Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The cancer.csv dataset deals with cancer patients. A tumour is a set of cells that have grown in a specific part of body. Tumours can

The cancer.csv dataset deals with cancer patients. A tumour is a set of cells that have grown in a specific part of body. Tumours can be classified as being either cancerous or non-cancerous based on various factors. Cancerous tumours continue to grow uncontrollably and spread to different parts of the body and eventually to the bloodstream. At this stage, they begin interfering with body functions that can lead to death (example heart attack from clogged arteries). The reason it is important to classify tumours correctly is because generally it is expensive and risky to try to remove all tumours. In this problem, we want to predict whether a persons tumour is cancerous in order to decide whether surgery is necessary or not. Features or Independent Variables: ID - Sample code number Clump Thickness: 1 - 10 Uniformity of Cell Size: 1 - 10 Uniformity of Cell Shape: 1 - 10 Marginal Adhesion: 1 - 10 Single Epithelial Cell Size: 1 - 10 Bare Nuclei: 1 - 10 Bland Chromatin: 1 - 10 Normal Nucleoli: 1 - 10 Mitoses: 1 - 10 Label or Dependent Variable: Class: (2 for benign, 4 for malignant)

Using SparkML

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Driven Web Sites

Authors: Joline Morrison, Mike Morrison

2nd Edition

? 061906448X, 978-0619064488

More Books

Students also viewed these Databases questions

Question

Explain the function and purpose of the Job Level Table.

Answered: 1 week ago