Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Feb 12, 2024

Remove missing values from your dataset(s) before answering the following questions. 1) Use mobile.csv dataset that classifies mobile phones with respect to their prices

Remove missing values from your dataset(s) before answering the following questions. 1) Use mobile.csv dataset that classifies mobile phones with respect to their prices into four categories (price_range column). Use 70% of the dataset for training and the remaining as the test set. a. Generate a decision tree classifier that uses gini index as the impurity measure with a maximum depth of 3. i. Plot the resulting decision tree obtained after training the classifier. ii. Compute the accuracy on the test set. iii. Plot the accuracy for different depth values (try values between 2 and 20) for both training and test. Do you observe overfitting as depth increases? b. Generate a k-nn classifier for different number of nearest neighbors, k (hyperparameter). Consider k values between 2 and 15. Plot the accuracies. C. Generate a support vector machine with a linear kernel where hyperparameter C can get values [0.001, 0.01, 0.1, 1]. Plot the accuracies for different C values. d. Generate ensemble classifiers, bagging, boosting, and adaboost, with 150 base- classifiers with a maximum depth of 5. Plot the accuracies for both training and test. Which one performs better than the others? 2) Use California Housing Dataset (data1.csv) which contains data drawn from the 1990 U.S. Census and apply K-means clustering using longitude, latitude, and median income columns. a. Pick k and form k clusters by assigning each instance to its nearest centroid. b. Compute the centroid of each cluster. C. Is the k that you pick good enough? Determine the number of clusters (k) in the data using sum-of-square-errors. 3) Use the first 50 row in the Customer Segmentation Dataset (data2.csv) and apply three hierarchical clustering algorithms provided by the Python scipy library: (1) single link (MIN), (2) complete link (MAX), and (3) group average and plot the dendograms. 4) Use data2.csv dataset (use every row) and apply DBSCAN algorithm. Consider annual income as the x axis and spending score as the y axis and plot the clusters.

Step by Step Solution

★★★★★

3.56 Rating (156 Votes )

There are 3 Steps involved in it

Step: 1

1 Mobile Price Classification a Decision Tree Classifier Load the mobilecsv dataset Preprocess the dataset by removing any missing values Split the dataset into a training set 70 and a test set 30 Cre... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Legal Environment of Business A Critical Thinking Approach

Authors: Nancy K Kubasek, Bartley A Brennan, M Neil Browne

6th Edition

978-0132666688, 132666685, 132664844, 978-0132664844

More Books

Students also viewed these Programming questions

Question

★★★★★

Consider two very long, straight fins of uniform cross section, as shown in Figure 3.17. The rectangular fin has dimensions t = 1 mm and w = 20 mm. The circular pin fin has the same cross-sectional...

Answered: 1 week ago

Question

★★★★★

Suppose the lengths of a women's pregnancies are normally distributed with a mean of 266 days and a standard deviation of 16 days. Find the following for an individual randomly chosen pregnant woman:...

Answered: 1 week ago

Question

★★★★★

QUIZ... Let D be a poset and let f : D D be a monotone function. (i) Give the definition of the least pre-fixed point, fix (f), of f. Show that fix (f) is a fixed point of f. [5 marks] (ii) Show that...

Answered: 1 week ago

Question

★★★★★

You have been asked to calculate the cost of capital of Mulligan Ltd, a company which specialises in developing new medical equipment products. The following information is available regarding the...

Answered: 1 week ago

Question

★★★★★

CC10Natalie is also thinking of buying a van that will be used only for business. The cost of the van is estimated at $36,500. Natalie would spend an additional $2,500 to have the van painted. In...

Answered: 1 week ago

Question

★★★★★

Show how you might prepare benzylamine from each of the following compounds: (a) (b) (c) Benzyl bromide (two ways) (d) Benzyl tosylate (e) Benzaldehyde (f) Phenylnitromethane (g) NH2 Benzylamine CN...

Answered: 1 week ago

Question

★★★★★

=+b) Does this model show that there is a (possibly unsuspected) 14-week seasonal cycle in gas prices? Explain.

Answered: 1 week ago

Question

★★★★★

OConnor Companys income statement information for 2016 and 2017 (a sole proprietorship) is as follows: Required: Provide the missing amounts for the blanks labeled (a) through (g). All the necessary...

Answered: 1 week ago

Question

★★★★★

In this project, you will be estimating the Net Present Value (NPV), Crossover Rate, and Internal Rate of Return (IRR) of two firms. Note that your final output should be formatted as though you were...

Answered: 1 week ago

Question

★★★★★

On January 1, 2020, Allan Company bought a 15 percent interest in Sysinger Company. The acquisition price of $228,500 reflected an assessment that all of Sysinger's accounts were fairly valued within...

Answered: 1 week ago

Question

★★★★★

12. On October 1, Boeing received an order from British Airways for a 747 for $200,000,000 to be paid on December 1. The exchange rates for $1 U.S. are as follows: Exchange Rates of $1 for British...

Answered: 1 week ago

Question

★★★★★

Exercise 10-6 Various liabilities LO2 Designer Architects had the following additional information at its November 30, 2020 year-endi a. The Unearned Revenue account showed a balance of $74,000,...

Answered: 1 week ago

Question

★★★★★

The following statements were printed in journal articles in the field of psychology. Explain to a person who is unfamiliar with the concept of confidence intervals what each statement means. "The...

Answered: 1 week ago

Question

★★★★★

Can you help me answer this mathematical questions related to data management? Thank you so much!!! IS MMW Module 7 Post Task 2nd SE X _> Course Modules: Mathematics in X + X C A...

Answered: 1 week ago

Question

★★★★★

Instructions: 1. You must write the title of the case and the bibliographic record. 2. Mention the parties involved in the case, duly identified as plaintiffs or defendants, etc. 3. What is the main...

Answered: 1 week ago

Question

★★★★★

1.Two equal-sized newspapers have an overlap circulation of 10% (10% of the subscribers subscribe to both newspapers). Advertisers are willing to pay $8 to advertise in one newspaper but only $15 to...

Answered: 1 week ago

Question

★★★★★

Adjusting the cost of capital for risk Divisional Costs of Capital Newtown Propane currently has only a wholesale division and uses only equity capital; however, it is considering creating marketing...

Answered: 1 week ago

Question

★★★★★

How does the organizational structure of an MNC influence its strategy implementation?

Answered: 1 week ago

Question

★★★★★

Sam was driving in excess of the speed limit and ran a red light at 11:00 P.M. He hit Suzannes car, which was crossing the intersection when he ran the red light. Sam had not seen Suzannes car...

Answered: 1 week ago

Question

★★★★★

Explain relationships between trademarks and trade secrets.

Answered: 1 week ago

Question

★★★★★

On behalf of T.B., the unwed mother of a minor child, the State of Alabama filed a complaint for paternity and child support against J.E.B. A panel of 12 males and 24 females was called by the court...

Answered: 1 week ago

Question

★★★★★

5.24. Example 5.5 (page 120) analyzed data from a study that compared therapies for anorexia. For the 17 girls who received the family therapy, the changes in weight during the study were...

Answered: 1 week ago

Question

★★★★★

5.4. A national survey conducted in July 2006 by Pew Forum on Religion & Public Life asked whether the subject favored allowing homosexual couples to enter into civil unionslegal agreements that...

Answered: 1 week ago

Question

★★★★★

5.20. Find and interpret the 95% confidence interval for jtr. ify = 70 and x = 10, based on a sample size of (a) 5 (b) 20.

Answered: 1 week ago

Previous Question Next Question