Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Here is the project: [ Overview and Rationale Data mining is used to reveal hard to see and hidden patterns and relationships in Big Data

Here is the project:

[Overview and Rationale

Data mining is used to reveal hard to see and hidden patterns and relationships in Big Data datasets. Data mining helps to classify data for further examination or create models to predict outcomes for a different set of data. As data miners, you should be able to explain how the code used to mine the data is functioning and be able to analyze and interpret the results of the mining. This allows you to summarize and clarify the results for stakeholders.

Assignment Description

Many people forage for mushrooms and sell them to restaurants or use them for their own consumption. These are experts who know their mushroom. However, as a novice, it is important to be able to spot a poisonous mushroom.

In this assignment, you will use the data set provided to mine the data using the methods presented in this module. You will document in a report the results of each step of the mining process, analyze and interpret the results. Suggest the characteristics to use when determining if a mushroom is safe to eat. Make recommendations for additional analysis and variables to examine to build other classifications such as use of the mushrooms that are not poisonous.

mushrooms.xlsx Download mushrooms.xlsx

Instructions

The report should include the following:

  • Code walk through: in this section provide a step by step explanation of how the code is interacting with and/or transforming the data. Provide examples from the output to support your explanations.
  • Analysis: Based on the output, analyze the data and the relationships revealed about the variables of interest. Explains the insights provided by the output. Use visualizations to support your analysis.
  • Interpretation and Recommendations: Interpret the results of your analysis and explain what the results mean for the data owner. Provide recommendations for actions to be taken based on your interpretation. Support those with the data. Explain why and what explicit variables you suggest incorporating. For example, median income by city and state from the census.gov website might be useful for examining home ownership.

]

here what I did:

Step 1: Import the necessary libraries

import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt 

Step 2: Load the dataset into a Pandas dataframe

mushrooms = pd.read_excel('/content/mushrooms.xlsx', header=None) 

Step 3: Explore the data

# view the first few rows of the data mushrooms.head() # check the dimensions of the dataset mushrooms.shape # check the data types of each variable mushrooms.dtypes 

Step 4: Clean and preprocess the data

# check for missing values mushrooms.isnull().sum() # encode the categorical variables as numerical variables from sklearn.preprocessing import LabelEncoder encoder = LabelEncoder() for col in mushrooms.columns: mushrooms[col] = encoder.fit_transform(mushrooms[col]) 

Step 5: Visualize the data

# visualize the distribution of each variable mushrooms.hist(figsize=(20,20)) # visualize the correlation between variables sns.heatmap(mushrooms.corr()) 

Step 6: Train and evaluate models

# split the data into training and testing sets from sklearn.model_selection import train_test_split X = mushrooms.drop(columns=['class']) y = mushrooms['class'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # train a decision tree model from sklearn.tree import DecisionTreeClassifier tree = DecisionTreeClassifier() tree.fit(X_train, y_train) # evaluate the model on the testing set from sklearn.metrics import accuracy_score y_pred = tree.predict(X_test) accuracy_score(y_test, y_pred)

But I cant create a decision tree. I want something like this created:

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Calculus Early Transcendentals

Authors: James Stewart

7th edition

538497904, 978-0538497909

More Books

Students also viewed these Mathematics questions