Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

What is the dependent variable? What is the independent variable? As you know the two main variables in an experiment are the independent and dependent

What is the dependent variable? What is the independent variable?
As you know the two main variables in an experiment are the independent and dependent variable. An independent variable is the variable that is changed or controlled in a scientific experiment to test the effects on the dependent variable. A dependent variable is the variable being tested and measured in a scientific experiment. In regression analysis, the dependent variable is denoted "Y" and the independent variables are denoted by "X".
Type of dependent variables and independent variables
In the previous regression model, both the dependent variable and the independent variable are numeric (type of variable). What can you do when the dependent variable is categorical (nominal or ordinal) variable? What should you do if you cannot conduct a scientific experiment?
In todays world of Big Data, you can look into large datasets and perform mining on the data instead of an experiment. A very analogous situation is coal mining, where different tools are required to mine the coal buried deep beneath the ground. Of the tools in Data mining, Decision Tree is one of them.
Decision Tree Data Mining Technique
A decision tree is a supervised machine learning technique wherein we train the machine using the existing data with the known target (i.e., dependent) variable. As the name suggests, this technique has a tree type of structure. In Decision Tree, the algorithm splits the dataset into subsets based on the most important or significant attribute. The most significant attribute is designated in the root node, and that is where the splitting takes the place of the entire dataset present in the root node. This splitting done is known as decision nodes. In case no more split is possible, that node is termed as a leaf node.
To stop the algorithm from reaching an overwhelming stage, a stop criterion is employed. One of the stop criteria is the minimum number of observations in the node before the split happens. While applying the decision tree in splitting the dataset, one must be careful that many nodes might have noisy data. To cater to an outlier or noisy data problems, we employ techniques known as Data Pruning. Data pruning is nothing but an algorithm to classify out data from the subset, making it difficult for learning from a given model.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Server Query Performance Tuning

Authors: Sajal Dam, Grant Fritchey

4th Edition

1430267429, 9781430267423

More Books

Students also viewed these Databases questions

Question

Explain the Equal Opportunity Commission enforcement process.

Answered: 1 week ago