Answered step by step
Verified Expert Solution
Question
1 Approved Answer
What is the dependent variable? What is the independent variable? As you know the two main variables in an experiment are the independent and dependent
What is the dependent variable? What is the independent variable?
As you know the two main variables in an experiment are the independent and dependent variable. An independent variable is the variable that is changed or controlled in a scientific experiment to test the effects on the dependent variable. A dependent variable is the variable being tested and measured in a scientific experiment. In regression analysis, the dependent variable is denoted Y and the independent variables are denoted by X
Type of dependent variables and independent variables
In the previous regression model, both the dependent variable and the independent variable are numeric type of variable What can you do when the dependent variable is categorical nominal or ordinal variable? What should you do if you cannot conduct a scientific experiment?
In todays world of Big Data you can look into large datasets and perform mining on the data instead of an experiment. A very analogous situation is coal mining, where different tools are required to mine the coal buried deep beneath the ground. Of the tools in Data mining, Decision Tree is one of them.
Decision Tree Data Mining Technique
A decision tree is a supervised machine learning technique wherein we train the machine using the existing data with the known target ie dependent variable. As the name suggests, this technique has a tree type of structure. In Decision Tree, the algorithm splits the dataset into subsets based on the most important or significant attribute. The most significant attribute is designated in the root node, and that is where the splitting takes the place of the entire dataset present in the root node. This splitting done is known as decision nodes. In case no more split is possible, that node is termed as a leaf node.
To stop the algorithm from reaching an overwhelming stage, a stop criterion is employed. One of the stop criteria is the minimum number of observations in the node before the split happens. While applying the decision tree in splitting the dataset, one must be careful that many nodes might have noisy data. To cater to an outlier or noisy data problems, we employ techniques known as Data Pruning. Data pruning is nothing but an algorithm to classify out data from the subset, making it difficult for learning from a given model.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started