Question
`semi_conductor.csv` contains the semi_conductor quality-control data from a semi-conductor manufacturing process. There are hundreds of diagnostic sensors along the production line, measuring various inputs and
`semi_conductor.csv` contains the semi_conductor quality-control data from a semi-conductor manufacturing process. There are hundreds of diagnostic sensors along the production line, measuring various inputs and outputs in the process. `codebook_semi_conductor.txt` contains description of the dataset.
(1) Do in-sample-fitting by using random forests.
(2) Do in-sample-fitting by using factor models.
(3) Compare the R-squared from parts (1) and (2).
(4) Which model does best in predicting unseen data? (Hint: evaluate the out-of-sample performance of each model through k-fold cross valuation)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started