Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Provide the summary statistics for all the variables from the dataset. Explain some of the key aspects of the dataset. Examine the SAS code from
Provide the summary statistics for all the variables from the dataset. Explain some of the key aspects of the dataset.
Examine the SAS code from mod4knn.sas Download mod4knn.sasfile and for each SAS statement, provide the explanation of the code as SAS comments. Put this part in the Appendix of your report.
Perform the k-NN using k = 1, 2, and 3. For each case, explain the SAS output and give interpretation(s).
Which case (k = 1, 2, or 3) provides the best model? Explain why using the output from part c.
\begin{tabular}{|l|l|l|l|} \hline Data Set Name & WORK.IMPORT & Observations & 1599 \\ \hline Member Type & DATA & Variables & 13 \\ \hline Engine & V9 & Indexes & 0 \\ \hline Created & 01/15/202309:07:13 & Observation Length & 104 \\ \hline Last Modified & 01/15/202309:07:13 & Deleted Observations & 0 \\ \hline Protection & & Compressed & NO \\ \hline Data Set Type & & Sorted & NO \\ \hline Label & & & \\ \hline Data Representation & SOLARIS_X86_64, LINUX_X86_64,_ALPHA_TRU64, LINUX_IA64 & & \\ \hline Encoding & utf-8 Unicode (UTF-8) & & \\ \hline \end{tabular} \begin{tabular}{|l|l|} \hline \multicolumn{2}{|l|}{ Engine/Host Dependent Information } \\ \hline Data Set Page Size & 131072 \\ \hline Number of Data Set Pages & 2 \\ \hline First Data Page & 1 \\ \hline Max Obs per Page & 1258 \\ \hline Obs in First Data Page & 1224 \\ \hline Number of Data Set Repairs & 0 \\ \hline Filename & /saswork/SAS_workDC4A00000A0F2_odaws02-usw2-2.oda.sas.com/SAS_work7F760000A0F2_odaws02-usw2-2.oda.sas.com/importsas7bdat \\ \hline Release Created & 9.0401M7 \\ \hline Host Created & Linux \\ \hline Inode Number & 1879054995 \\ \hline Access Permission & rw-r-r- \\ \hline Owner Name & u62962320 \\ \hline File Size & 384KB \\ \hline File Size (bytes) & 393216 \\ \hline \end{tabular} Alphabetic List of Variables and Attributes \begin{tabular}{|r|l|r|r|l|l|} \hline \# & Variable & Type & Len & Format & Informat \\ \hline 1 & VAR1 & Num & 8 & BEST12. & BEST32. \\ \hline 12 & alcohol & Num & 8 & BEST12. & BEST32. \\ \hline 6 & chlorides & Num & 8 & BEST12. & BEST32. \\ \hline 4 & citric_acid & Num & 8 & BEST12. & BEST32. \\ \hline 9 & density & Num & 8 & BEST12. & BEST32. \\ \hline 2 & fixed_acidity & Num & 8 & BEST12. & BEST32. \\ \hline 7 & free_sulfur_dioxide & Num & 8 & BEST12. & BEST32. \\ \hline 10 & pH & Num & 8 & BEST12. & BEST32. \\ \hline 13 & quality & Num & 8 & BEST12. & BEST32. \\ \hline 5 & residual_sugar & Num & 8 & BEST12. & BEST32. \\ \hline 11 & sulphates & Num & 8 & BEST12. & BEST32. \\ \hline 8 & total_sulfur_dioxide & Num & 8 & BEST12. & BEST32. \\ \hline 3 & volatile_acidity & Num & 8 & BEST12. & BEST32. \\ \hline \end{tabular} / Students need to modify this code accordingly*/ /*Randomly choose 50% of data to fit the model*/ proc surveyselect data=work. IMPORT3 out=wine WOORK.IMPORT3.DATA does not exist. run; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.WINE may be incomplete. When this step was stopped there were observations and variables. NOTE: PROCEDURE SURVEYSELECT used (Total process time) : real time 0.00 seconds usercputimesystemcputime0.01seconds0.00seconds systemcputimememory0.00seco:337.78k OS Memory 23972.00k TimestampStepCount01/15/202331Switch03:19:41PM Page Faults Page Reclai Page Swaps Page Swaps Involuntary Context Switches Block Input Operations Block Output Operations / Number of k neighbors*/ \%let k=1; / *Perform knn algorithm*/ proc discrim data=wine (where= (selected=1)) test=wine (where =( selected =0) is not on file WORK.WINE. testout=winetestout is not on file WORK. WINE. listerr crosslisterr; class quality; var fixed_acidity volatile_acidity citric_acid residual_sugar chlorides free_sulfur_dioxide total_sulfur_dioxide ! density pH sulphates alcohol; run; title 'Using KNN on Wine Data'; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE DISCRIM used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds os Memory \begin{tabular}{ll} Timestamp & 23972.00k \\ \hline Step & 01/15/202303:19:41PM \end{tabular} Step Count Page Faults Page Reclaims Page Swaps Voluntary Context Switches Involuntary Context Switches Block Input Operations Block Output Operations
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started