(All questions must be answered in detail and necessary supporting arguments provided) 1. The file MutualFunds x ls x contains a data set with information for 45 mutual funds that are part of the Morningstar Funds 500. The data set includes the following five variables: ( 20 points) - Fund Type: The type of fund, labeled DE (Domestic Equity), IE (International Equity), and FI (Fixed Income) - Net Asset Value (\$): The closing price per share - Five-Year Average Return (\%): The average annual return for the fund over the past five years - Expense Ratio (\%): The percentage of assets deducted each fiscal year for fund expenses - Morningstar Rank: The risk adjusted star rating for each fund; Morningstar ranks go from a low of 1 Star to a high of 5 Stars. a. Prepare a PivotTable that gives the frequency count of the data by Fund Type (rows) and the five-year average annual return (columns). Use classes of 09.99, 1019.99,2029.99,3039.99,4049.99, and 5059.99 for the Five-Year Average Return (\%). (10 points). b. What conclusions can you draw about the fund type and the average return over the past five years? (10 points). 2. The file Fortune 500 x/sx contains data for profits and market capitalizations from a recent sample of firms in the Fortune 500. Prepare a scatter diagram to show the relationship between the variables Market Capitalization and Profit in which Market Capitalization is on the vertical axis and Profit is on the horizontal axis. Comment on any relationship between the variables. (15 points). 3. What is text data? Is text usually classified under structured data? Explain in detail the different steps involved in the pre-processing of text data for text analytics. (20 points). 4. Compare and contrast hierarchical clustering versus k-means clustering ( 20 points). 5. Leggere, an Internet book retailer, is interested in better understanding the purchase decisions of its customers. For a set of 2,000 customer transactions, it has categorized the individual book purchases comprising those transactions into one or more of the following categories: Novels, Willa Bean series, Cooking Books, Bob Villa Do-ItYourself, Youth Fantasy, Art Books, Biography, Cooking Books by Mossimo Bottura, Harry Potter series, Florence Art Books, and Titian Art Books. Leggere has conducted association rules analysis on this data set and would like to analyze the output. The table below (file Leggere.xlsx) shows the top 10 rules with respect to lift ratio. (20 points). a. For the rule "If a customer buys a Youth Fantasy book, then they buy Novels and Cooking book.", calculate the confidence and lift ratio. ( 10 points) b. Interpret both confidence and lift ratio numbers calculated in (a). ( 5 points) c. Among the rules shown in the table above, which one has the highest lift ratio? ( 5 points). \begin{tabular}{|l|l|l|l|l|} \hline Anteceder & Consequer & \begin{tabular}{l} Support \\ for A \end{tabular} & \begin{tabular}{l} Support \\ for C \end{tabular} & \begin{tabular}{l} Support \\ for A \& C \end{tabular} \\ \hline BotturaCo & Cooking & 124 & 512 & 101 \\ \hline \begin{tabular}{l} Cooking, \\ BobVilla \end{tabular} & Art & 227 & 327 & 118 \\ \hline \begin{tabular}{l} Cooking, \\ Art \end{tabular} & Biography & 170 & 385 & 101 \\ \hline \begin{tabular}{l} Cooking, \\ Biography \end{tabular} & Art & 207 & 334 & 105 \\ \hline \begin{tabular}{l} Youth \\ Fantasy \end{tabular} & \begin{tabular}{l} Novels, \\ Cooking \end{tabular} & 227 & 512 & 170 \\ \hline \begin{tabular}{l} Cooking, \\ Art \end{tabular} & BobVilla & 190 & 385 & 105 \\ \hline \begin{tabular}{l} Cooking, \\ BobVilla \end{tabular} & Biography & 144 & 512 & 105 \\ \hline Biography & \begin{tabular}{l} Novels, \\ Cooking \end{tabular} & 194 & 373 & 103 \\ \hline \begin{tabular}{l} Novels, \\ Cooking \end{tabular} & Biography & 227 & 385 & 124 \\ \hline Art & \begin{tabular}{l} Novels, \\ Cooking \end{tabular} & 204 & 385 & 110 \\ \hline \end{tabular} \begin{tabular}{|c|c|c|c|c|c|} \hline Fund Name & \begin{tabular}{l} Fund \\ Type \end{tabular} & \begin{tabular}{l} Net Asset \\ Value (\$) \end{tabular} & \begin{tabular}{c} 5 Year \\ Average \\ Return \\ (%) \end{tabular} & \begin{tabular}{c} Expense \\ Ratio (\%) \end{tabular} & \begin{tabular}{l} Moraingstar \\ Rank (Stars) \end{tabular} \\ \hline Amer Cent lnc \& Growth Inv & DE & 28.88 & 12.39 & 0.67 & 2 \\ \hline American Century Intl. Disc & IE & 14.37 & 30.53 & 1.41 & 3 \\ \hline American Century Tax-Free Bond & FI & 10.73 & 3.34 & 0.49 & 4 \\ \hline American Century Ultra & DE & 24.94 & 10.88 & 0.99 & 3 \\ \hline Aricl & DE & 46.39 & 11.32 & 1.03 & 2 \\ \hline Artisan Intl Val & IE & 25.52 & 24.95 & 1.23 & 3 \\ \hline Artisan Small Cap & DE & 16.92 & 15.67 & 1.18 & 3 \\ \hline Baron Asset & DE & 50.67 & 16.77 & 1.31 & 5 \\ \hline Brandywine & DE & 36.58 & 18.14 & 1.08 & 4 \\ \hline Brown Cap Small & DE & 35.73 & 15.85 & 1.20 & 4 \\ \hline Buffalo Mid Cap & DE & 15.29 & 17.25 & 1.02 & 3 \\ \hline Delafield & DE & 24.32 & 17.77 & 1.32 & 4 \\ \hline DFA U.S. Micro Cap & DE & 13.47 & 17.23 & 0.53 & 3 \\ \hline Dodge \& Cox Income & F1 & 12.51 & 4.31 & 0.44 & 4 \\ \hline Fairholme & DE & 31.86 & 18.23 & 1.00 & 5 \\ \hline Fidelity Contrafund & DE & 73.11 & 17.99 & 0.89 & 5 \\ \hline Fidelity Municipal Income & Fi & 12.58 & 4.41 & 0.45 & 5 \\ \hline Fidelity Overseas & IE & 48.39 & 23.46 & 0.90 & 4 \\ \hline Fidelity Sel Electronics & DE & 45.60 & 13.50 & 0.89 & 3 \\ \hline Fidelity Sh-Term Bond & FI & 8.60 & 2.76 & 0.45 & 3 \\ \hline Fidelity & DE & 39.85 & 14.40 & 0.56 & 4 \\ \hline FPA New Income & Fi & 10.95 & 4.63 & 0.62 & 3 \\ \hline Gabelli Asset AAA & DE & 49.81 & 16.70 & 1.36 & 4 \\ \hline Greenspring & DE & 23.59 & 12.46 & 1.07 & 3 \\ \hline Janus & DE & 32.26 & 12.81 & 0.90 & 3 \\ \hline Janus Worldwide & IE & 54.83 & 12.31 & 0.86 & 2 \\ \hline Kalmar Gr Val Sm Cp & DE & 15.30 & 15.31 & 1.32 & 3 \\ \hline Managers Freemont Bond & FI & 10.56 & 5.14 & 0.60 & 5 \\ \hline Marsice 21 st Century & DE & 17.44 & 15.16 & 1.31 & 5 \\ \hline Mathews Pacific Tiger & IE & 27.86 & 32.70 & 1.16 & 3 \\ \hline Meridan Value & DE & 31.92 & 15.33 & 1.08 & 4 \\ \hline Oakmark 1 & DE & 40.37 & 9.51 & 1.05 & 2 \\ \hline PIMCO Emerg Mikts Bd D & FI & 10.68 & 13.57 & 1.25 & 3 \\ \hline RS Value A & DE & 26.27 & 23.68 & 1.36 & 4 \\ \hline T. Rowe Price Latin Am. & IE & 53.89 & 51.10 & 1.24 & 4 \\ \hline T. Rowe Price Mid Val & DE & 22.46 & 16.91 & 0.80 & 4 \\ \hline Templeton Growth A & IE & 24.07 & 15,91 & 1.01 & 3 \\ \hline Thomburg Value A & DE & 37.53 & 15.46 & 1.27 & 4 \\ \hline USAA Income & FI & 12.10 & 4.31 & 0.62 & 3 \\ \hline Vanguard Equity-Inc & DE & 24,42 & 13.41 & 0.29 & 4 \\ \hline Vanguard Global Equity & IE & 23.71 & 21.77 & 0.64 & 5 \\ \hline Vanguard GNMA & F1 & 10.37 & 4.25 & 0.21 & 5 \\ \hline Vanguard Sht-Tm TE & FI & 15.68 & 237 & 0.16 & 3 \\ \hline Vanguard SmCp Idx & DE & 32.58 & 17.01 & 0.23 & 3 \\ \hline Wasatch SmCpp Growth & DE & 35.41 & 13.98 & 1.19 & 4 \\ \hline \end{tabular} \begin{tabular}{|l|r|r|} \hline \multicolumn{1}{|c|}{ Company } & Profits (S millions) & Market Capitalization (S millions) \\ \hline Alliant Techsystems & 313.2 & 1891.9 \\ \hline Amazon.com & 631 & 81458.6 \\ \hline AmerisourceBergen & 706.6 & 10087.6 \\ \hline Avis Budget Group & -29 & 1175.8 \\ \hline Boeing & 4,018.00 & 55188.8 \\ \hline Cardinal Health & 959 & 14115.2 \\ \hline Cisco Systems & 6,490.00 & 97376.2 \\ \hline Coca-Cola & 8,572.00 & 157130.5 \\ \hline ConocoPhillips & 12,436.00 & 95251.9 \\ \hline Costco Wholesale & 1,462.00 & 36461.2 \\ \hline CVS Caremark & 3,461.00 & 53575.7 \\ \hline Delta Air Lines & 854 & 7082.1 \\ \hline Fidelity National Financial & 369.5 & 3461.4 \\ \hline FMC Technologies & 399.8 & 12520.3 \\ \hline Foot Locker & 278 & 3547.6 \\ \hline General Motors & 9,190.00 & 32382.4 \\ \hline Harley-Davidson & 599.1 & 8925.3 \\ \hline HCA Holdings & 2,465.00 & 9550.2 \\ \hline Kraft Foods & 3,527.00 & 65917.4 \\ \hline Kroger & 602 & 13819.5 \\ \hline Lockheed Martin & 2,655.00 & 26651.1 \\ \hline Medco Health Solutions & 1,455.70 & 21865.9 \\ \hline Owens Corning & 276 & 3417.8 \\ \hline Pitney Bowes & 617.5 & 3681.2 \\ \hline Procter \& Gamble & 11,797.00 & 182109.9 \\ \hline Ralph Lauren & 567.6 & 12522.8 \\ \hline Rockwell Automation & 697.8 & 10514.8 \\ \hline Rockwell Collins & 634 & 8560.5 \\ \hline United Stationers & 109 & 1381.6 \\ \hline United Technologies & 4,979.00 & 66606.5 \\ \hline UnitedHealth Group & 5,142.00 & 53469.4 \\ \hline \end{tabular}