Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

QUESTION 14 After the quarantine is lifted, and people start riding the Green Line again, a person sees you scribbling in a notebook while riding

image text in transcribed

QUESTION 14 After the quarantine is lifted, and people start riding the Green Line again, a person sees you scribbling in a notebook while riding the T. He looks over your shoulder and says, "Are you seriously using the Jacquard Coefficient to compare records based on categorical variables? That is so dumb of you!" You ask him politely to stop bothering you, but he insists on saying more. "Jacquard is pretty dumb, yo. If you really care about seeing how similar two records are, just use the matching coefficient instead. Take the number of things that the two observations BOTH have, add that to the number of things that NEITHER observation has, and then just divide that sum by the total number of variables that you're looking at." Why might this rude stranger's advice not be good? This rude stranger has overlooked the impact of data normalization. His advice might be good if the data does not require normalization, but it will not be applicable if some form of normalization is needed here. If there is a huge number of variables being considered, a misleading result could occur if this guy's recommendation is followed. If many of the categories are not present for either of the records being compared, the matching coefficient would imply a misleading sense of similarity between those records. The stranger is assuming that the data has already undergone some form of dimensionality reduction, most likely through a process known as Principal Components Analysis (PCA). If that has not yet occurred, then he cannot be sure which approach will be more applicable. The stranger should have mentioned Gower's distance instead -- that is the only certain way to prevent any risk of obtaining a misleading result. QUESTION 14 After the quarantine is lifted, and people start riding the Green Line again, a person sees you scribbling in a notebook while riding the T. He looks over your shoulder and says, "Are you seriously using the Jacquard Coefficient to compare records based on categorical variables? That is so dumb of you!" You ask him politely to stop bothering you, but he insists on saying more. "Jacquard is pretty dumb, yo. If you really care about seeing how similar two records are, just use the matching coefficient instead. Take the number of things that the two observations BOTH have, add that to the number of things that NEITHER observation has, and then just divide that sum by the total number of variables that you're looking at." Why might this rude stranger's advice not be good? This rude stranger has overlooked the impact of data normalization. His advice might be good if the data does not require normalization, but it will not be applicable if some form of normalization is needed here. If there is a huge number of variables being considered, a misleading result could occur if this guy's recommendation is followed. If many of the categories are not present for either of the records being compared, the matching coefficient would imply a misleading sense of similarity between those records. The stranger is assuming that the data has already undergone some form of dimensionality reduction, most likely through a process known as Principal Components Analysis (PCA). If that has not yet occurred, then he cannot be sure which approach will be more applicable. The stranger should have mentioned Gower's distance instead -- that is the only certain way to prevent any risk of obtaining a misleading result

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Accounting questions

Question

6. How do histories influence the process of identity formation?

Answered: 1 week ago