Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Question 1 . ( 8 points ) ( a ) For binary data, the L 1 distance corresponds to the Hamming distance; that is ,
Question points
a For binary data, the L distance corresponds to the Hamming distance; that is the number of bits that are different between two binary vectors. The Jaccard similarity is a measure of the similarity between two binary vectors. Compute the Hamming distance and the Jaccard similarity between the following two binary vectors.
b Which approach, Jaccard or Hamming distance, is more similar to the Simple Matching Coefficient, and which approach is more similar to the cosine measure? Explain. Note: The Hamming measure is a distance, while the other three measures are similarities, but don't let this confuse you.
c Suppose that you are comparing how similar two organisms of different species are in terms of the number of genes they share. Describe which measure, Hamming or Jaccard, you think would be more appropriate for comparing the genetic makeup of two organisms. Explain. Assume that each animal is represented as a binary vector, where each attribute is if a particular gene is present in the organism and otherwise.
d If you wanted to compare the genetic makeup of two organisms of the same species, eg two human beings, would you use the Hamming distance, the Jaccard coefficient, or a different measure of similarity or distance? Explain. Note that two human beings share of the same genes.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started