In our text, we state that the variance of N observations, x 1 , x 2 ,

Question:

In our text, we state that the variance of N observations, x1, x2, . . . , xN (when N is large), for a numeric attribute X is defined as

N 2 -() - N i=1 N 1 # - *)2. i=1 72,

where X̅ is the mean value of the observations, as defined in Eq. (??). This is actually the formula for calculating the variance for the whole population using all the data (hence called the population variance). If we are calculation the variance using only a sample of data (hence called sample variance), we will need to use the following formula

s2 = - n-1 m  ti - z)2 = i=1 n 1 - 1 n ( mp). x - np2

where n is size of the sample. With the sample size n, sample standard deviation can defined similarly. Explain why there is such a minor difference at defining sample variance and population variance.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Data Mining Concepts And Techniques

ISBN: 9780128117613

4th Edition

Authors: Jiawei Han, Jian Pei, Hanghang Tong

Question Posted: