4.5 (AUC-PR and AP) We have not discussed the details of how the AUC-PR mea. surement is calculated. For a binary classification task, we assume that every example x has a score f(x), and sort the test examples into descending order of these scores. Then, for every example, we set the classification threshold as the current example's score (i.e., only this example and the examples before it are classified as positive). A pair of precision and recall values are computed at this threshold. The PR curve is drawn by connecting nearby points using line segments. Then AUC-PR is the area under the PR curve. Let (ri,pi) denote the i th recall and precision rates (i=1,2,). When computing the area, the contribution between ri and ri1 is calculated using the trapezoidal interpolation (riri1)2pi+pi1, in which riri1 is the length on the x-axis, and pi and pi1 are the lengths of two vertical lines on the y-axis. Summing over all i values, we obtain the AUC-PR score. Note that we assume the first pair (r0,p0)=(0,1), which is a pseudopair corresponding to the threshold +. (a) For the test set with 10 examples (indexed from 1 to 10 ) in Table 4.3, calculate the precision (pi) and recall (ri) when the threshold is set as the current example's f(xi) value. Use class 1 as positive, and fill in these values in Table 4.3. Put the trapezoidal approximation (riri1)2pi+pi1 in the "AUC-PR" column for the i th row, and fill in their sum in the final row. (b) Average precision (AP) is another way to summarize the PR curve into one number. Similar to AUC-PR, AP approximates the contribution between ri and ri1 using a rectangle, as (riri1)pi. Fill in this approximation in the "AP" column for the i th row, and put their sum in the final row. Both AUC-PR and AP summarize the PR curve, hence they should be similar to each other. Are they? Table 4.3 Calculation of AUC-PR and AP. (c) Both AUC-PR and AP are sensitive to the order of labels. If the labels of the ninth and the tenth rows are exchanged, what are the new AUC-PR and AP? (d) Write a program to calculate both AUC-PR and AP based on the labels, scores, and the positive class. Validate your program's correctness using the example test set in Table 4.3