Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data segments Fraction of the data points between - 2 . 6 and 3 . 1 inclusively Does the segment have the symptom? - 4

Data segments Fraction of the data points between -2.6 and 3.1 inclusively Does the segment have the symptom?
-4.5,0.5,4.5,-0.1,-4.32/5=0.4 False
-4.1,0.1,4.1,0.4,-4.92/5=0.4 False
-1.3,0.2,1.1,0.4,1.15/5=1 True
-1.7,0.3,3.1,0.8,-2.65/5=1(Note: Both -2.6 and 3.1 in the data segment are counted.) True
-1.5,-0.2,1.2,0.6,-4.14/5=0.8 True
-4.1,0.1,4.1,0.4,-4.92/5=0.4 False
-1.2,-0.1,1.2,0.7,-1.95/5=1 True
-3.9,0.1,2.9,0.5,-2.24/5=0.8 True
-2.0,0.5,1.7,4.6,4.73/5=0.6 False
Note that the algorithmic parameters segment_len, interval and threshold may take on different values in different tests.
After computing whether each complete segment has the symptom, we can summarise the results in a Python list of Boolean values. We will refer to this list using the variable name disorder_status where disorder means the symptom is present. For the flow_rate data in the sample code, the variable disorder_status is:
disorder_status =[False, False, True, True, True, False, True, True, False]
Note that there are 9 elements in disorder_status and they correspond to the 9 complete segments in the given flow_rate. Note also that you can obtain disorder_status from the right-most column in the table above.
The next part of the computation is to determine the episodes from the variable disorder_status.
(Determining the episodes)
An episode is formed by consecutive segments that have symptoms and an episode must have a minimum number of segments. The algorithmic parameter min_segment specifies the minimum number of segments an episode must have. The value of min_segment is 2 in the sample code but its value can change from test to test.
The determination of the episodes requires only two variables: disorder_status and min_segment. For min_segment equals to 2, the variable disorder_status given above has two episodes, which are highlighted by the orange colour:
[False, False, True, True, True, False, True, True, False]
The first episode starts in the third segment (corresponding to a Python list index of 2) and a duration of 3 segments. The second episode starts in the seventh segment (corresponding to a Python list index of 6) and a duration of 2 segments. We will summarise the information on the episodes by using a list of lists as follows:
[[2,3],[6,2]]
The first list [2,3] corresponds to the first episode. The first element 2 in [2,3] is the Python list index of the segment that the episode begins and the second element 3 is the number of segments in the episode. Similarly for the second list. The variable episodes, in the last line of the sample code above is expected to take on the value of this list of lists.
Let us consider the case where the variable min_segment has the value of 3 instead. Then, in this case, the variable disorder_status given above has only one episode, which is highlighted by the orange colour:
[False, False, True, True, True, False, True, True, False]
This is because each episode is now required to have at least 3 segments. We will summarise the information on the episodes by using a list of lists as follows:
[[2,3]]
If we further increase the variable min_segment to the value of 4, then there are no episodes in the variable disorder_status given above. In this case, we summarise the information on the episodes by using an empty list, i.e.[].
Validity checks
The description above shows how the data (flow_rate) and algorithmic parameters (segment_len, interval, threshold, min_segment) are used to compute the episodes. Note that the algorithmic parameters must be valid so that the computation can be carried out. We require that your code performs a number of validity checks before computing the episodes. For example, the algorithmic parameter segment_len must be a positive integer greater than or equal to 1 for it to be valid, otherwise it is not valid. The following table state the requirements for the algorithmic parameters to be valid and what assumptions you can make when testing.
Algorithmic parameters Requirements for the parameter to be valid Assumptions you can make when testing
segment_len A positive integer greater than or equal to 1 You can assume that, when we test your code, the given segment_len is always a number (int or float).
In other words, the given segment_len cannot be of data type str, list e
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction To Data Mining

Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar

1st Edition

321321367, 978-0321321367

More Books

Students also viewed these Databases questions