Suppose we are given (M) temporal sequences (S=left{boldsymbol{x}^{(mathbf{1})}, ldots, boldsymbol{x}^{(boldsymbol{M})}ight}), where each temporal sequence (boldsymbol{x}^{(boldsymbol{m})}, m=1, ldots,
Question:
Suppose we are given \(M\) temporal sequences \(S=\left\{\boldsymbol{x}^{(\mathbf{1})}, \ldots, \boldsymbol{x}^{(\boldsymbol{M})}ight\}\), where each temporal sequence \(\boldsymbol{x}^{(\boldsymbol{m})}, m=1, \ldots, M\) consists \(n^{(m)}\) temporal segments, that is, \(\boldsymbol{x}^{(\boldsymbol{m})}=\left\{x_{1}^{(m)}, \ldots, x_{n^{(m)}}^{(m)}ight\}\). Note that the length of temporal sequences could be different. There exist both normal sequence (labeled as \(Y^{(m)}=0\) ) and abnormal sequence (labeled as \(Y^{(m)}=1\) ) in \(S\).
a. In the unsupervised setting, we do not have any labels for either the abnormal sequences and normal sequences. We observe that (1) the majority of temporal sequences are normal, whereas only a small portion of temporal sequences in \(S\) correspond to abnormal sequences; and (2) the abnormal sequences often deviate a lot from the normal sequences. Can you propose your own solution to identify the abnormal sequences out of \(S\) ?
b. In the supervised setting, we are given a training set with labeled abnormal sequences and normal sequences. Could you name one popular supervised sequence classification model to identify the abnormal sequences? What are the pros and cons of the supervised method, compared to your proposed unsupervised solution in (a)?
Step by Step Answer:
Data Mining Concepts And Techniques
ISBN: 9780128117613
4th Edition
Authors: Jiawei Han, Jian Pei, Hanghang Tong