6.19 Sequence kernels. Let X = fa; c; g; tg. To classify DNA sequences using SVMs, we...

Question:

6.19 Sequence kernels. Let X = fa; c; g; tg. To classify DNA sequences using SVMs, we wish to de ne a kernel between sequences de ned over X. We are given a nite set I  X of non-coding regions (introns). For x 2 X, denote by jxj the length of x and by F(x) the set of factors of x, i.e., the set of subsequences of x with contiguous symbols. For any two strings x; y 2 X de ne K(x; y) by K(x; y) =

X z 2(F(x)\F(y))????I

jzj; (6.32)

where   1 is a real number.

(a) Show that K is a rational kernel and that it is positive de nite symmetric.

(b) Give the time and space complexity of the computation of K(x; y) with respect to the size s of a minimal automaton representing X ???? I.

(c) Long common factors between x and y of length greater than or equal to n are likely to be important coding regions (exons). Modify the kernel K to assign weight jzj 2 to z when jzj  n, jzj 1 otherwise, where 1  1  2.
Show that the resulting kernel is still positive de nite symmetric.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Foundations Of Machine Learning

ISBN: 9780262351362

2nd Edition

Authors: Mehryar Mohri, Afshin Rostamizadeh

Question Posted: