Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You are hired by a new start-up called Readr that wants to offer automatic book recommendations based on a book's text. You are tasked with

image text in transcribed

You are hired by a new start-up called Readr that wants to offer automatic book recommendations based on a book's text. You are tasked with building a classifier that, given a book and its content classifies the book as good or bad. After consulting with a top literary critic, you decide that you only need to consider two binary features for this task: a) the length of the book is more than 500 pages (represented by random variable L with values short and long), and b) the word "wow" appears in the book text (represented by random variable W with values true and false) Potts, Christopher. 2011. On the negativity of negation. In Nan Li and David Lutz, eds., Proceedings of Semantics and uistic Theory 20, 636-659 a) You have a dataset of 10,000 books with good/bad labels provide by top literary critics, and you observe the following: i) 2,500 books are labeled as good and the rest as bad, ii) 2,000 of the good books are long, iii) 5,000 of the bad books are short, iv) 50 of the good books have the word "wow", and v) 1,500 of the bad books have the word "wow". Let G be the hypothesis that a book is good, and B the hypothesis that it is bad. What values would you pick for the priors P(G) and P(B), and the likelihoods P(L long G), P(L long | B), P(Wtrue G), P(Wtrue | B)? b) You decide to build your classifier using Naive Bayes, building on the probability mass functions derived in part a). Suppose your classifier receives a new book with length of 700 words and that does not contain the word "wow". What will your classifier predict about this book? C) Suppose you go back to your dataset and notice that all the good books that are long do not have the word "wow", and that 750 of the bad books that are long do not have the word "wow". Would this information lead you to a different answer than the one produced by your classifier from part b). If so, why

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advances In Spatial And Temporal Databases 11th International Symposium Sstd 2009 Aalborg Denmark July 8 10 2009 Proceedings Lncs 5644

Authors: Nikos Mamoulis ,Thomas Seidl ,Kristian Torp ,Ira Assent

2009th Edition

3642029817, 978-3642029813

More Books

Students also viewed these Databases questions

Question

How does equity valuation differ from bond valuation?

Answered: 1 week ago

Question

2. Define identity.

Answered: 1 week ago

Question

1. Identify three communication approaches to identity.

Answered: 1 week ago

Question

4. Describe phases of majority identity development.

Answered: 1 week ago