Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Part 1: Unique Words and Most Frequently Used Words in Novels In cauntwards.pr, you will compute the number of unque words (or vocabulary size), the

image text in transcribed

Part 1: Unique Words and Most Frequently Used Words in Novels In cauntwards.pr, you will compute the number of unque words (or vocabulary size), the average size of those unique words, and the top 20 most frequenty used words by authors in their novels Consider the text nle below caled lincotn.tt It is rather for us to te sere dedicated to the great cask remaining betore us that from these honared dead we take increased devatlon to that couse for which they gave the last full easure of cevation that he here higly resoive thet these dead shall fiot have died in vain-thet this mation, under God, shall have a 4? birth of 4reedon , and that Euverreest of the eople, by the people, for the people, shall not serish the earth Once con ieted co tw d, py Should have a variatie ca ed . In that whenchan ed wa anav e w are~erne s insere into the variable ro ekan e rie n'T binho d bet wter run e lie m ength of unique words, and the top-20 most trequenty used words along wh thee frequencies For example, a sample program output may look as fotows (Because ancoh txt is a short text we only list top 10) nn stod 1 at ou me numbed unnut art the Tatal Unique ords: 2 vE Length of Unique Wonds:5.689655172413793 renaining 1 ords in Lincoin's text seem to be afterent than what is actually in the text. For example, the most frequenty used word is actually tne with a count of 6. But it odmt show up in Use this stoe word t the resur. The reason is You might have noticed that the number of unique words and the mosi frequently used w I s to compare and remove all stop words in the text betore processing. This stop word Ist as also contained in the zip tle you downloaded There are dferent versions of stop n the text betore processing This stop that we don't want to count those that we call stop words, such as the, a, ard, not word Nst. Please use the given one so our resuls will be consistent for sugges5ions how to rescive these issues The process should work for any bock you download off of Project Gutenberg, and as a matter of tact, it should work for any sexi. You need to pay attenion to a tew things Read the Cases. People and people should be countec as the same word Punctuation When splitting a string into words, Python uses the space characler as f using regular expressions with some pattems isn't very oimcult ) For example, if a string variable s How are youi Plural and other forms ot a word Technically, we'd prter to keep the root of a word only, e.g natio to be counted separately the detaut delimner To spir a strin ns and netions) ng that is a bit more complex than that, we'd nave to use Python's regular expression package, or other string tunctions (Whie r can be a s.split) would producew you?"1 a should be considered as the same word. so are acting and act But this process can get very complicated. In this project, e o, 'areyou?'1 To get nid of the after you, you'd need the help of regul ar expressions or other strng functions aferent forms of a word are allowed for some help on these points Scroll down to erogramming 1as 2d Suggestions the zip file. The name of the text ie being analyz zed shoud be contained within the program When the program should automaticaty call the mainc) function when t is run see the given Python tles in It is important that your program works e xachy as described above. The program robin hood.txt the program should repeat the output in a simiar format twice, one for each nke Finally, think carefuly about your code in this phase You should be able to use ( copy-paste) the components you buikd for this phase into the next one as O Type here to search

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design And Relational Theory Normal Forms And All That Jazz

Authors: Chris Date

1st Edition

1449328016, 978-1449328016

More Books

Students also viewed these Databases questions