Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Create a TextAnalyzer class with the following methods: __init__() TextAnalyzer objects are instantiated by passing in one of the following to the src parameter: A

Create a TextAnalyzer class with the following methods:

__init__()

TextAnalyzer objects are instantiated by passing in one of the following to the src parameter:

A valid URL beginning with "http"

A path to a text file ending with the file extension "txt"

A string of text

The __init__() method also includes a src_type parameter, which is used to specify the type of the src argument. Options are:

discover (default) - You must write code to discover the type of src.

If the src begins with "http", it is a url.

If the src ends in "txt", it is a path.

Otherwise, it is text.

url

path

text

You should set self._src_type, self._content, and self._orig_content in the __init__() method.

set_content_to_tag(self, tag, tag_id=None)

Changes _content to the text within a specific element of an HTML document.

Keyword arguments:

tag (str) -- Tag to read

tag_id (str) -- ID of tag to read

It's possible the HTML does not contain the tag being searched. You should use exception handling to catch any errors.

reset_content(self)

Resets _content to full text. Useful after a call to set_content_to_tag().

_words(self, casesensitive=False):

Returns words in _content as list.

Keyword arguments:

casesensitive (bool) -- If False makes all words uppercase.

Hints

After splitting the text into words using the split() method, strip any leading and trailing punctuation using:

[word.strip(string.punctuation) for word in words]

common_words(self, minlen=1, maxlen=100, count=10, casesensitive=False)

Returns a list of 2-element tuples of the structure (word, num), where num is the number of times wordshows up in _content.

Keyword arguments:

minlen (int) - Minimum length of words to include.

maxlen (int) - Maximum length of words to include.

count (int) - Number of words to include.

casesensitive (bool) -- If False makes all words uppercase

char_distribution(self, casesensitive=False, letters_only=False)

Returns a list of 2-element tuples of the format (char, num), where num is the number of times charshows up in _content. The list should be sorted by num in descending order.

Keyword arguments:

casesensitive (bool) -- Consider case?

letters_only (bool) -- Exclude non-letters?

plot_common_words(self, minlen=1, maxlen=100, count=10, casesensitive=False)

Plots most common words.

Keyword arguments:

minlen (int) -- Minimum length of words to include.

maxlen (int) -- Maximum length of words to include.

count (int) -- Number of words to include.

casesensitive (bool) -- If False makes all words uppercase.

plot_char_distribution(self, casesensitive=False, letters_only=False)

Plots character distribution.

Keyword arguments:

casesensitive (bool) -- If False makes all words uppercase.

letters_only (bool) -- Exclude non-letters?

Properties

In addition, the class must include these properties:

avg_word_length(self)

The average word length in _content rounded to the 100th place (e.g, 3.82).

word_count(self)

The number of words in _content.

distinct_word_count(self)

The number of distinct words in _content.

words(self)

A list of all words used in _content, including repeats, in all uppercase letters.

positivity(self)

A positivity score calculated as follows:

Create local tally variable with initial value of 0.

Increment tally by 1 for every word in self.words found in positive.txt (in same directory)

Decrement tally by 1 for every word in self.words found in negative.txt (in same directory)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

7. Senior management supports the career system.

Answered: 1 week ago