Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 24, 2024

Create a TextAnalyzer class with the following methods: init() TextAnalyzer objects are instantiated by passing in one of the following to the src parameter: A

Create a TextAnalyzer class with the following methods:

__init__()

TextAnalyzer objects are instantiated by passing in one of the following to the src parameter:

A valid URL beginning with "http"

A path to a text file ending with the file extension "txt"

A string of text

The __init__() method also includes a src_type parameter, which is used to specify the type of the src argument. Options are:

discover (default) - You must write code to discover the type of src.

If the src begins with "http", it is a url.

If the src ends in "txt", it is a path.

Otherwise, it is text.

url

path

text

You should set self._src_type, self._content, and self._orig_content in the __init__() method.

set_content_to_tag(self, tag, tag_id=None)

Changes _content to the text within a specific element of an HTML document.

Keyword arguments:

tag (str) -- Tag to read

tag_id (str) -- ID of tag to read

It's possible the HTML does not contain the tag being searched. You should use exception handling to catch any errors.

reset_content(self)

Resets _content to full text. Useful after a call to set_content_to_tag().

_words(self, casesensitive=False):

Returns words in _content as list.

Keyword arguments:

casesensitive (bool) -- If False makes all words uppercase.

Hints

After splitting the text into words using the split() method, strip any leading and trailing punctuation using:

[word.strip(string.punctuation) for word in words]

common_words(self, minlen=1, maxlen=100, count=10, casesensitive=False)

Returns a list of 2-element tuples of the structure (word, num), where num is the number of times wordshows up in _content.

Keyword arguments:

minlen (int) - Minimum length of words to include.

maxlen (int) - Maximum length of words to include.

count (int) - Number of words to include.

casesensitive (bool) -- If False makes all words uppercase

char_distribution(self, casesensitive=False, letters_only=False)

Returns a list of 2-element tuples of the format (char, num), where num is the number of times charshows up in _content. The list should be sorted by num in descending order.

Keyword arguments:

casesensitive (bool) -- Consider case?

letters_only (bool) -- Exclude non-letters?

plot_common_words(self, minlen=1, maxlen=100, count=10, casesensitive=False)

Plots most common words.

Keyword arguments:

minlen (int) -- Minimum length of words to include.

maxlen (int) -- Maximum length of words to include.

count (int) -- Number of words to include.

casesensitive (bool) -- If False makes all words uppercase.

plot_char_distribution(self, casesensitive=False, letters_only=False)

Plots character distribution.

Keyword arguments:

casesensitive (bool) -- If False makes all words uppercase.

letters_only (bool) -- Exclude non-letters?

Properties

In addition, the class must include these properties:

avg_word_length(self)

The average word length in _content rounded to the 100th place (e.g, 3.82).

word_count(self)

The number of words in _content.

distinct_word_count(self)

The number of distinct words in _content.

words(self)

A list of all words used in _content, including repeats, in all uppercase letters.

positivity(self)

A positivity score calculated as follows:

Create local tally variable with initial value of 0.

Increment tally by 1 for every word in self.words found in positive.txt (in same directory)

Decrement tally by 1 for every word in self.words found in negative.txt (in same directory)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning Database Design Solutions Understanding And Implementing Database Design Concepts For The Cloud And Beyond

Authors: Rod Stephens

2nd Edition

★★★★★

7. Senior management supports the career system.

Answered: 1 week ago

Previous Question Next Question

Question

Create a TextAnalyzer class with the following methods: __init__() TextAnalyzer objects are instantiated by passing in one of the following to the src parameter: A

Step by Step Solution

Step: 1

Get Instant Access to Expert-Tailored Solutions

Step: 2

Step: 3

Ace Your Homework with AI

Recommended Textbook for

Beginning Database Design Solutions Understanding And Implementing Database Design Concepts For The Cloud And Beyond

Students also viewed these Databases questions

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Create a TextAnalyzer class with the following methods: init() TextAnalyzer objects are instantiated by passing in one of the following to the src parameter: A