Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Python Code: In this section, you will verify a key statistical property of text: Zipf's Law. Zipf's Law describes the relations between the frequency rank

Python Code:

In this section, you will verify a key statistical property of text: Zipf's Law.

Zipf's Law describes the relations between the frequency rank of words and frequency value of words. For a word w, its frequency is inversely proportional to its rank:

countw= K 1/rankw

K is a constant, specific to the corpus and how words are being defined.

What would this look like if you took the log of both sides of the equation?

  • Write your answer in one or two lines here.

Therefore, if Zipf's Law holds, after sorting the words descending on frequency, word frequency decreases in an approximately linear fashion under a log-log scale.

Now, please make such a log-log plot by plotting the rank versus frequency

Hint: Make use of the sorted dictionary you just created. Use a scatter plot where the x-axis is the log(rank), and y-axis is log(frequency). You should get this information from word_counts; for example, you can take the individual word counts and sort them. dict methods .items() and/or values() may be useful. (Note that it doesn't really matter whether ranks start at 1 or 0 in terms of how the plot comes out.) You can check your results by comparing your plots to ones on Wikipedia; they should look qualitatively similar.

Please remember to label the meaning of the x-axis and y-axis.

import math

import operator

x = []

y = []

X_LABEL = "log(rank)"

Y_LABEL = "log(frequency)"

# implement me! you should fill the x and y arrays. Add your code here

# running this cell should produce your plot below

plt.scatter(x, y)

plt.xlabel(X_LABEL)

plt.ylabel(Y_LABEL)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intelligent Image Databases Towards Advanced Image Retrieval

Authors: Yihong Gong

1st Edition

1461375037, 978-1461375036

More Books

Students also viewed these Databases questions

Question

What are the main differences between rigid and flexible pavements?

Answered: 1 week ago

Question

What is the purpose of a retaining wall, and how is it designed?

Answered: 1 week ago

Question

How do you determine the load-bearing capacity of a soil?

Answered: 1 week ago

Question

what is Edward Lemieux effect / Anomeric effect ?

Answered: 1 week ago