Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

We will use one full day worth of tweets as our input ( there are total of 4 . 4 M tweets in this file

We will use one full day worth of tweets as our input (there are total of 4.4M tweets in this file):
Execute and time the following tasks with 110,000 tweets and 550,000 tweets:
a. Write and execute a SQL query to find the average longitude and latitude value for each user ID. This query does not need the User table because User ID is a foreign key in the Tweet table. E.g., something like SELECT UserID, MIN(longitude), MAX(latitude) FROM Tweet, Geo WHERE Tweet.GeoFK = Geo.GeoID GROUP BY UserID;
b. Re-execute the SQL query in part 2-a 5 times and 20 times and measure the total runtime (just re-run the same exact query multiple times using a for-loop, it is as simple as it looks). Does the runtime scale linearly? (i.e., does it take 5X and 20X as much time?) What is the average runtime of each individual run?
c. Write the equivalent of the 2-a query in python (without using SQL) by reading it from the file with 550,000 tweets.
d. Re-execute the query in part 2-c 5 times and 20 times and measure the total runtime. Does the runtime scale linearly? What is the average runtime of each individual run?
e. Write the equivalent of the 2-a query in python by using regular expressions instead of json.loads(). Do not use json.loads() here. Note that you only need to find userid and geo location (if any) for each tweet, you dont need to parse the whole thing.
f. Re-execute the query in part 2-e 5 times and 20 times and measure the total runtime. Does the runtime scale linearly?
g. Create a visual using matplotlib of 2d showing the distribution of the runtimes.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Focus On Geodatabases In ArcGIS Pro

Authors: David W. Allen

1st Edition

1589484452, 978-1589484450

More Books

Students also viewed these Databases questions

Question

What are the four steps involved in developing personal creativity?

Answered: 1 week ago

Question

Provide examples of Dimensional Tables.

Answered: 1 week ago