Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I NEED HELP WITH THE FOLLOWING CODED, below is the problem and the coded We will use a full day worth of tweets as an

I NEED HELP WITH THE FOLLOWING CODED, below is the problem and the coded

We will use a full day worth of tweets as an input (there are total of 4.4M tweets in this file, but you only need to read 1M) http://rasinsrv07.cstcis.cti.depaul.edu/CSC455/OneDayOfTweets.txt

Repeat what you did in part-b, but instead of saving tweets to the file, populate the 3-table schema that you created in SQLite. Be sure to execute commit and verify that the data has been successfully loaded (report loaded row counts for each of the 3 tables) How long did this step take?

import urllib.request, time, json, sqlite3

conn = sqlite3.connect('Tweets_Database_final.db')

c = conn.cursor()

c.execute('DROP TABLE IF EXISTS Tweets');

c.execute('DROP TABLE IF EXISTS USER');

# Create Table Tweets

c.execute('''CREATE TABLE tweet

created_at DATETIME,

user_id INT,

id_str TEXT,

text TEXT,

source TEXT,

in_reply_to_user_id INT,

in_reply_to_screen_name TEXT,

in_reply_to_status_id INT,

retweet_count INT,

contributors TEXT

CONSTRAINT tweet_FK

FOREIGN KEY (user_id) REFERENCES user(id),

# Create Table User

c.execute('''CREATE TABLE user

id INT,

screen_name TEXT,

description TEXT,

friends_count INT,

contributors TEXT

CONSTRAINT USER_PK

PRIMARY KEY(ID)

# Create Table User

c.execute('''CREATE TABLE Geo

user_id INT,

type TEXT,

longitude INT,

latitude INT,

CONSTRAINT tweet_FK_2

FOREIGN KEY (user_id) REFERENCES tweet(user_id)

Line = webFD.readline()

tweetsdata= (Line.decode('utf32')).split('EndOfTweet')

errors = open("final_errors.txt", "w")

Datatweet = []

Datauser = []

DataGeo = []

for Line in range(500000):

try:

tweetrecord= json.loads(Line)

datatweet.append((tweetrecord["created_at"], tweetrecord["user_id "], tweetrecord["id_str"], tweetrecord["text"],tweetrecord["source"], tweetrecord["in_reply_to_user_id"],tweetrecord["in_reply_to_screen_name"], tweetrecord["in_reply_to_status_id"],tweetrecord['retweet_count'],tweetrecord['contributors']))

if tweetrecord[key] in ['',[],'null']:

datatweet.append(None)

else:

datatweet.append(tweetrecord[key])

c.execute('INSERT INTO tweet (created_at, user_id, id_str, text, source, in_reply_to_user_id, in_reply_to_screen_name, in_reply_to_status_id,retweet_count, contributors) VALUES (?,?,?,?,?,?,?,?,?,?)',datatweet)

datauser.append((tweetrecord["id"], tweetrecord["screen_name "], tweetrecord["desription"], tweetrecord["source"], tweetrecord["friends_count"]))

if tweetrecord[key] in ['',[],'null']:

datauser.append(None)

else:

datauser.append(tweetrecord[key])

c.executemany('INSERT INTO User (id, screen_name, description, source, friends_count) VALUES (?,?,?,?,?)', datauser)

dataGeo.append((tweetrecord["id_str"], tweetrecord["type "], tweetrecord["longitude"], tweetrecord["source"], tweetrecord["longitude"]))

if tweetrecord[key] in ['',[],'null']:

dataGeo.append(None)

else:

dataGeo.append(tweetrecord[key])

count += 1

c.executemany('INSERT INTO Geo (id_str, type, longitude, source, longitude ) VALUES (?,?,?,?,?)', dataGeo)

except ValueError:

print(tweet)

errors.write(tweet)

print("tweets loaded to file: ", count)

end = time.time()

print ("Difference is ", round((end-start),3), 'seconds')

print ("Performance : ", round(100000/(end-start), 3), ' operations per second ')

print("tweetfinal.txt on desktop")

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

OCA Oracle Database SQL Exam Guide Exam 1Z0-071

Authors: Steve O'Hearn

1st Edition

1259585492, 978-1259585494

More Books

Students also viewed these Databases questions

Question

Assume that 0 Answered: 1 week ago

Answered: 1 week ago

Question

What is management growth? What are its factors

Answered: 1 week ago