Question
MySQL Given below is the schema for the data. There are a total of 12 tables, thus 12 CSV files, each corresponding to a relational
MySQL
Given below is the schema for the data. There are a total of 12 tables, thus 12 CSV files, each corresponding to a relational table.
user (email, password, name, date_of_birth, address, type) primary key(email)
celebrity (email, website, kind) primary key(email)
blurt (blurtid, email, text, location, time), primary key(blurtid,email) foreign key(email) references user(email)
hobby (email, hobby), primary key(email,hobby) foreign key(email) references user(email))
follow (follower,followee), primary key(follower,followee) foreign key(follower) references user(email) foreign key(followee) references user(email))
vendor (id, name), primary key(id)
vendor_ambassador (vendorid, email), primary key(vendorid) foreign key(email) references user(email) foreign key(vendorid) references vendor(id))
topic (id, description), primary key(id)
vendor_topics (vendorid,topicid), primary key(vendorid, topicid) foreign key(vendorid) references vendor(id) foreign key(topicid) references topic(id))
blurt_analysis (email,blurtid,topicid,confidence,sentiment), primary key(email, blurtid, topicid) foreign key(email,blurtid) references blurt(email,blurtid) foreign key(topicid) references topic(id) constraint confidence >= 0 and confidence = -5 and sentiment <=5
advertisement (id, content, vendorid), primary key(id) foreign key(vendorid) references vendor(id))
user_ad (email,adid), primary key(email,adid) foreign key(email) references user(email) foreign key(adid) references advertisement(id))
The design model:
Users can post their thoughts in form of short messages that we call blurts. When signing up, users need to provide their email and a password of their choice. In addition, they need to enter some basic information name, date of birth, address, email ID and hobbies. Once signed up, they can (besides blurting) follow other users. To follow a user means subscribing to his/her blurts. Users are categorized into regular users and celebrities. A celebrity has an associated website url and an attribute called kind indicating whether he is a politician, actor, singer, etc. Each blurt by a user (regular or celebrity) is assigned an id. Blurt ids are serial and unique to the blurts by a given user; the first blurt by a given user would have blurt id 1 and ids are incremented for each successive blurt by the user. Note that blurt ids are unique only to a user, so blurts by two different users may have the same blurt id. Besides an id, each blurt also has its text, timestamp, and user location as additional attributes. The system should have a pre-defined notion of topics that are simply subjects that people may blurt about. Examples of topics might include music, pollution, disease, disaster, sports, weather, etc. A topic has a unique id and description (the name of the topic). Each blurt by a user is analyzed to associate with it zero or more topics. Related blurt-topic pairs are stored in blurt_analysis table. To account for the possible ambiguity arising from the choice of words or language used by a user, an association with a topic has a corresponding confidence level (an integer ranging from 1 10 indicating the strength of the association). For example consider the following blurt: I absolutely hate the rainy weather, cant go out, listening to the Beatles, just love them is analyzed to be associated with two topics, weather and music (Beatles). For each topic, the associated sentiment is evaluated and quantified as an integral value ranging between -5 and 5, with higher values indicating a more positive sentiment. Considering the example blurt used above, the topic weather would have an associated sentiment of -5 (hate) while for music the corresponding value is 4 (love). Note: You dont need to implement the value constraint as MySQL doesnt support it. A vendor has interest in one or more topics and is interested in tracking all users who are blurting about a topic of interest. A vendor may also have a celebrity as its brand ambassador. Vendors create advertisements that have an associated unique id and a textual content. These advertisements are stored in the system and are available to be shown to the regular set of users (that is, not to the celebrities, just to the other regular users). Careful matching is done based upon a historical analysis of all blurts by a user. Based upon the analysis, a user may be shown zero or more advertisements. STEP 4 Import CSV files Script Template: LOAD DATA LOCAL INFILE "[CSV file name]" INTO TABLE [table name] COLUMNS TERMINATED BY ',' LINES TERMINATED BY ' ' For each CSV file, replace [CSV file name] and [table name] with actual CSV file name and corresponding table name, e.g.: LOAD DATA LOCAL INFILE "d:\\csvdata\\advertisement.csv" INTO TABLE advertisement COLUMNS TERMINATED BY ',' LINES TERMINATED BY ' ' Execute 12 scripts using the GUI client STEP 5 - Form SQL Queries For the following statements, you are required to form SQL queries and execute them using the GUI client. Then export the result using the name "Query x.csv", x being the label of each query. Put all the SQL you formed into a file named "Script.txt" in the same order. Then archive the file as "mp-xxxxxxxx.zip", xxxxxxxx being your student id, and turn it in on eee dropbox under folder mini project. The filenames of your result has to follow the instructions exactly or you may get a deduction in your credit. 1. For each topic, find the total number of blurts that were analyzed as being related to the topic. Order the result by topic id. Your SQL query should print the topic id, topic description and the corresponding count.
2. For each celebrity user, find the total number of followers. Your SQL query should print the name of the celebrity and the corresponding number of followers.
3. For each celebrity, find the number of blurts. Order the result in decreasing order of the number of blurts. Your query should print the name of the celebrity and the associated count in decreasing order of the count.
4. Write an SQL query to print names of all celebrities who are not following anyone!
5. Write an SQL query that gives the email of its brand ambassador and the number users who are following the brand ambassador for each vendor. Your SQL query should print the vendor name, email and the total number of users who are following it.
6. Let us define the term "advertisement-gap" as the number of users who have blurted about a topic that is of interest to a vendor but are not being shown in any advertisements from the vendor. Write an SQL query that gives the vendor name and the corresponding "advertisement-gap" in decreasing order of the advertisement_gap.
7. Write an SQL query to find all pairs of users (A,B) such that both A and B have blurted on a common topic but A is not following B. Your query should print the names of A and B in that order.
8. You need to help users connect with other users. There could be there different users A,B and C such that A follows B, B follows C but A does not follow C. Write an SQL query to find all such triplets of A,B, and C. Your query should print the emails of users A,B and C in that order.
9. For each topic, find the states (e.g., California) where the average sentiment associated with the blurts related to the topic is negative. Your query should print the topic id, topic name, state, total # of blurts and average sentiment for each topic
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started