Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider the following relational schema. Actors( aID string, fName string, lName string, gender string) Movies( mID integer, name string, mYear year) Directors ( dID integer,

Consider the following relational schema.

Actors(aID string, fName string, lName string, gender string)

Movies(mID integer, name string, mYear year)

Directors (dID integer, fName string, lName string)

Casts (aID string, mID integer, role string)

MovieDirectors (dID integer, mID integer)

Genres (mID integer, genre string)

The primary key fields are underlined, and the domain of each field is listed after the field name. Thus aID is the key for Actors, mID is the key for Movies, dID is the key for Directors, (aID, mID, role) forms the key for Casts, (dID, mID) forms the key for MovieDirectors, and (mID,genre)forms the key for Genres.

These tables have been created using MySQL server. Download IndexDB.sql to your computer and run it on MySQL. You will need to import the data from the movies database, which is a set of plain-text data files. Each file contains all the rows for the named table, one row per line.

In this assignment, you have to import the Movies dataset into MySQL, to analyze the workload of each SQL statement, and to create indexes on the data to make the queries run fast. Turn in your Word file of workload analysis and your SQL index statements to the assignment link.

Task 1: Importing the MoviesDB database. (10 points)

Task 2: Analyzing the workload of each of the following SQL statements. (50 Points)

Queries

SQL Statements

Access Table

Output Column

Join Conditions

Search Conditions

Q1

Find the first and last names of all the actors who played in any comedy movie.

SELECT fName, lName

FROM Actors A

WHERE A.aID IN

(SELECT C.aID

FROM Casts C

WHERE C.mID IN

(SELECT M.mID

FROM Movies M

WHERE M.mID IN

(SELECT G.mID

FROM Genres G

WHERE G.genre = 'Action')));

Q2

Find the first and last names of all the directors who directed an 'Drama' movie in 2005

SELECT fName, lName

FROM Directors D JOIN MovieDirectors MD ON (D.dID = MD.dID)

JOIN Movies M ON (M.mID = MD.mID)

JOIN Genres G ON (M.mID = G.mID)

WHERE genre ='Drama' AND mYear = '2005';

Q3

Find the first and last names of all the actors who casted as Librarian in a film before 1980, and also in a film after 2000. (That is: < 1980 and > 2000.)

SELECT fName, lName

FROM Actors A

WHERE A.aID IN (SELECT C.aID

FROM CASTS C

WHERE role ='Librarian' AND C.mID IN (SELECT M.mID

FROM Movies M, Movies M2

WHERE M.mYear<'1980' AND M2.mYear>'2000'));

Q4

Find the first and last name of all directors who directed 500 movies or more, in ascending order of the number of movies they directed. Return the directors' names and the number of movies each of them directed.

SELECT fName, lName, NoOfMovies

FROM Directors D, (

SELECT dID, COUNT(*) NoOfMovies

FROM MovieDirectors

GROUP BY dID

HAVING COUNT(*) >=500) AS T

WHERE D.dID = T.dID

ORDER BY NoOfMovies;

Q5

Find the first and last names of actors that played three or more roles in the same movie since the year 2010.

SELECT fName, lName

FROM Actors A

WHERE A.aID IN (SELECT C.aID

FROM CASTS C JOIN Movies M ON ( M.mID = C.mID)

WHERE M.mYear >'2010'

GROUP BY aID, M.mID

HAVING COUNT(*) >=3);

Once you have imported the movie database, your main task is to analyze each of the above SQL statements. You may find that even simple queries will take a long time to execute. This is because the original database lacks indexes on frequently accessed attributes of its tables yet. Record the execute time of each query before and after adding indexes. Your goal is to choose up to about 12 indexes, such that once the indexes are created, each query should be executed under 1 minute on a modern computer. Be sure to create only indexes that matter. Put your answers in the following table.

Frequency

Fraction

Estimated Performance

Execute Time (before indexing)

Execute Time (after indexing)

Chosen Index Type

Justification

Q1

300

30%

<0.0015 Sec.

Q2

200

20%

< 0.0015 Sec.

Q3

100

10%

< 0.0015 Sec.

Q4

150

15%

< 0.0015 Sec.

Q5

250

25%

< 0.0015 Sec.

Task 3: Writing SQL statements to create appropriate indexes for each of the above queries (40 Points).

Place a comment for each CREATE INDEX statement to denote the reason for choosing such index (e.g., in order to speed up the selection in question Q1 or in order to speed up the join in query 2). Save all of your code for Task 3 into a single file called creatingIndexYourTeamNumber.sql.

------------------------------------------------------------------------------------------------------

index

DROP DATABASE IF EXISTS IndexDB; CREATE DATABASE IndexDB; USE IndexDB;

CREATE TABLE Actors( aID VARCHAR(10), fName VARCHAR(50), lName VARCHAR(50), gender CHAR(1), PRIMARY KEY(aID));

CREATE TABLE Movies( mID VARCHAR(10), name VARCHAR(200), mYear CHAR(4) NULL, PRIMARY KEY(mID));

CREATE TABLE Directors( dID VARCHAR(10), fName VARCHAR(100), lName VARCHAR(100), PRIMARY KEY(dID));

CREATE TABLE Casts( aID VARCHAR(10), mID VARCHAR(10), role VARCHAR(40), PRIMARY KEY(aID, mID, role));

#FOREIGN KEY(aID) REFERENCES Actors(aID), #FOREIGN KEY(mID) REFERENCES Movies(mID));

CREATE TABLE MovieDirectors( dID VARCHAR(10), mID VARCHAR(10), PRIMARY KEY(dID, mID));

CREATE TABLE Genres( mID VARCHAR(10), genre VARCHAR(40), PRIMARY KEY(mID, genre), FOREIGN KEY(mID) REFERENCES Movies(mID));

LOAD DATA INFILE '/MoviesDB/actors.txt' into table Actors FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (aID , fName , lName , gender );

LOAD DATA INFILE '/MoviesDB/movies.txt' into table Movies FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (mID, name, mYear );

LOAD DATA INFILE '/MoviesDB/directors.txt' into table Directors FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (dID, fname, lname );

LOAD DATA INFILE '/MoviesDB/movieDirectors.txt' into table MovieDirectors FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (dID, mID );

LOAD DATA INFILE '/MoviesDB/genre.txt' into table Genres FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (mID, genre );

LOAD DATA INFILE '/MoviesDB/casts.txt' into table Casts FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY ' ' (aID, mID , role );

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Visualizing Health And Healthcare Data Creating Clear And Compelling Visualizations To See How Youre Doing

Authors: Katherine Rowell ,Lindsay Betzendahl ,Cambria Brown

1st Edition

1119680883, 978-1119680888

More Books

Students also viewed these Databases questions

Question

What is the major role of rivers in the erosion process?

Answered: 1 week ago

Question

=+25-4 Identify the stimulants, and describe their effects.

Answered: 1 week ago

Question

6. Does your speech have a clear and logical structure?

Answered: 1 week ago