Question
CST 325 Homework 4 Due Date: Oct. 29, 2018 at midnight Note: Please type your solution. No late assignments will be accepted. Make sure your
CST 325
Homework 4
Due Date: Oct. 29, 2018 at midnight
Note: Please type your solution. No late assignments will be accepted. Make sure your SQL script file can be run successfully in MySQL before submitting it because I will run and grade it. If it cant run, your grade for this assignment will be a ZERO.
Objective: this assignment is designed to let you have a hands-on experience with creating indexes to tune the Physical database
Instruction: You should work with a team for this assignment. Create a new Word document and save it as HW3Answers_xx (where xx is your team number). Write every members full name and their participation on the first page of the Word document as follows.
Participant | Complete the Assignment before the Meeting (Y/N) | Percentage of Contribution | Justification |
Consider the following relational schema.
Actors(aID string, fName string, lName string, gender string)
Movies(mID integer, name string, mYear year)
Directors (dID integer, fName string, lName string)
Casts (aID string, mID integer, role string)
MovieDirectors (dID integer, mID integer)
Genres (mID integer, genre string)
The primary key fields are underlined, and the domain of each field is listed after the field name. Thus aID is the key for Actors, mID is the key for Movies, dID is the key for Directors, (aID, mID, role) forms the key for Casts, (dID, mID) forms the key for MovieDirectors, and (mID,genre)forms the key for Genres.
These tables have been created using MySQL server. Download IndexDB.sql to your computer and run it on MySQL. You will need to import the data from the movies database, which is a set of plain-text data files. Each file contains all the rows for the named table, one row per line.
In this assignment, you have to import the Movies dataset into MySQL, to analyze the workload of each SQL statement, and to create indexes on the data to make the queries run fast. Turn in your Word file of workload analysis and your SQL index statements to the assignment link.
Task 1: Importing the MoviesDB database. (10 points)
Task 2: Analyzing the workload of each of the following SQL statements. (50 Points)
Queries | SQL Statements | Access Table | Output Column | Join Conditions | Search Conditions | |
Q1 | Find the first and last names of all the actors who played in any comedy movie. | SELECT fName, lName FROM Actors A WHERE A.aID IN (SELECT C.aID FROM Casts C WHERE C.mID IN (SELECT M.mID FROM Movies M WHERE M.mID IN (SELECT G.mID FROM Genres G WHERE G.genre = 'Action')));
|
|
|
|
|
Q2 | Find the first and last names of all the directors who directed an 'Drama' movie in 2005 | SELECT fName, lName FROM Directors D JOIN MovieDirectors MD ON (D.dID = MD.dID) JOIN Movies M ON (M.mID = MD.mID) JOIN Genres G ON (M.mID = G.mID) WHERE genre ='Drama' AND mYear = '2005'; |
|
|
|
|
Q3 | Find the first and last names of all the actors who casted as Librarian in a film before 1980, and also in a film after 2000. (That is: < 1980 and > 2000.) | SELECT fName, lName FROM Actors A WHERE A.aID IN (SELECT C.aID FROM CASTS C WHERE role ='Librarian' AND C.mID IN (SELECT M.mID FROM Movies M, Movies M2 WHERE M.mYear<'1980' AND M2.mYear>'2000')); |
|
|
|
|
Q4 | Find the first and last name of all directors who directed 500 movies or more, in ascending order of the number of movies they directed. Return the directors' names and the number of movies each of them directed. | SELECT fName, lName, NoOfMovies FROM Directors D, ( SELECT dID, COUNT(*) NoOfMovies FROM MovieDirectors GROUP BY dID HAVING COUNT(*) >=500) AS T WHERE D.dID = T.dID ORDER BY NoOfMovies; |
|
|
|
|
Q5 | Find the first and last names of actors that played three or more roles in the same movie since the year 2010. | SELECT fName, lName FROM Actors A WHERE A.aID IN (SELECT C.aID FROM CASTS C JOIN Movies M ON ( M.mID = C.mID) WHERE M.mYear >'2010' GROUP BY aID, M.mID HAVING COUNT(*) >=3); |
|
|
|
|
Once you have imported the movie database, your main task is to analyze each of the above SQL statements. You may find that even simple queries will take a long time to execute. This is because the original database lacks indexes on frequently accessed attributes of its tables yet. Record the execute time of each query before and after adding indexes. Your goal is to choose up to about 12 indexes, such that once the indexes are created, each query should be executed under 1 minute on a modern computer. Be sure to create only indexes that matter. Put your answers in the following table.
Frequency | Fraction | Estimated Performance | Execute Time (before indexing) | Execute Time (after indexing) | Chosen Index Type | Justification | |
Q1 | 300 | 30% | <0.0015 Sec. |
| |||
Q2 | 200 | 20% | < 0.0015 Sec. |
| |||
Q3 | 100 | 10% | < 0.0015 Sec. |
| |||
Q4 | 150 | 15% | < 0.0015 Sec. |
| |||
Q5 | 250 | 25% | < 0.0015 Sec. |
Task 3: Writing SQL statements to create appropriate indexes for each of the above queries (40 Points).
Place a comment for each CREATE INDEX statement to denote the reason for choosing such index (e.g., in order to speed up the selection in question Q1 or in order to speed up the join in query 2). Save all of your code for Task 3 into a single file called creatingIndexYourLastName.sql.
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295967_2/courses/CST_325001.201910/actors.txt
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295968_2/courses/CST_325001.201910/casts.txt
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295969_2/courses/CST_325001.201910/directors.txt
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295970_2/courses/CST_325001.201910/genre.txt
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295972_2/courses/CST_325001.201910/movieDirectors.txt
https://blackboard.ncat.edu/bbcswebdav/pid-3235278-dt-content-rid-56295973_2/courses/CST_325001.201910/movies.txt
(((https://blackboard.ncat.edu/webapps/blackboard/content/listContent.jsp?content_id=_3102495_1&course_id=_3532571_1)))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started