Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Homework Overview Vast amounts of digital data are generated each day, but raw data are often not immediately usable . Instead, we are interested in

Homework Overview
Vast amounts of digital data are generated each day, but raw data are often not immediately usable. Instead,
we are interested in the information content of the data: what patterns are captured? This assignment covers
a few useful tools for acquiring, cleaning, storing, and visualizing datasets.
Why specific versions of software are used in homework assignments? Using specific versions of
software in homework assignments enables us to grade and provide immediate feedback to the large number
of students in the course (1000+ OMS students, 250+ Atlanta students). Autograders are used to grade
students' code submissions, and to ensure that these autograders can grade all submissions, we need to
know the specific versions of software that students use. This is because different versions of software can
have different features, and also to make sure that the autograders can detect potential errors that may occur
in different libraries and provide students with appropriate feedback to resolve them. Continuously updating
assignments to keep up with the latest versions of technology is a significant undertaking, so we carefully
select which aspects of our autograders to update, to balance the workload for our course staff and provide
a positive learning experience for students. As a result, you may see that certain assignment questions require
the use of older" versions of software or specific libraries.
Q1[40 points] Collect data from TMDb to build a co-actor network
Goal Collect data using an API for The Movie Database (TMDb). Construct a graph
representation of this data that shows which actors have acted together in various
movies. We use the word graph and network interchangeably.
Technology Python 3.10.x only (question and autograder developed and tested for these
versions). It is possible that more other versions may also work, but we do not
officially support them (it is possible that your code written with other versions
may break the autograder).
TMDb API version 3
Allowed Libraries The Python Standard Library only.
All other libraries (including and not limited to Pandas, Numpy, and Requests) are
NOT allowed. Providing a consistent autograder experience for all students vastly
outweighs the marginal utility of extending the scope of supported libraries. For
example, urllib can be easily used instead of Requests in solving this question.
Max runtime 10 minutes. Submissions exceeding this will receive zero credit.
Deliverables [Gradescope]
Q1.py: The completed Python file
nodes.csv: The csv file containing nodes
edges.csv: The csv file containing edges
For this question, you will use and submit a Python file. Complete all tasks according to the instructions
found in Q1.py to complete the Graph class, the TMDbAPIUtils class, and the one global function. The
Graph class will serve as a re-usable way to represent and write out your collected graph data. The
TMDbAPIUtils class will be used to work with the TMDB API for data retrieval.
Tasks and point breakdown
a)[10 pts] Implementation of the Graph class according to the instructions in Q1.py.
o The graph is undirected, thus {a, b} and {b, a} refer to the same undirected edge in the
graph; keep only either {a, b} or {b, a} in the Graph object. A nodes degree is the number
of (undirected) edges incident on it. In/ out-degrees are not defined for undirected graphs.
4 Version 0
b)[10 pts] Implementation of the TMDbAPIUtils class according to instructions in Q1.py. Use version
3 of the TMDb API to download data about actors and their co-actors. To use the API:
o Create a TMDb account and follow the instructions on this document to obtain an
authentication key. Be sure to use the key, not the token.
o Refer to the TMDB API Documentation as you work on this question.
c)[20 pts] Producing correct nodes.csv and edges.csv.
o As mentioned in the Python file, if an actor name has comma characters (,), remove those
characters before writing that name into the csv files.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Application And Administration

Authors: Michael Mannino, Michael V. Mannino

2nd Edition

0072880678, 9780072880670

More Books

Students also viewed these Databases questions