Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

please fix this python code: I am pretty sure the problem is with the getValueAtLine function, but the error is line 50, in main if

please fix this python code: I am pretty sure the problem is with the getValueAtLine function, but the error is line 50, in main if currMovie[1] == "movie" and currMovie[4] == "0": IndexError: list index out of range

I think the getValueAtLine function could be to blame....something to do with all the escape characters it reads from the files. these three files that i am reading from are very large files, averaging about 5 million lines

This point is to write python code to read files from IMDb (internet movie database) and pick 20 random movies(not tv shows) and not adult movies and to write them along with their principal cast members to a text document. Fields should be separated by tab characters, and there should be one record per line.

import random # open files # read files # do some stuff # close files  open('name.basics1.tsv', encoding="utf8") titles = open('title.basics1.tsv', encoding="utf8") #open('titles.txt', encoding="utf8")  # open('title.principals1.tsv', encoding="utf8")  # code to count the lines of the specific file def countLines(fileObj): #fileObj = open(fileName, encoding='utf8')  counter = 0 line = " "  while not line == "": counter += 1 line = fileObj.readline() fileObj.seek(0) return counter # gets the value of each line def getValueAtLine(fileObj, lineNum): #fileObj = open(fileName, encoding='utf8')  counter = 0 line = " "  while not counter == lineNum: counter += 1 line = fileObj.readline() fileObj.seek(0) return line # main function part 1: starts by getting all the lines in the file, opens, # and creating an empty array of movies. While the "movies" array is less than 20, # it will go through the line numbers randomly and grab the value at the line, split it at the tab, # and if the movie is a movie (currMovie[1]=="movie" and is NOT an adult movie (currmovie[4]=="0") # then we will append that currMovie value to the movies array. def main(): titles = open("title.basics1.tsv", encoding="utf8") maxLines = countLines(titles) movies = [] while not len(movies) == 20: lineNumber = int(random.uniform(2, maxLines)) currMovie = getValueAtLine(titles, lineNumber).split("\t") if currMovie[1] == "movie" and currMovie[4] == "0": movies.append(currMovie) titles.close() # this code is getting all the lines of the principal cast.  # getting the cast at a specific line, splitting them at the tabs,  # and then for currMovie in the array of movies, seeing if the  # tconst of titles.basics matches the tconst of titles.principals  # then we split them at the commas(getting principal cast)  castNum = open("title.principals1.tsv") line = " "  while not line == "": line = castNum.readline()[:-1] lineSplit = line.split("\t") for currMovie in movies: if currMovie[0] == lineSplit[0]: currMovie.append(lineSplit[1].split(",")) castNum.close() # gets the cast names for everything, splits at tabs, and appends each  # of the cast in the currMovie[-1] column to the movies array  # with corresponding numbers  castName = open("name.basics1.tsv") line = " "  while not line == "": line = castName.readline() lineSplit = line.split("\t") for currMovie in movies: for cast in currMovie[-1]: if lineSplit[0] == cast: currMovie[-1].append(lineSplit[1]) currMovie[-1].remove(lineSplit[0]) castName.close() # writes everything to the titles.txt document  titleAndCast = open("titles.txt", 'w') string = ""  for currMovie in movies: string += currMovie[2] for cast in currMovie[-1]: string += "\t" + cast string += " "  titleAndCast.write(string) titleAndCast.close() # runs the main function main() 

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Power Of Numbers In Health Care A Students Journey In Data Analysis

Authors: Kaiden

1st Edition

8119747887, 978-8119747887

More Books

Students also viewed these Databases questions