Question
Write a program that will do the following: Consolidate all the information from the csv files they have provided into a single csv file called
- Write a program that will do the following:
- Consolidate all the information from the csv files they have provided into a single csv file called "survey_database.csv"
- write a summary of the collected data in a file called "report.txt"
- write an error log of the collected data in a file called "error_log.txt"
The following are their detailed requirements for the program:
Survey Database
- They want the program to be able to search, find and read all the .csv files in the current working directory to create the "survey_database.csv" file in the current working directory.
- The CSV should contain the following information (defined in self-identifying column header names):
- "City" , "Team Name", "Sport", "Number of Times Picked"
- The rows following that header should contain that information for the top 3 most picked teams in each sport
- "picked" in this context is the team appearing once in the CSV files
- The ranking of the team is defined by the inverse of the times they were picked and their alphabetic Team Name. In other words, teams that appear more often in the read csv files are at the top. In the event of ties in their count, their names are used to resolve the ties.
Report
- They want a new report created every time they run the program. The following is the content of the summary report in a "report.txt" file created in the current working directory. The report is simply a text file with the following information:
Number of files read: {num good files read}
Number of lines read: {num lines read in good files read}
Error log- They want a new error log created every time they run the program. This error log simply has the name of the files that contain errors, on their separate line. An error could be missing strings or empty strings. write a file called "error_log.txt" in the current working directory with that information.
- .If a file has an error (i.e it is missing one of the 3 fields it should have (City, Team Name or Sport)), it file is considered corrupted and it will not count AT ALL. In other words, if you read 5 rows already then you encounter an empty filed on the sixth, you will discard those 5 rows with the whole file as well.
- Every line, including the last, in report.txt and error_log.txt ends in a new line character.
- If there are no error files encountered, simply create the error_log.txt file with nothing in it.
- If there are less than 3 teams in a sport, simply output the teams that you do have.
- Every CSV file you read will have an header, and the survery_database.csv file you create must also have a header.
- The order of sports in survey_database.csv does not matter.
- read.py
from pathlib import Path import csv from sportclub import SportClub from typing import List, Tuple def readFile(file: Path) -> List[Tuple[str, str, str]]: """Read a CSV file and return its content A good CSV file will have the header "City,Team Name,Sport" and appropriate content. Args: file: a path to the file to be read Returns: a list of tuples that each contain (city, name, sport) of the SportClub Raises: ValueError: if the reading csv has missing data (empty fields) """ rows = [] good_lines = 0 with open(file, newline="") as File: reader = csv.reader(File) header = next(reader) for row in reader: if len(row) == 3 and all(row): rows.append(tuple(row)) good_lines += 1 else: raise ValueError(f"Invalid data in file: one or more files have missing fields.") return rows def readAllFiles() -> List[SportClub]: """Read all the csv files in the current working directory to create a list of SportClubs that contain unique SportClubs with their corresponding counts Take all the csv files in the current working directory, calls readFile(file) on each of them, and accumulates the data gathered into a list of SportClubs. Create a new file called "report.txt" in the current working directory containing the number of good files and good lines read. Create a new file called "error_log.txt" in the current working directory containing the name of the error/bad files read. Returns: a list of unique SportClub objects with their respective counts """ good_files = 0 good_lines = 0 sport_clubs = [] error_files = [] cvs_files = Path.cwd().glob("*.csv") for file in cvs_files: try: rows = readFile(file) for row in rows: club = SportClub(*row) if club not in sport_clubs: sport_clubs.append(club) ind = sport_clubs.index(club) sport_clubs[ind].count += 1 good_lines += 1 good_files += 1 except ValueError as e: error_files.append(file.name) print(str(e)) else: good_files += 1 with open("report.txt", "w") as File: File.write(f"Number of files read: {good_files}\n") File.write(f"Number of lines read: {good_lines}\n") with open("error_log.txt", "w") as File: for error_file in error_files: File.write(error_file + "\n") return sport_clubs
- sportclub.py
class SportClub: """A simple class to store and handle information about SportClubs. Attributes: city (str): The city the SportClub is based in. name (str): The name of the SportClub. sport (str): The sport the club plays. count (int): The amount of time the SportClub has been seen. Todo: complete the __eq__ and __lt__ functions of this class """ def __init__(self, city: str = "", name: str = "", sport: str= "", count: int = 0) -> None: """Make a SportClub. Args: city: The city the SportClub is based in. name: The name of the SportClub. sport: The sport the club plays. count: The amount of time the SportClub has been seen. """ self.setCity(city) self.setName(name) self.setSport(sport) self.count = count def setName(self, name: str) -> None: """Set the name of the SportClub. Args: name: Name of the SportClub. """ self.name = name def setCity(self, city: str) -> None: """Set the city the SportClub is based in. Args: city: The city the SportClub is based in. """ self.city = city def setSport(self, sport: str) -> None: """Set the sport the club plays. Args: sport: The sport the club plays. """ self.sport = sport def getName(self) -> str: """Get the name of the SportClub. Returns: A formatted version of the private attribute name. """ return self.name.title() def getCity(self): """Get the city the SportClub is based in. Returns: A formatted version of the private attribute city. """ return self.city.title() def getSport(self): """Get the sport the club plays. Returns: A formatted version of the private attribute sport. """ return self.sport.upper() def getCount(self): """Get the total times the SportClub has been seen. Returns: A copy of the attribute count """ return self.count def incrementCount(self) -> None: """Increment the times the SportClub has been seen by 1. """ self.count += 1 def __hash__(self) -> int: """Get the hash of current object. Returns: Hash of the object """ unique_identifier = (self.getCity(), self.getName(), self.getSport()) return hash(unique_identifier) def __str__(self) -> str: """Get the string version of current object. Returns: str summary of the object """ return f"Name: {self.getCity()} {self.getName()}, Sport: {self.getSport()}, Count: {self.getCount()}" def __eq__(self, o: object) -> bool: """Check if another object is equal to self. Returns: True if they are equal, False otherwise """ if not isinstance(o, SportClub): return False return (self.getCity(), self.getName(), self.getSport()) == (o.getCity(), o.getName(), o.getSport()) def __lt__(self, o: object) -> bool: """Check if self is less than another object. Returns: True if self is less than o, False otherwise """ if not isinstance(o, SportClub): return NotImplemented if self.getName() == other.getName(): if self.getCity() == other.getCity(): return self.getSport() < other.getSport() return self.getCity() < other.getCity() return self.getName() < other.getName()
- write.py
import csv from sportclub import SportClub from typing import List, Iterable def separateSports(all_clubs: List[SportClub]) -> Iterable[List[SportClub]]: """Separate a list of SportClubs into their own sports For example, given the list [SportClub("LA", "Lakers", "NBA"), SportClub("Houston", "Rockets", "NBA"), SportClub("LA", "Angels", "MLB")], return the iterable [[SportClub("LA", "Lakers", "NBA"), SportClub("Houston", "Rockets", "NBA")], [SportClub("LA", "Angels", "MLB")]] Args: all_clubs: A list of SportClubs that contain SportClubs of 1 or more sports. Returns: An iterable of lists of sportclubs that only contain clubs playing the same sport. """ sports = {} for club in all_clubs: if club.getSport() not in sports: sports[club.getSport()] = [] sports[club.getSport()].append(club) return sports.values() def sortSport(sport: List[SportClub]) -> List[SportClub]: """Sort a list of SportClubs by the inverse of their count and their name For example, given the list [SportClub("Houston", "Rockets", "NBA", 80), SportClub("LA", "Warriors", "NBA", 130), SportClub("LA", "Lakers", "NBA", 130)] return the list [SportClub("LA", "Lakers", "NBA", 130), SportClub("LA", "Warriors", "NBA", 130), SportClub("Houston", "Rockets", "NBA", 80)] Args: sport: A list of SportClubs that only contain clubs playing the same sport Returns: A sorted list of the SportClubs """ return sorted(sport, key=lambda club: (-club.getCount(), club.getName())) def outputSports(sorted_sports: Iterable[List[SportClub]]) -> None: """Create the output csv given an iterable of list of sorted clubs Create the csv "survey_database.csv" in the current working directory, and output the information: "City,Team Name,Sport,Number of Times Picked" for the top 3 teams in each sport. Args: sorted_sports: an Iterable of different sports, each already sorted correctly """ with open('survey_database.csv', mode='w', newline="") as csv_file: writer = csv.writer(csv_file) writer.writerow(["City", "Team Name", "Sport", "Number of Times Picked"]) rows = [[club.getCity(), club.getName(), club.getSport(), club.getCount()] for sport in sorted_sports for club in sport[:3]] writer.writerows(rows + [[]])
- main.py
from read import readAllFiles from write import sortSport, separateSports, outputSports def main() -> None: separated_sports = separateSports(readAllFiles()) sorted_sports = map(sortSport, separated_sports) outputSports(sorted_sports) if __name__ == "__main__": main()
- there are some error in the above coding
- report.txt is incorrect. Make sure it contains lines formatted *exactly* as described in the assignment, that 'files read' only counts valid files and that 'lines read' only counts valid lines (and does not count file headers).
- The SportClub count for Name: Golden State Warriors, Sport: NBA, Count: 1 in your readAllFiles(...) output differs from the corresponding count in the solution's output.
- Caught a ValueError while running your code: not enough values to unpack (expected 4, got 1). Are you sure all your fields are present and of the proper type?
could you help me out with the error that happen in the above coding, thank you.
__________________________________________________________________________________________________________
and these are the original instructions of the above coding (below)
- these are the sample test Files:
junesurvey.csv
CityTeam NameSportRank San Francisco49ersNFL2 SeatleSeahawksNFL29 Los AngelesRamsNFL1 Golden StateWarriorsNBA1 Los AngelesLakersNBA3 SacramentoKingsNBA2 San FranciscoGiantsMLB1
#this is a CSV
- marchsurvey.csv
file:///Users/cherrychen/Library/Messages/Attachments/74/04/28B7D5BF-49FC-488C-B9A2-737567C97CC1/marchsurvey.csv
CityTeam NameSportRank San Francisco49ersNFL1 SeatleSeahawksNFL3 Los AngelesRamsNFL2 Golden StateWarriorsNBA1 Los AngelesLakersNBA2 SacramentoKingsNBA3 San FranciscoGiantsMLB1
#this is a CSV
You will get 4 files (main.py, read.py, write.py and sportclub.py). The instructions here are also present in the files themselves:
- main.py: Do not change anything in main.py. Your code must run with the main given to you, and will be run the same way by the autograder.
- sportclub.py: Contains the class SportClub. Read through the class to understand how it works, and complete the following functions:
- __eq__(other): Function to check if the current instance of the class (self) is equal to another object. This can be useful in sortSport(sport).
- __lt__(other): Function to check if the current instance of the class (self) is less than another object. This can also be useful in sortSport(sport).
- read.py: Contains two functions you need to complete:
- readFile(file): Function to read one .csv file. Returns a list of tuples of three fields (City, Name and Sport). The output should not include raw input formatting like commas, the header, etc. Raises ValueError if it finds that the file it's reading is an error file.
- readAllFiles(): Function to read all .csv files in the current working directory. Returns a list of SportClub objects. This function should call the readFile(file), and gather its output into a list of unique SportClub objects. Each of these unique SportClub objects will store the identifying information of the team, and how many times the participants have picked the team. This is the function that will also produce the report.txt and error_log.txt files you read about.
4. write.py: Contains three functions you need to complete:
- separateSports(all_clubs): Function to separate a list of all sport clubs by their sport. Takes as input the list of all clubs seen, and returns an iterable of lists of sport clubs. Each of those lists only contain sport clubs that play the same sport. The order of the sports does not matter.
- sortSport(sport): Function to sort a list of SportClub objects by their counts and name. Takes a list of SportClub objects of the same sport as an argument. Returns a list of the sorted SportClub objects.
- Hint: make use of the standard __lt__ and __eq__ functions you defined earlier by calling standard Python sorting functions.
- outputSports(sorted_sports): Function to create the survey_database.csv file output file and output the top 3 most picked teams of each sport. Takes an iterable of lists of sorted SportClub objects of the as an argument.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started