Question
One of the benefits of using computers to solve problems is they can process data very quickly to help us discover important facts about the
One of the benefits of using computers to solve problems is they can process data very quickly to help us discover important facts about the real world. In this part of the lab, we will be using Python to perform some data science on observations biologists recorded about three species of penguins on different islands around Antarctica. In other words, we will be using data science as an interdisciplinary approach to answer questions related to biodiversity.
Gentoo Penguin (source Andrew Shiva at Wikipedia, CC-BY-SA 4.0).Special thanks and credit to Professor Allison Horst at the University of California Santa Barbara for making this data set public: Twitter post and thread with more information and GitHub repository.
Data About Penguins
You have been provided with a read_data() function in penguin.py that reads in all of the data from the penguins_data.csv file in your replit project. This file contains data for about 342 real-life penguins. Calling read_data() returns a list of all of the penguins we will be working with.
penguins = read_data()
Each penguin is a list containing the six values described in the table below.
List Index | Information | Type |
---|---|---|
0 | species | str |
1 | home island | str |
2 | bill length | float |
3 | bill depth | float |
4 | flipper length | float |
5 | body mass | float |
For example, one penguin might be represented as the following list:
penguin = ["Adelie", "Torgersen", 39.1, 18.7, 181.0, 3750.0]
Since each penguin is itself a list of values, then a list of multiple penguins is represented as a two-dimensional list, such as the following list that contains 5 penguins:
five_penguin_list = [["Adelie", "Torgersen", 39.1, 18.7, 181.0, 3750.0], ["Adelie", "Briscoe", 37.8, 18.3, 174.0, 3400.0], ["Gentoo", "Biscoe", 46.1, 13.2, 211.0, 4500.0], ["Gentoo", "Biscoe", 50.0, 16.3, 230.0, 5700.0], ["Chinstrap", "Dream", 46.5, 17.9, 192.0, 3500.0]]
The penguins list returned by the read_data() function is similar in structure to the five_penguins_list above, except it contains 342 penguins, instead of only 5.
ReadMe
We will read in all the penguin observations from a file, so you do not need to make any assignments like the above (they merely illustrate what the data looks like).
Program Goal
Your goal in this program is to use that list of penguins to discover possible differences between the three species of penguins (Adelie, Chinstrap, and Gentoo) based on their data. In particular, your program should do the following:
- Ask the user for a species name (either “Adelie”, “Chinstrap”, or “Gentoo”).
- Ask the user for which type of measurement they want to see (either bill length or body mass; you do not have to handle bill depth or flipper length).
- Based on the species chosen by the user in Step 1, create a new list containing only the penguins that belong to this species.
- Calculate and then print the average, minimum, and maximum of the measurements selected by the user in Step 2 (bill length or body mass) for the list of penguins created in Step 3.
During Steps 1 and 2, you should make sure the user enters a valid option. If the user did not, you should print a message telling them what mistake they made then close the program. Tip: you might want to provide the user with a menu like we did with image filters in Lab 4 - maybe even inside a similar main() function!
Useful Functions
To complete this assignment, the following functions will help us.
find_species(penguins, species):
The find_species() function will perform Step 3 above by taking in a list of all of the penguins in the data, and return a smaller list that contains only the penguins of a particular species. This can be done by following these steps:
- Create a new empty list called filtered.
- Loop over each penguin in the penguins list.
- Check if the current penguin’s species (in index 0, i.e., penguin[0]) is equal to species.
- If so, append the current penguin to the filtered list.
- Return the filtered list.
For example, say we have the same five penguins as we did above.
five_penguin_list = [["Adelie", "Torgersen", 39.1, 18.7, 181.0, 3750.0], ["Adelie", "Briscoe", 37.8, 18.3, 174.0, 3400.0], ["Gentoo", "Biscoe", 46.1, 13.2, 211.0, 4500.0], ["Gentoo", "Biscoe", 50.0, 16.3, 230.0, 5700.0], ["Chinstrap", "Dream", 46.5, 17.9, 192.0, 3500.0]]
Then, if you want to create a list of only the Adelie penguins from those five, you can call find_species(five_penguin_list, "Adelie"), which should return a list with the two Adelie penguins.
filtered = [["Adelie", "Torgersen", 39.1, 18.7, 181.0, 3750.0], ["Adelie", "Briscoe", 37.8, 18.3, 174.0, 3400.0]]
find_measurements(filtered, index):
In order to perform Step 4, we need to work with either the bill length or body mass of all of the penguins of a given species (returned as filtered from our find_species() function). To get those measurements, we will use the find_measurements() function.
The find_measurements() function is very similar to find_species(), except we are only saving a particular measurement from each penguin, instead of the entire penguin. This function should:
- Create a new empty list called measurements.
- Loop over each penguin in the filtered list.
- Grab the measurement from penguin[index] (index = 2 if the user chose bill length and index = 5 if they chose body mass).
- Save the measurement in the measurements list.
- Return the measurements list.
For example, say we have the same two Adelie penguins as we did above:
filtered = [["Adelie", "Torgersen", 39.1, 18.7, 181.0, 3750.0], ["Adelie", "Briscoe", 37.8, 18.3, 174.0, 3400.0]]
Then, if I want to create a list of all of their bill lengths, I can call find_measurements(filtered, 2), which should return a list:
measurements = [39.1, 37.8]
find_average(measurements):
For the find_average() function, we will want to add together all the numbers in the input measurements list, then divide that total by the count of numbers in the list and return the result.
find_max(measurements) and find_min(measurements):
For the find_max() and find_min() functions, we will need to loop through the values in measurements. Within the loop, we will keep track of which value is currently the largest (for find_max()) or smallest (for find_min()). You should not use Python’s built-in max() or min() functions here. In addition, you should not use the variable names min or max as they will collide with the built-in function names.
ReadMe
As a hint of how to loop over each penguin contained in a list of penguins (i.e., a list of lists), we can use the following code:
for penguin in penguins: # do something with penguin, which is a list of measurements
Also, don’t forget to use a main() function here too.
Correct Answers
Species | Measurement | Min | Average | Max |
---|---|---|---|---|
Adelie | Bill Length | 32.1 | 38.7914 | 46.0 |
Adelie | Body Mass | 2850.0 | 3700.6623 | 4775.0 |
Chinstrap | Bill Length | 40.9 | 48.8338 | 58.0 |
Chinstrap | Body Mass | 2700.0 | 3733.0882 | 4800.0 |
Gentoo | Bill Length | 40.9 | 47.5049 | 59.6 |
Gentoo | Body Mass | 3950.0 | 5076.0163 | 6300.0 |
def read_data():
# read in the contents of the file
file = open("penguins_data.csv", "r")
lines = file.readlines()
file.close()
# create a new list to store the penguins
penguins = list()
# convert the lines in the file into penguins
for line in lines[1:]:
# remove the newline character from the end
line = line.strip()
# turn the line into a list of strings
line = line.split(",")
# create a new list for this penguin
penguin = list()
# save the species and island of the penguin
penguin.append(line[0]) #the species string
penguin.append(line[1]) #the island string
# convert the measurements to floats and save them in the penguin
penguin.append(float(line[2])) #the bill length
penguin.append(float(line[3])) #the bill depth
penguin.append(float(line[4])) #the flipper length
penguin.append(float(line[5])) #the body mass
# save this penguin to our list of penguins
penguins.append(penguin)
return penguins
def main():
pass
if __name__ == '__main__':
main()
Step by Step Solution
3.56 Rating (156 Votes )
There are 3 Steps involved in it
Step: 1
Answer Def readdata file read in the contents of the file fileclose create a new list to store the p...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started