Question
B. Programming Task 1 This programming task focuses on using Python to calculate a set of Pearson Correlation Coefficients for a given dataset using built-in
B. Programming Task 1
This programming task focuses on using Python to calculate a set of Pearson Correlation Coefficients for a given dataset using built-in functions and data structures ONLY.
For Task 1, you MUST NOT import any Python library functions. This means you cannot use Python modules such as math or libraries such as Pandas or NumPy.
To print a the Pearson Correlation Coefficient for a given pair of Python Lists, it would be very easy to use the pearsonr() function provided in the SciPy library. However, this programming task is designed to assess your coding abilities and by preventing you from using this function you are forced to gain a deeper understanding of how to complete that task. To do this, you will need to develop your own algorithm. Try typing calculate Pearson Correlation Coefficient by hand into your favourite search engine.
There is a single data file available for use in this programming task. The file contains a record of US police killings for the year 2015.
The data file is called task1.csv. This CSV file includes a header row with multiple named data values. This file is available in the Assignments section on Blackboard
Students are expected to follow appropriate coding standards such as code commenting, consistent identifier naming, code readability, and appropriate use of data structures.
You are expected to identify the strengths/weaknesses of your approach. For this programming task, you are expected to write a reflective report which focuses on the process taken to develop a solution to the task. Please reflect on your experiences rather than simply describing what you did. The report should:
include an explanation of how you approached the task.
identify any strengths/weaknesses of the approach used.
consider how the approach used could be improved.
suggest alternative approaches that could have been taken instead of the one you used.
B.1. Requirements
ID | Requirement | Description | Marks Available |
FR1 | Develop a function to find the arithmetic mean | The function should take a Python List as a parameter and return its arithmetic mean. You should use the following list to test your function: 85, 29, 35, 55, 82, 45, 42, 21, 42, 60, 56, 30, 72, 56, 37, 65, 29, 14, 66, 43, 23, 39, 81, 56, 74, 29, 22, 27, 14, 66, 55, 33, 31, 66, 63, 41, 30, 48, 68, 58, 51, 44, 66, 34, 20, 71, 59, 57, 43, 48. | 2 |
FR2 | Develop a function to read a single specified column of data from a CSV file | The function should accept two parameters: the data file name and a column number. The column number specifies which of the columns to read. It can range between 0 and n-1 (where n is the number of columns). The function should return two values: a List containing all the specified columns data values and the column name. You should use the task1.csv data file to test your function but your function should also work for other CSV files. An illustration of this is given in Appendix 1. | 6 |
FR3 | Develop a function to read CSV data from a file into memory | The task1.csv data file contains multiple columns of data values. This function should accept a single parameter: the data file name. It should make use of the function developed in FR2 to read all the columns of data from the data file and add them to a Dictionary data structure. The Dictionary should contain one entry for each column in the CSV data file. An illustration of this is given in Appendix 2. | 6 |
FR4 | Develop a function to calculate the Pearson Correlation Coefficient for two lists of data | This function should calculate the Pearson Correlation Coefficient for two lists of data. You should make use of the function developed in FR1. The function should take two lists of data (of equal length) as parameters. The function should return the calculated coefficient value. | 10 |
FR5 | Develop a function to generate a set of Pearson Correlation Coefficients for a given data file | This function should make use the function developed in FR4 to generate a Pearson Correlation Coefficient for every pair of columns in the data read into memory in FR3. The function should return a list of tuples, each tuple containing the two column names and associated correlation coefficient value. An illustration of this is given in Appendix 3. | 10 |
FR6 | Develop a function to print a custom table | This function should output the Pearson Correlation Coefficient for a subset of the column pairs generated in FR5. The function should take three parameters: list of correlation coefficient tuples, border character to use and which columns to include. High marks will be given for good use of padding in the table cells to improve readability. An illustration of this is given in Appendix 4. | 8 |
Note: Please follow the question and answer it.. please do it fast it's urgent. this question related to MSC data science course.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started