Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

B. Programming Task 1 This programming task focuses on using Python to calculate a set of Pearson Correlation Coefficients for a given dataset using built-in

B. Programming Task 1

This programming task focuses on using Python to calculate a set of Pearson Correlation Coefficients for a given dataset using built-in functions and data structures ONLY.

For Task 1, you MUST NOT import any Python library functions. This means you cannot use Python modules such as math or libraries such as Pandas or NumPy.

To print a the Pearson Correlation Coefficient for a given pair of Python Lists, it would be very easy to use the pearsonr() function provided in the SciPy library. However, this programming task is designed to assess your coding abilities and by preventing you from using this function you are forced to gain a deeper understanding of how to complete that task. To do this, you will need to develop your own algorithm. Try typing calculate Pearson Correlation Coefficient by hand into your favourite search engine.

There is a single data file available for use in this programming task. The file contains a record of US police killings for the year 2015.

The data file is called task1.csv. This CSV file includes a header row with multiple named data values. This file is available in the Assignments section on Blackboard

Students are expected to follow appropriate coding standards such as code commenting, consistent identifier naming, code readability, and appropriate use of data structures.

You are expected to identify the strengths/weaknesses of your approach. For this programming task, you are expected to write a reflective report which focuses on the process taken to develop a solution to the task. Please reflect on your experiences rather than simply describing what you did. The report should:

include an explanation of how you approached the task.

identify any strengths/weaknesses of the approach used.

consider how the approach used could be improved.

suggest alternative approaches that could have been taken instead of the one you used.

B.1. Requirements

ID

Requirement

Description

Marks Available

FR1

Develop a function to find the arithmetic mean

The function should take a Python List as a parameter and return its arithmetic mean. You should use the following list to test your function: 85, 29, 35, 55, 82, 45, 42, 21, 42, 60, 56, 30, 72, 56, 37, 65, 29, 14, 66, 43, 23, 39, 81, 56, 74, 29, 22, 27, 14, 66, 55, 33, 31, 66, 63, 41, 30, 48, 68, 58, 51, 44, 66, 34, 20, 71, 59, 57, 43, 48.

2

FR2

Develop a function to read a single specified column of data from a CSV file

The function should accept two parameters: the data file name and a column number. The column number specifies which of the columns to read. It can range between 0 and n-1 (where n is the number of columns). The function should return two values: a List containing all the specified columns data values and the column name. You should use the task1.csv data file to test your function but your function should also work for other CSV files. An illustration of this is given in Appendix 1.

6

FR3

Develop a function to read CSV data from a file into memory

The task1.csv data file contains multiple columns of data values. This function should accept a single parameter: the data file name. It should make use of the function developed in FR2 to read all the columns of data from the data file and add them to a Dictionary data structure. The Dictionary should contain one entry for each column in the CSV data file. An illustration of this is given in Appendix 2.

6

FR4

Develop a function to calculate the Pearson Correlation Coefficient for two lists of data

This function should calculate the Pearson Correlation Coefficient for two lists of data. You should make use of the function developed in FR1. The function should take two lists of data (of equal length) as parameters. The function should return the calculated coefficient value.

10

FR5

Develop a function to generate a set of Pearson Correlation Coefficients for a given data file

This function should make use the function developed in FR4 to generate a Pearson Correlation Coefficient for every pair of columns in the data read into memory in FR3. The function should return a list of tuples, each tuple containing the two column names and associated correlation coefficient value. An illustration of this is given in Appendix 3.

10

FR6

Develop a function to print a custom table

This function should output the Pearson Correlation Coefficient for a subset of the column pairs generated in FR5. The function should take three parameters: list of correlation coefficient tuples, border character to use and which columns to include. High marks will be given for good use of padding in the table cells to improve readability. An illustration of this is given in Appendix 4.

8

Note: Please follow the question and answer it.. please do it fast it's urgent. this question related to MSC data science course.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advances In Databases And Information Systems 23rd European Conference Adbis 2019 Bled Slovenia September 8 11 2019 Proceedings Lncs 11695

Authors: Tatjana Welzer ,Johann Eder ,Vili Podgorelec ,Aida Kamisalic Latific

1st Edition

3030287297, 978-3030287290

More Books

Students also viewed these Databases questions