Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

C. Programming Task 2 This programming task focuses on using NumPy/SciPy, Pandas, and Matplotlib/Seaborn to combine, clean and analyse two datasets related to student performance.

C. Programming Task 2

This programming task focuses on using NumPy/SciPy, Pandas, and Matplotlib/Seaborn to combine, clean and analyse two datasets related to student performance.

Two data files have been provided for this task. These data files provide some real data from the Open University.

The task2a.csv data file contains background information about 26746 students including gender, age, disability status and score.

The task2b.csv data file contains information about the number of click events made by 26074 students using the Universitys Virtual Learning Environment (VLE) system.

The files are available in the Assignment section on Blackboard

Students are expected to follow appropriate coding standards such as code commenting, consistent identifier naming, code readability, and appropriate use of data structures.

You are expected to identify the strengths/weaknesses of your approach. For this programming task, you must write a reflective report which focuses on the process taken to develop a solution to the task. Please reflect on your experiences rather than simply describing what you did. The report should:

include an explanation of how you approached the task.

identify any strengths/weaknesses of the approach used.

Consider how the approach used could be improved.

C.1. Requirements

ID

Requirement

Description

Marks Available

FR7

Read CSV data from two files and merge it into a single Data Frame

For this task you should use the task2a.csv and task2b.csv data files

2

FR8

Clean the merged data

Remove all rows from the merged data that contain a missing value in any column. Remove the following unnecessary column: region, final_result and highest_education

3

FR9

Filter out unnecessary rows

Remove all rows where click_event is smaller than 10.

2

FR10

Investigate the effects of engagement on attainment

Use an appropriate visualisation tool (such as Matplotlib or Seaborn) to investigate if there is any relation between the engagement (click events) and the level of attainment (score). You must include an explanation of your findings to achieve good marks for this requirement.

6

FR11

Test the hypothesis that engagement has some effect on levels of attainment

Using an appropriate Python library, test if there is any statistically significant relation between engagement and attainment. You must include an explanation of your findings to achieve good marks for this requirement.

4

FR12

Investigate the effects of disability on levels of attainment

Use an appropriate visualisation tool (such as Matplotlib or Seaborn) to investigate if there is any effect on levels of attainment due to disability. You must include an explanation of your findings to achieve good marks for this requirement.

6

FR13

Test if there is any difference between the attainment of disabled and non-disabled students

Using an appropriate Python library, test if there is any statistically significant difference between disabled and non-disabled students. You must include an explanation of your findings to achieve good marks for this requirement.

4

C.2. Deliverables

There is single deliverable for this task:A Jupyter Notebook file (in .ipynb format) containing a complete s courseolution to Programming Task.

You must use the template provided [1].

The Jupyter Notebook should also include a Development Process Report written using Markdown reflecting on the process taken to develop a solution to this task

The report should not exceed 500 words.

C.3. Submission

[1] There is a Jupyter Notebook template available in the Assignment folder on Blackboard - UFCFVQ-15-M_Programming_Task_2_Template.ipynb

Note: please follow the question and answer it. this question is related to msc data science

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Structure Of The Relational Database Model

Authors: Jan Paredaens ,Paul De Bra ,Marc Gyssens ,Dirk Van Gucht

1st Edition

3642699588, 978-3642699580

More Books

Students also viewed these Databases questions

Question

How does a nonfamily household differ from a family household?

Answered: 1 week ago