Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

KINGDOM OF SAUDI ARABIA | JAZAN UNIVERSITY COLLEGE OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY 2 0 2 3 - 2 0 2 4 - SECOND

KINGDOM OF SAUDI ARABIA | JAZAN UNIVERSITY
COLLEGE OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY
2023-2024- SECOND SEMESTER
Course with code Introduction to Data Science ITEC 313 Section 1600
Type of task HOME ASSIGNMENT Marks 15 Assignment
5 Q n A
Date of Announcement 7-12-2023 Deadline 14-1-2024
Student_Id Stu_Name
Q.1: Write down the profile of a Data Scientist. Search through the website and describe any Five tools/programs used for Data Science projects which are not mentioned in the Course material slides. (C1, CLO:1.1,1 Marks)
Answer:
Q.2: Write about CSV data storage format with an example. (C2, CLO:1.1,1 Mark)
Download a .CSV file from the internet and answer the below questions:
What is the URL of the website?
What is the file name?
What are the features of the dataset?
What are the data types of the field?
NOTE: Same datasets should not be used to answer this question.
Answer:
Q.3: Why is data noisy? How to clean it? Give three examples other than the one mentioned in the slide. (C2, CLO:1.1,1 Mark)
Answer:
Q.4: Scale down the given data using given below data transformation methods.
(C2, CLO:2.1,3 Marks)
min-max normalization
200040006000900012000
Answer:
z-square normalization
200060008000100001400015000
Answer:
decimal scaling
353274586520745489742
Answer:
Q.5: Calculate the mean, median, mode, range and IQR for the given dataset.
(C3, CLO:2.1,2 Marks)
12142337142246783572
Answer:
Q.6: Calculate the sample variance (s^2) and standard deviation (s) of for the given dataset. (C3, CLO:2.1,2 Marks)
121518253548
Answer:
Q.7: Apply Pearsons r correlation in the Running on Treadmill - Calories Burn dataset given below and draw the relationship between the two variables.
(C4, CLO:2.2,3 Marks)
Running on Treadmill (minutes)
(X) Calories Burn
(Y)
40200
30178
2045
1555
1025
845
522
Answer:
Q.8: Refer to the following dataset regarding the students attitude and scores in the examination: Here, attitude is the predictor variable, and the score is the outcome variable calculated from attitude. Using linear regression, predict the score of a new student with an attitude value of 79.(C4, CLO:2.2,2 Marks)
Given that;
Pearsons correlation coefficient r =0.94.
Standard deviation: Sd_x =3.10,Sd_y =22.80
Mean of y =159 and x =70.6
Table: Attitude and score data
# Attitude Score
165129
267126
368143
470156
571161
672158
772168
873166
973182
1075201
Answer:Q.1: Write down the profile of a Data Scientist. Search through the website and describe
any Five tools/programs used for Data Science projects which are not mentioned in the
Course material slides.
(C1, CLO:1.1,1
Marks)
Answer:
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Murach's SQL Server 2012 For Developers

Authors: Bryan Syverson, Joel Murach, Mike Murach

1st Edition

1890774693, 9781890774691

Students also viewed these Databases questions