Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Given the following matlab program for Divvy bikes. The assignment is organized with a main function that calls the helper functions you need to write.

Given the following matlab program for Divvy bikes. The assignment is organized with a main function that calls the helper functions you need to write. The main function is called DivvyAnalysis, and you pass the name of the input file along with the analysis to perform (1-9). For example, analysis 1 returns the total # of rides in the file. So you would call the function as DivvyAnalysis('file1.txt', 1). Analysis 2 computes the average duration of a DIVVY ride, in minutes. The rides are fairly short because the DIVVY system is geared towards commuting, where rides < 30 minutes do not incur a surcharge. Analysis 3 is to compute the percentage of riders that identify as male, or female. Note that in this case an additional parameter is passed to the DivvyAnalysis function: 1 => male, 2 => female. The percentages do not add up to 100% because in some cases, the gender is not specified in the input data. Analysis 4 is to compute the average age of a Divvy rider (in years) that identify as male, and female. Note that an additional parameter is passed to the DivvyAnalysis function in this case: 1 => male, 2 => female. The input data contains the birthyear, not the exact birthday, to help ensure the data is anonymous. To compute the age of a rider, use (current year birth year). Heres how to determine the current year in MATLAB:

c = clock(); %% returns a vector of values: year, month, day, etc.

currentyear = c(1); %% 1st value is the year

In the data, note that some years are 0 since the birth year was not specified --- do not include the 0 values in the computation of the average age. Even if the gender is specified, the birth year may be 0.

Analysis 5 is to compute a histogram (counts) of the # of rides for each starting hour: how many rides started during the midnight hour, how many rides started during the 1am hour, how many rides started during the 2am hour, and so on. There should be a total of 24 counts, one count for each starting hour 0-23. Note that an additional parameter is passed to the DivvyAnalysis function in this case: 0 => all riders, 1 => male riders only, and 2 => female riders only. Hint: MATLAB has various histogram functions you may be able to use, e.g. hist() and histc().

Analyses 6 and 7 compute the total # of rides that start at a given station, or end at a given station. An additional parameter is passed to the DivvyAnalysis function denoting the station ID. Here are sample outputs involving stations 211, 349, and 351 from the divvy1.csv file.

Finally, analyses 8 and 9 compute the top-10 stations in terms of the largest # of rides starting from these stations, or ending at these stations. The analyzes return 10x2 matrices where column 1 are the top-10 stations and column 2 are the corresponding # of rides starting / ending at these stations. For example, for analysis 8 on input file file1.txt, the first result is station #3 with a total of 5 rides starting at that station. Station #176 also had 5 rides starting at that station --- this entry comes second because when theres a tie, the station ids are given in sorted order.

The matlab program is given below:

function Result = DivvyAnalysis(filename, analysis, optional)

%%

%% let's make sure the file exists:

%%

if exist(filename, 'file') ~= 2

fprintf('**Error: file "%s" cannot be found ', filename);

Result = '**Error: file not found';

return;

end

%%

%% Load the data:

%%

data = load(filename);

%%

%% Perform requested analysis:

%%

if analysis == 1

Result = NumRides(data);

elseif analysis == 2

Result = AverageRide(data);

elseif analysis == 3

Result = RiderPercentage(data, optional);

elseif analysis == 4

Result = AverageAge(data, optional);

elseif analysis == 5

Result = RideHistogram(data, optional);

elseif analysis == 6

Result = StationCheckouts(data, optional);

elseif analysis == 7

Result = StationCheckins(data, optional);

elseif analysis == 8

Result = Top10Checkouts(data);

elseif analysis == 9

Result = Top10Checkins(data);

else

Result = '**Error: invalid analysis parameter';

end

end

%% total # of rides in this dataset:

function Rides = NumRides(data)

Rides = 0;

end

%% average ride duration, in seconds:

function AvgRide = AverageRide(data)

AvgRide = 0;

end

%% percentage of male (1) or female (2) riders:

function Percentage = RiderPercentage(data, gender)

Percentage = 0;

end

%% average age of male (1) or female (2) riders:

function AvgAge = AverageAge(data, gender)

AvgAge = 0;

end

%% # of rides each hour from midnight to 11pm; pass 0 as "gender" to

%% count all rides, otherwise 1 => male and 2 => female:

function Result = RideHistogram(data, gender)

Result = zeros(1, 24); %% row vector of 24 zeros: [0, 0, 0, ...]

end

%% total # of rides where the ride started at this station:

function Checkouts = StationCheckouts(data, stationID)

Checkouts = 0;

end

%% total # of rides where the ride ended at this station:

function Checkins = StationCheckins(data, stationID)

Checkins = 0;

end

%% which stations have the largest # of rides that originated here? Return

%% a 10x2 matrix where column 1 are the top-10 station ids and column 2 are

%% the corresponding # of rides starting from these stations:

function Result = Top10Checkouts(data)

Result = [zeros(10,1), zeros(10,1)]; %% return 10x2 matrix:

end

%% which stations have the largest # of rides that ended here? Return

%% a 10x2 matrix where column 1 are the top-10 station ids and column 2 are

%% the corresponding # of rides ending at these stations:

function Result = Top10Checkins(data)

Result = [zeros(10,1), zeros(10,1)]; %% return 10x2 matrix:

end

The assignment is to perform some basic analysis of their ridership data. In particular, your task is to write functions that perform the following analyses:

1. Total # of rides

2. Average duration of a ride, in seconds

3. Average percentage of riders, per gender

4. Average age of riders, per gender

5. Histogram of ride duration, total and per gender

6. Total # of rides starting and ending at a given station

7. The station with the most rides starting, and ending

The input file "file1.txt" contains the following information: each line of the file contains information about one particular ride: the id of the station where the ride started, the id of the station where the ride ended, the bike id, ride duration, etc. Each line represents one Divvy trip, and consists of 7 values:

From station id: integer, id of station where bike was checked out / ride started

To station id: integer, id of station where bike was returned / ride ended

Bike id: integer

Starting hour: the hour, in military time, of when the ride started (e.g. 0 => midnight and 23 => 11pm)

Trip duration: integer, in seconds

Birth year: 0 => not specified, otherwise year that rider was born (e.g. 1986)

Gender: 0 => not specified, 1 => if rider identifies as male, 2 => if rider identifies as female

66 171 5292 23 857 1989 1 81 172 242 22 1303 0 0 81 172 182 22 1376 0 0 176 74 1252 21 595 1986 1 216 502 2773 21 636 1983 1 35 2 2431 21 964 0 0 331 28 2967 20 612 1970 2 344 299 5498 20 984 1966 1 36 43 2741 20 1093 0 0 36 43 561 20 1091 0 0 110 142 141 19 296 1986 1

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Internals A Deep Dive Into How Distributed Data Systems Work

Authors: Alex Petrov

1st Edition

1492040347, 978-1492040347

More Books

Students also viewed these Databases questions

Question

What is Change Control and how does it operate?

Answered: 1 week ago

Question

How do Data Requirements relate to Functional Requirements?

Answered: 1 week ago