Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

===== Database: We will use the San Francisco Bike share database. Only part of the data is included with this homework. Database Schema: CREATE

image

===== Database: We will use the San Francisco Bike share database. Only part of the data is included with this homework. Database Schema: CREATE TABLE station ( // bike stations id INTEGER PRIMARY KEY, name TEXT, lat NUMERIC, long NUMERIC, CREATE TABLE station_status ( // the number of available bikes and docks in a station at a given date and time id integer primary key autoincrement, station_id INTEGER, // foreign key to station.id CREATE TABLE trip ( id CREATE TABLE daily_weather ( name dock_count INTEGER, city TEXT, install date TEXT); Notes: - For all questions, include all columns shown in the expected result. - Use the round function to round to two decimal places (unless the expected result of a query has a different number of decimal places). All questions have the same number of points (100/13) ====== bikes_available INTEGER, docks_available INTEGER, time TEXT); 1) List trips with id number is less than or equal 7450. include all attributes in the expected result. Expected Result: id INTEGER PRIMARY KEY, duration INTEGER, start_station_id INTEGER, // foreign key to station.id end_station_id INTEGER, // foreign key to station.id bike_id INTEGER, subscription_type TEXT, 7442 75 7443 75 7444 66 7445 75 7446 25 7447 66 7448 69 7449 56 7450 56 zip_code INTEGER, start_d Text, // start date and time of trip end_d Text // end date and time of trip ); origin_station id integer primary key autoincrement, date TEXT, HADHGEES max_temperature_f INTEGER, mean_temperature_f INTEGER, min_temperature_f INTEGER, max_dew_point_f INTEGER, mean_dew_point_f INTEGER, min_dew_point_f INTEGER, max_humidity INTEGER, mean_humidity INTEGER, min_humidity INTEGER, 14 max_sea_level_pressure_inches NUMERIC, mean_sea_level_pressure_inches NUMERIC, min_sea_level_pressure_inches NUMERIC, max_visibility_miles INTEGER, mean_visibility_miles INTEGER, min_visibility_miles INTEGER, max_wind_Speed_mph INTEGER, mean_wind_speed_mph INTEGER, max_gust_speed_mph INTEGER, precipitation_inches INTEGER, 12 cloud_cover INTEGER, events TEXT, wind_dir_degrees INTEGER, zip_code INTEGER); Mechanics Plaza (Market at Battery) Mechanics Plaza (Market at Battery) South Van Ness at Market Mechanics Plaza (Market at Battery) Stanford in Redwood City South Van Ness at Market San Francisco Caltrain 2 (330 Townsend) Beale at Market Beale at Market Davis at Jackson Embarcadero at Bryant 15 2) Modify the previous query to add the name of the destination station (end station). Note that you will need two joins (connections) with the station table. Expected Result: Expected Result: date Harry Bridges Plaza (Ferry Building) Powell at Post (Union Square) Embarcadero at Sansome 2014-09-01 85.8 2013-09-01 79.2 Harry Bridges Plaza (Ferry Building) Powell at Post (Union Square) Harry Bridges Plaza (Ferry Building) Embarcadero at Vallejo Harry Bridges Plaza (Ferry Building) start_station_id 8.00 13 8.00 8.02 8.05 16 8.07 3) Count the number of trips between every pair of stations. Keep only pairs of stations with 10 or more trips. Show the average trip duration between a pair of stations as well. Expected Result: 17 8.29 11 8.32 18 8.39 10 8.47 09 8.59 Expected Result: name 29 31 32 36 50 Expected Result: hour avg_bikes_available avg_docks_available Expected Result: name id ===================== avg_temp 94041 94041 5 23 26 38 4) The trip table includes data for two days only. Compute the average max tempreature and number of trips in these days. Join the trip and daily weather using the `start_d` and `date` attributes. You will need to use the date() function to extract the date part in the `date` and start_d` attributes. Expected Result: zip_code date. 94041 94041 94041 94041 95113 95113 94107 94041 94041 95113 94041 95113 95113 95113 95113 80 83 24 25 36 84 4 29 31 34 12 22 37 32 21 7 name Mechanics Plaza (Market at Battery) Mechanics Plaza (Market at Battery) South Van Ness at Market 43 Mechanics Plaza (Market at Battery) Stanford in Redwood City South Van Ness at Market San Francisco Caltrain 2 (330 Townsend) Beale at Market Beale at Market 5) The station_status table includes stats (bikes_available and docks_available) about every station recorded every few seconds. Compute the average of these two columns for every hour. You will need to use the strftime('%H', time) to extract the hour from the time attribute. Keep only the rows with the 10 lowest avg_bikes_available values (use order by and limit). 2014-09-01 111 2013-09-01 464 2014-09-01 461 2014-09-01 466 2014-09-01 470 46 46 46 46 46 46 46 46 46 Expected Result: station_id name 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 San Francisco Caltrain 2 (330 Townsend) Spear at Folsom Harry Bridges Plaza (Ferry Building) Palo Alto Caltrain Station San Jose City Hall Beale at Market Commercial at Montgomery San Francisco City Hall South Van Ness at Market Mechanics Plaza (Market at Battery) ======== number_trips stationID name 368 706 433179 San Francisco Caltrain (Townsend at 4th) 8326 Embarcadero at Sansome Embarcadero at Sansome 8327 8439 8440 Embarcadero at Sansome Embarcadero at Sansome Expected Result: date bike_id start_station ==== 9.74 9.74 9.72 9.68 9.67 9.44 9.42 9.36 9.26 9.14 6) Compute the euclidian distance for every pai of stations (usi the POWER and SQRT functions). Show the 10 closest pairs of stations. Use the round function to keep 6 decimal places. ==== date 2013-09-21 2015-02-07 64 2014-01-30 2015-05-07 56 2014-02-09 2014-02-27 60 2014-02-09 2014-12-05 60 2014-02-16 2014-02-19 56 2014-02-27 2014-12-05 60 2014-02-08 2014-03-02 57 2014-02-15 2014-11-01 56 2014-04-04 2014-11-30 56 2014-03-03 2014-11-20 56 2014-03-29 2015-04-25 58 2014-03-06 2014-12-05 60 2014-11-12 2015-03-22 62 2014-12-01 2014-12-23 59 2014-12-01 2015-02-09 59 2014-12-14 2015-03-02 52 2014-12-23 2015-02-09 59 Expected Result: bike_id ==== ====== 7) Find the 5 longest trips (i.e., the trip is between two stations that are far away when considering the euclidian distance). start_d Expected Result: station_id name name 2013-09-01 00:11 1508 2013-09-01 00:12 1513 2013-09-01 00:19 1345 2013-09-01 00:21 931 2013-09-01 00:23 159 2013-09-01 00:24 1033 2013-09-01 00:29 323 2013-09-01 00:38 825 2013-09-01 00:38 819 Adobe on Almaden San Mateo County Center Davis at Jackson Embarcadero at Sansome San Francisco Caltrain (Townsend at 4th) Powell at Post (Union Square) Embarcadero at Sansome 8) For a particular zip code, find two different dates that have the same mean tempreature and the same precipitation (precipitation_inches). Keep only rows where precipitation is larger than 0.01. Also, discard rows with precipitation = 'T', which means 'trace' (yes, SQLite allows string values in an integer column!). Harry Bridges Plaza (Ferry Building) Harry Bridges Plaza (Ferry Building) Embarcadero at Vallejo Harry Bridges Plaza (Ferry Building) Embarcadero at Sansome Redwood City Medical Center Park at Olive Ryland Park Santa Clara at Almaden name San Antonio Caltrain Station San Antonio Shopping Center Castro Street and El Camino Real California Ave Caltrain Station Redwood City Public Library Stanford in Redwood City 9) Suppose we need to look at how far a bike has traveled in a particular day. Consider a two-leg trip using the same bike on the same day (possibly by two different customers). The first leg begins at start_station and stops at middle_station. The second leg starts at the middle station and continues to an end_station. Find two-leg trips with the largest euclidean distance between the start and end stations. Show only the first 20 characters of all station names (use the substr function). Santa Clara County Civic Center Mezes Park mean_temperature_f precipitation_inches Townsend at 7th San Francisco Caltra South Van Ness at Ma Harry Bridges Plaza South Van Ness at Ma 5th at Howard South Van Ness at Ma 5th at Howard South Van Ness at Ma Market at Sansome San Francisco Caltrain (Townsend at 4th) Embarcadero at Folsom Steuart at Market University and Emerson MLK Library Mechanics Plaza (Market at Battery) Washington at Kearney Market at 10th Market at 10th Market at Sansome 15 16 17 18 19 20 21 22 23 Washington at Kearney 16 1 1 California Ave Caltrain Station 1 1 2 2 2 2 3 3 3 Washington at Kearney Washington at Kearney Washington at Kearney Washington at Kea Washington at Kearney Washington at Kearney 22 Washington at Kearney 23 middle_station num_description 1 number of arriving trips 1 number of arriving trips 1 number of arriving trips 1 number of arriving trips bikes available Harry Bridges Plaza (Ferry Building) 23 10) Which stations are haveing little traffic? Find stations with only one arriving trip (the station is the end station of one trip only). Alos, include stations with 23 or more available bikes at any time. The description column can be added as a fixed string in SQL (e.g., SELECT GPA, 'Out of 4.0' as description, ... ) San Antonio Caltrain Station San Antonio Shopping Center Palo Alto Caltrain Station SJSU 4th at San Carlos Redwood City Caltrain Station Cowper at University Castro Street and El Camino Real 4 Franklin at Maple 5 Paseo de San Antonio 6 start_d duration destination_station name 0 0 0 0 0 0 2013-09-01 00:11 2013-09-01 00:36 1508 2013-09-01 00:12 2013-09-01 00:37 1513 2013-09-01 00:19 2013-09-01 00:41 1345 2013-09-01 00:21 2013-09-01 00:36 931 2013-09-01 00:23 2013-09-01 00:25 159 2013-09-01 00:24 2013-09-01 00:42 1033 2013-09-01 00:29 2013-09-01 00:35 323 2013-09-01 00:38 2013-09-01 00:52 825 2013-09-01 00:38 2013-09-01 00:52 819 ====== 11) Find stations with the smallest number of trips. Consider both trips that starts or ends at a station. Expected Result: Washington at Kearney 00 7.9 7.79 7.9 7.95 Washington at Kearney 01 7.73 Washington at Kearney 02 7.84 Washington at Kearney 03 Washington at Kearney 04 Washington at Kearney 05 7.73 Washington at Kearney 06 7.9 Washington at Kearney 07 Washington at Kearney 08 7.81 Washington at Kearney 09 8.04 Washington at Kearney 10 5.54 Washington at Kearney 11 2.29 Washington at Kearney 12 2.91 Washington at Kearney 13 4.22 Washington at Kearney 14 3.08 Washington at Kearney 4.4 San Antonio Caltrain Station South Van Ness at Market South Van Ness at Market South Van Ness at Market South Van Ness at Market ==== 7.97 7.89 8.41 0.37 0.04 0.02 0.02 0.02 0.02 0.02 0.02 0.32 0.14 number_of_trips Beale at Market Beale at Market Harry Bridges Plaza (Ferry Building) Mechanics Plaza (Market at Battery) Franklin at Maple 0.28 0.02 0.03 0.02 0.02 0.05 0.02 Harry Bridges Plaza (Ferry Building) San Francisco Caltrain 2 (330 Townsend) Harry Bridges Plaza (Ferry Building) Harry Bridges Plaza (Ferry Building) num 8.01 end_d 8.87 9.85 9.79 9.73 ===== cnt 10 5404.80 10 1349.60 10 1477.10 11 7733.73 12 8081.08 12 5274.67 12 858.08 14 1157.21 16 3212.38 18 1435.44 end_station duration avg_duration 12) Find any bike that had trips to both 'Mountain View Caltrain Station' and 'Evelyn Park and Ride' (i.e., the bike used in trips to these stations). hour avg_bikes_available avg_docks_available 13) Find the average bikes_available and the average docks_available in every hour for the station with id 46. 7.1 7.27 7.16 7.21 7.1 7.27 7.1 7.05 7.19 6.96 9.46 euclidean_distance 0.000211 0.001228 0.001274 0.001749 0.002013 0.002183 0.002197 0.002202 0.002392 0.00243 12.71 12.09 10.78 11.92 10.6 7.03 7.11 6.99 6.59 6.13 5.15 5.21 5.27 euclidean_distance duration 28226 1792 1788 1651 1623 0.468929 0.03383 0.03383 0.03383 0.03383 San Antonio Caltrain 0.469227 Embarcadero at Sanso 0.03383 Embarcadero at Sanso 0.03383 Embarcadero at Sanso 0.03383 Embarcadero at Sanso 0.03383 euclidean_distance

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction to Data Mining

Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar

1st edition

321321367, 978-0321321367

More Books

Students also viewed these Algorithms questions

Question

What is an access control list?

Answered: 1 week ago