Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Part of your job as a data analyst will be to find information on the web, clean it and present the data in a meaningful

Part of your job as a data analyst will be to find information on the web, clean it and present the data in a meaningful format other people can understand.  In this assignment, you are going to do is with a tourism dataset from the world bank. You will download the data, clean it and then make a function which will allow a user to input a variable number of countries and years, and the output will be a graph of the countries vs years.  This assignment directly maps to the following learning outcomes:

 

  1. Utilize the R programming language to write functions, loops, examine and explore data and utilize libraries for added functionality for data analysis such as: dplyr, ggplot2, lubridate, and tidyr.

 

  1. Demonstrate how to turn unstructured data (messy data) into structured data (tidy data).

 

  1. Demonstrate how to search for online databases, find open data sources on the internet, and utilize the data.

 

  1. Retrieve data from the web, clean it, and present the data to a user in a readable, often visual, format which utilizes tools and techniques learned throughout the course.


 

 

Directions

  1. Download the dataset from:
    http://data.worldbank.org/indicator/ST.INT.RCPT.CD

     
  2. Unzip the file and load it into R Studio
    1. You can use read.csv(), or read.xls() from library(xlsx)
  3. Clean & Tidy the data
    1. Note: you need to convert data from data wide to data long.
  4. Plot out graph of 3 countries tourism $ vs time
    1. You need to use ggplot() for this part of the problem.       If you use another plotting function ie. plot() or qplot() you will only received 50% credit for this part of the assignment.
    2. Convert your y-axis a log axis.
  5. Make a function by wrapping your code with a function argument
    1. Your arguments should be three countries
    2. Extra Credit part 1 - Use the ". . ." argument to pass multiple countries and multiple years in the function.  This will allow the user to plot as many countries and for whatever years they want.
    3. Credit part 2 - create an argument that allows you to select a sequential number of years.  So from 1997:2005
  6. Save the code as a . R file or a . Rmd file and upload the file to moodle

 

Note: You R code function and plot should look like the next page.

 

 

 

image.png

 

 

 

 

                                                                                                                      

tourism_plot("China", "Ghana", "United States")
 

Dollars 1e+11- 1e+10- 1e+09- 1e+08- 1995 1996 1997 1998 19992000200120022003200420052006200720082009201020112012201320142015201620'17 Years Countries China Ghana United States

Step by Step Solution

3.34 Rating (145 Votes )

There are 3 Steps involved in it

Step: 1

Step 1 Download and Load the Dataset Assuming youve downloaded the dataset and saved it as touri... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Systems Analysis And Design

Authors: Joseph Valacich, Joey George

8th Edition

0134204921, 978-0134204925

More Books

Students also viewed these Computer Network questions

Question

Explain the benefits of visualization. Critical T hinking

Answered: 1 week ago

Question

Describe the advantages of a client/server architecture.

Answered: 1 week ago