Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I need someone help me out to complete 5 questions left need to be resolved using Scala on Spark shell. The following are the whole

I need someone help me out to complete 5 questions left need to be resolved using Scala on Spark shell. The following are the whole questions in the assignment, I just need to get help in the questions from 10-14 :

Make a new HDFS directory called /loudacre/weblogs.

2. Move all of the files in the Linux directory /home/training/training_materials/data/weblogs/ to the HDFS directory created in step

1. 3. Create a val logfiles that refers to the weblogs HDFS directory and use this in subsequent commands to save some typing.

4. Create a val input that is an RDD containing all of the records in the logfiles directory.

5. Create a val inputJPG that contains only the records from input that contain jpg requests

6. View the first 10 records using take.

7. Combine the functions of steps 4 and 5 in one line of code, adding a count of the records too. What is that count?

8. Use the map function to get the length of each record, and show the first 10 results.

9. Use the map function, splitting on space, to show the individual fields in each record, and show the first 10 results.

10. Use the previous results to show just the IP addresses in each records and show the first 10 results.

11. Use the .foreach(println) technique to make the results in step 10 easier to read.

12. Save the results from step 10 in the HDFS file /loudcare/iplist

13. Use the -ls and -cat command options (and optionally the Hue browser) to show the HDFS directory /loudcare/iplist as well as the contents of the part file containing the IP addresses.

14. Write a program to display just the IP addresses and timestamps for all the records that contain jpg requests in the format IPAddress/Timestamp and show the first 10 results.

This is sample of how the logfile looks like :

3.94.78.5 - 69827 [15/Sep/2013:23:58:36 +0100] "GET /KBDOC-00033.html HTTP/1.0" 200 14417 "http://www.loudacre.com" "Loudacre Mobile Browser iFruit 1" 3.94.78.5 - 69827 [15/Sep/2013:23:58:36 +0100] "GET /theme.JPG HTTP/1.0" 200 3576 "http://www.loudacre.com" "Loudacre Mobile Browser iFruit 1" 19.38.140.62 - 21475 [15/Sep/2013:23:58:34 +0100] "GET /KBDOC-00277.html HTTP/1.0" 200 15517 "http://www.loudacre.com" "Loudacre Mobile Browser Ronin S1" 19.38.140.62 - 21475 [15/Sep/2013:23:58:34 +0100] "GET /theme.css HTTP/1.0" 200 13353 "http://www.loudacre.com" "Loudacre Mobile Browser Ronin S1" 129.133.56.105 - 2489 [15/Sep/2013:23:58:34 +0100] "GET /KBDOC-00033.html HTTP/1.0" 200 10590 "http://www.loudacre.com" "Loudacre Mobile Browser Sorrento F00L" 129.133.56.105 - 2489 [15/Sep/2013:23:58:34 +0100] "GET /theme.css HTTP/1.0" 200 12295 "http://www.loudacre.com" "Loudacre Mobile Browser Sorrento F00L" 217.150.149.167 - 4712 [15/Sep/2013:23:56:06 +0100] "GET /ronin_s4_sales.html HTTP/1.0" 200 845 "http://www.loudacre.com" "Loudacre Mobile Browser MeeToo 1.0" 217.150.149.167 - 4712 [15/Sep/2013:23:56:06 +0100] "GET /theme.css HTTP/1.0" 200 738 "http://www.loudacre.com" "Loudacre Mobile Browser MeeToo 1.0" 217.150.149.167 - 4712 [15/Sep/2013:23:56:06 +0100] "GET /code.js HTTP/1.0" 200 938 "http://www.loudacre.com" "Loudacre Mobile Browser MeeToo 1.0" 217.150.149.167 - 4712 [15/Sep/2013:23:56:06 +0100] "GET /ronin_s4.jpg HTTP/1.0" 200 5552 "http://www.loudacre.com" "Loudacre Mobile Browser MeeToo 1.0" 209.151.12.34 - 45922 [15/Sep/2013:23:55:09 +0100] "GET /KBDOC-00259.html HTTP/1.0" 200 19362 "http://www.loudacre.com" "Loudacre Mobile Browser Sorrento F11L" 209.151.12.34 - 45922 [15/Sep/2013:23:55:09 +0100] "GET /theme.css HTTP/1.0" 200 17795 "http://www.loudacre.com" "Loudacre Mobile Browser Sorrento F11L" 184.97.84.245 - 144 [15/Sep/2013:23:54:55 +0100] "GET /KBDOC-00052.html HTTP/1.0" 200 12499 "http://www.loudacre.com" "Loudacre CSR Browser" 184.97.84.245 - 144 [15/Sep/2013:23:54:55 +0100] "GET /theme.css HTTP/1.0" 200 4979 "http://www.loudacre.com" "Loudacre CSR Browser" 233.60.251.2 - 33908 [15/Sep/2013:23:51:43 +0100] "GET /KBDOC-00292.html HTTP/1.0" 200 4779 "http://www.loudacre.com" "Loudacre Mobile Browser Ronin S2"

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional Microsoft SQL Server 2014 Integration Services

Authors: Brian Knight, Devin Knight

1st Edition

1118850904, 9781118850909

More Books

Students also viewed these Databases questions