Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write a single line shell command to download the source code of the webpage and save the files as news.html 1) (2 points) Assume now

Write a single line shell command to download the source code of the webpage and save the files as news.html

image text in transcribed

image text in transcribed

1) (2 points) Assume now you are given a webpage with a URL as below money cnn.com/2018/03/28ews/ news/companies/amazon-stock/index.html The webpage is the news about "amazon". Please write a single line shell command to download the source code of the webpage and save the file as news.html. [Hint: you can use wget in Ubuntu and curl in MacOS] 2) (2 points) Html is a mark-up language in which the tags are used to let web browser know how to format the content. You can refer to the link below for more details https://www.tutorialspoint.com/htm/html overview.htm Now your task is to extract the actual article from the webpage source code new.html by extracting the text tagged bypp>. Please use a single line shell command to finish this task and save the article to file news.txt. Note: the extracted article is like below Trump stso beted bout Amzon hurting the US Post 0tice He has ewen calted Aszono-txmcnapoly- Trumpnts Amszon to pay more tax, bec ae he's concerned sbout amett retailers ustnasndan bkirputs d thiy to retai tro te arlasi trust d, Aaoid.en 1a hurting thir .Today's non adds geso t sne to the fire that Amazon could see more regulation stead.ms Danteltves, hemd of technology research at atnsights. The lant thing nervous tach Trunp hed to do ith that deciason Bath te and the Jaatice Depertoent deny the prestdent's srwalvement 3) (3 points) The first two steps have helped us collect data from web resource. Then we are going to preprocess the article by removing some less important words from the text file, such as "a", "the" and so on. These words are not used for text mining Please write a C program removeChar.c to finish this task. To simplify this problem, your C program just needs to obtain the input from standard input and remove only word 1) (2 points) Assume now you are given a webpage with a URL as below money cnn.com/2018/03/28ews/ news/companies/amazon-stock/index.html The webpage is the news about "amazon". Please write a single line shell command to download the source code of the webpage and save the file as news.html. [Hint: you can use wget in Ubuntu and curl in MacOS] 2) (2 points) Html is a mark-up language in which the tags are used to let web browser know how to format the content. You can refer to the link below for more details https://www.tutorialspoint.com/htm/html overview.htm Now your task is to extract the actual article from the webpage source code new.html by extracting the text tagged bypp>. Please use a single line shell command to finish this task and save the article to file news.txt. Note: the extracted article is like below Trump stso beted bout Amzon hurting the US Post 0tice He has ewen calted Aszono-txmcnapoly- Trumpnts Amszon to pay more tax, bec ae he's concerned sbout amett retailers ustnasndan bkirputs d thiy to retai tro te arlasi trust d, Aaoid.en 1a hurting thir .Today's non adds geso t sne to the fire that Amazon could see more regulation stead.ms Danteltves, hemd of technology research at atnsights. The lant thing nervous tach Trunp hed to do ith that deciason Bath te and the Jaatice Depertoent deny the prestdent's srwalvement 3) (3 points) The first two steps have helped us collect data from web resource. Then we are going to preprocess the article by removing some less important words from the text file, such as "a", "the" and so on. These words are not used for text mining Please write a C program removeChar.c to finish this task. To simplify this problem, your C program just needs to obtain the input from standard input and remove only word

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke, David Auer, Scott Vandenberg, Robert Yoder

10th Edition

0137916787, 978-0137916788

More Books

Students also viewed these Databases questions