Answered step by step
Verified Expert Solution
Question
1 Approved Answer
This will help hone your regex skills, as well as introduce some Internet-related and module-related things. Let's say that there was a particular section of
This will help hone your regex skills, as well as introduce some Internet-related and module-related things. Let's say that there was a particular section of a particular web page that you wanted delivered directly to your mailbox every week You don't want to have to actually visit the web page every day because you're only interested in one small part of the page and they don't provide a useful RSS feed. Having the data mailed to you on a regular basis would be great. For this program, you're going to write a very basic web scraper that will accomplish this for you. The first part is the hardest part, and that is extracting the relevant information out of the web page. To do this, we're going to use the backtick method (used in the last summary program and discussed later in the handout) of grabbing system output from a text-based web browser. Later in the semester we may look at using Perl libraries/modules to assist us in pulling web data into our programs for processing First, start off by visiting the web page at http://boxofficemojo.com/weekend/chart/ in your favorite web browser. The information that we're interested in is whatever movie information happens to be listed When you're done examining the page, log in to your Unix system and issue the following command: lynx http://boxofficemojo.com/weekend/chart/ What you'll get is a text-based representation of the web page you just looked at in a graphical web browser. It's not very pretty, but all of the same information is there. You may have to allow or block cookies from the site before you get to a page that displays the information we're looking for. Try using the cursor keys to move around a little bit and get a feel for what it's like to use lynx to browse the web. Use "q" to quit when you're done. IFRAME http://sis.amazon.com/iu?dmnId-boxofficemojo.com&eld-viewpId-101&r-1 skyfall passes $1.1 bi11ion worldwide... Daily Box office (Sun.) , weekend Box Office (Feb. 810) Identity Thief Updated 2/11/2013 12:48 P.M. Pacific Time Box office Mojo #1 Movie: I Get Showtimes Search site Searc Social Facebook Logo Facebook Twitter Logo Twitter Features News Showtimes *Release Sched Box office Daily NORMAL LINK) Use right-arrow or creturn> to activate Arrow keys: Up and Down to move. Right to follow a link: Left to go back Figure 1: Using 1ynx to browse the web Now, issue the following command at the Unix prompt: lynx -dump http://boxofficemojo.com/weekend/chart/ Notice how this variation will literally dump the contents of the web page you were just looking at to the screen. It's possible that some of the output may be too long to display on one line of the screen and will automatically wrap at the end. To rectify this (because you're ultimately going to want it to not wrap), you can start up lynx with the-width option, to set the virtual screen width a little wider so text won't wrap. lynx -dump-width-300 http://boxofficemojo.com/weekend/chart/ If, when using the -dump option, the output scrolls by too fast (which it probably should), you can use the following command to save the output to a file called page.output: lynx -dump-width-300 http://boxofficemojo.com/weekend/chart/ > page.output Bring up page.output in your favorite editor and scroll through the file. The numbers in brackets, such as [58) are the hyperlinks on the page, numbered starting at one from the top of the page. There will be a "bibliography of links at the bottom of the page.output file with the actual web URLs for each numbered link. 4 5 [56]Everest (2015) [57 ni. $13,242,895 +83.4% 3,006 +2,461 4,405 $23,282,700 $55 2 5 2 [58]Black Mass [59MB $11,031,215-51.3% 3,188- $3,460 $42,129,394 $53 2 6 3 [60]The visit [61]Uni. $6,674,280-42.3% 2,967-181 $2,250 $52,184,860 $5 3 7 4 [62]The Perfect Guy [63]sGen $4,774,505-51.0 1,889-341 $2,528 $48,895,640 $12 3 Figure 2: Sample page.output results Please note that you cannot rely on the information being in the same spot of the output all the time as the content of the web page changes day-to-day. You also cannot rely on the same bracketed link number (such as 63) being present in the same place every day for the same reason. The examples above have been space-compressed, meaning that extra spaces have been condensed for purposes of displaying them here. MORE INFORMATION Now, try out the following program, which will grab the web page and put all of the output into a scalar called $allData. If you wanted to, you could put the output, line by line, into an array also. Please notice the use of backticks (found on the key on the keyboard) here: /usr/bin/env perl use Modern:: Perl; my $pageToGrab -"http://boxofficemojo.com/weekend/chart/"; -"/usr/bin/lynx-dump-width-300 SpageToGrab"; my $command my $allData$command # capture output of $command to $al!Data # notice that those are backticks, not # single quotes print "SallData" # This obviously prints out the output, as it # would if you executed the lynx command # normally Doing this, you have the entire output of the web page in SallData, and now, if you're careful, you can extract the entire box office section out of that file using one well-crafted regex using a match, some parenthetical back-referencing and some well-placed modifiers to the match operator You'll need to use some regex's to identify the relevant data for each movie. Here's a generic format for that data. Make sure to take into account that the TITLE could contain punctuation and spaces. This is just a basic format for the data; please note that, if you've used the -width option of lynx that all of these items probably appear on a single line" for each movie, like this: CURK LSTHX [JNK] TITLE [JNK] STUDIO WEEKEND CHNG| SCREENS CHNG2 PERSCR CUME BUDGET WEEKS Here is a description of these fields a number or a dash (- a number or the letter N typically a hyperlink reference from the lynx -dump command the title; be aware that it will likely contain spaces typically a hyperlink reference from the lynx -dump command an abbreviation for the studio that distributed the movie the weekend gross in dollars a percentage change from last week' s weekend gross how many screens the movie played on last week the difference from last weeks screen count the per-screen average in dollars the cumulative gross since release if known, the budget for the movie the number of weeks the movie has been in release CURNK LSTWK UNK TITLE UNK STUDIO CHNG1 CHNG2 PERSCR CUME BUDGET NEEKS An actual example, pulled from the September 25h-27h, 2015 weekend box office report: 4 5 [56]Everest (2015) [57ni. $13,242,895 +83.4% 3,006 +2,461 4,405 $23,282,700 $55 2 In this example, here is what each of the pieces of the generic format would map to: CURIK LSTWK 56 Everest (2015) 57 Uni $13,242,895 +83. 4% 3,006 +2,461 $4,405 $23,282,700 $55 TITLE JNK] STUDIO CHNG1 SCREENS CHNG2 PERSCR CUME BUDGET NEEKS An example of a debut from the same weekend: 1 N [50]Hotel Transylvania 2 [51]Sony $48,464,322-3,754- $12,910 $48,464,322-1 In this example, here is what each of the pieces of the generic format would map to: 50 UNK TITLE UNK Hotel Transylvania 2 51 $48,464, 322 3,754 $2,910 STUDIO CHNG1 $48,464, 322 CUME BUDGET An example of a movie without a LSTWK position from the same weekend 20- (88)Jurassic 0rld [89]Uni. $390, 450-38.7% 347-406 $1,125 $650,442,741 $150 16 In this example, here is what each of the pieces of the generic format would map to: 20 CURWK LSTWK UNK TITLE UNK STUDIO WEEKEND CHNG1 SCREENS CHNG2 PERSCR CUME BUDGET NEEKS Jurassic World 89 Un $390,450 -38.78 347 -406 $1,125 $650,442,741 $150 16 MAI T REQUIREMENTS Once you have obtained all of the movies from the Box Office report (process all entries on the page), you need to create an e-mail message containing most of this information. Your e-mail should look almost identical in format to the sample one, including the Your e-mail must display the URL that the data was scraped from at the top. Each movie's entry must display all information on one line; truncate movie titles at a maximum of 35 characters to Each movie's entry will contain the current week's position and the previous week's position. Each movie's entry will display the amounts for the weekend gross and cumulative total Your e-mail must contain, at the end, a summary that includes: o the best debut of the week including the debut's position o the "lowest/'weakest" debut of the week including the debut's position if known or - if unknown o the biggest non-debut gain of the report including how many spots it moved up the chart o the biggest loss of the report including how many spots it moved down the chart. Movement must be represented as positive numbers. In the off chance that no losses or gains occurred in the report use "NONE If multiple movies had the biggest gain or loss, you must list them all in the order they appear in the list. The use of sprintf and string concatenation will be very useful in formatting each line of text to build for your message. Please see the sample e-mail message in this assignment to get an idea of how to format the information NOTE: You must process all movies at the boxofficemojo.com URL. The example e-mail lists all of the movies that were available when the program was run. On some days there may be more movies listed and on some days there may be fewer movies listed. Your program must handle however many movies are found at the URL provided. SAMPLE E-MAIL MESSAGE: Data scraped from http://boxofficemojo.com/weekend/chart/ Data for the week of September 30-October 2, 2016 ## ## Movie Title Weekend Cume 1N Miss Peregrine's Home for Peculiar $28,500,000 $28,500,000 2 N Deepwater Horizon 3 1 The Magnificent Seven (2016) 4 2Storks 5 3 Sully 6 N Masterminds (2016) 7 22 Queen of Katwe $20,600,000 $15,700,000 $13,800,000 $8,400, 000 $6, 600, 000 $20,600,000 $61,605,901 $38,811,274 105, 387,463 $6, 600,000 $3,011,009 $84, 734,937 $20,981, 735 $18,729, 637 320, 845, 629 $19,132,088 $1,200, 000 $28 $25, 787,126 $112,519, 460 $46, 723, 776 $364, 931,200 $10,900, 772 $622, 296 $2,088, 918 $325, 000 $74,698, 658 158, 428,433 $96, 776, 399 $484, 405,113 $12,239, 465 $102,101 $27,156, 624 $2,375, 000 $2,330, 000 94 Bridget Jones's Baby 10 5 Soden 11 8 Suicide Squad 12 6 Blair Witch 13 N M.S. Dhoni: The Untold Story 14 9When the Bough Breaks 15 11 Hell or High ater $1,905, 000 $1,575,000 $1,200, 000 $1,200, 000 $525,000 $470,000 $450,000 $445,000 $380,000 $357,705 17 10 Kubo and the Two Strings 18 15 The Secret Life of Pets 19 14 No Manches Frida 20 28 The Dressmaker 21 19 The Beatles: Eight Days a Week -Th $355,156 22 N I Belonged to You 23 13 Pete's Dragon (2016) 24 23 Star Trek Beyond 25 16 Sausage Party 26 27 Finding Dory 27 25 The Light Between Oceans 28 N Denial 29 29 Florence Foster Jenkins 30 33 Don't Think Twice 31 26 The Hollars 32 24 Mechanic: Resurrection 33 N American Honey 34 36 Nerve 35 N A Man Called Ove 36 N Harry&Snowman 37 20 The wild Life (2016) 38 35 Ben-Hur (2016) $325,000 $271,000 $210,000 $185,000 $158,000 $112,000 $102,101 $100,000 337 138 $98,699 $20,940,493 $62,000 38,562,439 $55,000 $51 $50,000 $48,000 $55,000 $7,853,028 $26,382,282 $55,436, 675 $5,086, 249 $11,075,385 $2,301,933 $1,892,540 40 39 Hunt for the Wilderpeople 41 41 Cafe Society 42 18 HillsongLet Hope Rise 3 43 Greater 44 52 Indignation 5 75 Demon 46-Command and Control 7 53 White Girl 48 61 A Tale of Love and Darkness 49 72 The Best Democracy Money Can Buy: A 7,900 35, 955 $176, 527 $571,069 19, 639 150 $33, 344 1,150 50 N DNot Resist 51 78 Is That a Gun in Your Pocket? 2 97 Girl Asleep 53 83 Generation Startup 54 99 Chronic 100 $4,500 $1,800 $5,207 Biggest Debut: Miss Peregrine's Home for Peculiar Children (1) Meakest Debut: Do Not Resist (50) Biggest Gain: Girl Asleep Chronic (45 places) Biggest Loss: Hillsong Let Hope Rise (24 places) USING Mail: Sendmail TO SEND E-MAIL FROM Perl Try this program out in your Unix account: /usr/bin/env perl use Modern:: Perl; use Mail::Sendmail # what gets shifted off of command line should be a valid e-mail addy my SmailTo shifti # use your address here instead of me@nowhere.com!!! my $mailFrom my $subjectLine my Smessage 'me nowhere.com "Weekend Box Office Report" "pre>Hello nHow nAre You?Inln" -SmailTo, ->$mailFrom, my mai1-To From Subject$ubjectLine, Message$message, Content-Type'-'text/html; charset-"utf-8* # must be a string, not an array # be careful to get the Content-Type line correct, with # double quotes for the charset, etc if (sendmail Smail) print "Successfully sent mail to $mailTo. Check your box! " else print "Error sending mail: SMail::Sendmail::error In" This program will send e-mail to the e-mail address listed in $mailTo and from the e-mail address listed in SmailFrom Goals Write a moderately involved program with Perl that grabs a web page, parses out specific information, and then e-mails that information in a formatted way to an e-mail address. Use external programs (lynx) and capture their output with backticks. .Use Perl modules (Mail:: Sendmail) to send e-mail. ink A Your program should end with a usage statement if no command-line argument is provided. Remember to use use Modern:: Perl; in your program. Remember to use # ! /usr/bin/env perl as the very first line of your program. . ram Available You may run a version of my program by typing-rfulkerson/samples/bomscraper on Loki. Please do so to get an idea about how the basic program needs to run and work. You can also run the sample program to compare your results to my results. You must have an Honor Pledge on file for the course for the program to be graded. You must have complete header documentation as outlined in the Course Materials section of Blackboard. You may only use material through lecture #10 for this assignment. You should only use regex material we've covered: if you use anything else, you must document where you obtained that material from. .Your program must display all movies from the available data on the web page. Your e-mail must be sent from your @unomaha. edu address so that it's easy for me to tell whose email results rm grading, rather than trying to reverse engineer every e-mail to figure out which of the programs it came from. Yor program must print each movie entry, one per line. If this requires you to truncate long movie names at 35 characters, then make sure that it happens. Assume that a line is no longer than 78 characters. Your program must use the use Modern:: Perl; directive. Your program must not produce any errors with the use Modern: Per 1 ; pragma. If it does, 2 points per instance will be taken off to a maximum of 10 points. Remember the use of Data: : Dumper discussed earlier this semester. This can help you debug uninitialized variables/array elements, etc. If you forgot how to use it, use perldoc Data::Dumper to find out. Your program must be able to be executed without explicitly stating the perl interpreter on the command-line (ie, you should be able to type /scraper.pl address instead of perl /scraper.pl address). Yor program must have a usage statement that is displayed if no command line argument is given or if too many command line arguments are given, and your program should end after the usage statement is displayed. Make sure your usage statement displays the name of the currently running program (use $0) and gives an example of how to use the program. Your program must send e-mail to the command-line argument e-mail address given: do not hardcode e-mail addresses like 28508robertfulkerson.com or your own e-mail address into the recipient To: portion of your program. A portion of your grade will be on the aesthetic formatting of your e-mail, specifically how each entry looks. If your message looks funny in an e-mail service like Hotmail, Yahoo or GMail, try pre-pending the text
to your message body. This will do a very basic HTML formatting of your text to preformat it in a Courier font. .Your program must identify the movie with the best debut. .Your program must identify the movie with the weakest debut Your program must identify the movie(s) with the largest non-debut gain. Your program must identify the movie(s) with the largest non-debut loss. This will help hone your regex skills, as well as introduce some Internet-related and module-related things. Let's say that there was a particular section of a particular web page that you wanted delivered directly to your mailbox every week You don't want to have to actually visit the web page every day because you're only interested in one small part of the page and they don't provide a useful RSS feed. Having the data mailed to you on a regular basis would be great. For this program, you're going to write a very basic web scraper that will accomplish this for you. The first part is the hardest part, and that is extracting the relevant information out of the web page. To do this, we're going to use the backtick method (used in the last summary program and discussed later in the handout) of grabbing system output from a text-based web browser. Later in the semester we may look at using Perl libraries/modules to assist us in pulling web data into our programs for processing First, start off by visiting the web page at http://boxofficemojo.com/weekend/chart/ in your favorite web browser. The information that we're interested in is whatever movie information happens to be listed When you're done examining the page, log in to your Unix system and issue the following command: lynx http://boxofficemojo.com/weekend/chart/ What you'll get is a text-based representation of the web page you just looked at in a graphical web browser. It's not very pretty, but all of the same information is there. You may have to allow or block cookies from the site before you get to a page that displays the information we're looking for. Try using the cursor keys to move around a little bit and get a feel for what it's like to use lynx to browse the web. Use "q" to quit when you're done. IFRAME http://sis.amazon.com/iu?dmnId-boxofficemojo.com&eld-viewpId-101&r-1 skyfall passes $1.1 bi11ion worldwide... Daily Box office (Sun.) , weekend Box Office (Feb. 810) Identity Thief Updated 2/11/2013 12:48 P.M. Pacific Time Box office Mojo #1 Movie: I Get Showtimes Search site Searc Social Facebook Logo Facebook Twitter Logo Twitter Features News Showtimes *Release Sched Box office Daily NORMAL LINK) Use right-arrow or creturn> to activate Arrow keys: Up and Down to move. Right to follow a link: Left to go back Figure 1: Using 1ynx to browse the web Now, issue the following command at the Unix prompt: lynx -dump http://boxofficemojo.com/weekend/chart/ Notice how this variation will literally dump the contents of the web page you were just looking at to the screen. It's possible that some of the output may be too long to display on one line of the screen and will automatically wrap at the end. To rectify this (because you're ultimately going to want it to not wrap), you can start up lynx with the-width option, to set the virtual screen width a little wider so text won't wrap. lynx -dump-width-300 http://boxofficemojo.com/weekend/chart/ If, when using the -dump option, the output scrolls by too fast (which it probably should), you can use the following command to save the output to a file called page.output: lynx -dump-width-300 http://boxofficemojo.com/weekend/chart/ > page.output Bring up page.output in your favorite editor and scroll through the file. The numbers in brackets, such as [58) are the hyperlinks on the page, numbered starting at one from the top of the page. There will be a "bibliography of links at the bottom of the page.output file with the actual web URLs for each numbered link. 4 5 [56]Everest (2015) [57 ni. $13,242,895 +83.4% 3,006 +2,461 4,405 $23,282,700 $55 2 5 2 [58]Black Mass [59MB $11,031,215-51.3% 3,188- $3,460 $42,129,394 $53 2 6 3 [60]The visit [61]Uni. $6,674,280-42.3% 2,967-181 $2,250 $52,184,860 $5 3 7 4 [62]The Perfect Guy [63]sGen $4,774,505-51.0 1,889-341 $2,528 $48,895,640 $12 3 Figure 2: Sample page.output results Please note that you cannot rely on the information being in the same spot of the output all the time as the content of the web page changes day-to-day. You also cannot rely on the same bracketed link number (such as 63) being present in the same place every day for the same reason. The examples above have been space-compressed, meaning that extra spaces have been condensed for purposes of displaying them here. MORE INFORMATION Now, try out the following program, which will grab the web page and put all of the output into a scalar called $allData. If you wanted to, you could put the output, line by line, into an array also. Please notice the use of backticks (found on the key on the keyboard) here: /usr/bin/env perl use Modern:: Perl; my $pageToGrab -"http://boxofficemojo.com/weekend/chart/"; -"/usr/bin/lynx-dump-width-300 SpageToGrab"; my $command my $allData$command # capture output of $command to $al!Data # notice that those are backticks, not # single quotes print "SallData" # This obviously prints out the output, as it # would if you executed the lynx command # normally Doing this, you have the entire output of the web page in SallData, and now, if you're careful, you can extract the entire box office section out of that file using one well-crafted regex using a match, some parenthetical back-referencing and some well-placed modifiers to the match operator You'll need to use some regex's to identify the relevant data for each movie. Here's a generic format for that data. Make sure to take into account that the TITLE could contain punctuation and spaces. This is just a basic format for the data; please note that, if you've used the -width option of lynx that all of these items probably appear on a single line" for each movie, like this: CURK LSTHX [JNK] TITLE [JNK] STUDIO WEEKEND CHNG| SCREENS CHNG2 PERSCR CUME BUDGET WEEKS Here is a description of these fields a number or a dash (- a number or the letter N typically a hyperlink reference from the lynx -dump command the title; be aware that it will likely contain spaces typically a hyperlink reference from the lynx -dump command an abbreviation for the studio that distributed the movie the weekend gross in dollars a percentage change from last week' s weekend gross how many screens the movie played on last week the difference from last weeks screen count the per-screen average in dollars the cumulative gross since release if known, the budget for the movie the number of weeks the movie has been in release CURNK LSTWK UNK TITLE UNK STUDIO CHNG1 CHNG2 PERSCR CUME BUDGET NEEKS An actual example, pulled from the September 25h-27h, 2015 weekend box office report: 4 5 [56]Everest (2015) [57ni. $13,242,895 +83.4% 3,006 +2,461 4,405 $23,282,700 $55 2 In this example, here is what each of the pieces of the generic format would map to: CURIK LSTWK 56 Everest (2015) 57 Uni $13,242,895 +83. 4% 3,006 +2,461 $4,405 $23,282,700 $55 TITLE JNK] STUDIO CHNG1 SCREENS CHNG2 PERSCR CUME BUDGET NEEKS An example of a debut from the same weekend: 1 N [50]Hotel Transylvania 2 [51]Sony $48,464,322-3,754- $12,910 $48,464,322-1 In this example, here is what each of the pieces of the generic format would map to: 50 UNK TITLE UNK Hotel Transylvania 2 51 $48,464, 322 3,754 $2,910 STUDIO CHNG1 $48,464, 322 CUME BUDGET An example of a movie without a LSTWK position from the same weekend 20- (88)Jurassic 0rld [89]Uni. $390, 450-38.7% 347-406 $1,125 $650,442,741 $150 16 In this example, here is what each of the pieces of the generic format would map to: 20 CURWK LSTWK UNK TITLE UNK STUDIO WEEKEND CHNG1 SCREENS CHNG2 PERSCR CUME BUDGET NEEKS Jurassic World 89 Un $390,450 -38.78 347 -406 $1,125 $650,442,741 $150 16 MAI T REQUIREMENTS Once you have obtained all of the movies from the Box Office report (process all entries on the page), you need to create an e-mail message containing most of this information. Your e-mail should look almost identical in format to the sample one, including the Your e-mail must display the URL that the data was scraped from at the top. Each movie's entry must display all information on one line; truncate movie titles at a maximum of 35 characters to Each movie's entry will contain the current week's position and the previous week's position. Each movie's entry will display the amounts for the weekend gross and cumulative total Your e-mail must contain, at the end, a summary that includes: o the best debut of the week including the debut's position o the "lowest/'weakest" debut of the week including the debut's position if known or - if unknown o the biggest non-debut gain of the report including how many spots it moved up the chart o the biggest loss of the report including how many spots it moved down the chart. Movement must be represented as positive numbers. In the off chance that no losses or gains occurred in the report use "NONE If multiple movies had the biggest gain or loss, you must list them all in the order they appear in the list. The use of sprintf and string concatenation will be very useful in formatting each line of text to build for your message. Please see the sample e-mail message in this assignment to get an idea of how to format the information NOTE: You must process all movies at the boxofficemojo.com URL. The example e-mail lists all of the movies that were available when the program was run. On some days there may be more movies listed and on some days there may be fewer movies listed. Your program must handle however many movies are found at the URL provided. SAMPLE E-MAIL MESSAGE: Data scraped from http://boxofficemojo.com/weekend/chart/ Data for the week of September 30-October 2, 2016 ## ## Movie Title Weekend Cume 1N Miss Peregrine's Home for Peculiar $28,500,000 $28,500,000 2 N Deepwater Horizon 3 1 The Magnificent Seven (2016) 4 2Storks 5 3 Sully 6 N Masterminds (2016) 7 22 Queen of Katwe $20,600,000 $15,700,000 $13,800,000 $8,400, 000 $6, 600, 000 $20,600,000 $61,605,901 $38,811,274 105, 387,463 $6, 600,000 $3,011,009 $84, 734,937 $20,981, 735 $18,729, 637 320, 845, 629 $19,132,088 $1,200, 000 $28 $25, 787,126 $112,519, 460 $46, 723, 776 $364, 931,200 $10,900, 772 $622, 296 $2,088, 918 $325, 000 $74,698, 658 158, 428,433 $96, 776, 399 $484, 405,113 $12,239, 465 $102,101 $27,156, 624 $2,375, 000 $2,330, 000 94 Bridget Jones's Baby 10 5 Soden 11 8 Suicide Squad 12 6 Blair Witch 13 N M.S. Dhoni: The Untold Story 14 9When the Bough Breaks 15 11 Hell or High ater $1,905, 000 $1,575,000 $1,200, 000 $1,200, 000 $525,000 $470,000 $450,000 $445,000 $380,000 $357,705 17 10 Kubo and the Two Strings 18 15 The Secret Life of Pets 19 14 No Manches Frida 20 28 The Dressmaker 21 19 The Beatles: Eight Days a Week -Th $355,156 22 N I Belonged to You 23 13 Pete's Dragon (2016) 24 23 Star Trek Beyond 25 16 Sausage Party 26 27 Finding Dory 27 25 The Light Between Oceans 28 N Denial 29 29 Florence Foster Jenkins 30 33 Don't Think Twice 31 26 The Hollars 32 24 Mechanic: Resurrection 33 N American Honey 34 36 Nerve 35 N A Man Called Ove 36 N Harry&Snowman 37 20 The wild Life (2016) 38 35 Ben-Hur (2016) $325,000 $271,000 $210,000 $185,000 $158,000 $112,000 $102,101 $100,000 337 138 $98,699 $20,940,493 $62,000 38,562,439 $55,000 $51 $50,000 $48,000 $55,000 $7,853,028 $26,382,282 $55,436, 675 $5,086, 249 $11,075,385 $2,301,933 $1,892,540 40 39 Hunt for the Wilderpeople 41 41 Cafe Society 42 18 HillsongLet Hope Rise 3 43 Greater 44 52 Indignation 5 75 Demon 46-Command and Control 7 53 White Girl 48 61 A Tale of Love and Darkness 49 72 The Best Democracy Money Can Buy: A 7,900 35, 955 $176, 527 $571,069 19, 639 150 $33, 344 1,150 50 N DNot Resist 51 78 Is That a Gun in Your Pocket? 2 97 Girl Asleep 53 83 Generation Startup 54 99 Chronic 100 $4,500 $1,800 $5,207 Biggest Debut: Miss Peregrine's Home for Peculiar Children (1) Meakest Debut: Do Not Resist (50) Biggest Gain: Girl Asleep Chronic (45 places) Biggest Loss: Hillsong Let Hope Rise (24 places) USING Mail: Sendmail TO SEND E-MAIL FROM Perl Try this program out in your Unix account: /usr/bin/env perl use Modern:: Perl; use Mail::Sendmail # what gets shifted off of command line should be a valid e-mail addy my SmailTo shifti # use your address here instead of me@nowhere.com!!! my $mailFrom my $subjectLine my Smessage 'me nowhere.com "Weekend Box Office Report" "pre>Hello nHow nAre You?Inln" -SmailTo, ->$mailFrom, my mai1-To From Subject$ubjectLine, Message$message, Content-Type'-'text/html; charset-"utf-8* # must be a string, not an array # be careful to get the Content-Type line correct, with # double quotes for the charset, etc if (sendmail Smail) print "Successfully sent mail to $mailTo. Check your box! " else print "Error sending mail: SMail::Sendmail::error In" This program will send e-mail to the e-mail address listed in $mailTo and from the e-mail address listed in SmailFrom Goals Write a moderately involved program with Perl that grabs a web page, parses out specific information, and then e-mails that information in a formatted way to an e-mail address. Use external programs (lynx) and capture their output with backticks. .Use Perl modules (Mail:: Sendmail) to send e-mail. ink A Your program should end with a usage statement if no command-line argument is provided. Remember to use use Modern:: Perl; in your program. Remember to use # ! /usr/bin/env perl as the very first line of your program. . ram Available You may run a version of my program by typing-rfulkerson/samples/bomscraper on Loki. Please do so to get an idea about how the basic program needs to run and work. You can also run the sample program to compare your results to my results. You must have an Honor Pledge on file for the course for the program to be graded. You must have complete header documentation as outlined in the Course Materials section of Blackboard. You may only use material through lecture #10 for this assignment. You should only use regex material we've covered: if you use anything else, you must document where you obtained that material from. .Your program must display all movies from the available data on the web page. Your e-mail must be sent from your @unomaha. edu address so that it's easy for me to tell whose email results rm grading, rather than trying to reverse engineer every e-mail to figure out which of the programs it came from. Yor program must print each movie entry, one per line. If this requires you to truncate long movie names at 35 characters, then make sure that it happens. Assume that a line is no longer than 78 characters. Your program must use the use Modern:: Perl; directive. Your program must not produce any errors with the use Modern: Per 1 ; pragma. If it does, 2 points per instance will be taken off to a maximum of 10 points. Remember the use of Data: : Dumper discussed earlier this semester. This can help you debug uninitialized variables/array elements, etc. If you forgot how to use it, use perldoc Data::Dumper to find out. Your program must be able to be executed without explicitly stating the perl interpreter on the command-line (ie, you should be able to type /scraper.pl address instead of perl /scraper.pl address). Yor program must have a usage statement that is displayed if no command line argument is given or if too many command line arguments are given, and your program should end after the usage statement is displayed. Make sure your usage statement displays the name of the currently running program (use $0) and gives an example of how to use the program. Your program must send e-mail to the command-line argument e-mail address given: do not hardcode e-mail addresses like 28508robertfulkerson.com or your own e-mail address into the recipient To: portion of your program. A portion of your grade will be on the aesthetic formatting of your e-mail, specifically how each entry looks. If your message looks funny in an e-mail service like Hotmail, Yahoo or GMail, try pre-pending the text
to your message body. This will do a very basic HTML formatting of your text to preformat it in a Courier font. .Your program must identify the movie with the best debut. .Your program must identify the movie with the weakest debut Your program must identify the movie(s) with the largest non-debut gain. Your program must identify the movie(s) with the largest non-debut loss
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started