Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Regular expression python question: A webserver's access.log file lists all the attempts toaccess my webserver. A typical line from the log file looks likethis: 35.211.184.102

Regular expression python question:

A webserver's access.log file lists all the attempts toaccess my webserver. A typical line from the log file looks likethis:

35.211.184.102 - - [17/Jul/2021:00:20:14 +0000] "GET /home/HTTP/1.1" 200 9731 "https://thackston.me/home/" "WordPress/5.7.2;https://thackston.me/home"

  • {hostname} : 35.211.184.102
  • {logname} : -
  • {username} : -
  • {date/time} : [17/Jul/2021:00:20:14 +0000]
  • {first-line-of-request} : "GET /home/ HTTP/1.1" (usually theword GET or POST, followed by a space, theURL, followed by a space, ending with HTTP/1.x,all inside double quotes)
  • {status} : 200
  • {bytes-sent} : 9731
  • {referrer} : "https://thackston.me/home/"
  • {user-agent} : "WordPress/5.7.2;https://thackston.me/home"

Write a Python script to extract a unique list of the URLs beingaccessed via a GET request. Display theresults.

Modify the script to extract a unique list of the URLs beingaccessed via POST requests. Display theresults.

Example of the access.log file: 35.190.168.80 - -[04/Jul/2021:00:03:17 +0000] "GET /home/?feed=rss2&page_id=2HTTP/1.0" 200 5568 "-" "ZoominfoBot (zoominfobot at zoominfo dotcom)"
205.185.119.153 - - [04/Jul/2021:00:06:19 +0000] "GET / HTTP/1.1"302 196 "-" "Mozilla/5.0 (compatible, MSIE 10.0, Windows NT,DigExt)"
205.185.119.153 - - [04/Jul/2021:00:06:19 +0000] "GET/home/index.php HTTP/1.1" 301 295 "http://35.211.184.102""Mozilla/5.0 (compatible, MSIE 10.0, Windows NT, DigExt)"
205.185.119.153 - - [04/Jul/2021:00:06:19 +0000] "GET /home/HTTP/1.1" 200 5453 "http://35.211.184.102/home/index.php""Mozilla/5.0 (compatible, MSIE 10.0, Windows NT, DigExt)"
138.128.118.130 - - [04/Jul/2021:00:06:58 +0000] "GET / HTTP/1.1"301 565 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190Safari/537.36"
138.128.118.130 - - [04/Jul/2021:00:06:58 +0000] "GET / HTTP/1.1"302 4748 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190Safari/537.36"
138.128.118.130 - - [04/Jul/2021:00:06:59 +0000] "GET/home/index.php HTTP/1.1" 301 502 "-" "Mozilla/5.0 (Windows NT10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/88.0.4324.190 Safari/537.36"
138.128.118.130 - - [04/Jul/2021:00:06:59 +0000] "GET /home/HTTP/1.1" 200 5704 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190Safari/537.36"
138.128.118.130 - - [04/Jul/2021:00:07:00 +0000] "GET /robots.txtHTTP/1.1" 404 662 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190Safari/537.36"
138.128.118.130 - - [04/Jul/2021:00:07:02 +0000] "GET /ads.txtHTTP/1.1" 404 662 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190Safari/537.36"
35.211.184.102 - - [04/Jul/2021:00:10:30 +0000] "POST/home/wp-cron.php?doing_wp_cron=1625357430.4367249011993408203125HTTP/1.1" 200 4929"https://thackston.me/home/wp-cron.php?doing_wp_cron=1625357430.4367249011993408203125""WordPress/5.7.2; https://thackston.me/home"
35.190.168.80 - - [04/Jul/2021:00:10:30 +0000] "GET/home/xmlrpc.php?rsd HTTP/1.0" 200 5044 "-" "ZoominfoBot(zoominfobot at zoominfo dot com)"
35.190.168.80 - - [04/Jul/2021:00:12:05 +0000] "GET/home/?page_id=14 HTTP/1.0" 200 9189 "-" "ZoominfoBot (zoominfobotat zoominfo dot com)"
35.190.168.80 - - [04/Jul/2021:00:12:27 +0000] "GET /home HTTP/1.0"301 5054 "-" "ZoominfoBot (zoominfobot at zoominfo dot com)"
35.190.168.80 - - [04/Jul/2021:00:12:27 +0000] "GET /home/HTTP/1.0" 200 5666 "-" "ZoominfoBot (zoominfobot at zoominfo dotcom)"
45.41.104.178 - - [04/Jul/2021:00:14:07 +0000] "GET / HTTP/1.1" 302177 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
35.190.168.80 - - [04/Jul/2021:00:17:33 +0000] "GET /eagle/HTTP/1.0" 200 5236 "-" "ZoominfoBot (zoominfobot at zoominfo dotcom)"
209.141.35.200 - - [04/Jul/2021:00:21:39 +0000] "GET / HTTP/1.0"400 0 "-" "-"
35.211.184.102 - - [04/Jul/2021:00:38:20 +0000] "POST/home/wp-cron.php?doing_wp_cron=1625359100.0295848846435546875000HTTP/1.1" 200 4929"https://thackston.me/home/wp-cron.php?doing_wp_cron=1625359100.0295848846435546875000""WordPress/5.7.2; https://thackston.me/home"
35.190.168.80 - - [04/Jul/2021:00:38:19 +0000] "GET/home/?feed=comments-rss2 HTTP/1.0" 200 5348 "-" "ZoominfoBot(zoominfobot at zoominfo dot com)"
35.190.168.80 - - [04/Jul/2021:00:39:03 +0000] "GET/home/wp-content/uploads/2019/12/d2l_enhancement_suite-2.1.0-fx.xpiHTTP/1.0" 200 19635 "-" "ZoominfoBot (zoominfobot at zoominfo dotcom)"
35.190.168.80 - - [04/Jul/2021:00:54:57 +0000] "GET/home/index.php?rest_route=/ HTTP/1.0" 200 106317 "-" "ZoominfoBot(zoominfobot at zoominfo dot com)"
35.190.168.80 - - [04/Jul/2021:00:55:24 +0000] "GET/home/index.php?rest_route=%2Foembed%2F1.0%2Fembed&url=https%3A%2F%2Fthackston.me%2Fhome%2F&format=xmlHTTP/1.0" 200 6188 "-" "ZoominfoBot (zoominfobot at zoominfo dotcom)"
35.190.168.80 - - [04/Jul/2021:00:58:44 +0000] "GET/home/index.php?rest_route=/wp/v2/pages/2 HTTP/1.0" 200 10049 "-""ZoominfoBot (zoominfobot at zoominfo dot com)"
209.17.97.34 - - [04/Jul/2021:01:01:42 +0000] "GET / HTTP/1.0" 302196 "-" "Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2;http://cloudsystemnetworks.com)"
162.62.123.46 - - [04/Jul/2021:01:03:33 +0000] "GET / HTTP/1.1" 3024736 "-" "-"
209.17.96.194 - - [04/Jul/2021:01:05:20 +0000] "GET / HTTP/1.1" 302177 "-" "Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2;http://cloudsystemnetworks.com)"
209.17.96.194 - - [04/Jul/2021:01:05:20 +0000] "GET /home/index.phpHTTP/1.1" 301 276 "http://35.211.184.102:80/" "Go httppackage"
209.17.96.194 - - [04/Jul/2021:01:05:20 +0000] "GET /home/HTTP/1.1" 200 14631 "http://35.211.184.102:80/home/index.php" "Gohttp package"
194.48.199.78 - - [04/Jul/2021:01:24:00 +0000] "GET /sso/js/auth.jsHTTP/1.1" 404 4978 "-" "() { :; }; echo ; /bin/bash -c 'expr 16356* 999'"
194.48.199.78 - - [04/Jul/2021:01:36:04 +0000] "GET/sso/js/openam.js HTTP/1.1" 404 4978 "-" "() { :; }; echo ;/bin/bash -c 'expr 16356 * 999'"
207.46.13.0 - - [04/Jul/2021:01:53:05 +0000] "GET /robots.txtHTTP/1.1" 301 585 "-" "Mozilla/5.0 (compatible; bingbot/2.0;+http://www.bing.com/bingbot.htm)"
207.46.13.0 - - [04/Jul/2021:01:53:05 +0000] "GET /robots.txtHTTP/1.1" 404 5497 "-" "Mozilla/5.0 (compatible; bingbot/2.0;+http://www.bing.com/bingbot.htm)"
72.81.249.159 - - [04/Jul/2021:01:53:32 +0000] "GET / HTTP/1.1" 3024980 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0)Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:33 +0000] "GET /home/index.phpHTTP/1.1" 301 353 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64;rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET /home/HTTP/1.1" 200 5540 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64;rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/css/dist/block-library/style.min.css?ver=5.7.2HTTP/1.1" 200 9110 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/css/dist/block-library/theme.min.css?ver=5.7.2HTTP/1.1" 200 1105 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-content/themes/twentysixteen/js/skip-link-focus-fix.js?ver=20170530HTTP/1.1" 200 959 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-content/themes/twentysixteen/genericons/genericons.css?ver=20201208HTTP/1.1" 200 17241 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-content/themes/twentysixteen/css/blocks.css?ver=20190102HTTP/1.1" 200 2590 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-content/themes/twentysixteen/js/functions.js?ver=20181217HTTP/1.1" 200 2506 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/js/jquery/jquery-migrate.min.js?ver=3.3.2HTTP/1.1" 200 4906 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-content/themes/twentysixteen/style.css?ver=20201208HTTP/1.1" 200 14332 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/js/jquery/jquery.min.js?ver=3.5.1 HTTP/1.1" 20031721 "https://thackston.me/home/" "Mozilla/5.0 (X11; Ubuntu; Linuxx86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/js/wp-emoji-release.min.js?ver=5.7.2 HTTP/1.1"200 5079 "https://thackston.me/home/" "Mozilla/5.0 (X11; Ubuntu;Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET/home/wp-includes/js/wp-embed.min.js?ver=5.7.2 HTTP/1.1" 200 1136"https://thackston.me/home/" "Mozilla/5.0 (X11; Ubuntu; Linuxx86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:34 +0000] "GET /favicon.icoHTTP/1.1" 404 513 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:35 +0000] "GET/home/wp-content/uploads/2018/02/Thackston_Russell_Headshot_medium.jpgHTTP/1.1" 200 768499 "https://thackston.me/home/" "Mozilla/5.0(X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:01:53:35 +0000] "GET/home/?wordfence_lh=1&hid=745286DB5824692721E5CBD7408F15FD&r=0.8443106612550717HTTP/1.1" 200 514 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
35.211.184.102 - - [04/Jul/2021:01:53:36 +0000] "POST/home/wp-admin/admin-ajax.php?action=wordfence_testAjax HTTP/1.1"200 5170"https://thackston.me/home/wp-admin/admin-ajax.php?action=wordfence_testAjax""WordPress/5.7.2; https://thackston.me/home"
35.211.184.102 - - [04/Jul/2021:01:53:36 +0000] "GET/home/wp-admin/admin-ajax.php?action=wordfence_doScan&isFork=0&scanMode=quick&cronKey=43752554214f9c2fb1e02ce2f343d6a2&signature=e5b8916ea31ec15b5e9c6aab5bd2a26e6d7b5449af32987a66de368092cff84aHTTP/1.1" 200 5157 "-" "WordPress/5.7.2;https://thackston.me/home"
35.211.184.102 - - [04/Jul/2021:01:53:33 +0000] "POST/home/wp-cron.php?doing_wp_cron=1625363613.6638200283050537109375HTTP/1.1" 200 4929"https://thackston.me/home/wp-cron.php?doing_wp_cron=1625363613.6638200283050537109375""WordPress/5.7.2; https://thackston.me/home"
72.81.249.159 - - [04/Jul/2021:02:00:26 +0000] "GET / HTTP/1.1" 3024980 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0)Gecko/20100101 Firefox/89.0"
35.211.184.102 - - [04/Jul/2021:02:00:27 +0000] "POST/home/wp-cron.php?doing_wp_cron=1625364026.9680650234222412109375HTTP/1.1" 200 4929"https://thackston.me/home/wp-cron.php?doing_wp_cron=1625364026.9680650234222412109375""WordPress/5.7.2; https://thackston.me/home"
72.81.249.159 - - [04/Jul/2021:02:00:26 +0000] "GET /home/HTTP/1.1" 200 5026 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64;rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:02:00:27 +0000] "GET/home/wp-content/themes/twentysixteen/genericons/genericons.css?ver=20201208HTTP/1.1" 200 16921 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:02:00:27 +0000] "GET /favicon.icoHTTP/1.1" 404 833 "https://thackston.me/home/" "Mozilla/5.0 (X11;Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
72.81.249.159 - - [04/Jul/2021:02:00:31 +0000] "GET/home/wp-content/uploads/2018/09/ThackstonCV.pdf HTTP/1.1" 200235563 "https://thackston.me/home/" "Mozilla/5.0 (X11; Ubuntu;Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"
114.119.148.70 - - [04/Jul/2021:02:02:51 +0000] "GET /robots.txtHTTP/1.1" 301 585 "-" "Mozilla/5.0(compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)"
114.119.148.70 - - [04/Jul/2021:02:02:52 +0000] "GET /robots.txtHTTP/1.1" 404 5003 "-" "Mozilla/5.0(compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)"
23.228.109.147 - - [04/Jul/2021:02:07:07 +0000] "GET/wp-content/plugins/Tevolution/tmplconnector/monetize/templatic-custom_fields/css/jquery.lightbox-0.5.cssHTTP/1.1" 301 736 "http://www.google.com/" "Mozilla/5.0 (Windows NT6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/56.0.2896.3 Safari/537.36"
23.228.109.147 - - [04/Jul/2021:02:07:07 +0000] "GET/wp-content/plugins/Tevolution/tmplconnector/monetize/templatic-custom_fields/css/jquery.lightbox-0.5.cssHTTP/1.1" 404 4970 "http://www.google.com/" "Mozilla/5.0 (WindowsNT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/56.0.2896.3 Safari/537.36"
116.75.229.201 - - [04/Jul/2021:02:12:15 +0000] "GET/boaform/admin/formLogin?username=ec8&psd=ec8 HTTP/1.0" 404 498"-" "-"
128.14.134.170 - - [04/Jul/2021:02:13:51 +0000] "GET/Telerik.Web.UI.WebResource.axd?type=rau HTTP/1.1" 404 437 "-""Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
66.249.88.26 - - [04/Jul/2021:02:44:23 +0000] "GET / HTTP/1.1" 301565 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon"
66.249.88.28 - - [04/Jul/2021:02:44:23 +0000] "GET / HTTP/1.1" 3024965 "-" "Mozilla/5.0 (X11; Linux x86_64)

Step by Step Solution

3.49 Rating (149 Votes )

There are 3 Steps involved in it

Step: 1

Here is an example of a Python script that extracts a unique list of U... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Income Tax Fundamentals 2013

Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill

31st Edition

1111972516, 978-1285586618, 1285586611, 978-1285613109, 978-1111972516

More Books

Students also viewed these Programming questions

Question

Establish identity. cos( + k) = (-1)k cos , k any integer

Answered: 1 week ago