Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

My problem is with my Python code. I need to ask for user input to search an inputted word in titles and to find the

My problem is with my Python code. I need to ask for user input to search an inputted word in titles and to find the titles that have that word from user.

But, when I tried to get those news titles that contains the user_in it doesn't seem to work. Any help would be greatly appreciated!

#first import modules (re, urllib.request, webbrowser)

#webbrowser is needed for bonus

#open the webpage that is given in the assignment sheet
#read contents in the webpage, set a variable to recall it later
#close the webpage
#print the beginning phrase, searching ....
#use re.findall to find every title of the news in the contents variable (that was set previously)
#need to use for loop to eliminate unrelevant (not needed) contents
#print outcome of for loop
#ask for user input to search that input word in titles
#find the titles that have that word from user
#print outcomes of it
#for bonus, I need to use a re.findall to find the url of those titles that are found in previous one.
#set variable for that urls to be able to use later
#use webbrowser.open_new_tab module to open those url in browser
import re, urllib.request, webbrowser
web_page = urllib.request.urlopen('http://cgi.soic.indiana.edu/~dpierz/news.html')
contents = web_page.read().decode(errors="replace")
web_page.close()
print("Searching: http://cgi.soic.indiana.edu/~dpierz/news.html ")
title = re.findall('(?<=).+?(?=)',contents,re.DOTALL)
for i in title:
if ".edu" not in title:
print("\t",i," ")
web_page = urllib.request.urlopen('http://cgi.soic.indiana.edu/~dpierz/news.html')
contents = web_page.read().decode(errors="replace")
web_page.close()
user_in = input("Please enter a word to search for: ")
title = re.findall('(?<=).+?(?=)',contents,re.DOTALL)
user_title = re.findall('(?<=)user_in.+?(?=")',title,re.DOTALL)
url = re.findall('(?<=

)',contents,re.DOTALL)

#find url of those webpages that has user_in in title. It has to be brought from original contents, because it is out of range of title.
for word in user_in:
if word == user_title:
print(word)
#I tried to get those News titles that contains the user_in, but it is not running properly.
webbrowser.open_new_tab(url)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Machine Learning And Knowledge Discovery In Databases European Conference Ecml Pkdd 2019 Wurzburg Germany September 16 20 2019 Proceedings Part 2 Lnai 11907

Authors: Ulf Brefeld ,Elisa Fromont ,Andreas Hotho ,Arno Knobbe ,Marloes Maathuis ,Celine Robardet

1st Edition

3030461467, 978-3030461461

More Books

Students also viewed these Databases questions