Answered step by step
Verified Expert Solution
Question
1 Approved Answer
My problem is with my Python code. I need to ask for user input to search an inputted word in titles and to find the
My problem is with my Python code. I need to ask for user input to search an inputted word in titles and to find the titles that have that word from user. But, when I tried to get those news titles that contains the user_in it doesn't seem to work. Any help would be greatly appreciated! #first import modules (re, urllib.request, webbrowser) #webbrowser is needed for bonus | |
#open the webpage that is given in the assignment sheet | |
#read contents in the webpage, set a variable to recall it later | |
#close the webpage | |
#print the beginning phrase, searching .... | |
#use re.findall to find every title of the news in the contents variable (that was set previously) | |
#need to use for loop to eliminate unrelevant (not needed) contents | |
#print outcome of for loop | |
#ask for user input to search that input word in titles | |
#find the titles that have that word from user | |
#print outcomes of it | |
#for bonus, I need to use a re.findall to find the url of those titles that are found in previous one. | |
#set variable for that urls to be able to use later | |
#use webbrowser.open_new_tab module to open those url in browser | |
import re, urllib.request, webbrowser | |
web_page = urllib.request.urlopen('http://cgi.soic.indiana.edu/~dpierz/news.html') | |
contents = web_page.read().decode(errors="replace") | |
web_page.close() | |
print("Searching: http://cgi.soic.indiana.edu/~dpierz/news.html ") | |
title = re.findall('(?<=).+?(?=)',contents,re.DOTALL) | |
for i in title: | |
if ".edu" not in title: | |
print("\t",i," ") | |
web_page = urllib.request.urlopen('http://cgi.soic.indiana.edu/~dpierz/news.html') | |
contents = web_page.read().decode(errors="replace") | |
web_page.close() | |
user_in = input("Please enter a word to search for: ") | |
title = re.findall('(?<=).+?(?=)',contents,re.DOTALL) | |
user_title = re.findall('(?<=)user_in.+?(?=")',title,re.DOTALL) | |
url = re.findall('(?<=)',contents,re.DOTALL) | |
#find url of those webpages that has user_in in title. It has to be brought from original contents, because it is out of range of title. | |
for word in user_in: | |
if word == user_title: | |
print(word) | |
#I tried to get those News titles that contains the user_in, but it is not running properly. | |
webbrowser.open_new_tab(url) |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started