Question
The TripAdvisor web page Day Trips in Yellowstone National Park lists the top things to do in the park. For your convenience the code to
The TripAdvisor web page "Day Trips in Yellowstone National Park" lists the top things to do in the park. For your convenience the code to retrieve the contents of the first page of this list is provided below:
import requests
import re
import pandas as pd
from bs4 import BeautifulSoup
import cloudscraper
scraper = cloudscraper.create_scraper()
url = 'https://www.tripadvisor.com/Attraction_Products-g60999-t11889-zfg11867-Yellowstone_National_Park_Wyoming.html'
response = scraper.get(url)
if not response.status_code == 200:
None
try:
results_page = BeautifulSoup(response.content,'lxml')
except:
None
1.1 Use these contents to scrape the rank, name, rating, number of reviews & cost for each of the destinations on the first page and populate a dataframe with columns Rank, Name, Rating, #Reviews, Cost.
1.2 If your criteria for a day trip was the lowest cost destination that had a minimum of 50 reviews, how much would you have to spend? (You must apply the appropriate manipulations on the dataframe to arrive at your answer)
Hints:
To find a tag named 'foo' with attribute 'class' having a value that starts with 'Bar' use: find ('foo',{'class': re.compile('Bar.*')})
To extract the first number from a string mystring use: re.findall ('\d*', mystring)[0]
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started