Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 06, 2024

Your team must create a Python class called AIWebCrawler that fulfills the following requirements: Web Crawling: Your crawler must visit all pages within the given

Your team must create a Python class called AIWebCrawler that fulfills the following requirements:

Web Crawling:

Your crawler must visit all pages within the given domain.

The crawler must not navigate to external domains.

Handle different types of web pages and links.

Visiting Strategies:

Implement the visiting strategies: preorder, inorder, and postorder.

The visiting strategy must be specified as a parameter during class instantiation.

Output:

Generate a corpus of text documents containing the content of each visited page.

Ensure the text is free of HTML tags, JavaScript, menu items, and other non $-$ essential elements.

The title of each textual document is the title of the page visited during the crawling phase.

Handling Dynamic Content:

Use JavaScript engines like Chrome Selenium WebDriver to crawl and extract content from dynamic pages.

Ensure the crawler can interpret and navigate JavaScript $-$ rendered content.

Integration with AI $($ ChatGPT or Google Colab $)$ :

Utilize Al capabilities in your crawler for tasks such as parsing, Eext extraction, or decision $-$ making.

Document all the prompts used to generate the web crawler and keep track of the number of times the generated code did not work and how you solved the iss prompt or manual intervention $) .$

Keep track of this information using the following table:

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

If there were no rule against it, would it be OK to chew gum in class?

Answered: 1 week ago

Question

★★★★★

The adjusted trial balance of Sang Company shows the following data pertaining to sales at the end of its fiscal year October 31, 2017: Sales Revenue $820,000, Freight-Out $16,000, Sales Returns and...

Answered: 1 week ago

Question

★★★★★

Bond Issue B Period Ending (A) Cash Interest Paid $490,000.0 7.0% 3/12 (B) Period Interest Expense (E) 7.5% 3/12 (C) Amort. (A) (B) (D) Unamortized Balance (E) Carrying Value $490,000 (D) Apr. 1/18 $...

Answered: 1 week ago

Question

★★★★★

According to Chebychev's theorem, what percentage of a distribution will be within k - 3. 1 standard deviations of the mean? (Give your answer correct to one decimal place.) % Additional Materials...

Answered: 1 week ago

Question

★★★★★

Solve the following problems using Mohr's circle. Given: P = 10.5 ksi + P Normal stresses: Ox'= Oy= 8 ksi Draw Mohr's circle and use it to determine the normal and shearing stresses after the element...

Answered: 1 week ago

Question

★★★★★

2. (6 points) Calculate the charge of a 4.0 g sample of Zinc (atomic number 30, atomic mass 65.37 g/mole, density 7.133g/cm) from which one electron out of every 101 electrons has been removed

Answered: 1 week ago

Question

★★★★★

Which two of the following statements are true about acquisition intelligence during the acquisition life cycle? A. Manages thresholds/objectives requirements to meet or exceed expected threat...

Answered: 1 week ago

Question

★★★★★

During the release of an important product, a vice president for a different product line brought a valid concern to the project team, which had to be completed before the release. The vice president...

Answered: 1 week ago

Question

★★★★★

Lars acquired a new network system on June 5, 2022 (five-year class property), for $55,000. She expects taxable income from the business will always be about $145,000 without regard to the 179...

Answered: 1 week ago

Previous Question Next Question