Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hi, I asked the following question: This is a question about data analysis/web scraping in R. I'm looking at a wikipedia webpage. the title

Hi,

I asked the following question:

""

This is a question about data analysis/web scraping in R.

I'm looking at a wikipedia webpage. the title is "List of french supercentenarians"

I want to load data into R and delete all columns except for the age column. The first entry is 1222 years, 164 days. The problem is that this is text. I wan to convert that into 122+164/365 years (basically a decimal)

And I want to do this for every row of that column and have it as a dataframe or a vector.

I'm guessing I have to use regular expressions? I'm not sure how

Please provide a code and explain how it works

""

A nice person provided an answer but after running it it didn't work. I got a frame with only 2 rows instead of 100. Please explain what is wrong with it and how to fix it. How do I transform that column into a vector (I guess it's already a vector)

image text in transcribed

Also, there is used to be a reply button months ago on Chegg to ask clarifying questions. Is that done away with?

This code will scrape the webpage using read_html (), extract the table from the page using html_nodes( ) and html_table( ), extract the age column using subsetting, convert the ages to numeric values using str_extract_all() and as. numeric(), and then calculate the decimal value for each age using element-wise division and addition. Finally, it creates a new dataframe with the decimal ages. Step 3/3 Explanation:- Here's a more detailed explanation of the code: 1. First, we start by installing and loading the required packages: rvest for web scraping and stringr for string manipulation. 2. Next, we specify the URL of the Wikipedia page that we want to scrape, and use read_html() to scrape the page and store the results in a variable called page. 3. We use html_nodes( ) to find the table element in the HTML code of the page, and html_table( ) to extract the table data and store it in a variable called table. 4. We use subsetting to extract the "Age" column from the table, and store the results in a variable called ages. 5. We use str_extract_all( ) to extract all the digits from the ages character vector, and as. numeric ( ) to convert the results to numeric values. The extracted digits are stored in a variable called ages_numeric. 6. We use element-wise division and addition to calculate the decimal value for each age, and store the results in a variable called decimal ages

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Essentials of Database Management

Authors: Jeffrey A. Hoffer, Heikki Topi, Ramesh Venkataraman

1st edition

133405680, 9780133547702 , 978-0133405682

More Books

Students also viewed these Databases questions

Question

Example. Evaluate 5n+7 lim 7-00 3n-5

Answered: 1 week ago