Question
Hi, I asked the following question: This is a question about data analysis/web scraping in R. I'm looking at a wikipedia webpage. the title
Hi,
I asked the following question:
""
This is a question about data analysis/web scraping in R.
I'm looking at a wikipedia webpage. the title is "List of french supercentenarians"
I want to load data into R and delete all columns except for the age column. The first entry is 1222 years, 164 days. The problem is that this is text. I wan to convert that into 122+164/365 years (basically a decimal)
And I want to do this for every row of that column and have it as a dataframe or a vector.
I'm guessing I have to use regular expressions? I'm not sure how
Please provide a code and explain how it works
""
A nice person provided an answer but after running it it didn't work. I got a frame with only 2 rows instead of 100. Please explain what is wrong with it and how to fix it. How do I transform that column into a vector (I guess it's already a vector)
Also, there is used to be a reply button months ago on Chegg to ask clarifying questions. Is that done away with?
This code will scrape the webpage using read_html (), extract the table from the page using html_nodes( ) and html_table( ), extract the age column using subsetting, convert the ages to numeric values using str_extract_all() and as. numeric(), and then calculate the decimal value for each age using element-wise division and addition. Finally, it creates a new dataframe with the decimal ages. Step 3/3 Explanation:- Here's a more detailed explanation of the code: 1. First, we start by installing and loading the required packages: rvest for web scraping and stringr for string manipulation. 2. Next, we specify the URL of the Wikipedia page that we want to scrape, and use read_html() to scrape the page and store the results in a variable called page. 3. We use html_nodes( ) to find the table element in the HTML code of the page, and html_table( ) to extract the table data and store it in a variable called table. 4. We use subsetting to extract the "Age" column from the table, and store the results in a variable called ages. 5. We use str_extract_all( ) to extract all the digits from the ages character vector, and as. numeric ( ) to convert the results to numeric values. The extracted digits are stored in a variable called ages_numeric. 6. We use element-wise division and addition to calculate the decimal value for each age, and store the results in a variable called decimal agesStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started