Question
USE PYTHON and can ONLY IMPORT PANDAS Hi I am having trouble cleaning a specific column in my dataframe that is not allowing me to
USE PYTHON and can ONLY IMPORT PANDAS
Hi I am having trouble cleaning a specific column in my dataframe that is not allowing me to finish my project.
I have a column that states the hours of a different schools. The data isn't clean so there are different variants of how this is read. Below is a sample size of the data in this column.
School_Hours |
08:00 AM-03:00 PM |
08:15 AM-03:15 PM |
08:30 AM-03:30 PM |
7:45 AM - 2:45 PM |
7:30 AM-2:30 PM |
7:45 AM - 2:45 PM |
8:30 AM - 2:55 PM |
08:30 AM-03:30 PM |
8:30 AM-3:30 PM |
08:00 AM-03:00 PM |
8:00 am-3:30 pm |
9:00 AM - 4:15 PM |
7:00 am-3:00 pm |
08:30 AM-03:30 PM |
08:00 AM-03:00 PM |
08:45 AM-03:45 PM |
8:00 AM-3:30 PM |
8:00 AM-3:30 PM |
7:45 AM-2:45 PM |
07:45 AM-02:45 PM |
8:00 am-3:30 pm |
9:00 AM - 4:08 PM |
7:50 am-3:30 pm |
8:00 AM - 3:30 PM |
08:30 AM-03:30 PM |
7:15a.m.-2:45p.m. |
8:00 am-3:30 pm |
9:00 AM - 4:00 Pm |
08:00 AM-03:00 PM |
8:00 AM - 3:13 PM |
08:15 AM-03:15 PM |
M, T, W, Th: 7:45 AM-3:05 PM F: 7:45 AM-2:07 PM |
08:45 AM-03:45 PM |
7:30 AM-3:00 PM |
7:45 AM - 2:45 PM |
08:00 AM-03:00 PM |
7:45 AM - 3:00 PM |
08:45 AM-03:45 PM |
07:45 AM-02:45 PM |
8:00 am-3:00 pm |
:
I want to take this column from another dataframe called df and add to my dataframe school_df as a new column. The added column should grab the start time of schools rounded down to the hour. For ex) 8:45am would be 8, and 7:30am would be 7. For all blanks/nulls will be the mean of the column.
------------------------------------------------------------------------------------------------------------------------------------
This is my current script for the column:
school_df['Starting Hour'] = df['School_Hours'].str.extract("(^\d*)")
school_df['Starting Hour'] = school_df['Starting Hour'].str.replace('0', '')
This is the unique results that I get for the column:
['8', '7', '9', '', nan]
---------------------------------------------------------------------------------------------------------------------------------------------------
I would prefer to not have to replace the 0. If you can get a script that grabs the first non-zero digit in the column that would be the best but I couldn't get that to work. The expected results should be 8, 7, and 9. The nan and space should be equal to the mean of the column. The column should also be an int d-type.
Thanks
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started