Question
Python The first task is to read the external data into the Python environment. The external files are named cleveland and va indicating data collected
Python
The first task is to read the external data into the Python environment. The external files are named "cleveland" and "va" indicating data collected from two different locations, and they are both .csv files with the same variable structure. Since the first row of the .csv files do not provide the variable names, you will have to create the variable names on your own. They are: age: age in years of the patient sex: 1 = male, 0 = female cp: chest pain type: 1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic trestbps: resting blood pressure (in mm Hg on admission to the hospital) chol: serum cholestoral in mg/dl fbs: fasting blood sugar > 120 mg/dl. (1 = true, 0 = false) restecg: resting electrocardiographic results: 0 = normal, 1 = having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), 2 = showing probable or definite left ventricular hypertrophy by Estes' criteria thalach: maximum heart rate achieved exang: exercise induced angina (1 = yes; 0 = no) oldpeak: ST depression induced by exercise relative to rest slope: the slope of the peak exercise ST segment: 1 = upsloping, 2 = flat, 3 = downsloping ca: number of major vessels (0-3) colored by flourosopy thal: 3 = normal, 6 = fixed defect, 7 = reversable defect num: diagnosis of heart disease (angiographic disease status): 0 = < 50% diameter narrowing, 1 = > 50% diameter narrowing Create a DataFrame for each location, read the external data into the DataFrame and add a new variable, say "location", to indicate which location the data come from - either 'cleveland' or 'va'.
The second task is to get some useful information about the DataFrame, such as the dimension, data type, etc. and extract only the useful variables for our future use. The variables we are interested in at this moment are: age, sex, cp, trestbps, chol, num, along with the location variable. Create a new DataFrame that combines the two locations and that consists of all observations and only these 7 variables, and save the new DataFrame into an external .cvs file (let's call if full.csv)
could you show me the commands to run this
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started