The code runs free of errors but always gives Nan correlation i dont understand what the problem is , i tried every solution but keeps producing same output i have attached the 4 datasets i have to make the code and create a correlation for the four hupothesis and even attach the result my data frame is always empty no matter how i edit the code there is no null values or string values in my data set as well help me and give me a code free of error and give me right correlation removing rows, merging, eberything is done but still gives Nan correlation give the correct code, no need for explanation if the code still gives Nan output import pandas as pd import numpy as np from scipy stats import pearsonr import matplotlib pyplot as plt Read data files house prices df pd read excel ( MeanHousePricesClean 1 xlsx ) crime df pd read excel ( CrimeClean 1 1 xlsx ) population df pd read excel ( PopulationClean xlsx ) area df pd read excel ( SuburbAreas 1 xlsx , header None ) Step B Clean and prepare data def prepare data ( df , columns ) df df dropna ( subset columns ) Remove rows with missing values in key columns return df Rename columns for consistency house prices df house prices df rename ( columns ' Year ' 'year' ) crime df crime df rename ( columns ' Year ' 'year', 'Crime rate per 1 0 0 , 0 0 0 population' 'crime rate', 'Local Government Area' 'local government area' ) population df population df rename ( columns ' Year ' 'year' ) area df area df rename ( columns 0 'local government area', 1 'area' ) Clean the area DataFrame to remove non relevant rows area df area df area df ' local government area' 'Property' Step C Analysis functions def analyze correlation ( df , col 1 , col 2 ) df df dropna ( subset col 1 , col 2 ) if len ( df ) 2 return np nan correlation, pearsonr ( df col 1 , df col 2 ) return correlation Reshape data to long format house prices long pd melt ( house prices df , id vars ' year ' , var name 'local government area', value name 'mean house price' ) population long pd melt ( population df , id vars ' year ' , var name 'local government area', value name 'population' ) Merge the datasets on 'year' and 'local government area' merged df pd merge ( crime df , house prices long, on ' year ' , 'local government area' , how 'inner' ) merged df pd merge ( merged df , population long, on ' year ' , 'local government area' , how 'inner' ) merged df pd merge ( merged df , area long, on 'local government area', how 'inner' ) Calculate population density merged df ' population density' merged df ' population ' merged df ' area ' Step D Prepare the data by cleaning merged df prepare data ( merged df , ' mean house price', 'crime rate', 'population density' ) Step E Perform correlation analysis house price population corr analyze correlation ( merged df , 'mean house price', 'population density' ) crime house price corr analyze correlation ( merged df , 'crime rate', 'mean house price' ) crime population density corr analyze correlation ( merged df , 'crime rate', 'population density' ) Step F Print the results print ( f Correlation between house prices and population density house price population corr ) print ( f Correlation between crime rate and house prices crime house price corr ) print ( f Correlation between crime rate and population density crime population density corr ) Plotting for visual analysis plt figure ( figsize ( 1 0 , 6 ) ) plt scatter ( merged df ' population density' , merged df ' mean house price' ) plt title ( ' House Price vs Population Density' ) plt xlabel ( ' Population Density ( people per square km ) ' ) plt ylabel ( ' Mean House Price' ) plt grid ( True ) plt show ( ) plt figure ( figsize ( 1 0 , 6 ) ) plt scatter ( merged df ' mean house price' , merged df ' crime rate' ) plt title ( ' Crime Rate vs House Price' ) plt xlabel ( ' Mean House Price' ) plt ylabel ( ' Crime Rate ( per 1 0 0 , 0 0 0 population ) ' ) plt grid ( True ) plt show ( ) plt figure ( figsize ( 1 0 , 6 ) ) plt scatter ( merged df ' population density' , merged df ' crime rate' ) plt title ( ' Crime Rate vs Population Density' ) plt xlabel ( ' Population Density ( people per square km ) ' ) plt ylabel ( ' Crime Rate ( per 1 0 0 , 0 0 0 population ) ' ) plt grid ( True ) plt show ( ) output the above code gives Correlation between house prices and population density nan Correlation between crime rate and house prices nan Correlation between crime rate and population density nan problem Empty DataFrame Columns Incidents recorded, crime rate, mean house price, year, population, local government area, area, population density Index Show all images Show all images Show all images done loading

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 05, 2024

The code runs free of errors but always gives Nan correlation. i dont understand what the problem is , i tried every solution but keeps

The code runs free of errors but always gives Nan correlation. i dont understand what the problem is

,

i tried every solution but keeps producing same output. i have attached the

4

datasets i have to make the code and create a correlation for the four hupothesis and even attach the result. my data frame is always empty no matter how i edit the code. there is no null values or string values in my data set as well

.

help me and give me a code free of error and give me right correlation. removing rows, merging, eberything is done but still gives Nan correlation. give the correct code, no need for explanation if the code still gives Nan output.

import pandas as pd

import numpy as np

from scipy.stats import pearsonr

import matplotlib.pyplot as plt

# Read data files

house

_

prices

_

=

.

read

_

excel

("

MeanHousePricesClean

- 1 .

xlsx

")

crime

_

=

.

read

_

excel

("

CrimeClean

- 1 - 1 .

xlsx

")

population

_

=

.

read

_

excel

("

PopulationClean

.

xlsx

")

area

_

=

.

read

_

excel

("

SuburbAreas

- 1 .

xlsx

",

header

=

None

)

# Step B: Clean and prepare data

def prepare

_

data

(

,

columns

)

=

.

dropna

(

subset

=

columns

)

# Remove rows with missing values in key columns

return df

# Rename columns for consistency

house

_

prices

_

=

house

_

prices

_

.

rename

(

columns

= {'

Year

'

: 'year'

})

crime

_

=

crime

_

.

rename

(

columns

= {'

Year

'

: 'year', 'Crime rate per

100, 000

population': 'crime

_

rate',

'Local Government Area': 'local

_

government

_

area'

})

population

_

=

population

_

.

rename

(

columns

= {'

Year

'

: 'year'

})

area

_

=

area

_

.

rename

(

columns

= {0

: 'local

_

government

_

area',

1

: 'area'

})

# Clean the area DataFrame to remove non

-

relevant rows

area

_

=

area

_

[

area

_

['

local

_

government

_

area'

]! =

'Property'

]

# Step C: Analysis functions

def analyze

_

correlation

(

,

col

1,

col

2)

=

.

dropna

(

subset

= [

col

1,

col

2])

if len

(

) 2

return np

.

nan

correlation,

_=

pearsonr

(

[

col

1],

[

col

2])

return correlation

# Reshape data to long format

house

_

prices

_

long

=

.

melt

(

house

_

prices

_

,

_

vars

= ['

year

'],

var

_

name

=

'local

_

government

_

area', value

_

name

=

'mean

_

house

_

price'

)

population

_

long

=

.

melt

(

population

_

,

_

vars

= ['

year

'],

var

_

name

=

'local

_

government

_

area', value

_

name

=

'population'

)

# Merge the datasets on 'year' and 'local

_

government

_

area'

merged

_

=

.

merge

(

crime

_

,

house

_

prices

_

long, on

= ['

year

',

'local

_

government

_

area'

],

how

=

'inner'

)

merged

_

=

.

merge

(

merged

_

,

population

_

long, on

= ['

year

',

'local

_

government

_

area'

],

how

=

'inner'

)

merged

_

=

.

merge

(

merged

_

,

area

_

long, on

=

'local

_

government

_

area', how

=

'inner'

)

# Calculate population density

merged

_

['

population

_

density'

] =

merged

_

['

population

'] /

merged

_

['

area

']

# Step D: Prepare the data by cleaning

merged

_

=

prepare

_

data

(

merged

_

, ['

mean

_

house

_

price', 'crime

_

rate', 'population

_

density'

])

# Step E: Perform correlation analysis

house

_

price

_

population

_

corr

=

analyze

_

correlation

(

merged

_

,

'mean

_

house

_

price', 'population

_

density'

)

crime

_

house

_

price

_

corr

=

analyze

_

correlation

(

merged

_

,

'crime

_

rate', 'mean

_

house

_

price'

)

crime

_

population

_

density

_

corr

=

analyze

_

correlation

(

merged

_

,

'crime

_

rate', 'population

_

density'

)

# Step F: Print the results

(

"

Correlation between house prices and population density:

{

house

_

price

_

population

_

corr

} ")

(

"

Correlation between crime rate and house prices:

{

crime

_

house

_

price

_

corr

} ")

(

"

Correlation between crime rate and population density:

{

crime

_

population

_

density

_

corr

} ")

# Plotting for visual analysis

plt

.

figure

(

figsize

= (10, 6))

plt

.

scatter

(

merged

_

['

population

_

density'

],

merged

_

['

mean

_

house

_

price'

])

plt

.

title

('

House Price vs Population Density'

)

plt

.

xlabel

('

Population Density

(

people per square km

)')

plt

.

ylabel

('

Mean House Price'

)

plt

.

grid

(

True

)

plt

.

show

()

plt

.

figure

(

figsize

= (10, 6))

plt

.

scatter

(

merged

_

['

mean

_

house

_

price'

],

merged

_

['

crime

_

rate'

])

plt

.

title

('

Crime Rate vs House Price'

)

plt

.

xlabel

('

Mean House Price'

)

plt

.

ylabel

('

Crime Rate

(

per

100, 000

population

)')

plt

.

grid

(

True

)

plt

.

show

()

plt

.

figure

(

figsize

= (10, 6))

plt

.

scatter

(

merged

_

['

population

_

density'

],

merged

_

['

crime

_

rate'

])

plt

.

title

('

Crime Rate vs Population Density'

)

plt

.

xlabel

('

Population Density

(

people per square km

)')

plt

.

ylabel

('

Crime Rate

(

per

100, 000

population

)')

plt

.

grid

(

True

)

plt

.

show

()

output the above code gives: Correlation between house prices and population density: nan

Correlation between crime rate and house prices: nan

Correlation between crime rate and population density: nan

problem: Empty DataFrame

Columns:

[

Incidents recorded, crime

_

rate, mean

_

house

_

price, year, population, local

_

government

_

area, area, population

_

density

]

Index:

[]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Big Data Concepts, Theories, And Applications

Authors: Shui Yu, Song Guo

1st Edition

★★★★★

=+2. Explain the interactions in the newspaper and magazine market!

Answered: 1 week ago

Previous Question Next Question