Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hypothesis: The mean house prices are higher in more populated areas. A higher population density ( people per square km ) results in a higher

Hypothesis:
The mean house prices are higher in more populated areas.
A higher population density
(
people per square km
)
results in a higher number of crimes committed.
The crime rate is higher in areas with low mean house prices.
Code:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Read data files
house
_
prices
_
df
=
pd
.
read
_
excel
(
"
MeanHousePricesClean
-
1
.
xlsx
"
)
crime
_
df
=
pd
.
read
_
excel
(
"
CrimeClean
-
1
-
1
.
xlsx
"
)
population
_
df
=
pd
.
read
_
excel
(
"
PopulationClean
.
xlsx
"
)
area
_
df
=
pd
.
read
_
excel
(
"
SuburbAreas
-
1
.
xlsx
"
,
header
=
None
)
# Transform area
_
df to long format
area
_
df
.
columns
=
area
_
df
.
iloc
[
0
]
# Set the first row as the header
area
_
df
=
area
_
df
[
1
:
]
# Remove the first row from the dataframe
area
_
df
=
area
_
df
.
set
_
index
(
'
Property
'
)
.
transpose
(
)
.
reset
_
index
(
)
area
_
df
.
columns
=
[
'
local
_
government
_
area', 'area
_
sq
_
km
'
]
# Convert 'area
_
sq
_
km
'
to numeric
area
_
df
[
'
area
_
sq
_
km
'
]
=
pd
.
to
_
numeric
(
area
_
df
[
'
area
_
sq
_
km
'
]
,
errors
=
'coerce'
)
# Rename columns in house
_
prices
_
df and crime
_
df to ensure consistent naming
house
_
prices
_
df
=
house
_
prices
_
df
.
rename
(
columns
=
{
'
Year
'
: 'year'
}
)
crime
_
df
=
crime
_
df
.
rename
(
columns
=
{
'
Year
'
: 'year', 'Local Government Area': 'local
_
government
_
area',
'Incidents recorded': 'incidents
_
recorded',
'Crime rate per
1
0
0
,
0
0
0
population': 'crime
_
rate'
}
)
population
_
df
=
population
_
df
.
rename
(
columns
=
{
'
Year
'
: 'year'
}
)
# Function to normalize LGA names
def normalize
_
lga
_
names
(
df
,
lga
_
column
)
:
if lga
_
column in df
.
columns:
df
[
lga
_
column
]
=
df
[
lga
_
column
]
.
astype
(
str
)
.
str
.
strip
(
)
.
str
.
replace
(
'
Shire
'
,
'
'
)
.
str
.
replace
(
'
City
'
,
'
'
)
.
str
.
strip
(
)
return df
# Normalize LGA names in all relevant DataFrames
house
_
prices
_
df
=
normalize
_
lga
_
names
(
house
_
prices
_
df
,
'local
_
government
_
area'
)
crime
_
df
=
normalize
_
lga
_
names
(
crime
_
df
,
'local
_
government
_
area'
)
for col in population
_
df
.
columns
[
1
:
]
:
population
_
df
=
normalize
_
lga
_
names
(
population
_
df
,
col
)
area
_
df
=
normalize
_
lga
_
names
(
area
_
df
,
'local
_
government
_
area'
)
# Transform house
_
prices
_
df to long format
house
_
prices
_
long
_
df
=
pd
.
melt
(
house
_
prices
_
df
,
id
_
vars
=
[
'
year
'
]
,
var
_
name
=
'local
_
government
_
area', value
_
name
=
'house
_
price'
)
# Normalize 'local
_
government
_
area' column in house
_
prices
_
long
_
df
house
_
prices
_
long
_
df
=
normalize
_
lga
_
names
(
house
_
prices
_
long
_
df
,
'local
_
government
_
area'
)
# Transform population
_
df to long format
population
_
long
_
df
=
pd
.
melt
(
population
_
df
,
id
_
vars
=
[
'
year
'
]
,
var
_
name
=
'local
_
government
_
area', value
_
name
=
'population'
)
# Normalize 'local
_
government
_
area' column in population
_
long
_
df
population
_
long
_

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning Microsoft SQL Server 2012 Programming

Authors: Paul Atkinson, Robert Vieira

1st Edition

1118102282, 9781118102282

More Books

Students also viewed these Databases questions