from math import sqrt from matplotlib import pyplot as plot from random import seed from random import randrange from csv import reader Step 2 Load the csv file def load csv ( filename , skip False ) dataset list ( ) with open ( filename , ' r ' ) as file csv reader reader ( file ) if skip next ( csv reader, None ) for row in csv reader dataset append ( row ) return dataset Step 3 Convert any string column to a float column def string column to float ( dataset , column ) for row in dataset row column float ( row column strip ( ) ) Step 4 Calculate the mean value of a list of numbers def mean ( values ) return sum ( values ) float ( len ( values ) ) Step 5 Calculate a regularisation value for the parameter def regularisation ( parameter , lambda value 0 0 1 ) return lambda value parameter def variance ( values , mean ) return sum ( ( x mean ) 2 for x in values ) def covariance ( x , x mean, y , y mean ) covar 0 0 for i in range ( len ( x ) ) covar ( x i x mean ) ( y i y mean ) return covar Step 6 Calculate least squares between x and y def leastSquares ( dataset ) x row 0 for row in dataset y row 1 for row in dataset x mean mean ( x ) y mean mean ( y ) b 1 covariance ( x , x mean, y , y mean ) variance ( x , x mean ) b 0 y mean b 1 x mean return b 0 , b 1 Step 7 Calculate root mean squared error def root mean square error ( actual , predicted ) sum error 0 0 for i in range ( len ( actual ) ) prediction error predicted i actual i sum error ( prediction error 2 ) mean error sum error float ( len ( actual ) ) return sqrt ( mean error ) Step 8 Make predictions def simple linear regression ( train , test ) predictions list ( ) b 0 , b 1 leastSquares ( train ) for row in test yhat b 0 b 1 row 0 predictions append ( yhat ) return predictions Step 9 Split the data into training and test sets def train test split ( dataset , split ) train list ( ) test list ( dataset ) train size split len ( dataset ) while len ( train ) train size index randrange ( len ( test ) ) train append ( test pop ( index ) ) return train, test Seed the random value seed ( 1 ) Load and prepare data filename 'fertility rate worker percent csv ' dataset load csv ( filename , skip True ) for i in range ( len ( dataset 0 ) ) string column to float ( dataset , i ) Evaluate algorithm split 0 6 rmse evaluate simple linear regression ( dataset , split ) print ( ' Root Mean Square Error 3 f ' rmse ) Visualise the dataset def visualise dataset ( dataset ) test set list ( ) for row in dataset row copy list ( row ) row copy 1 None test set append ( row copy ) sizes, prices , for i in range ( len ( dataset ) ) sizes append ( dataset i 0 ) prices append ( dataset i 1 ) plot figure ( ) plot plot ( sizes , prices, ' x ' ) plot xlabel ( ' Fertility rate' ) plot ylabel ( ' Worker percent' ) plot grid ( ) plot tight layout ( ) plot show ( ) visualise dataset ( dataset )

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jul 28, 2024

from math import sqrt from matplotlib import pyplot as plot from random import seed from random import randrange from csv import reader # Step 2

from math import sqrt

from matplotlib import pyplot as plot

from random import seed

from random import randrange

from csv import reader

# Step

2

: Load the csv file

def load

_

csv

(

filename

,

skip

=

False

)

dataset

=

list

()

with open

(

filename

,'

')

as file:

csv

_

reader

=

reader

(

file

)

if skip:

(

csv

_

reader, None

)

for row in csv

_

reader:

dataset.append

(

row

)

return dataset

# Step

3

: Convert any string column to a float column

def string

_

column

_

_

float

(

dataset

,

column

)

for row in dataset:

row

[

column

] =

float

(

row

[

column

] .

strip

())

# Step

4

: Calculate the mean value of a list of numbers

def mean

(

values

)

return sum

(

values

) /

float

(

len

(

values

))

# Step

5

: Calculate a regularisation value for the parameter

def regularisation

(

parameter

,

lambda

_

value

= 0.01)

return lambda

_

value

+

parameter

def variance

(

values

,

mean

)

return sum

([(

-

mean

) * * 2

for x in values

])

def covariance

(

,

_

mean, y

,

_

mean

)

covar

= 0.0

for i in range

(

len

(

))

covar

+ = (

[

] -

_

mean

) * (

[

] -

_

mean

)

return covar

# Step

6

: Calculate least squares between x and y

def leastSquares

(

dataset

)

= [

row

[0]

for row in dataset

]

= [

row

[1]

for row in dataset

]

_

mean

=

mean

(

)

_

mean

=

mean

(

)

1 =

covariance

(

,

_

mean, y

,

_

mean

) /

variance

(

,

_

mean

)

0 =

_

mean

-

1 *

_

mean

return

[

0,

1]

# Step

7

: Calculate root mean squared error

def root

_

mean

_

square

_

error

(

actual

,

predicted

)

sum

_

error

= 0.0

for i in range

(

len

(

actual

))

prediction

_

error

=

predicted

[

] -

actual

[

]

sum

_

error

+ = (

prediction

_

error

* * 2)

mean

_

error

=

sum

_

error

/

float

(

len

(

actual

))

return sqrt

(

mean

_

error

)

# Step

8

: Make predictions

def simple

_

linear

_

regression

(

train

,

test

)

predictions

=

list

()

0,

1 =

leastSquares

(

train

)

for row in test:

yhat

=

0 +

1 *

row

[0]

predictions.append

(

yhat

)

return predictions

# Step

9

: Split the data into training and test sets

def train

_

test

_

split

(

dataset

,

split

)

train

=

list

()

test

=

list

(

dataset

)

train

_

size

=

split

*

len

(

dataset

)

while len

(

train

) <

train

_

size:

index

=

randrange

(

len

(

test

))

train.append

(

test

.

pop

(

index

))

return train, test

# Seed the random value

seed

(1)

# Load and prepare data

filename

=

'fertility

_

rate

-

worker

_

percent.csv

'

dataset

=

load

_

csv

(

filename

,

skip

=

True

)

for i in range

(

len

(

dataset

[0]))

string

_

column

_

_

float

(

dataset

,

)

# Evaluate algorithm

split

= 0.6

rmse

=

evaluate

_

simple

_

linear

_

regression

(

dataset

,

split

)

('

Root Mean Square Error:

% . 3

' %

rmse

)

# Visualise the dataset

def visualise

_

dataset

(

dataset

)

test

_

set

=

list

()

for row in dataset:

row

_

copy

=

list

(

row

)

row

_

copy

[- 1] =

None

test

_

set.append

(

row

_

copy

)

sizes, prices

= [], []

for i in range

(

len

(

dataset

))

sizes.append

(

dataset

[

] [0])

prices.append

(

dataset

[

] [1])

plot.figure

()

plot.plot

(

sizes

,

prices,

'

')

plot.xlabel

('

Fertility rate'

)

plot.ylabel

('

Worker percent'

)

plot.grid

()

plot.tight

_

layout

()

plot.show

()

visualise

_

dataset

(

dataset

)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Secrets Of Analytical Leaders Insights From Information Insiders

Authors: Wayne Eckerson

1st Edition

1935504347, 9781935504344

Students also viewed these Databases questions

Question

★★★★★

Two liquids, 1 and 2, are in equilibrium in a U-tube that is open at both ends, as in the drawing. The liquids do not mix, and liquid 1 rests on top of liquid 2. How is the density 1 of liquid 1...

Answered: 1 week ago

Question

★★★★★

The file named Machinedates.xlsx contains dates on which several machines were bought and sold. Determine how many months and years each machine was kept.

Answered: 1 week ago

Question

★★★★★

3. Understand the advantages of this model compared to a pure risk management approach.

Answered: 1 week ago

Question

★★★★★

Phoenix Industries has pulled off a miraculous recovery. Four years ago it was near bankruptcy. Today, it announced a $ 1 per share dividend to be paid a year from now, the first dividend since the...

Answered: 1 week ago

Question

★★★★★

Simon Company's year-end balance sheets follow. At December 31 Current Year 1 Year Ago 2 Years Ago Assets Cash $ 31,807 $ 35,374 $ 38,728 Accounts receivable, net 89,500 64,432 49,131 Merchandise...

Answered: 1 week ago

Question

★★★★★

Q5. In early 2018, Abercrombie & Fitch (ANF) had a book equity of $1250 million, a price per share of $22.48 and 68.4 million shares outstanding. At the same time, The GAP had a book equity of $3140...

Answered: 1 week ago

Question

★★★★★

An analysis of the transactions made by Blossom & Co., a certified public accounting firm, for the month of August is shown below. The expenses were $550 for rent, $3,800 for salaries and wages, and...

Answered: 1 week ago

Question

★★★★★

A company plans to configure a vessel's journey in Dynamics 365 Supply Chain Management by using the landed cost functionality. The company needs to track the details of each step of the journey...

Answered: 1 week ago

Question

★★★★★

10 Which of the following is NOT a disadvantage of virtual integration in supply chains? 1 point Often variations in the IT capabilities and existing level of infrastructure across supply chain...

Answered: 1 week ago

Question

★★★★★

ACTIVITY 3: PREPARE ME Below are the Income Statement and Balance Sheet of Hobert Supply and Services. Using these data, prepare the company's Statement of Changes in Equity. Herbert Supply and...

Answered: 1 week ago

Question

★★★★★

A data professional works on a financial audit. During the verification process, they keep in mind the big picture view of confirming that the company's financial statements comply with accounting...

Answered: 1 week ago

Question

★★★★★

Understand the likely direction of labor relations for the coming years

Answered: 1 week ago

Question

★★★★★

Understand the process of arbitration

Answered: 1 week ago

Question

★★★★★

Know the different variations of arbitration that are in use

Answered: 1 week ago

Previous Question Next Question