Question
ormalize the dataThere are three different units in these data. FilesA 2017 Soybeans Harvest.csvandA 2019 SoybeansHarvest.csvare soybean harvests (usually around 60 bu/acre),A 2018 Corn Harvest.csvandA
ormalize the dataThere are three different units in these data. FilesA 2017 Soybeans Harvest.csvandA 2019 SoybeansHarvest.csvare soybean harvests (usually around 60 bu/acre),A 2018 Corn Harvest.csvandA 2020Corn Harvest.csvare corn harvests (usually around 180 bu/acre), whileA 2018 Corn Seeding.csvandA2020 Corn Seeding.csvcontain seeding rate data. We want to normalize each of these, so that they can bemore fairly compared. There may also be outliers in these data; normalization may reduce the impact of theoutliers on the analysis.Repeat the process above, but this time, normalize the data by one of the methods below before aggregatingand merging the data.Denote theithYieldobservation for Yearjasyij, we normalize yield by one of the following methods, ineach case holdingjconstant and iterating overionly within years. If we assume 20 rows and 6 columns,thenyij={y1j, y2j, . . . , yNij}whereNi= 120. Similarly, we would denote the successive yield estimates forgrid celliasyij=yi1, yi2, . . . , yiNjwhereNj= 5.Note that we do not have an index for the yield samples within each cell. You may, but are not required,compare normalization of the grid cell estimates with normalization of the yield sample values.You may choose a normalization method at your discretion. I've listed some possible normalization formulabelow. You are not required to implement all three, but you must use some method to convert yield orseeding rates to a common scale. You may choose to compare the different methods; they have differentstatistical properties and may lead to different conclusions
ption 1. RankReplaceyijwithrij=rank(yij).Determine ranks independently forj= 1,2, . . . , Njfor years{2013,2015, . . . ,2018}Option 2. Z-scoreCalculatey.j=PNii=1yijNiands2.j=PNii=1(yijy.j)2Ni1whereNiare the number ofYieldvalues for yearj. Replaceyijwithzij=yijy.js.j.Calculatey.jands2.jindependently forj= 1,2, . . . , Njfor each original data column. Note that this methodmakes use of the first (mean) and second moments (variance). It would be best practice to check for skewnessor kurtosis of these data.Option 3. PercentReplaceyijwith100yijy.jCalculatey.jindependently forj= 1,2, . . . , Njfor each original data column. Note that this method assumethe arithmetic mean is a reasonable estimate of central tendency. It would be best practice to check forskewness or kurtosis of these data
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started