Answered step by step
Verified Expert Solution
Question
1 Approved Answer
kindly make two data set. one data set which is not tidy and other one which is tidy. make sure to comment on each data
kindly make two data set. one data set which is not tidy and other one which is tidy. make sure to comment on each data set on why it not tidy and why it is tidy. i have attached pictures
a neips code, or date) it will be easy to combine the information later and there a la of benefits of not having to enter the same airport measurement repeatedly in the flights dataset (less mistakes, less tedious, easier to make corrections later. We will return to this third characterstic of tidy data in the next chapter as think about data management. For effective visualization it is enough to kno that the dataset satisfies requirements 1 and 2. I Exercise 4.3. Make your own (madeup) dataset that is not tidy. Exercise 4.4. Adjust the above dataset to make a dataset that is tidy. You can think about tidy data as "long". In these datasets, there is a row for each observation. And there are (often) multiple measurements of that unit (a person, a day, a race) that are stored as different variables in the columns (like temperature, height, duration). We will follow Hadley Wickham's definition of tidy data here (?): A dataset is a collection of values, usually either numbers (if quanti- tative) or strings (if qualitative). Values are organised in two ways. Every value belongs to a variable and an observation. A variable contains all values that measure the same underlying attribute (like height, temperature, duration) across units. An observation con- tains all values measured on the same unit (like a person, or a day, or a race) across attributes. In tidy data: 1. Each variable forms a coluinn. 2. Each observation forms a row. 3. Each type of observational unit forms a table. Your instinct in making datasets is probably not to make tidy datasets. This is because they work really well when we work with them in a computer but would be a pain to write in, say, a lab notebook. So if you were studying the heights of plants grown under two different types of light and in three types of soil, and you grew one plant in each combination of conditions and measured the height. you probably learned in a science class to organize your data similarly to this: a 4.1. WHAT IS TIDY DATA? (PREPARING THE DATA FOR GGPLOT) 75 Light A Light B Table 4.1: Example of a dataset that is not "tidy Light A Light B Soil 1 Soil 2 Soil 3 14 cm 9 cm 17 cm 8 cm 6 cm 7 cm In this experiment, we are measuring individual plants, so each plant would be an observation. To create a tidy version of the same dataset we would restructure it to have each row contain onle measurements about a single plant. For each plant, we would then need to record certain attributes/measurements (i.e. the light treatment, the soil type, and the height of the plant). Here is what the dataset would look like as a tidy dataset. Table 4.2: Example of the above dataset in tidy format. Plant Number Light Type Soil Type Height 1 1 2 A 2 3 3 4 B 1 5 B 2 6 cm 6 B 3 14 cm 9 cm 17 cm 8 cm 7 cm Sometimes you will find that there multiple ways that you could make a dataset tidy depending ou how you define the unit of observation. In these cases, you will need to choose which makes more sense/works better for what you are doing Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started