Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

A dataset is a set of data identified with a particular experiment, scenario, or circumstance. Datasets are typically displayed in tables, in which rows represent

A dataset is a set of data identified with a particular experiment, scenario, or circumstance.

Datasets are typically displayed in tables, in which rows represent individuals and columns represent variables.

EXAMPLE: Medical Records

The following dataset shows medical records for a sample of patients.

VARIABLE

Respondents

Gender

(M/F)

Age

Weight

(lbs.)

Height

(in.)

Smoking Behavior (Yes/No) Yes=1, No=2

Educational Attainment

Patient 1

M

49

170

67

1

College Grad.

INDIVIDUAL

Patient 2F

57

140

69

2

High School

Patient 3

M

65

175

59

1

College Grad.

Patient 4

M

43

155

62

1

Post Grad.

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

Patient 90

M

59

173

60

2

College Grad.

In this example,

  • theindividualsare patients,
  • and thevariablesare Gender, Age, Weight, Height, Smoking, and Educational level.

Individuals, Observations, or Cases

The rows in a dataset (representingindividuals) might also be calledobservations,cases, or a description that is specific to the individuals and the scenario.

For example, if we were interested in studying flu vaccinations in school children across the Philippines, we could collect data where each observation was a

  • school children
  • school
  • school district
  • city/municipality

Each of these would result in a different way to investigate questions about flu vaccinations in school children.

Independent Observations

In our course, we will present methods which can be used when the observations being analyzed are independent of each other. If the observations (rows in our dataset) are not independent, a more complex analysis is needed. Clear violations of independent observations occur when

we have more than one row for a given individual such as if we gather the same measurements at many different times for individuals in our study

individuals are paired or matched in some way.

As we begin this course, you should start with an awareness of the types of data we will be working with and learn to recognize situations which are more complex than those covered in this course.

Variables

The columns in a dataset (representing variables) are often grouped and labeled by their role in our analysis.

For example, in many studies involving people, we often collect demographic variables such as gender, age, race, ethnicity, socioeconomic status, marital status, and many more.

The role a variable plays in our analysis must also be considered.

In studies where we wish to predict one variable using one or more of the remaining variables, the variable we wish to predict is commonly called the response variable, the outcome variable, or the dependent variable.

Any variable we are using to predict or explain differences in the outcome is commonly called anexplanatory variable, anindependent variable, apredictor variable, or acovariate.

Note: The word "independent" is used in statistics in numerous ways. Be careful to understand in what way the words "independent" or "independence" (as well as dependent or dependence) are used.

Here we have discussed independent observations (also called cases, individuals, or subjects).

We have also used the term independent variable as another term for our explanatory variables.

And when comparing groups we will define independent samples and dependent samples.

Variables are classified into one of two:

Quantitative

Categorical/Qualitative

Quantitative variables are attributes that are numerical in nature

Categorical variables take category or label values, and place an individual into one of several groups.

Categorical variables are often further classified as either:

Nominal, when there is no natural ordering among the categories.

Common examples would be gender, race, or ethnicity.

Ordinal, when there is a natural order among the categories, such as, ranking scales or letter grades.

Ordinal variables are still categorical and do not provide precise measurements.

Differences are not precisely meaningful, for example, if one student scores an A and another a B in a research project, we cannot say precisely the difference in their scores, only that an A is larger than a B.

Quantitative variables take numerical values, and represent some kind of measurement.

Quantitative variables are classified as:

Discrete, when the variable takes on a countable number of values.

These variables indeed represent some kind of count such as the number of individuals that are positive of CoVID 19.

Continuous, when the variable can take on any value in some range of values.

Our precision in measuring these variables is often limited by our instruments.

Common examples would be height (inches), weight (pounds), or time to recovery

(days).

One special variable type occurs when a variable has only two possible values.

Binary or Dichotomous variable,when there are only two possible levels.

Variables can usually be phrased in a "yes/no" question. Gender is an example of a binary variable.

Why Does the Type of Variable Matter?

The types of variables you are analyzing directly relate to the available descriptive and inferential statistical methods, therefore it is important since:

It will assess how you will measure the effect of interest and

It will guide you to determine the statistical methods you need.

In this course, we will continually emphasize the types of variables that are appropriate for each method we discuss.

For example:

To compare the number of Covid 19 cases among two Rural areas,you could use

Fisher's Exact Test

Chi-Square Test

To compare haemoglobin countin a clinical trial evaluating two types medications, you could use

Two-sample t-Test

Wilcoxon Rank-Sum Test

We will be discussion more on this in the succeeding topic.

Application

EXAMPLE: Medical Records

Let's revisit the dataset showing medical records for a sample of patients

VARIABLE

Respondents

Gender

(M/F)

Age

Weight

(lbs.)

Height

(in.)

Smoking Behavior (Yes/No) Yes=1, No=2

Educational Attainment

Patient 1

M

49

170

67

1

College Grad.

INDIVIDUAL

Patient 2F

57

140

69

2

High School

Patient 3

M

65

175

59

1

College Grad.

Patient 4

M

43

155

62

1

Post Grad.

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

Patient 90

M

59

173

60

2

College Grad.

In our example of medical records, there are several variables of each type:

  • Age, Weight, and Height arequantitativevariables.
  • Education, Gender, and Smoking behavior arecategoricalvariables.

Comments:

  • Notice that the values of thecategoricalvariable Smoking have beencodedas the numbers 0 or 1.

It is quite common to code the values of a categorical variable as numbers, but you should remember that these are just codes.

They have no arithmetic meaning (i.e., it does not make sense to add, subtract, multiply, divide, or compare the magnitude of such values).

Usually, if such a coding is used, all categorical variables will be coded and we will tend to type of coding for datasets in this course.

  • Sometimes,quantitativevariables aredivided into groupsfor analysis, in such a situation, although the original variable was quantitative, the variable analyzed is categorical.

Task

1.Copy 5 research objectives from different researches conducted and identify and classify the variables in each.

2.Discuss briefly the stages of statistical process.

3.Briefly explain the importance ofbiostatistics.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Leading Strategic Change In An Era Of Healthcare Transformation

Authors: Jim Austin ,Judith Bentkover ,Laurence Chait

1st Edition

3319808826, 978-3319808826

Students also viewed these Mathematics questions