Question
ANALYSIS You are going to analyse a dataset on the past winners of the Nobel Prize. Let's see what patterns you can uncover in the
ANALYSIS You are going to analyse a dataset on the past winners of the Nobel Prize. Let's see what patterns you can uncover in the past Nobel Laureates and what can we learn about the Nobel prize and our world generally. The data are in an excel file named "nobel_prize_data".
The file includes:
birth_date: date in string format.
motivation: description of what the prize is for.
prize_share: given as a fraction.
laureate_type: individual or organisation.
birth_country: has countries that no longer exist.
birth_country_current: current name of the country where the birth city is located
ISO: three-letter international country code.
organization_name: research institute where the discovery was made.
organization_city: location of the institution
Part 2:
1. Construct a horizontal bar chart which shows how many prizes awarded by country (only top 20 countries are needed). Is it best to use birth_country or birth_country_current? What are some potential problems when using birth_country? Construct a cross-classification table of prizes between category and country. Plot a vertical bar chart of prizes (Y variable) and category (X variable), then group the bar charts of category for top 6 countries and comment on the relationship between category and country. In which category does Germany have more prizes than the UK? In which categories does France have more prizes than Germany? In which category are Germany and Japan the weakest compared to the United States?
2. Calculate the age of the laureate in the year of the ceremony. What are the names of the youngest and oldest Nobel Laureate? What did they win the prize for? What is the average age of a winner? 75% of laureates are younger than what age when they received the prize? Investigate the prizes by country over time. Using a histogram to present the distribution of laureate age at the time of wining and number of prizes per age interval.
Part 3:
Answer the following hypothesis questions:
1. How does the winning age at different categories? Calculate the Mean, Median, Mode, Standard Deviation and Coefficient of Variation of winning age for different category. Compare the figures and explain that what conclusions you can draw from these analyses? Draw a box and Whisker plot for each category and comment on the shape of the graph.
2. Determine if average age for Physics is less than average age for Chemistry. Compare the result with question 1. Does the result confirm your previous findings? (Follow the 5 hypothesis testing steps, 0.05 level of significance, assuming "equal variances" of populations).
3. To get a more complete picture, we should look at how the age of winners have changed over time. Using a line chart to present the age of the winner (Y variable on vertical axis) and year (X variable, horizontal axis) for different category, comment on the trend for different category.
the Excel: https://docs.google.com/spreadsheets/d/1vgLaFIr_ZGk5KuW0Xy-PAPIBRYzltDTX/edit?usp=sharing&ouid=110467223272557658393&rtpof=true&sd=true
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started