Question
Background Data-driven apps are garnering a lot of attention in the modern world. The data I/O component and the data analysis component are two essential
Background
Data-driven apps are garnering a lot of attention in the modern world. The data I/O component and the data analysis component are two essential parts of a typical data-driven application.
The data I/O component concentrates on the methods for loading the data and producing the output. To load from and save to the same data persistance module is one solution. One of the most straightforward methods for data persistence is based on file I/O on CSV (comma-separated values) files.
The data analysis component makes an effort to identify trends in the data and produce insightful reports. Data analysis can be performed using statistical functions or more complicated machine learning models. We will focus on statistical functions here.
In this project, we will develop an app to load a real-world morbidity database from a CSV file and allow user to query statistical metrics for data analysis purposes.
Project Objective
We'll be using the above statistical functions on a set of input data to identify unusual morbidity data reports grouped by state. The data from the CDC was sourced here and here.
Project Requirements:
Your application must function as described below:
In order to ensure that your statistical calculations are correct, your program must pass all of the given tests in the cpp files in the test directory.
The test suites are:
test-week - This tests the object used to store the data for a given week.
test-stats - This tests the static methods needed to do the necessary calculations.
test-state - This tests the object used to store the data for a given state.
test-morbidity - This tests your file reading functions and is essentially an integration test. Once it is running, you are ready to build the main program (which should at that point be rather trivial).
Must able to pass all tests with make test-all.
Must be able to compile a main executable using the command make main.
Must be able to run your program using ./main after it is compiled.
Running the main program should prompt a user for the input file name. If the input file is not present, Unable to read input file should be displayed and the user should be prompted again to enter a valid file name.
Once the file has been loaded, your program should offer the ability to query for the following outputs:
All-time average (mean) of a given state.
List of weeks that are statistical outliers of a given state.
List all states and their respective count of outlier weeks, whose death count are more than two standard deviations from the mean.
Makefile:
CXX = g++
CXXFLAGS = -g -std=c++14 -Wall -Werror=return-type \
-Werror=uninitialized -Wno-sign-compare
TESTS = test-stats test-week test-state test-morbidity
CATCH = test/catch/catch.o
RM = rm -rf
main: main.o morbidity.o state.o stats.o week-data.o
$(CXX) $(CXXFLAGS) -o $@ $^
%.o: %.cpp
$(CXX) $(CXXFLAGS) -c -o $@ $<
test-all: $(TESTS)
test-week: week-data.o $(CATCH) test/test-week.o
test-stats: stats.o week-data.o $(CATCH) test/test-stats.o
test-state: state.o week-data.o stats.o $(CATCH) test/test-state.o
test-morbidity: morbidity.o state.o week-data.o stats.o $(CATCH) \
test/test-morbidity.o
$(TESTS):
$(CXX) $(CXXFLAGS) -o $@ $^
./$@ --success
clean:
$(RM) *.dSYM test/*.dSYM *.o *.gc* $(CATCH) \
$(TESTS) test/*.o main
Class descriptions and UML Class diagrams
The stats library (not a class!)
Two statistical functions to calculate the mean and standard deviation of deaths out of an array of WeekData objects.
double getMean(WeekData *weeks, int count);
double stDev(WeekData *weeks, int count);
The WeekData class
The State class
- You can allocate 500 as the size for the
weeks
array.
The Morbidity class
- The
states
instance variable is a one dimension array of pointers to the objects rather than a two dimensional array of the objects! You can assume that there be at most 65 states and allocate the array using this size. - The
load()
method should return false on a file that cannot be opened.
Sample data
The data will be in the following format. For this project, it is safe assume the data is properly formatted, that is exactly 3 columns will be present in each row of data.
State,Week Ending Date,All Cause
Florida,2014-01-04,2101
Florida,2014-01-11,3877
Florida,2014-01-18,3800
Sample Run
Welcome to the data viewer!
Enter the file name with the morbidity data: foo.bar
Unable to read input file!
Enter the file name with the morbidity data: data.csv
1 - Output the mean for a state
2 - Get a list of outliers for a state
3 - List all states with outlier counts
anything other than 1-3 will end the application.
Please choose an option from the above menu: 1
Enter the name of the state to search: Texas
-------------------------
The mean deaths for Texas is 4000.91
-------------------------
1 - Output the mean for a state
2 - Get a list of outliers for a state
3 - List all states with outlier counts
anything other than 1-3 will end the application.
Please choose an option from the above menu: 2
Enter the name of the state to search: North Carolina
-------------------------
Statistical outliers for North Carolina
2020-12-19 - total deaths: 2588
2020-12-26 - total deaths: 2670
2021-01-02 - total deaths: 2837
... Sample output truncated for brevity
2021-07-24 - total deaths: 495
2021-07-31 - total deaths: 434
2021-08-07 - total deaths: 319
-------------------------
1 - Output the mean for a state
2 - Get a list of outliers for a state
3 - List all states with outlier counts
anything other than 1-3 will end the application.
Please choose an option from the above menu: 3
-------------------------
Alabama: 16 outlying weeks
Alaska: 18 outlying weeks
... Sample output truncated for brevity
Wisconsin: 14 outlying weeks
Wyoming: 18 outlying weeks
-------------------------
1 - Output the mean for a state
2 - Get a list of outliers for a state
3 - List all states with outlier counts
anything other than 1-3 will end the application.
Please choose an option from the above menu: 9999
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started