All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
statistical techniques in business
Questions and Answers of
Statistical Techniques in Business
Assuming that \(\boldsymbol{X}_{1}, \ldots, \boldsymbol{X}_{n} \stackrel{\text { iid }}{\sim} f\), show that (2.48) holds and that \(\ell_{n}^{*}=-n \mathbb{E} \ln f(\boldsymbol{X})\).
Suppose that \(\tau=\left\{x_{1}, \ldots, x_{n}\right\}\) are observations of iid continuous and strictly positive random variables, and that there are two possible models for their pdf. The first
Suppose that we have a total of \(m\) possible models with prior probabilities \(g(p), p=\) \(1, \ldots, m\). Show that the posterior probability of model \(g(p \mid \tau)\) can be expressed in terms
Given the data \(\tau=\left\{x_{1}, \ldots, x_{n}\right\}\), suppose that we use the likelihood \((X \mid \boldsymbol{\theta}) \sim \mathscr{N}\left(\mu, \sigma^{2}\right)\) with parameter
Visit the UCI Repository https://archive.ics.uci.edu/. Read the description of the data and download the Mushroom data set agaricuslepiota.data. Using pandas, read the data into a DataFrame called
Change the type and value of variables in the nutri data set according to Table 1.2 and save the data as a CSV file. The modified data should have eight categorical features, three floats, and two
It frequently happens that a table with data needs to be restructured before the data can be analyzed using standard statistical software. As an example, consider the test scores in Table 1.3 of 5
Create a similar barplot as in Figure 1.5, but now plot the corresponding proportions of males and females in each of the three situation categories.That is, the heights of the bars should sum up to
The iris data set, mentioned in Section1.1, contains various features, including 'Petal.Length' and 'Sepal.Length', of three species of iris:setosa, versicolor, and virginica.(a) Load the data set
Import the data set EuStockMarkets from the same website as the iris data set above. The data set contains the daily closing prices of four European stock indices during the 1990s, for 260 working
Consider the KASANDR data set from the UCI Machine Learning Repository, which can be downloaded from https://archive.ics.uci.edu/ml/machine-learningdatabases/00385/de.tar.bz2.This archive file has a
Visualizing data involving more than two features requires careful design, which is often more of an art than a science.(a) Go to Vincent Arel-Bundocks’s website (URL given in Section1. 1)and read
Prove Eq. (1.16) and discuss the increase in the variance of sample mean with respect to the independent case when the series has a non-zero first-order autocorrelation coefficient and zero
Using the Taiwan AirBox Data in Example 5.5, compare the clustering results obtained using the ACF and using the coefficients of an AR fitting to the series.Data From Example 5.5:We used the mclust
Again, consider the US monthly macroeconomic data set used in Example 8.4, but use the unemployment rate (UNRATE) as the dependent variable. Apply a DL network to obtain forecasts in the testing
Consider the 99 world financial market indexes. Compute the log returns of the indexes. Obtain a time plot of all series and perform a PCA of the log returns. Summarize the results of PCA, including
Consider the clothing data set of Figure 1.8. Perform PCA on the sales data and summarize the results. Obtain time plots of the first 12 PCs with 6 series on one page. Figure 1.8: In (sales) 7 8 11
Consider, again, the clothing data set. Obtain the three summary plots of the sample cross-correlations for lags 1 to 21.
Consider the temperature data of Figure 1.1. (a) Obtain the sample mean and sample covariance matrix of the data. (b) Obtain the lag-1 to lag-10 sample CCMs of the data.Figure 1.1: South Amer North
Consider the hourly \(\mathrm{PM}_{2.5}\) measurements at 15 monitoring stations in the southern Taiwan; columns 4 to 18 of the file TaiwanPM25 . csv. (a) Compute the sample mean and sample
Compute the variance and the ACF (lag-1 to lag-4) of the following ARMA models with \(\operatorname{Var}\left(a_{t}\right)=1\) : (a) \(z_{t}=0.7 z_{t-1}+a_{t}\); (b) \(z_{t}=0.4 a_{t-1}+a_{t}\); (c)
Simulate the three ARIMA models of Exercise 1 with the command arima. sim and compare the theoretical ACFs with the sample ACFs.Data From Exercise 1:Compute the variance and the ACF (lag-1 to lag-4)
Compare the EDQ and the TWQ for probabilities \((0.05,0.5,0.95)\) of the logs of CPI series in file CPIEurope2000-15.csv. Compute these quantiles in levels and first differences. Note the extreme
Compute the ACF of the process \(z_{t}=y_{t}+v_{t}\), where \(y_{t}=0.4 a_{t-1}+a_{t}\) with \(\operatorname{Var}\left(a_{t}\right)=4\) and \(v_{t}\) a white noise process with
Find the roots of the characteristic equation of the following ARMA models: (a) \((1-6 B) z_{t}=a_{t}\); (b) \(\left(1-1.4 B+0.8 B^{2}\right) z_{t}=a_{t}\); (c) \(\left(1-0.6 B+1.2 B^{2}\right)
Compute the periodogram of the first three series identified as seasonal in Exercise 2 (Series 4th, 5th, and 9th) with the transformation \(abla \log \left(\mathrm{CPI}_{t}\right)\) and compare the
Write the Kalman filter equations for an \(\operatorname{AR}(1)\) process written is state space form with \(H_{t}=1, \alpha_{t}=z_{t}\) and \(V_{t}=0, \Omega_{t}=\phi, R_{t}=\sigma_{a}^{2}\). Show
Compare the 1-step and 2-step ahead forecast error variances of the ARMA models of Exercise 1.Data From Exercise 1:Compute the variance and the ACF (lag-1 to lag-4) of the following ARMA models with
Consider the log series of the US monthly exports and imports data in the file m-expimpcnus.csv. (a) Are the two log series unit-root non-stationary? Perform unit-root tests to draw
Consider, again, the US monthly export and import series of Problem 1. (a) Build a bivariate time series model for the two series. Perform model simplification and model checking to justify the
Consider the Taiwan AirBox data in TaiwanAirBox032017.csv. The file contains 514 series with 744 observations. Focus on Series 2, 3, and 4. Build a multivariate time series for the three-dimensional
Consider the November temperatures of Europe, North America, and South America. See the file temperatures.txt. (a) Is there a unit root in the individual time series? Why? (b) Are the three series
Again consider the three temperature series of Europe, North America, and South America. Build a vector ARMA model (AR or MA model is allowed) for the three series. Perform model checking. Obtain
Consider the World Stock Indexes. Apply a hierarchical clustering using as dissimilarity measure the cross linear dependency. Compare the results obtained with different number of lags in the
Apply \(k\)-means and \(k\)-medoids to cluster the World Stock Indexes using the Euclidean distance among the standardized series. Comment on the differences between the clusters found and those
Consider quarterly economic series of European Union from 2000 to 2019 in the file UMEdata2000_2018.Csv. Compare the results of a hierarchical clustering using dissimilarities between univariate
Follow the analysis of Example 5.7 to discriminate between the five files with EEG data. Apply a similar analysis to discriminate between the data in the files EEGsetB.CSV, EEGsetC.CSv and
Apply SVM to discriminate between the data in the files EEGsetA.CSv and EEGsetB. csv, EEGsetC.CSv,EEGsetD.CSv and EEGsetE.csv in order to classify the results from healthy individuals or seizure
Compute the optimal interpolation for the univariate ARMA process \((1-0.6 B-\) \(\left.0.3 B^{2}\right) z_{t}=5+a_{t}\) at time \(h\) as a function of the observations before and after \(t=h\). How
Prove that the optimal interpolation of the vector process \((\boldsymbol{I}-\boldsymbol{\Phi} B) \boldsymbol{z}_{t}=\boldsymbol{a}_{t}\) at time \(t=h\) is given by
Use the package tsoutliers to detect outliers in the 9th and 10th series of the data set TaiwanAirBox032017.csv. Then, use the Lasso approach to detect level shifts and AOs in the same two time
Compare the results of Exercise 3 with those obtained via the program arima.rob.Data From Exercise 3:Use the package tsoutliers to detect outliers in the 9th and 10th series of the data set
With the three series of world temperature (Temperatures.CSv) find outliers using the programs tso, arima.rob and outlierLasso.
Simulate, as in Example 2.2, 100 values of the three series that follow an ARMA model. For instance, an AR(2) or ARMA(1,1). Introduce in the three series an outlier of size 3 and compare the results
In a white noise series \(a_{t}\) of variance \(\sigma^{2}\), an outlier of size \(\omega\) is identified by the ratio \(\omega / \sigma\). In a vector white noise \(\boldsymbol{a}_{t}\) of \(k\)
Suppose that \(k\) is large and that the series follows a VAR(1) with a sparse parameter matrix with many coefficients equal to zero and rank \(r
In the DFM \(\boldsymbol{z}_{t}=\boldsymbol{P} \boldsymbol{f}_{t}+\boldsymbol{a}_{t}\) with \(\boldsymbol{P}=\frac{1}{\sqrt{k}} \mathbf{1}\) is a \(k \times 1\) vector, \(\mathbf{1}=(1, \ldots,
Suppose the DFM \(z_{t}=\boldsymbol{P} \boldsymbol{f}_{t}+\boldsymbol{n}_{t}\), where \(\boldsymbol{n}_{t}\) follows a diagonal VAR model. Under what conditions are the factors \(f_{t}\) linear
Consider the GDFM with one factor and two lags, \(\boldsymbol{z}_{t}=\boldsymbol{P}_{0} f_{t}+\boldsymbol{P}_{1} f_{t-1}+\boldsymbol{P}_{2} f_{t-3}+\boldsymbol{n}_{t}\), where the factor follows
Suppose that the GDFM \(z_{t}=\boldsymbol{P}_{0} f_{t}+\boldsymbol{P}_{1} f_{t-1}+\boldsymbol{a}_{t}\), where \(\boldsymbol{a}_{t}\) is white noise, is estimated by the ODPC with one lag, and
Fit a DFM to the EUUS (CPI) price indexes in file CPIEurope2000-15.CSv. Apply the transformation \(abla abla_{12} \log z_{t}\) and the command dfmpc to fit the model. Analyze the properties of the
Fit a DFM to the data in levels of price indexes EUUS (CPI) in file CPIEurope2000-15.csv. Apply the transformation \(\log \left(z_{t}\right)\) and the command dfmpc to fit the model. Analyze the
Use the data in file gdpsimple6c8010 . txt of the GDP of six countries to fit a FM. Compare the results with those of Example 3.2 where a VAR model was fitted.Example 3.2:To illustrate the analysis
Modify the commands given in the appendix to check the precision of \(\mathrm{PC}\) in the estimation of the EDFM for different values of the signal-to-noise ratio and see the decrease in the
The dependent variable of interest is the inflation, consumer price index all items, which is CPIAUCSL. The predictors consist of the first 6 lagged values of all 122 variables available. Perform a
Repeat the analysis of Problem 1 but using glmnet with \(\alpha=0.75\).Data From Problem 1:The dependent variable of interest is the inflation, consumer price index all items, which is CPIAUCSL. The
Consider, again, the monthly macroeconomic data set of Problem 1. Apply Group Lasso by letting (a) lagged predictors of CPIAUCSL as group 1, and (b) predictors of other variables with the same lag
Again, consider the monthly macroeconomic data set of Problem 1. Apply boosting to the problem with subcommands \(\mathrm{n}\). trees \(=10000\) and shrinkage \(=0.001\).Data From Problem 1:The
Consider the hourly \(\mathrm{PM}_{2.5}\) measurements of Station 2 (Column 5) in the data file TaiwanPM25.csv. Obtain the series \(y_{t}\) of the square-root transform of daily maximum
Consider the US monthly macroeconomic data used in Example 8.4. Use the same training and testing subsamples, and apply a DL network below```mod mod
Apply the random forest to the monthly macroeconomic data set as in Example 8.2, but using the unemployment rate as the dependent variable. You may try different choices of mtry and ntree.Example
Consider the US monthly unemployment rate series from 1948 to 2020 in the file unrate 4820 . txt. Use lag-1 to lag-24 of the series as predictors to build a tree model for the US unemployment rate.
Consider the IMDB data set. Use the first 4500 observations as the training subsample and the last 1236 observations as the testing sample. Repeat the analyses of Section 8.4.4. Is the LSTM network
Consider the ozone data of Midwestern United States. Obtain a similar plot as Figure 9.3 for the data of 25 June 1987.Figure 9.3: 44 42 lowa 40 38 Missouri Wisconsin Michigan:south 150 00 100
Consider, again, the ozone data of Midwestern United States. Obtain the counterpart plot of Exercise 1 for the trend-adjusted data.Data From Exercise 1:Consider the ozone data of Midwestern United
Perform the analysis of Example 9.4 but on the daily maximum temperature of July 1993.Example 9.4:Consider again the ozone measurements of Midwestern United States. In this particular instance, we
Perform the universal S-T kriging as that of Example 9.2 but using the daily maximum temperature of July 1993.Example 9.2:Here we only employ that data of August 1993, which are available from the
For the data given in Table 6.1 calculate the sample statistics \(\bar{x}, \bar{y}, S_{x x}, S_{y y}\), and \(S_{x y}\).Table 6.1 5 6 8 9 - 234 2 SAT Mathematics Examination Scores (SATM) and
Researchers have collected data from a random sample of six students on the number of hours spent studying for an exam and the grade received on the exam as given in Table 6.5. Using this data,
Does using social media stress you out? A researcher is interested in whether college students who extensively use social media tend to have higher levels of stress. To test this, a sample of nine
Researchers have collected data from a sample of nine individuals on the number of hours of television watched in a day and the age of the individual. They are interested in estimating if age is
Do towns in Massachusetts with higher elevations tend to get more snowfall? To answer this question, a random sample of five towns in Massachusetts, their average yearly snowfall (in inches), and
An organizational psychologist is interested in determining what factors in the workplace are related to employee satisfaction. Survey data were collected from 20 employees regarding measures of
What a difference a single outlier can make! The data set in Table 7.7 presents a collection of ordered pairs \((x, y)\).a. Using Minitab, draw a scatterplot and run a simple linear regression
Can the use of social media stress you out? A researcher is interested in finding out whether college students who extensively use social media tend to have higher levels of stress. To test this, a
A plant manager wants to estimate how the number of units produced at a plant is affected by the number of employees. The data in Table 7.9 gives a random sample of the number of units produced and
Is there a relationship between buying an expensive car and the level of satisfaction with the car? To answer this question, a consumer researcher randomly sampled 15 individuals, asked them about
The correlation inference that was described in this chapter can only test whether a population coefficient of correlation is significantly different from 0 . We may also be interested in testing
If samples are taken from two different populations, we can also test whether the two population coefficients of correlation are significantly different from each other. To do this, we first need to
The data in Table 8.6 gives the amount of credit card debt for a sample of 24 individuals along with their age, education level, and yearly salary.a. Write the population linear regression
Does your age and weight affect your blood pressure? To test this, a medical researcher collected measures of systolic blood pressure (in \(\mathrm{mmHg}\), millimeters of mercury), age (in years),
The asking price of a home is influenced by many different factors such as the number of bedrooms, number of bathrooms, square footage, and the lot size. A random sample of 13 recent home sales
What factors influence how much people are willing to spend on a mortgage or rent payment? The data set in Table 8.9 consists of a random sample of 20 residences and it contains measures of the
You may have noticed in the data set in Table 8.9 that many of the variables have missing values. In fact, only eight of the rows have measures on all six predictor variables. Data sets with missing
An island ferry company wants to determine what factors impact the revenue it makes from running its ferry boats to a local island. The passengers are charged to ride the ferry and also for
The data in Figure 9.19 are from running a multiple regression analysis to develop a model that predicts the selling price of homes based on the number of bedrooms, the number of bathrooms, square
A financial advisor wants to determine what factors influence a mutual funds rate of return. A random sample of 1-year return rates (expressed as a percentage) for 78 mutual funds was collected along
Can you predict final examination scores based on the number of hours studying, major, gender, and current GPA? For a random sample of 17 students, their exam grade (on a scale of \(0-100\) ), the
Can your diet make you happier? A researcher wants to know if the food you eat has an effect on your happiness. A random sample of 41 participants was asked to complete a survey on happiness and a
In problem #1, we used the given sample data to estimate whether there is a difference in happiness scores based on the type of diet. Using Fisher's LSD, describe and interpret any and all
A researcher wants to investigate whether the amount of alcohol consumed by first-year college students has an impact on their grades. To study this, the researcher first categorized alcohol
In problem \#3, we used the given sample data to estimate whether there is a difference in the first-year GPAs based on the amount of alcohol consumed per week. Using Fisher's LSD, describe and
A researcher wants to know if there is a difference in the average number of hours a person sleeps based on four different types of natural sleep supplements (melatonin, herbal tea, tart cherry
In problem \#5, we used the given sample data to estimate whether there is a difference in the mean number of hours slept based on the type of natural supplement used. Using Fisher's LSD, describe
A study was conducted where 36 patients were randomly assigned to use one of the three over-the-counter pain medications and were asked how long the medicine provided pain relief (rounded to the
Is there a difference in the final examination grades for courses that are offered on-ground, online, or as a hybrid class (a hybrid class is half on-ground and half online)? The data provided in
Using the data given in Table 11.3, perform a Wilcoxon signed-rank test manually by completing Table 11.12 .Table 11.3Table 11.12 Amount Spent Per Month (in Dollars) on Car Payments for a Random
The dosages for the supplement melatonin can vary between \(1 \mathrm{and} 60 \mathrm{mg}\), where a dosage of \(5 \mathrm{mg}\) is a typical starting dosage. A random sample of 15 individuals using
The Wilcoxon signed-rank test can also be used for paired data by testing if the median difference is different from 0 . The data presented in Table 11.14 gives the before and after cholesterol
An educational researcher is interested in whether students do better in online classes as compared to on-ground classes. A random sample of 50 students was randomly assigned to participate in either
Showing 4600 - 4700
of 5757
First
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
Last