All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
principles algorithms and systems
Questions and Answers of
Principles Algorithms And Systems
2. Is the approach suitable for the type of prediction we want to make and the types of descriptive features we are using?
1. Does a machine learning approach match the requirements of the project?
Which pieces of the network infrastructure were likely to fail in the near future?
What retention offer would a particular customer best respond to?
Which customers were most likely to churn in the near future?
What is the overall lifetime value of a customer?
7. A prediction model is going to be built for in-line quality assurance in a factory that manufactures electronic components for the automotive industry. The system will be integrated into the
6. A marketing company working for a charity has developed two different models that predict the likelihood that donors will respond to a mail-shot asking them to make a special extra donation. The
5. Explain the problem associated with measuring the performance of a predictive model using a single accuracy figure.
4. A retail supermarket chain has built a prediction model that recognizes the household that a customer comes from as being one of single, business, or family. After deployment, the analytics team
3. A credit card issuer has built two different credit scoring models that predict the propensity of customers to default on their loans. The outputs of the first model for a test dataset are shown
2. The table below shows the predictions made for a continuous target feature by two different prediction models for a test dataset.a. Based on these predictions, calculate the evaluation measures
1. The table below shows the predictions made for a categorical target feature by a model for a test dataset. Based on this test set, calculate the evaluation measures listed below.a. A confusion
2. The table below gives details of symptoms that patients presented and whether they were suffering from meningitis.Using this dataset calculate the following probabilities:a. P(VOMITING = true)b.
8. A support vector machine has been built to predict whether a patient is at risk of cardiovascular disease. In the dataset used to train the model there are two target levels—high risk (the
7. The following multinomial logistic regression model predicts the TYPE of a retail customer (single, family, or business) based on the average amount that they spend per visit, SPEND, and the
6. The effects that can occur when different drugs are taken together can be difficult for doctors to predict. Machine learning models can be built to help predict optimal dosages of drugs so as to
5. When building multivariate logistic regression models, it is recommended that all continuous descriptive features be normalized to the range [−1, 1]. The table below shows a data quality report
4. The use of the kernel trick is key in writing efficient implementations of the support vector machine approach to predictive modelling. The kernel trick is based on the fact that the result of a
3. A multivariate logistic regression model has been built to predict the propensity of shoppers to perform a repeat purchase of a free gift that they are given. The descriptive features used by the
2. You have been hired by the European Space Agency to build a model that predicts the amount of oxygen that an astronaut consumes when performing five minutes of intense physical work. The
1. A multivariate linear regression model has been built to predict the heating load in a residential building based on a set of descriptive features describing the characteristics of the building.
6. Imagine that you have been given a dataset of 1,000 documents that have been classified as being about entertainment or education. There are 700 entertainment documents in the dataset and 300
5. The table below lists a dataset containing details of policy holders at an insurance company. The descriptive features included in the table describe each policy holders’ ID, occupation, gender,
4. The following is a description of the causal relationship between storms, the behavior of burglars and cats, and house alarms:Stormy nights are rare. Burglary is also rare, and if it is a stormy
3. Predictive data analytics models are often used as tools for process quality control and fault detection. The task in this question is to create a naive Bayes model to monitor a waste water
1.a. Three people flip a fair coin. What is the probability that exactly two of them will get heads?b. Twenty people flip a fair coin. What is the probability that exactly eight of them will get
6. You have been asked by a San Francisco property investment company to create a predictive model that will generate house price estimates for properties they are considering purchasing as rental
5. You are working as an assistant biologist to Charles Darwin on the Beagle voyage. You are at the Galápagos Islands, and you have just discovered a new animal that has not yet been classified. Mr.
4. You have been given the job of building a recommender system for a large online shop that has a stock of over 100,000 items. In this domain the behavior of customers is captured in terms of what
3. The predictive task in this question is to predict the level of corruption in a country based on a range of macro-economic and social features.The table below lists some countries described by the
2. Email spam filtering models often use a bag-of-words representation for emails. In a bag-of-words representation, the descriptive features that describe a document (in our case, an email) each
1. The table below lists a dataset that was used to create a nearest neighbour model that predicts whether it will be a good day to go surfing.Assuming that the model uses Euclidean distance to find
9. Calculate the probability of a model ensemble that uses simple majority voting making an incorrect prediction in the following scenarios. (Hint: Understanding how to use the binomial distribution
8. This table lists a dataset of the scores students achieved on an exam described in terms of whether the student studied for the exam(STUDIED) and the energy level of the lecturer when grading the
7. The following table lists a dataset collected in an electronics shop showing details of customers and whether they responded to a special offer to buy a new laptop.This dataset has been used to
6. The following table lists a dataset containing the details of six patients. Each patient is described in terms of three binary descriptive features (OBESE, SMOKER, and DRINKS ALCOHOL) and a target
5. The following table34 lists a dataset containing the details of five participants in a heart disease study, and a target feature RISK which describes their risk of heart disease. Each patient is
4. The diagram below shows a decision tree for the task of predicting heart disease.33 The descriptive features in this domain describe whether the patient suffers from chest pain (CHEST PAIN) as
3. The table below lists a sample of data from a censusThere are four descriptive features and one target feature in this dataset:AGE, a continuous feature listing the age of the individual
2. A convicted criminal who reoffends after release is known as a recidivist. The table below lists a dataset that describes prisoners released on parole, and whether they reoffended within two years
1. The image below shows a set of eight Scrabble pieces.a. What is the entropy in bits of the letters in this set?b. What would be the reduction in entropy (i.e., the information gain)in bits if we
10. The following data visualizations are based on the tachycardia prediction dataset from Question 9 (after the instances with missing TACHYCARDIA values have been removed and all outliers have been
9. Tachycardia is a condition that causes the heart to beat faster than normal at rest. The occurrence of tachycardia can have serious implications including increased risk of stroke or sudden
8. The table below shows socio-economic data for a selection of countries for the year 2009,16 using the following features:COUNTRY: The name of the country LIFEEXPECTANCY: The average life
7. Comment on the distributions of the features shown in each of the following histogramsa. The height of employees in a truck driving company.b. The number of prior criminal convictions held by
6. The following table shows the IQs for a group of people who applied to take part in a television general knowledge quiz.Using this dataset, generate the following binned versions of the IQ
5. The table below shows the scores achieved by a group of students on an exam.Using this data, perform the following tasks on the SCORE feature:a. A range normalization that generates data in the
4. The following data visualizations are based on the channel prediction dataset given in Question 3. Each visualization illustrates the relationship between a descriptive feature and the target
3. An analytics consultant at an insurance company has built an ABT that will be used to train a model to predict the best communications channel to use to contact a potential customer with an offer
2. The table below shows the policy type held by customers at a life assurance company.a. Based on this data calculate the following summary statistics for the POLICY feature:i. Mode and 2nd mode ii.
1. The table below shows the age of each employee at a cardboard box factory.Based on this data calculate the following summary statistics for the AGE feature:a. Minimum, maximum and rangeb. Mean and
7. Select one of the predictive analytics models that you proposed in your answer to the previous question about the oil exploration company for exploration of the design of its analytics base
6. An oil exploration company is struggling to cope with the number of exploratory sites that they need to drill in order to find locations for viable oil wells. There are many potential sites that
5. Although their sales are reasonable, an online fashion retailer is struggling to generate the volume of sales that they had originally hoped for when launching their site. List a number of ways in
4. Select one of the predictive analytics models that you proposed in your answer to Question 2 about the revenue commission for exploration of the design of its analytics base table (ABT).a. What is
3. The table below shows a sample of a larger dataset containing details of policy holders at an insurance company. The descriptive features included in the table describe each policy holders’ ID,
2. A national revenue commission performs audits on public companies to find and fine tax defaulters. To perform an audit, a tax inspector visits a company and spends a number of days scrutinizing
1. An online movie streaming company has a business problem of growing customer churn—subscription customers canceling their subscriptions to join a competitor. Create a list of ways in which
In what ways could a predictive analytics model help to address the business problem?
How does the business currently work?
What is the business problem? What are the goals that the business wants to achieve?
8. It is often said that 80% of the work done on predictive data analytics projects is done in the Business Understanding, Data Understanding, and Data Preparation phases of CRISP-DM, and just 20% is
7. What can go wrong when an inappropriate inductive bias is used?
6. How do machine learning algorithms deal with the fact that machine learning is an ill-posed problem?
5. What is meant by the term inductive bias?
4. The following table lists a dataset from the credit scoring domain we discussed in the chapter. Underneath the table we list two prediction models that are consistent with this dataset, Model 1
3. Machine learning is often referred to as an ill-posed problem. What does this mean?
2. What is supervised machine learning?
1. What is predictive data analytics?
1.3 How Does Machine Learning Work?
1.2 What Is Machine Learning?
What Is Predictive Data Analytics?
1. State whether True or False for each of the following. Justify your answers. (a) Possibly():-Definitely() (b) Possibly(): Definitely() (c) Possibly(): Definitely(-6) (d) Possibly():-Definitely (0)
6. Prove the following. For the equalities, you need to prove the implication in both directions.For each part, first prove the results using the interleaving model, and then prove the results using
3. If events ei and ej respectively occurred at processes pi and pj and are assigned vector timestamps V Tei and V Tej , respectively, then show that ee, ej VT, [i] < VT[i] < VTe, [i].
2. If events corresponding to vector timestamps V t1, V t2, ...., Vtn are mutually concurrent, then prove that (Vt1[1]. Vt[2].....Vtn[n])= max(Vt1, Vt2...... Vtn).
31. Modify the rules of the expansion, contraction, and switch tests in the adaptive dynamic replication algorithm of Section 5.12 to adapt to tree overlays on arbitrary graphs, rather than to tree
30. (AdaptiveData Replication.) In the adaptive data replication scheme (Section 5.12), consider a node that is both an R-neighbour and a R-fringe node.• Can the expansion test and the reduction
29. Examine the impact of both fail-stop process failures and of crash process failures on all the algorithms described in this chapter. Explain your answers in each case.
28. Examine all the algorithms in this chapter, and classify them using the classifications introduced in Section 5.2 ( 5.2.1- 5.2.10).
27. (a) For the tree labeling schemes, show that there is no uniform bound on the dialation, which is defined as the ratio of the length of the tree path to the optimal path, between any pair of
26. (a) For the tree labeling scheme for compact routing, show that a pre-order traversal of the tree generates a numbering that always permits tree-labeled routing.(b) Will post-order traversal
25. For the γ-synchronizer, significant flexibility can be achieved by varying a parameter k that is used to give a bound on Lc (sum of the number of tree edges and clustering edges) and hc(maximum
24. Consider the simple, the α, and the β synchronizers. Identify some algorithms or application areas where you can identify one synchronizer as being more efficient than the others.
23. Identify how the complexity of the synchronous GHS algorithm can be reduced from O((n+|L|)log n) to O((n log n) + |L|). Explain and prove your answer.
22. In the synchronous GHS MST algorithm, prove that when several components join to form a single component, there must exist a cycle of length two in the component graph of MWOE edges.
21. In the synchronous distributed GHS algorithm, it was assumed that all the edge weights were unique. Explain why this assumption was necessary, and give a way to make the weights unique if they
20. In the distributed Floyd-Warshall algorithm of Figure 5.14,(a) show that the parameter pivot is redundant on all the message types when the communication channels are FIFO.(b) show that the
19. In the distributed Floyd-Warshall algorithm of Figure 5.14, consider iteration k at node i and iteration k + 1 at node j. Examine the dependencies in the code of i and j in these two iterations.
18. For the asynchronous Bellman-Ford algorithm of Figure 5.11,(a) if some of the links may have negative weights, what would be the impact on the shortest paths? Explain your answer.(b) if the link
17. For the asynchronous Bellman-Ford algorithm of Figure 5.11, if all links are assumed to have equal weight, the algorithm effectively computes the minimum-hop path. Show that under this
16. For the asynchronous Bellman-Ford algorithm of Figure 5.11 show that it has an exponential(cn) number ofmessages and exponential(cn·d) time complexity in the worst case, where c is some constant
15. Modify the asynchronous Bellman-Ford algorithm to devise the Distance Vector Routing algorithm outlined in Section ??.
14. In the asynchronous Bellman-Ford algorithm of Figure 5.11 what can be said about the termination conditions when (i) n is not known, and when (ii) n is known?For each of these two cases, modify
13. In the synchronous distributed Bellman-Ford algorithm in Figure 5.10, the termination condition for the algorithm assumed that each process knew the number of nodes in the graph.If this number is
12. Adapt Algorithms 5.10 and 5.17 to design a synchronous algorithm that achieves the following property: “In each round, each node may or may not generate a new update that it wants to distribute
11. Modify the synchronous flooding algorithm of Figure 5.17 so as to reduce the complexity, assuming that all the processes only need to know the highest process identifier among all the processes
10. Formally write the convergeecast algorithm of Section 5.5.5 using the style for the other algorithms in this chapter.Modify your algorithm to satisfy the following property. Each node has a
9. (based on [2]) Modify the algorithm derived in Exercise 8 to obtain a depth-first search tree but with time complexity O(n). (Assuming a single intiator for simplicity does not reduce the time
Showing 1 - 100
of 640
1
2
3
4
5
6
7