Question
1. You are given a dataset about number of products manufactured in two factories, A and B. You are asked to look into the data
1. You are given a dataset about number of products manufactured in two factories, A and B. You are asked to look into the data and choose which company outperforms the other one. The dataset contains the number of manufactured products per day in each factory. Your data contain year of information.
What approach will you select?
A. You can do time series forecasting to see which factory will perform better in the future.
B.You can look at the average number of sales to see which factory in average performs better.
C. You can build a hypothesis and perform a test to see which factory is statistically better
D. We need more information.
2. You have a SQL database that has the name of the countries and their population. You would like to ONLY get the name of the top 3 most populated countries. Which query will do it for you
A. SELECT name FROM TABLE ORDER BY population LIMIT 3
B. SELECT countries FROM TABLE ORDER BY name LIMIT 3.
C. SELECT ONLY TOP 3 name from TABLE ORDER BY population.
D. SELECT ONLY TOP 3 population from TABLE ORDER BY name
3. You are building a linear regression model. While looking into the data, you realize that two of your features are highly correlated. What is the recommended action?
A. Get rid of one of them
B.Use both of them
C.Multiply them and use the resulting feature instead
D.We need more information
4. You are trying to cluster your online market customers with more than 1,000,000 customers into several groups based on their purchase behaviour. How many clusters will you choose?
A.5
B.10
C.20
D. There is no right or wrong answer before looking into data and learning more about the context
5. You work For Amazon Web Services. Part of your job is to provide HPC facilities to your customers. You notice that some customers use the resources extensively for some "illegal" purposes, and you want to stop them. So you build a model that can detect, based on the activity of a user, if the user is using the resources properly or not. Which one is worst for such a model, FN or FP or both, considering these facts:
AWS is the number one platform of its kind in the world. It is famous enough. They don't need to acquire new customers. At the same time, they have customers they never want to lose, as those customers bring millions of dollars to the platform. So, they really don't want to tell a valuable customer, mistakenly, they are doing illegal work.
Assume a "nice" user as "negative" and an "illegal" user as "positive"
A.FP is more important as we don't want to tell a user you are doing fraud while they are not
B.FN is more important as we don't want not to predict a user is a nice one while this is not true
C.Both
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started