Comprehensive Guide to Decision Trees, KNN, PCA, and Neural Networks in Machine Learning

Flashcard Icon

Flashcard

Learn Mode Icon

Learn Mode

Match Icon

Match

Coming Soon!
Library Icon

Library

View Library
Match Icon

Create

Create More Decks
Flashcard Icon Flashcards
Flashcard Icon Flashcards
Library Icon Library
Match Icon Match (Coming Soon)

Computer Science - Software Engineering

View Results
Full Screen Icon

user_jevbwl Created by 7 mon ago

Cards in this deck(100)
What is a decision tree, and how is it used in data classification tasks?
Blur Image
Describe the components of a decision tree, including internal nodes, branches, and leaf nodes.
Blur Image
Explain the steps involved in building a decision tree.
Blur Image
Define entropy and information gain in the context of decision trees and explain how they influence the construction of the tree.
Blur Image
What is the Gini index, and how is it used in building decision trees?
Blur Image
Describe the ID3 algorithm and its role in decision tree construction.
Blur Image
How are attributes selected for splitting at each node in a decision tree?
Blur Image
Discuss the techniques used to prevent overfitting in decision trees.
Blur Image
Explain the difference between pre-pruning and post-pruning in decision tree algorithms.
Blur Image
List and explain the advantages and limitations of using decision trees.
Blur Image
Compare and contrast decision trees with Random Forests.
Blur Image
How do decision trees handle missing values during the training process?
Blur Image
Provide examples of real-world applications where decision trees are effectively used.
Blur Image
How does a decision tree algorithm handle continuous and categorical variables differently?
Blur Image
What methods are used to evaluate the performance of a decision tree?
Blur Image
What is the purpose of evaluation metrics in machine learning?
Blur Image
Define accuracy in the context of binary classification.
Blur Image
Why might accuracy not be a good metric in cases of class imbalance?
Blur Image
Define precision and recall. Why are these metrics important in binary classification tasks?
Blur Image
What is the F1 score and how does it combine precision and recall?
Blur Image
Explain the components of a confusion matrix.
Blur Image
Describe the process and purpose of k-fold cross-validation in model evaluation.
Blur Image
Explain what leave-one-out cross-validation is and in what scenario it might be particularly useful.
Blur Image
Describe how a paired t-test is used to compare the performance of two models.
Blur Image
How do evaluation metrics help in assessing the generalization performance of a machine learning model?
Blur Image
What strategies might be used to handle evaluation in imbalanced datasets?
Blur Image
Discuss the trade-off between precision and recall and how it affects model performance evaluation.
Blur Image
What factors should be considered when using evaluation metrics to compare different machine learning models?
Blur Image
What are overfitting and underfitting in the context of machine learning?
Blur Image
Explain the concept of generalization and its importance in machine learning models.
Blur Image
What is meant by high bias in a machine learning model, and what are its typical consequences?
Blur Image
Describe the symptoms and consequences of high variance in a machine learning model.
Blur Image
Discuss the trade-off between bias and variance in machine learning models.
Blur Image
What indicators would suggest that a model is overfitting?
Blur Image
What indicators would suggest that a model is underfitting?
Blur Image
List and describe three methods that can be used to prevent or reduce overfitting in machine learning models.
Blur Image
Explain the concept of early stopping and how it helps prevent overfitting.
Blur Image
What is the purpose of regularization, and how does it help in managing overfitting?
Blur Image
Compare and contrast Lasso and Ridge regression in the context of regularization.
Blur Image
How does increasing the complexity of a model affect its likelihood of overfitting?
Blur Image
How does cross-validation help prevent overfitting?
Blur Image
What strategies can be employed to address underfitting in a machine learning model?
Blur Image
Provide an example each of overfitting and underfitting in real-world machine learning applications.
Blur Image
What is the K-Nearest Neighbors (KNN) algorithm, and in which scenarios is it commonly used?
Blur Image
Explain why KNN is considered a non-parametric algorithm. How does this characteristic impact its performance?
Blur Image
How does the choice of the parameter K affect the performance of a KNN model? What are the risks of choosing a value of K that is too small or too large?
Blur Image
Name and describe at least three distance metrics used in KNN. In what situations would you use each one?
Blur Image
Why is data scaling important in KNN? Describe the difference between normalization and standardization and their roles in KNN.
Blur Image
What is meant by the term 'lazy learner,' and how does it apply to the KNN algorithm? Discuss its advantages and disadvantages.
Blur Image
List the steps involved in the KNN algorithm for classification. Use an example to illustrate these steps.
Blur Image
Explain the 'curse of dimensionality' and how it affects KNN. What techniques can be used to mitigate this issue?
Blur Image
How do outliers affect the predictions made by KNN? What strategies can be employed to handle outliers in the dataset?
Blur Image
Suppose you are tasked with classifying emails as spam or not spam using KNN. Describe how you would preprocess the data and select an appropriate K value for the task.
Blur Image
How can cross-validation be used to determine the optimal value of K in KNN? Explain with an example.
Blur Image
KNN is often computationally intensive for large datasets. Why is this the case, and what strategies can be used to improve its efficiency?
Blur Image
How does KNN perform regression tasks? Explain the role of averaging in this context and provide an example.
Blur Image
Compare KNN with a decision tree-based classifier. What are the strengths and weaknesses of each algorithm in terms of interpretability and performance?
Blur Image
What is Principal Component Analysis (PCA), and why is it used in machine learning?
Blur Image
How does PCA achieve dimensionality reduction? Why is dimensionality reduction important?
Blur Image
Outline the steps involved in performing PCA on a dataset.
Blur Image
Explain the concepts of variance and covariance. How are they used in the PCA algorithm?
Blur Image
What roles do eigenvalues and eigenvectors play in PCA? How do they determine the principal components?
Blur Image
Describe how principal components are derived. What properties do they have?
Blur Image
How do you decide how many principal components to retain after performing PCA?
Blur Image
What is the curse of dimensionality, and how does PCA address this issue?
Blur Image
Why is it important to standardize data before applying PCA? What happens if you skip this step?
Blur Image
You are given a dataset with 50 features. Describe how you would use PCA to reduce the number of features while retaining 90% of the variance.
Blur Image
Explain how PCA can be used for visualizing high-dimensional datasets in two or three dimensions.
Blur Image
What are the limitations of PCA? In what situations might PCA not perform well?
Blur Image
Given a dataset with missing values, noisy features, and a large number of correlated variables, explain how PCA can be used to preprocess the data for machine learning.
Blur Image
Compare PCA with feature selection techniques. What are the advantages and disadvantages of using PCA over traditional feature selection?
Blur Image
What is clustering, and how is it used in unsupervised learning? Provide examples of its applications.
Blur Image
Explain the primary goals of clustering. What is meant by maximizing intra-cluster similarity and minimizing inter-cluster similarity?
Blur Image
Describe the key differences between hierarchical and partitional clustering. Provide an example of each.
Blur Image
Name and explain three common distance metrics used in clustering algorithms. In what scenarios would each metric be preferred?
Blur Image
What is hierarchical clustering? Differentiate between agglomerative (bottom-up) and divisive (top-down) approaches.
Blur Image
What is a dendrogram, and how is it used in hierarchical clustering? How can you determine the number of clusters from a dendrogram?
Blur Image
Outline the steps of the K-Means clustering algorithm. Provide an example to illustrate the process.
Blur Image
How does the Elbow Method help determine the optimal number of clusters in K-Means clustering? Explain with a diagram.
Blur Image
How do clustering algorithms handle outliers? Discuss the strengths and weaknesses of K-Means and DBSCAN in this context.
Blur Image
Explain the curse of dimensionality and its impact on clustering algorithms. What techniques can be used to address this issue?
Blur Image
You are tasked with grouping customers based on purchasing behavior. How would you apply clustering to solve this problem? Discuss the preprocessing steps and the choice of algorithm.
Blur Image
What are parametric and non-parametric methods in statistical analysis? Highlight the key difference between their assumptions.
Blur Image
Provide two examples each of parametric and non-parametric methods. Explain how their assumptions influence their application.
Blur Image
Compare the advantages and disadvantages of parametric and non-parametric methods. When would you prefer one over the other?
Blur Image
What are the main assumptions underlying parametric methods? How do these assumptions differ from those of non-parametric methods?
Blur Image
You are given a dataset that does not follow a normal distribution and contains many outliers. Which method—parametric or non-parametric—would you choose for analysis? Justify your choice with reasoning.
Blur Image
What are ensemble models in machine learning? Explain how combining multiple weak learners can lead to a strong learner.
Blur Image
Compare and contrast bagging and boosting. Highlight their objectives, how they combine models, and their impact on bias and variance.
Blur Image
What is bootstrap sampling in bagging? Describe the steps involved in training a bagging model and how it reduces variance.
Blur Image
How does gradient boosting work? Explain the iterative process of improving predictions by minimizing residuals.
Blur Image
Describe a scenario where boosting would be preferred over bagging. Include examples of algorithms like AdaBoost or Gradient Boosting and their advantages in that context.
Blur Image
Why are activation functions important in neural networks? Compare ReLU and sigmoid functions, including their advantages and limitations.
Blur Image
Outline the steps of forward propagation in a neural network. Use an example to demonstrate how inputs are transformed through layers.
Blur Image
What is the purpose of bias in a neural network? How does it enhance the learning process?
Blur Image
What is the softmax function, and why is it commonly used in multi-class classification tasks?
Blur Image
Define overfitting in the context of neural networks. What are its symptoms, and why is it undesirable?
Blur Image
Compare L1 and L2 regularization. How do they help prevent overfitting in neural networks?
Blur Image
Explain the dropout technique in neural networks. How does it improve generalization and prevent overfitting?
Blur Image
What is batch normalization, and how does it stabilize training and reduce overfitting in neural networks?
Blur Image

Ask Our AI Tutor

Get Instant Help with Your Questions

Need help understanding a concept or solving a problem? Type your question below, and our AI tutor will provide a personalized answer in real-time!

How it works

  • Ask any academic question, and our AI tutor will respond instantly with explanations, solutions, or examples.
Flashcard Icon
  • Browse questions and discover topic-based flashcards
  • Practice with engaging flashcards designed for each subject
  • Strengthen memory with concise, effective learning tools