My Subject Matter
artificial-intelligence

Machine Learning Fundamentals New

A sourced reference on Machine Learning Fundamentals.

What is machine learning?

Machine learning is a subfield of artificial intelligence in which computer systems learn from data to improve performance on tasks without being explicitly programmed. Algorithms identify patterns, make decisions, and refine predictions through experience rather than hard-coded rules. [Source: MIT CSAIL]

Sources
Machine Learning Research — MIT CSAIL
academic · MIT Computer Science and Artificial Intelligence Laboratory · 2024-01-01
·
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·

What is deep learning and how does it differ from machine learning?

Deep learning is a subset of machine learning that uses artificial neural networks with many layers to learn hierarchical representations of data. While all deep learning is machine learning, standard ML often requires hand-crafted features; deep learning discovers features automatically from raw data. [Source: Stanford HAI]

Sources
Artificial Intelligence Definitions FAQ — Stanford HAI
academic · Stanford Human-Centered Artificial Intelligence Institute · 2023-08-01
·
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·

What is the difference between supervised and unsupervised learning?

Supervised learning trains models on labeled input-output pairs so predictions can be made on new data, while unsupervised learning finds hidden structure in unlabeled data without predefined targets. A third paradigm, reinforcement learning, learns via reward signals rather than labeled examples. [Source: NIST]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is an artificial neural network?

An artificial neural network is a computational model loosely inspired by biological brains, composed of interconnected nodes (neurons) organized in layers. Each connection carries a weight adjusted during training; the network learns by propagating errors backward through layers to minimize prediction loss. [Source: IEEE]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·

How does backpropagation work in training neural networks?

Backpropagation computes the gradient of a loss function with respect to each weight by applying the chain rule of calculus layer by layer from output to input. An optimizer such as stochastic gradient descent then updates weights in the direction that reduces loss, iterating until convergence. [Source: MIT OpenCourseWare]

Sources
6.034 Artificial Intelligence Lecture Notes — MIT OpenCourseWare
academic · MIT OpenCourseWare · 2022-01-01
·
CS231n: Optimization — Gradient Descent — Stanford University
academic · Stanford University · 2023-01-01
·

What is gradient descent in machine learning?

Gradient descent is an iterative optimization algorithm that minimizes a loss function by repeatedly adjusting model parameters in the direction opposite to the gradient. Variants include batch, stochastic, and mini-batch gradient descent, each trading computation cost for convergence stability during training. [Source: Stanford University]

Sources
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·
6.034 Artificial Intelligence Lecture Notes — MIT OpenCourseWare
academic · MIT OpenCourseWare · 2022-01-01
·

What is the learning rate and why does it matter?

The learning rate is a hyperparameter that controls how much model weights are adjusted after each gradient update. Too large a value causes training to diverge; too small leads to slow convergence or local minima. Adaptive optimizers like Adam adjust it automatically per parameter during training. [Source: Stanford University]

Sources
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·
·

What is overfitting in machine learning and how is it prevented?

Overfitting occurs when a model learns training data noise rather than generalizable patterns, performing well on training examples but poorly on unseen data. Prevention techniques include regularization (L1/L2), dropout, early stopping, cross-validation, and increasing training data volume or diversity. [Source: NIST]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is regularization in machine learning?

Regularization adds a penalty term to a model's loss function to discourage overly complex solutions and improve generalization. L1 regularization (Lasso) promotes sparsity by zeroing weak features; L2 (Ridge) shrinks all weights proportionally. Both reduce overfitting without requiring additional training data. [Source: MIT OpenCourseWare]

Sources
6.034 Artificial Intelligence Lecture Notes — MIT OpenCourseWare
academic · MIT OpenCourseWare · 2022-01-01
·
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·

What is the bias-variance tradeoff?

The bias-variance tradeoff describes a fundamental tension: high-bias models underfit by making simplistic assumptions, while high-variance models overfit by being too sensitive to training data. Optimal model complexity balances these two sources of error to minimize total generalization error on unseen examples. [Source: Stanford HAI]

Sources
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is cross-validation and when should it be used?

Cross-validation is a model evaluation technique that partitions data into k subsets, trains on k-1 folds, and tests on the remaining fold, rotating until every subset has served as a test set. It provides a reliable generalization estimate when data is limited and prevents evaluation on training data. [Source: NIST]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·

What is an activation function in a neural network?

An activation function introduces non-linearity into a neural network, enabling it to model complex relationships. Common choices include ReLU (Rectified Linear Unit), sigmoid, and softmax. Without non-linear activations, stacking layers would be mathematically equivalent to a single linear transformation, severely limiting model capacity. [Source: MIT OpenCourseWare]

Sources
6.034 Artificial Intelligence Lecture Notes — MIT OpenCourseWare
academic · MIT OpenCourseWare · 2022-01-01
·

What is reinforcement learning?

Reinforcement learning trains an agent to maximize cumulative reward by interacting with an environment, learning which actions lead to favorable outcomes through trial and error. Unlike supervised learning, no labeled dataset is required; the reward signal itself guides learning via policies updated over many episodes. [Source: DeepMind / Nature]

Sources
·
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·

What are transformer models in machine learning?

Transformers are neural network architectures introduced in the 2017 paper 'Attention Is All You Need' that rely on self-attention mechanisms rather than recurrence to model relationships across sequences. They underpin large language models like GPT and BERT and have been extended to vision, audio, and multimodal tasks. [Source: Google Research / arXiv]

Sources
Attention Is All You Need — arXiv:1706.03762
academic · arXiv / Google Research · 2017-06-12
·
Artificial Intelligence Definitions FAQ — Stanford HAI
academic · Stanford Human-Centered Artificial Intelligence Institute · 2023-08-01
·

What is transfer learning and why is it useful?

Transfer learning reuses a model pre-trained on a large dataset as a starting point for a related task, significantly reducing the labeled data and compute needed for good performance. Fine-tuning only the final layers adapts general representations to domain-specific tasks such as medical imaging or legal text classification. [Source: Stanford HAI]

Sources
Artificial Intelligence Definitions FAQ — Stanford HAI
academic · Stanford Human-Centered Artificial Intelligence Institute · 2023-08-01
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is feature engineering in machine learning?

Feature engineering is the process of selecting, transforming, or creating input variables from raw data to improve model performance. Effective features encode domain knowledge, reduce dimensionality, and help algorithms identify meaningful patterns. It remains critical in classical ML, though deep learning automates much of this process from raw inputs. [Source: NIST]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is dimensionality reduction and what techniques are commonly used?

Dimensionality reduction compresses high-dimensional data into fewer features while preserving important structure, reducing computational cost and mitigating the curse of dimensionality. Principal Component Analysis (PCA) and t-SNE are widely used; PCA finds linear projections of maximum variance, while t-SNE visualizes clusters in two or three dimensions. [Source: MIT OpenCourseWare]

Sources
6.034 Artificial Intelligence Lecture Notes — MIT OpenCourseWare
academic · MIT OpenCourseWare · 2022-01-01
·
CS229: Machine Learning Lecture Notes — Stanford University
academic · Stanford University Department of Computer Science · 2022-09-01
·

How do you evaluate the performance of a machine learning model?

Model evaluation depends on the task: classification uses accuracy, precision, recall, F1-score, and ROC-AUC; regression uses MAE, RMSE, and R². All metrics should be computed on a held-out test set or via cross-validation to reflect true generalization rather than memorization of training examples. [Source: NIST]

Sources
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·
Machine Learning Research Group — The Alan Turing Institute
academic · The Alan Turing Institute · 2024-01-01
·

What is fairness in machine learning and why does it matter?

ML fairness refers to ensuring that model predictions do not systematically disadvantage individuals based on protected attributes such as race, gender, or age. NIST's AI Risk Management Framework identifies fairness as a core trustworthy-AI property and recommends bias testing throughout the model development lifecycle. [Source: NIST]

Sources
AI Risk Management Framework (AI RMF 1.0) — NIST
primary · National Institute of Standards and Technology (NIST) · 2023-01-26
·
The Language of Trustworthy AI: An Glossary of Terms — NIST AI RMF
primary · National Institute of Standards and Technology (NIST) · 2023-03-30
·

What is model interpretability and why is it important?

Model interpretability describes the degree to which humans can understand why a model produces a given prediction. It is critical for debugging, regulatory compliance, and building user trust. Techniques include SHAP values, LIME, saliency maps, and attention visualization, each suited to different model types and explanation goals. [Source: NIST AI RMF]

Sources
AI Risk Management Framework (AI RMF 1.0) — NIST
primary · National Institute of Standards and Technology (NIST) · 2023-01-26
·
Artificial Intelligence Definitions FAQ — Stanford HAI
academic · Stanford Human-Centered Artificial Intelligence Institute · 2023-08-01
·