Topic 22 – Introduction to Machine Learning

Why do I need to learn about machine learning?

Machine learning has been used to solve many important and difficult problems, including speech recognition, speech synthesis, image recognition, autonomous driving, and chatbots. Today, a key skill for software developers is the ability to use machine learning algorithms to solve real-world problems.

What can I do after finishing learning about machine learning?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

First, please audit these couses to learn the core concepts of machine learning and gain hands-on experience with them:

After that, please read the following books to reinforce your theoretical understanding and practical competence in machine learning:

After that, please audit this course and read its readings to learn the core approaches and algorithms for building artificial intelligence systems: MIT 6.034 – Artificial Intelligence, Fall 2010 (Readings).

After that, please read the following books to to study the mathematical foundations underlying machine learning algorithms:

After that, please audit the following courses and read the book below to learn the core concepts and algorithms of reinforcement learning:

Supervised Learning Terminology Review:

  • Artificial Intelligence.
  • Machine Learning.
  • Deep Learning.
  • Linear Regression: Y = θX + Ε.
  • Cost Function measures how good/bad your model is.
  • Mean Square Error (MSE) measures the average of the squares of the errors.
  • Gradient Descent, Learning Rate.
  • Batch Gradient Descent.
  • The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
  • Stochastic Gradient Descent.
  • Mini-Batch Gradient Descent.
  • Overfitting: machine learning model gives accurate predictions for training data but not for new data.
  • Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
  • Normalization.
  • Logistic Regression.
  • Sigmoid Function.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • One Hot Encoding.
  • The Softmax function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
  • Softmax Regression.
  • Gradient Ascent.
  • Newton’s Method.
  • Support Vector Machines.
  • Decision Trees.
  • Parametric vs. Non-parametric Models.
  • K-Nearest Neighbors.
  • Locally Weighted Regression.
  • McCulloch-Pitts Neuron.
  • Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
  • Perceptron.
  • Artificial Neural Networks.
  • Backpropagation.
  • Activation Functions: Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Hyperbolic Tangent.
  • Batch Normalization.
  • Learning Rate Decay.
  • Exponentially Weighted Averages.
  • Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
  • Regularization: Dropout.
  • The Joint Probability Table.
  • Bayesian Networks.
  • Naive Bayes Inference.

Unsupervised Learning Terminology Review:

  • K-Means.
  • Principal Component Analysis.
  • User-Based Collaborative Filtering.
  • Item-based Collaborative Filtering.
  • Matrix Factorization.

    Reinforcement Learning Terminology Review:

    • k-armed Bandit Problem.
    • Sample-Average Method.
    • Greedy Action.
    • Exploration and Exploitation.
    • ϵ-Greedy Action Selection.
      • Bandit Algorithm.
      • Exponential Recency-Weighted Average.
      • Optimistic Initial Values.
      • Upper-Confidence-Bound Action Selection.
      • Rewards.
      • Agent, Actions, World or Environment.
      • History, States, Terminal State, Environment State, Agent State, Information State.
      • Fully Observable Environments.
      • Partially Observable Environments.
      • Policy,  Value Function, Model.
      • Value Based RL Agent, Policy Based RL Agent, Actor Critic RL Agent.
      • Model Free RL Agent, Model Based RL Agent.
      • Learning Problem and Planning Problem.
      • Prediction and Control.
      • Markov Property.
      • State Transition Matrix.
      • Markov Process.
      • Episodic Tasks.
      • Continuing Tasks.
      • Horizon (H): Number of time steps in each episode, can be infinite.
      • Markov Reward Process.
      • Discount Factor, Discount Rate: 0 ≤ γ ≤ 1.
      • Return.
      • Discounted Return: Discounted sum of rewards from time step t to horizon H.
      • State-Value Function.
      • Bellman Equation for Markov Reward Processes.
      • Markov Decision Process.
      • Policy: Mapping from states to actions. Deterministic policy: π (s) = a. Stochastic policy: π (a|s) = P(aₜ=a|sₜ=s).
      • State Value Function – Vπ(s): The expected return starting from state s following policy π.
      • Bellman Expectation Equation for Vπ.
      • Action Value Function (also known as State-Action Value Fucntion or the Quality Function) – Qπ(s, a): The expected return starting from state , taking action , then following policy .
      • Bellman Expectation Equation for Qπ.
      • Optimal State Value Function.
      • Optimal Action Value Function.
      • Bellman Optimality Equation for v*.
      • Bellman Optimality Equation for q*.
      • Optimal Policies.
      • Dynamic Programming.
      • Iterative Policy Evaluation.
      • Policy Improvement.
      • Policy Improvement Theorem.
      • Policy Iteration.
      • Value Iteration.
      • Synchronous Dynamic Programming.
      • Asynchronous Dynamic Programming.
      • Generalized Policy Iteration.
      • Bootstrapping: Updating estimates on the basis of other estimates.
      • Monte-Carlo Policy Evaluation.
      • First-Visit Monte-Carlo Policy Evaluation.
      • Every-Visit Monte-Carlo Policy Evaluation.
      • Incremental Mean.
      • Incremental Monte-Carlo Updates.
      • Temporal-Difference Learning.
      • Forward-View TD(λ).
      • Eligibility Traces.
      • Backward-View TD(λ).
      • On-Policy Learning.
      • Off-Policy Learning.
      • ϵ-Greedy Exploration.
      • ϵ-greedy Policies: Most of the time they choose an action that has maximal estimated action value, but with probability ϵ they instead select an action at random.
      • Monte-Carlo Policy Iteration. Policy evaluation: Monte-Carlo policy evaluation, Q = qπ. Policy improvement: ϵ-greedy policy improvement.
      • Monte-Carlo Control. Policy evaluation: Monte-Carlo policy evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Exploring Starts: Specify that the episodes start in a state–action pair, and that every pair has a nonzero probability of being selected as the start.
      • Monte Carlo Control Exploring Starts.
      • Greedy in the Limit with In nite Exploration (GLIE) Monte-Carlo Control.
      • ϵ-soft Policies: Policies for which π(a|s) ≥ ϵ/|A(s)| for all states and actions, for some ϵ > 0.
      • On-Policy First-Visit MC Control.
      • SARSA: State (S), Action (A), Reward (R), State (S’), Action (A’).
      • On-Policy Control with SARSA. Policy evaluation: SARSA evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Forward-View SARSA (λ).
      • Backward-View SARSA (λ).
      • Target Policy.
      • Behavior Policy.
      • Importance Sampling: Use samples from one distribution to estimate the expectation of a diff erent distribution.
      • Importance Sampling for Off-Policy Monte-Carlo.
      • Importance Sampling for Off-Policy TD.
      • Q-Learning: Next action is chosen using behaviour policy. Q is updated using alternative successor action.
      • Off -Policy Control with Q-Learning.
      • Expected SARSA.
      • Value Function Approximation.
      • Function Approximators.
      • Differentiable Function Approximators.
      • Feature Vectors.
      • State Aggregation.
      • Coarse Coding.
      • Tile Coding.
      • Continuous States.
      • Incremental Prediction Algorithms.
      • Control with Value Function Approximation. Policy evaluation: Approximate policy evaluation, q(.,., w) ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Learning State Action Value function: Replay Buffer: 10,000 tuples most recent (s, a, R(s), s’). x = (s, a) → Q(θ) → y = R(s) + γmaxQ(s’, a’, θ). Loss = [R(s) + γmaxQ(s’, a’; θ)] − Q(s, a; θ).
      • Expected SARSA with Function Approximation.
      • Target Network: A separate neural network for generating the y targets. It has the same architecture as the original Q-Network. Loss = [R(s) + γmaxTargetQ(s’, a’; θ′)] − Q(s, a; θ). Every C time steps we will use the TargetQ-Network to generate the y targets and update the weights of the TargetQ-Network using the weights of the Q-Network.
      • Soft Updates: ← 0.001θ + 0.999, where and represent the weights of the target network and the current network, respectively.
      • Deep Q-learning.
      • Linear Least Squares Prediction Algorithms.
      • Least Squares Policy Iteration. Policy evaluation: Least squares Q-Learning. Policy improvement: Greedy policy improvement.
      • Average Reward.
      • Discounted Returns, Returns for Average Reward.
      • Stochastic Policies.
      • Softmax Policies.
      • Gaussian Policies.
      • Policy Objective Functions: Start State Objective, Average Reward Objective and Average Value Objective.
      • Score Function.
      • Policy Gradient Theorem.
      • Monte-Carlo Policy Gradient (REINFORCE).
      • Action-Value Actor-Critic: Critic updates w by linear TD(0). Actor updates θ by policy gradient.
      • The Tabular Dyna-Q Algorithm.
      • The Dyna-Q+ Algorithm.
      • Forward Search.
      • Simulation-Based Search.
      • Monte-Carlo Tree Search.
      • Temporal-Difference Search.
      • Dyna-2.

      Probabilistic Machine Learning Terminology Review:

      • Probabilistic Machine Learning
      • Non-Probabilistic Machine Learning
      • Algorithmic Machine Learning.
      • Array Programming.
      • Frequentist and Bayesian Approaches.

      After finishing machine learning, please click on Topic 23 – Introduction to Computer Vision to continue.

       

      (Visited 129 times, 1 visits today)

      Leave a Reply