Keras | Software Development

Why do I need to learn about computer vision?

Computer vision has become an increasingly interesting field, with achievements such as image recognition, autonomous driving, and disease detection.

Nowadays, a key skill for software developers is the ability to use computer vision algorithms, foundation models, and tools to solve real-world problems involving images and videos.

What can I do after finishing learning about applied computer vision?

You will be able to create software that can recognize a face or transform a picture of a young person into an older person.

That sounds fun! What should I do now?

First, please take a quick look at the following two books to grasp the core concepts and methods in computer vision:

After that, please audit the course and read the book below to solidify your knowledge and gain hands-on experience with computer vision algorithms:

After that, please audit the following courses to grasp the core concepts of generative adversarial networks and gain hands-on experience with them:

After that, please audit the following courses and read the book below to grasp the core concepts of generative models, including diffusion models, and to gain hands-on experience with these models:

After that, please audit this course to learn how to efficiently represent, compress, and train large generative models: TinyML and Efficient Deep Learning Computing.

Terminology Review:

Digital Image: f(x, y)
Intensity (Gray Level): ℓ = f(x, y)
Gray Scale: ℓ = 0 is considered black and ℓ = L – 1 is considered white.
Quantization: Digitizing the amplitude values.
Sampling: Digitizing the coordinate values.
Representing Digital Images: Matrix or Vector.
Pixel or Picture Element: An element of matrix or vector.
∞×∞
Computer Vision Tasks: Image Classification, Image Classification, Object Segmentation, Style Transfer, Image Colorization, Image Reconstruction, Image Super-Resolution, Generating Images.
Deep Learning.
Artificial Neural Networks.
∞×∞
Filter: 2-dimensional matrix commonly square in size containing weights shared all over the input space.
The Convolution Operation: Element-wise multiply, and add the outputs.
Stride: Filter step size.
Padding.
Upsampling: Nearest Neighbors, Linear Interpolation, Bilinear Interpolation.
Max Pooling, Average Pooling, Min Pooling.
Convolutional Layers.
Feature Maps.
Convolutional Neural Networks (CNNs), ResNet.
Receptive Field, Strided Convolution Layer, Grouped Convolution Layer.
∞×∞
Object Localization.
Bounding Box.
Landmark Detection.
Sliding Windows Detection.
Bounding Box Predictions.
Intersection over Union.
Non-max Suppression Algorithm.
Anchor Box Algorithm.
∞×∞
Object Detection.
YOLO Algorithm.
∞×∞
Semantic Segmentation.
Transpose Convolution.
U-Net.
∞×∞
Face Verification.
Face Recognition.
One-shot Learning.
Siamese Network.
Triplet Loss.
∞×∞
Neural Style Transfer.
Content Cost Function.
Style Cost Function.
∞×∞
1D Convolution.
3D Convolution.
∞×∞
Latent Variable.
Autoencoders.
Variational Autoencoders.
Generators.
Discriminators.
Binary Cross Entropy Loss Function, Log Loss Function.
Generative Adversarial Networks (GANs).
Deep Convolutional Generative Adversarial Networks.
Mode Collapse.
Earth Mover’s Distance.
Wasserstein Loss (W-Loss).
1-Lipschitz Continuous Function.
Wasserstein GANs.
Conditional GANs.
Pixel Distance.
Feature Distance.
Fréchet Inception Distance (FID).
Inception Score (IS).
Autoregressive Models.
Variational Autoencoders (VAEs).
Flow Models.
StyleGAN.
Pix2Pix.
CycleGAN.
∞×∞
Diffusion Models.
∞×∞
Tokenizer.
Embeddings.
Self-Attention.
Multi-Head Attention.
Attention Masking.
Transformer Block.
Positional Encoding.
Vision Transformer.
Contrastive Language-Image Pre-training (CLIP) Models.
Visual Language Models: Flamingo.
∞×∞
Magnitude-based Pruning.
K-Means-based Weight Quantization.
Linear Quantization.
∞×∞
Neural Architecture Search.
∞×∞
Knowledge Distillation.
Self and Online Distillation.
Network Augmentation.
∞×∞
Loop Reordering, Loop Tiling, Loop Unrolling.
SIMD (Single Instruction, Multiple Data) Programming.
Multithreading.
CUDA Programming.
∞×∞
Data Parallelism.
Pipeline Parallelism.
Tensor Parallelism.
Hybrid Parallelism.
Automated Parallelism.
Gradient Pruning: Sparse Communication, Deep Gradient Compression, PowerSGD.
Gradient Quantization: 1-Bit SGD, Threshold Quantization, TernGrad.
Delayed Gradient Averaging.

After finishing computer vision, please click on Topic 24 – Introduction to Nature Language Processing to continue.

Why do I need to learn about machine learning?

Machine learning has been used to solve many important and difficult problems, including speech recognition, speech synthesis, image recognition, autonomous driving, and chatbots. Today, a key skill for software developers is the ability to use machine learning algorithms to solve real-world problems.

What can I do after finishing learning about machine learning?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

First, please audit these couses to learn the core concepts of machine learning and gain hands-on experience with them:

After that, please read the following books to reinforce your theoretical understanding and practical competence in machine learning:

After that, please audit this course and read its readings to learn the core approaches and algorithms for building artificial intelligence systems: MIT 6.034 – Artificial Intelligence, Fall 2010 (Readings).

After that, please read the following books to to study the mathematical foundations underlying machine learning algorithms:

After that, please audit the following courses and read the book below to learn the core concepts and algorithms of reinforcement learning:

Supervised Learning Terminology Review:

Artificial Intelligence.
Machine Learning.
Deep Learning.
Linear Regression: Y = θᵀX + Ε.
Cost Function measures how good or bad your model is.
Mean Square Error (MSE) measures the average of the squares of the errors.
Gradient Descent, Learning Rate.
Batch Gradient Descent.
The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
Stochastic Gradient Descent.
Mini-Batch Gradient Descent.
Overfitting: machine learning model gives accurate predictions for training data but not for new data.
Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
Normalization.
∞×∞
Logistic Regression.
Sigmoid Function.
Binary Cross Entropy Loss Function, Log Loss Function.
One Hot Encoding.
The Softmax function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
Softmax Regression.
∞×∞
Gradient Ascent.
Newton’s Method.
∞×∞
Support Vector Machines.
∞×∞
Decision Trees.
Parametric vs. Non-parametric Models.
Iterative Dichotomiser 3 (ID3).
Classification and Regression Trees (CART).
K-Nearest Neighbors.
Locally Weighted Regression.
∞×∞
McCulloch-Pitts Neuron.
Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
Perceptron.
Artificial Neural Networks.
Forward Propagation.
Activation Functions: Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Hyperbolic Tangent.
Softmax Layer.
Gradient: How much would the loss move if I nudged this one weight?
Chain Rule.
Automatic Differentiation: Record each operation going forward, then replay the graph backward.
Backpropagation: How much did each weight contribute to the error?
Batch Normalization.
Learning Rate Decay.
Exponentially Weighted Averages.
Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
Regularization: Dropout.
The Joint Probability Table.
∞×∞
Bayesian Networks.
Naive Bayes Inference.

Unsupervised Learning Terminology Review:

K-Means.
Principal Component Analysis.
User-Based Collaborative Filtering.
Item-based Collaborative Filtering.
Matrix Factorization.

Artificial Intelligence Terminology Review:

Representations, Generate and Test Method.
Minimax, Alpha-Beta.

Reinforcement Learning Terminology Review:

k-armed Bandit Problem.
Sample-Average Method.
Greedy Action.
Exploration and Exploitation.
ϵ-Greedy Action Selection.
Bandit Algorithm.
Exponential Recency-Weighted Average.
Optimistic Initial Values.
Upper-Confidence-Bound Action Selection.
∞×∞
Rewards.
Agent, Actions, World or Environment.
History, States, Terminal State, Environment State, Agent State, Information State.
Fully Observable Environments.
Partially Observable Environments.
Policy, Value Function, Model.
Value Based RL Agents, Policy Based RL Agents, Actor Critic RL Agents.
Model Free RL Agents, Model Based RL Agents.
Learning Problem and Planning Problems.
Prediction and Control.
∞×∞
Markov State.
State Transition Matrix.
Markov Process.
Episodic Tasks.
Continuing Tasks.
Horizon (H): Number of time steps in each episode, can be infinite.
Markov Reward Process.
Discount Factor, Discount Rate: 0 ≤ γ ≤ 1.
Return.
Discounted Return: Discounted sum of rewards from time step t to horizon H.
State-Value Function of an Markov Reward Process.
Bellman Equation for Markov Reward Processes.
Markov Decision Process.
Policy: Mapping from states to actions.
Deterministic policy: π (s) = a.
Stochastic policy: π (a|s) = P(aₜ=a|sₜ=s).
State-Value Function – Vπ(s): The expected return starting from state s following policy π.
Action-Value Function (also known as State-Action Value Function or the Quality Function) – Qπ(s, a): The expected return starting from state $s$ , taking action $a$ , then following policy $π$ .
Bellman Expectation Equation for Vπ.
Bellman Expectation Equation for Qπ.
Optimal State-Value Function – v*.
Optimal Action-Value Function – q*.
Optimal Policies.
Bellman Optimality Equation for v*.
Bellman Optimality Equation for q*.
Bellman Optimality Equation is non-linear. No closed form solution in general.
∞×∞
Dynamic Programming.
Iterative Policy Evaluation.
Policy Improvement.
Policy Improvement Theorem.
Policy Iteration.
Value Iteration.
Synchronous Dynamic Programming.
Asynchronous Dynamic Programming.
Generalized Policy Iteration.
Bootstrapping: Updating estimates on the basis of other estimates.
∞×∞
Monte-Carlo Policy Evaluation.
First-Visit Monte-Carlo Policy Evaluation.
Every-Visit Monte-Carlo Policy Evaluation.
Incremental Mean.
Incremental Monte-Carlo Updates.
Temporal-Difference Learning.
Forward-View TD(λ).
Eligibility Traces.
Backward-View TD(λ).
∞×∞
On-Policy Learning.
ϵ-Greedy Exploration.
ϵ-greedy Policies: Most of the time they choose an action that has maximal estimated action value, but with probability ϵ they instead select an action at random.
Monte-Carlo Policy Iteration. Policy evaluation: Monte-Carlo policy evaluation, Q = qπ. Policy improvement: ϵ-greedy policy improvement.
Monte-Carlo Control. Policy evaluation: Monte-Carlo policy evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
Exploring Starts: Specify that the episodes start in a state–action pair, and that every pair has a nonzero probability of being selected as the start.
Monte Carlo Control Exploring Starts.
Greedy in the Limit with Innite Exploration (GLIE) Monte-Carlo Control.
ϵ-soft Policies: Policies for which π(a|s) ≥ ϵ/|A(s)| for all states and actions, for some ϵ > 0.
On-Policy First-Visit MC Control.
SARSA: State (S), Action (A), Reward (R), State (S’), Action (A’).
On-Policy Control with SARSA. Policy evaluation: SARSA evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
Forward-View SARSA (λ).
Backward-View SARSA (λ).
Expected SARSA.
Off-Policy Learning.
Target Policy: The policy you are trying to evaluate or improve.
Behavior Policy: The policy that actually generates the data.
Importance Sampling: Use samples from one distribution to estimate the expectation of a different distribution.
Importance Sampling for Off-Policy Monte-Carlo.
Importance Sampling for Off-Policy TD.
Off-Policy Control with Q-Learning: Next action is chosen using a behaviour policy (an exploratory policy, often ϵ-greedy). Q is updated using the maximum Q-value over all possible next actions, not necessarily the action selected by the exploratory policy.
∞×∞
Types of Value Function Approximation: v(s, w), q(s, a, w), [q(s, a1, w), q(s, a2, w), …, q(s, an, w)]
Representing Value Functions.
Value Function Approximation.
Function Approximators.
Feature Vectors.
The Value Error Objective.
Gradient Monte Carlo for Policy Evaluation.
State Aggregation.
Semi-Gradient TD for Policy Evaluation.
Coarse Coding.
Tile Coding.
Continuous States.
Incremental Prediction Algorithms.
Control with Value Function Approximation. Policy evaluation: Approximate policy evaluation, q(.,., w) ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
Learning State Action Value function: Replay Buffer: 10,000 tuples most recent (s, a, R(s), s’). x = (s, a) → Q(θ) → y = R(s) + γmaxQ(s’, a’, θ). Loss = [R(s) + γmaxQ(s’, a’; θ)] − Q(s, a; θ).
Expected SARSA with Function Approximation.
Target Network: A separate neural network for generating the y targets. It has the same architecture as the original Q-Network. Loss = [R(s) + γmaxTargetQ(s’, a’; θ′)] − Q(s, a; θ). Every C time steps we will use the TargetQ-Network to generate the y targets and update the weights of the TargetQ-Network using the weights of the Q-Network.
Soft Updates: $θ^{'}$ $θ^{'} \leftarrow τ θ + (1 - τ) θ^{'}$ , where $θ^{'}$ and $θ$ represent the weights of the target network and the current network, respectively.
Deep Q-learning.
Linear Least Squares Prediction Algorithms.
Least Squares Policy Iteration. Policy evaluation: Least squares Q-Learning. Policy improvement: Greedy policy improvement.
Average Reward.
Discounted Returns, Returns for Average Reward.
∞×∞
Parameterized Policies.
Stochastic Policies.
Softmax Policies.
Gaussian Policies.
Policy Objective Functions: Start State Objective, Average Reward Objective and Average Value Objective.
Score Function.
Policy Gradient Theorem.
Monte-Carlo Policy Gradient (REINFORCE).
Action-Value Actor-Critic: Critic updates w by linear TD(0). Actor updates θ by policy gradient.
∞×∞
Random Tabular Q-planning.
Sample-Based Planning.
The Tabular Dyna-Q Algorithm.
The Dyna-Q+ Algorithm.
Forward Search.
Simulation-Based Search.
Monte-Carlo Tree Search.
Temporal-Difference Search.
Dyna-2.

Probabilistic Machine Learning Terminology Review:

Probabilistic Machine Learning
Non-Probabilistic Machine Learning
Algorithmic Machine Learning.
Array Programming.
Frequentist and Bayesian Approaches.

After finishing machine learning, please click on Topic 23 – Introduction to Computer Vision to continue.

Software Development

Tag Archives: Keras

Topic 23 – Introduction to Computer Vision

Topic 22 – Introduction to Machine Learning

Software development and software engineering research