All posts by admin

Topic 22 – Introduction to Machine Learning

Why do I need to learn about machine learning?

Machine learning has solved many important difficult problems recently. A few of them include speech recognition, speech synthesis, image recognition, autonomous driving and chat bots.
Nowadays a key skill of software developer is the ability to use machine learning algorithms solve real-world problems.

What can I do after finishing learning about machine learning?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

Please audit
– this Machine Learning Specialization (Coursera) courses and
– this Applied Machine Learning in Python (Coursera) course.

At the same time, please read
– this Aurelien Geron (2022). Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow. O’Reilly Media book and
– this Brett Lantz (2023). Machine Learning with R. Packt Publishing book.

After that please watch
– this MIT 6.034 – Artificial Intelligence, Fall 2010 course (Readings).

After that please read
– this Tom M. Mitchell (1997). Machine Learning. McGraw-Hill Education book, and
– this Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer book.

After that please audit this Reinforcement Learning Specialization (Coursera) courses.
At the same time, please read this Richard S. Sutton and Andrew G. Barto (2020). Reinforcement Learning. The MIT Press book.

Supervised Learning Terminology Review:

  • Artificial Intelligence.
  • Machine Learning.
  • Deep Learning.
  • Linear Regression: Y = θX + Ε.
  • Cost Function measures how good/bad your model is.
  • Mean Square Error (MSE) measures the average of the squares of the errors.
  • Gradient Descent, Learning Rate.
  • Batch Gradient Descent.
  • The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
  • Stochastic Gradient Descent.
  • Mini-Batch Gradient Descent.
  • Overfitting: machine learning model gives accurate predictions for training data but not for new data.
  • Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
  • Normalization.
  • Logistic Regression.
  • Sigmoid Function.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • One Hot Encoding.
  • The Softmax function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
  • Softmax Regression.
  • Support Vector Machines.
  • Decision Trees.
  • K-Nearest Neighbors.
  • McCulloch-Pitts Neuron.
  • Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
  • Perceptron.
  • Artificial Neural Networks.
  • Backpropagation.
  • Activation Functions: Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Hyperbolic Tangent.
  • Batch Normalization.
  • Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
  • Regularization: Dropout.
  • The Joint Probability Table.
  • Bayesian Networks.
  • Naive Bayes Inference.

Unsupervised Learning Terminology Review:

  • K-Means.
  • Principal Component Analysis.
  • User-Based Collaborative Filtering.
  • Item-based Collaborative Filtering.
  • Matrix Factorization.

Reinforcement Learning Terminology Review:

  • k-armed Bandit Problem.
  • Bandit Algorithm.
  • Exponential Recency-Weighted Average.
  • Optimistic Initial Values.
  • Upper-Confidence-Bound Action Selection.
  • Agent.
  • World.
  • States, Terminal State.
  • Actions.
  • Rewards.
  • Markov Decision Processes: Agent (π) >> Action (a) >> World >> State (s), Reward >> Agent (π). Model: (current state, action, reward of current state, next state) = (s, a, R(s), s’).
  • Episodes.
  • Continuing Tasks.
  • Horizon (H): Number of time steps in each episode, can be infinite.
  • Expected Return: Sum of rewards from time step t to horizon H.
  • Discounted Return: Discounted sum of rewards from time step t to horizon H.
  • Discount Factor, Discount Rate: 0 ≤ γ ≤ 1.
  • Policy: Mapping from states to actions: π (s) = a or π (a|s) = P(aₜ=a|sₜ=s).
  • State Value Function – Vπ(s): The expected return starting from state s following policy π.
  • State-Action Value function, also known as the quality function – Qπ(s): The expected return starting from state , taking action , then following policy .
  • Bellman Equations.
  • Optimal Value Functions.
  • Optimal Policies.
  • Bellman Optimality Equations.
  • Policy Evaluation: (MDP, π) → Linear System Solver, Dynamic Programming → Vπ.
  • Iterative Policy Evaluation.
  • Policy Control, Policy Improvement.
  • Policy Improvement Theorem.
  • Greedy Policy.
  • Policy Iteration: (MDP) → Dynamic Programming → Vπ-optimal.
  • Value Iteration: MDP → (Qopt, πopt).
  • Asynchronous Dynamic Programming.
  • Generalized Policy Iteration.
  • Bootstrapping: Updating estimates on the basis of other estimates.
  • First-Visit Monte Carlo Prediction.
  • Exploring Starts.
  • Monte Carlo Control Exploring Starts.
  • On-Policy Methods.
  • ϵ-greedy Policies: Most of the time they choose an action that has maximal estimated action value, but with probability ϵ they instead select an action at random.
  • ϵ-soft Policies: Policies for which π(a|s) ≥ ϵ/|A(s)| for all states and actions, for some ϵ > 0.
  • On-Policy First-Visit MC Control.
  • Off-Policy Learning.
  • Target Policy.
  • Behavior Policy.
  • Importance Sampling.
  • Off-Policy Monte Carlo Prediction.
  • Off-Policy Monte Carlo Control.
  • Temporal-Difference Learning.
  • SARSA: On-Policy TD Control.
  • Q-Learning: Off-Policy TD Control
  • Function Approximation.
  • Continuous States.
  • Learning State Action Value function: Replay Buffer: 10,000 tuples most recent (s, a, R(s), s’). x = (s, a) → Q(θ) → y = R(s) + γmaxQ(s’, a’, θ). Loss = [R(s) + γmaxQ(s’, a’; θ)] − Q(s, a; θ).
  • Target Network: A separate neural network for generating the y targets. It has the same architecture as the original Q-Network. Loss = [R(s) + γmaxTargetQ(s’, a’; θ′)] − Q(s, a; θ). Every C time steps we will use the TargetQ-Network to generate the y targets and update the weights of the TargetQ-Network using the weights of the Q-Network.
  • Soft Updates: ← 0.001θ + 0.999, where and represent the weights of the target network and the current network, respectively.
  • Deep Reinforcement Learning, Deep Q-learning.

After finishing learning about machine learning please click Topic 23 – Introduction to Computer Vision to continue.

 

How to Change Language of an EPUB File

Problem:

You have an EPUB file encoded with a wrong language tag.
Therefore when you use the Read aloud feature of the Google Play Books application the book is read aloud in a wrong language.

Solution:
  1. Download the EPUB file to a PC.
  2. Change the extension from EPUB to ZIP.
  3. Open the .ZIP file.
  4. Open the content.opf file using the Notepad app.
  5. If you cannot file this content.opf file then please navigate to the OEBPS folder.
  6. Find the tag <dc:language> and change its value (e.g. from <dc:language>en</dc:language>to <dc:language>vi</dc:language>).
  7. If you cannot find the tag <dc:language> then just add a new tag right above the </metadata> tag (e.g.
    <dc:language>vi</dc:language>
    </metadata>
  8. Save the content.opf file and rezip the EPUB file.
  9. Change the file extension from ZIP to EPUB.

 

How to transfer Photos from iPhone to PC with highest quality

Problem: Images copied directly from iPhone to a PC usually have lower quality in comparison with the original quality due to format conversion.

You want to preserve the quality as high as possible.

Solution:

  1. Connect iPhone to a MacBook.
  2. Open Photos app.
  3. Click on iPhone’s name under Devices section on the left.
  4. Select photos on the right.
  5. Select Import to = Library or New Album.
  6. Click on Import N Selected button, where N is the number of selected photos, to Import photos from iPhone to iPhotos.
    • The imported photos will be copied from iPhone to iPhotos.
    • You have to manually delete the photos directly from iPhone if you want to permanently remove them from iPhone.
  7. Click on an album name on the left menu in iPhotos.
  8. Select the imported images in the album.
  9. Click File > Export > Export N Photos… (N is a number) to export photos from iPhotos to a folder on MacBook.
    • Select Arrow icon at the end of Photo Kind
    • Select PNG
    • Select Color Profile = Original
    • Select Size = Full Size
    • Select Movie Quality = 4K
  10. Click the Export button.
  11. Enter a folder name.
  12. Click the Export button. Wait for the exporting process to be completed by reviewing the circle icon in the toolbar.
  13. Share the folder in a LAN.
  14. Copy the folder to a PC.

 

Topic 21 – Introduction to Computation and Programming using Python

Why do I need to learn about computation and programming using Python?

Computational thinking and Python are fundamental tools for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing and natural language processing.

What can I do after finishing learning about computation and programming using Python ?

You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

That sounds useful! What should I do now?

Please read this John V. Guttag (2013). Introduction to Computation and Programming using Python. 2nd Edition. The MIT Press book.

Alternatively, please watch
– this 6.0001 Introduction to Computer Science and Programming in Python. Fall 2016 course (Lecture Notes) and

– this MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016 course (Lecture Notes).

Terminology Review:

  • Big O notation.
  • Monte Carlo Simulation.
  • Random Walk.
  • K-means Clustering.
  • k-Nearest Neighbors Algorithm.

After finishing reading the book please click Topic 22 – Introduction to Machine Learning to continue.

 

Topic 19 – Probability & Statistics

Why do I need to learn about probability and statistics?

Probability and statistics are fundamental tools for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing and natural language processing.

What can I do after finishing learning about probability and statistics?

You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

That sounds useful! What should I do now?

Please read
– this Dimitri P. Bertsekas and John N. Tsitsiklis (2008). Introduction to Probability. Athena Scientific book, or
– this Hossein Pishro-Nik (2014). Introduction to Probability, Statistics, and Random Processes. Kappa Research, LLC book.

Alternatively, please read these notes, then watch
– this MIT 6.041SC – Probabilistic Systems Analysis and Applied Probability, Fall 2011 course (Lecture Notes), and
– this MIT RES.6-012 – Introduction to Probability, Spring 2018 course (Lecture Notes).

Probability and statistics are difficult topics so you may need to learn it 2 or 3 times using different sources to actually master the concepts. For example you may audit this Probability & Statistics for Machine Learning & Data Science course (Coursera) to get more examples and intuitions about core concepts.

Terminology Review:

  • Sample Space (Ω): Set of possible outcomes.
  • Event: Subset of the sample space.
  • Probability Law: Law specified by giving the probabilities of all possible outcomes.
  • Probability Model = Sample Space + Probability Law.
  • Probability Axioms: Nonnegativity: P(A) ≥ 0; Normalization: P(Ω)=1; Additivity: If A ∩ B = Ø, then P(A ∪ B)= P(A)+ P(B).
  • Conditional Probability: P(A|B) = P (A ∩ B) / P(B).
  • Multiplication Rule.
  • Total Probability Theorem.
  • Bayes’ Rule: Given P(Aᵢ) (initial “beliefs” ) and P (B|Aᵢ). P(Aᵢ|B) = ? (revise “beliefs”, given that B occurred).
  • The Monty Problem: 3 doors, behind which are two goats and a car.
  • The Spam Detection Problem: “Lottery” word in spam emails.
  • Independence of Two Events: P(B|A) = P(B)  or P(A ∩ B) = P(A) · P(B).
  • The Birthday Problem: P(Same Birthday of 23 People) > 50%.
  • The Naive Bayes Model: “Naive” means features independence assumption.
  • Discrete Uniform Law: P(A) = Number of elements of A / Total number of sample points = |A| / |Ω|
  • Basic Counting Principle: r stages, nᵢ choices at stage i, number of choices = n₁ n₂ · · · nᵣ
  • Permutations: Number of ways of ordering elements. No repetition for n slots: [n] [n-1] [n-2] [] [] [] [] [1].
  • Combinations: number of k-element subsets of a given n-element set.
  • Binomial Probabilities. P (any sequence) = p# ʰᵉᵃᵈˢ(1 − p)# ᵗᵃᶦˡˢ.
  • Random Variable: A function from the sample space to the real numbers. It is not random. It is not a variable. It is a function: f: Ω ℝ. Random variable is used to model the whole experiment at once.
  • Discrete Random Variables.
  • Probability Mass Function: P(X = 𝑥) or Pₓ(𝑥): A function from the sample space to [0..1] that produces the likelihood that the value of X equals to 𝑥. PMF gives probabilities. 0 ≤ PMF ≤ 1. All the values of PMF must sum to 1. PMF is used to model a random variable.
  • Bernoulli Random Variable (Indicator Random Variable): f: Ω {1, 0}. Only 2 outcomes: 1 and 0. p(1) = p and p(0) = 1 – p.
  • Binomial Random Variable: X = Number of successes in n trials. X = Number of heads in n independent coin tosses.
  • Binomial Probability Mass Function: Combination of (k, n)pᵏ(1 − p)ⁿ−ᵏ.
  • Geometric Random Variable: X = Number of coin tosses until first head.
  • Geometric Probability Mass Function: (1 − p)ᵏ−¹p.
  • Expectation: E[X] = Sum of xpₓ(x).
  • Let Y=g(X): E[Y] = E[g(X)] = Sum of g(x)pₓ(x). Caution: E[g(X)] ≠ g(E[X]) in general.
  • Variance: var(X) = E[(X−E[X])²].
  • var(aX)=a²var(X).
  • X and Y are independent: var(X+Y) = var(X) + var(Y). Caution: var(X+Y) ≠ var(X) + var(Y) in general.
  • Standard Deviation: Square root of var(X).
  • Conditional Probability Mass Function: P(X=x|A).
  • Conditional Expectation: E[X|A].
  • Joint Probability Mass Function: Pₓᵧ(x,y) = P(X=x, Y=y) = P((X=x) and (Y=y)).
  • Marginal Probability Mass Function: P(x) = Σy Pₓᵧ(x,y).
  • Total Expectation Theorem: E[X|Y = y].
  • Independent Random Variables: P(X=x, Y=y)=P(X=xP(Y=y).
  • Expectation of Multiple Random Variables: E[X + Y + Z] = E[X] + E[Y] + E[Z].
  • Binomial Random Variable: X = Sum of Bernoulli Random Variables.
  • The Hat Problem.
  • Continuous Random Variables.
  • Probability Density Function: P(a ≤ X ≤ b) or Pₓ(𝑥). (a ≤ X ≤ b) means X function produces a real number value within the [a, b] range. Programming language: X(outcome) = 𝑥, where a ≤ 𝑥 ≤ b. PDF does NOT give probabilities. PDF does NOT have to be less than 1. PDF gives probabilities per unit length. The total area under PDF must be 1. PDF is used to define the random variable’s probability coming within a distinct range of values.
  • Cumulative Distribution Function: P(X ≤ b). (X ≤ b) means X function produces a real number value within the [-∞, b] range. Programming language: X(outcome) = 𝑥, where 𝑥 ≤ b.
  • Continuous Uniform Random Variables: fₓ(x) = 1/(b – a) if a ≤ X ≤ b, otherwise f = 0.
  • Normal Random Variable, Gaussian Distribution, Normal Distribution: Fitting bell shaped data.
  • Chi-Squared Distribution: Modelling communication noise.
  • Sampling from a Distribution: The process of drawing a random value (or set of values) from a probability distribution.
  • Joint Probability Density Function.
  • Conditional Probability Density Function.
  • Marginal Probability Density Function.
  • Derived Distributions.
  • Convolution: A mathematical operation on two functions (f and g) that produces a third function.
  • Covariance.
  • Correlation Coefficient.
  • Conditional Expectation: E[X | Y = y] = Sum of xpₓ|ᵧ(x|y). If Y is unknown then E[X | Y] is a random variable, i.e. a function of Y. So E[X | Y] also has its expectation and variance.
  • Law of Iterated Expectations: E[E[X | Y]] = E[X].
  • Conditional Variance: var(X | Y) is a function of Y.
  • Law of Total Variance: var(X) =  E[var(X | Y)] +var([E[X | Y]).
  • Bernoulli Process:  A sequence of independent Bernoulli trials. At each trial, i: P(Xᵢ=1)=p, P(Xᵢ=0)=1−p.
  • Poisson Process.
  • Markov Chain.
  • Population: N.
  • Sample: n.
  • Random Sampling.
  • Population Mean: μ.
  • Sample Mean: x̄.
  • Population Proportion: p.
  • Sample Proportion: p̂.
  • Population Variance: σ².
  • Sample Variance: s².
  • Markov’s Inequality: P(X ≥ a) ≤ E(X)/a (X > 0, a > 0).
  • Chebyshev’s Inequality: P(|X – E(X)| ≥ a) ≤ var(X)/a².
  • Week Law of Large Numbers: The average of the samples will get closer to the population mean as the sample size (not number of items) increases.
  • Central Limit Theorem: The distribution of sample means approximates a normal distribution as the sample size (not number of items) gets larger, regardless of the population’s distribution.
  • Sampling Distributions: Distribution of Sample Mean, Distribution of Sample Proportion, Distribution of Sample Variance.
  • Point Estimate: A single number, calculated from a sample, that estimates a parameter of the population.
  • Maximum Likelihood Estimation: Given data the maximum likelihood estimate (MLE) for the parameter p is the value of p that maximizes the likelihood P (data | p). P (data | p) is the likelihood function. For continuous distributions, we use the probability density function to define the likelihood.
  • Log likelihood: the natural log of the likelihood function.
  • Frequentists: Assume no prior belief, the goal is to find the model that most likely generated observed data.
  • Bayesians: Assume prior belief, the goal is to update prior belief based on observed data.
  • Maximum A Posteriori (MAP): Good for instances when you have limited data or strong prior beliefs. Wrong priors, wrong conclusions. MAP with uninformative priors is just MLE.
  • Margin of Error: A bound that we can confidently place on the difference between an estimate of something and the true value.
  • Significance Level: α, the probability that the event could have occurred by chance.
  • Confidence Level: 1 − α,  a measure of how confident we are in a given margin of error.
  • Confidence Interval: A 95% confidence interval (CI) of the mean is a range with an upper and lower number calculated from a sample. Because the true population mean is unknown, this range describes possible values that the mean could be. If multiple samples were drawn from the same population and a 95% CI calculated for each sample, we would expect the population mean to be found within 95% of these CIs.
  • z-score: the number of standard deviations from the mean value of the reference population.
  • Confidence Interval: Unknown σ.
  • Confidence Interval for Proportions.
  • Hypothesis: A statement about a population developed for the purpose of testing.
  • Hypothesis Testing.
  • Null Hypothesis (H₀): A statement about the value of a population parameter, contains equal sign.
  • Alternate Hypothesis (H₁): A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false, never contains equal sign.
  • Type I Error: Reject the null hypothesis when it is true.
  • Type II Error: Do not reject the null hypothesis when it is false.
  • Significance Level, α: The maximum probability of rejecting the null hypothesis when it is true.
  • Test Statistic:  A number, calculated from samples, used to find if your data could have occurred under the null hypothesis.
  • Right-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is greater than the null hypothesis claims.
  • Left-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is less than the null hypothesis claims.
  • Two-Tailed Test: The alternative hypothesis which does not specify a direction, i.e. when the alternative hypothesis states that the null hypothesis is wrong.
  • p-value: The probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. μ₀ is assumed to be known and H₀ is assumed to be true.
  • Decision Rules: If H₀ is true then acceptable x̄ must fall in (1 − α) region.
  • Critical Value or k-value: A value on a test distribution that is used to decide whether the null hypothesis should be rejected or not.
  • Power of a Test: The probability of rejecting the null hypothesis when it is false; in other words, it is the probability of avoiding a type II error.
  • t-Distribution.
  • T-Statistic.
  • t-Tests: Unknown σ, use T-Statistic.
  • Independent Two-Sample t-Tests.
  • Paired t-Tests.
  • A/B testing: A methodology for comparing two variations (A/B) that uses t-Tests for statistical analysis and making a decision.
  • Model Building: X = a·S + W, where X: output, S: “signal”, a: parameters, W: noise. Know S, assume W, observe X, find a.
  • Inferring: X = a·S + W. Know a, assume W, observe X, find S.
  • Hypothesis Testing: X = a·S + W. Know a, observe X, find S. S can take one of few possible values.
  • Estimation: X = a·S + W. Know a, observe X, find S. S can take unlimited possible values.
  • Bayesian Inference can be used for both Hypothesis Testing and Estimation by leveraging Bayes rule. Output is posterior distribution. Single answer can be Maximum a posteriori probability (MAP) or Conditional Expectation.
  • Least Mean Squares Estimation of Θ based on X.
  • Classical Inference can be used for both Hypothesis Testing and Estimation.

After finishing learning about probability and statistics please click Topic 20 – Discrete Mathematics to continue.

 

Topic 18 – Linear Algebra

Why do I need to learn about linear algebra?

Linear algebra is a fundamental tool for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing, and natural language processing.

What can I do after finishing learning about linear algebra?

You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

That sounds useful! What should I do now?

Please read this David C. Lay et al. (2022). Linear Algebra and Its Applications. Pearson Education book.

Alternatively, please watch this MIT 18.06 – Linear Algebra, Spring 2005 course. While watching this course please do read Lecture Notes, and this Gilbert Strang (2016). Introduction to Linear Algebra. Wellesley-Cambridge Press book for better understanding some complex topics.

Terminology Review:

  • Linear Equations.
  • Row Picture.
  • Column Picture.
  • Triangular matrix is a square matrix where all the values above or below the diagonal are zero.
  • Lower Triangular Matrix.
  • Upper Triangular Matrix.
  • Diagonal matrix is a matrix in which the entries outside the main diagonal are all zero.
  • Tridiagonal Matrix.
  • Identity Matrix.
  • Transpose of a Matrix.
  • Symmetric Matrix.
  • Pivot Columns.
  • Pivot Variables.
  • Augmented Matrix.
  • Echelon Form.
  • Reduced Row Echelon Form.
  • Elimination Matrices.
  • Inverse Matrix.
  • Factorization into A = LU.
  • Free Columns.
  • Free Variables.
  • Gauss-Jordan Elimination.
  • Vector Spaces.
  • Rank of a Matrix.
  • Permutation Matrix.
  • Subspaces.
  • Column space, C(A) consists of all combinations of the columns of A and is a vector space in ℝᵐ.
  • Nullspace, N(A) consists of all solutions x of the equation Ax = 0 and lies in ℝⁿ.
  • Row space, C(Aᵀ) consists of all combinations of the row vectors of A and form a subspace of ℝⁿ. We equate this with C(Aᵀ), the column space of the transpose of A.
  • The left nullspace of A, N(Aᵀ) is the nullspace of Aᵀ. This is a subspace of ℝᵐ.
  • Linearly Dependent Vectors.
  • Linearly Independent Vectors.
  • Linear Span of Vectors.
  • A basis for a vector space is a sequence of vectors with two properties:
    • They are independent.
    • They span the vector space.
  • Given a space, every basis for that space has the same number of vectors; that number is the dimension of the space.
  • Dimension of a Vector Space.
  • Dot Product.
  • Orthogonal Vectors.
  • Orthogonal Subspaces.
  • Row space of A is orthogonal to  nullspace of A.
  • Matrix Spaces.
  • Rank-One Matrix.
  • Orthogonal Complements.
  • Projection matrix: P = A(AᵀA)⁻¹Aᵀ. Properties of projection matrix: Pᵀ = P and P² = P. Projection component: Pb = A(AᵀA)⁻¹Aᵀb = (AᵀA)⁻¹(Aᵀb)A.
  • Linear regression, least squares, and normal equations: Instead of solving Ax = b we solve Ax̂ = p or AᵀAx̂ = Aᵀb.
  • Linear Regression.
  • Orthogonal Matrix.
  • Orthogonal Basis.
  • Orthonormal Vectors.
  • Orthonormal Basis.
  • Orthogonal Subspaces.
  • Gram–Schmidt process.
  • Determinant: A number associated with any square matrix letting us know whether the matrix is invertible, the formula for the inverse matrix, the volume of the parallelepiped whose edges are the column vectors of A. The determinant of a triangular matrix is the product of the diagonal entries (pivots).
  • The big formula for computing the determinant.
  • The cofactor formula rewrites the big formula for the determinant of an n by n matrix in terms of the determinants of smaller matrices.
  • Formula for Inverse Matrix.
  • Cramer’s Rule.
  • Eigenvectors are vectors for which Ax is parallel to x: Ax = λx. λ is an eigenvalue of A, det(A − λI)= 0.
  • Diagonalizing a matrix: AS = SΛ 🡲 S⁻¹AS = Λ 🡲 A = SΛS⁻¹. S: matrix of n linearly independent eigenvectors. Λ: matrix of eigenvalues on diagonal.
  • Matrix exponential eᴬᵗ.
  • Markov matrices: All entries are non-negative and each column adds to 1.
  • Symmetric matrices: Aᵀ = A.
  • Positive definite matrices: all eigenvalues are positive or all pivots are positive or all determinants are positive.
  • Similar matrices: A and B = M⁻¹AM.
  • Singular value decomposition (SVD) of a matrix: A = UΣVᵀ, where U is orthogonal, Σ is diagonal, and V is orthogonal.
  • Linear Transformations: T(v + w) = T(v)+ T(w) and T(cv)= cT(v) . For any linear transformation T we can find a matrix A so that T(v) = Av.

After finishing learning about linear algebra please click Topic 19 – Probability & Statistics to continue.

 

Talking In A Shop

Customer: Excuse me, do you sell swimsuits for girls?
Assistant: It’s in Aisle 12.

Customer: Excuse me, I am looking for a shirt?
Assistant: It’s in Aisle 12.

Assistant: Can I help you?
Customer: Yes please, I am looking for washing up liquid.

Assistant: Are you looking for something in particular?
Customer: Yes please, I am looking for a pair of jeans.
Customer: I’m fine thanks, just browsing.
Customer: I’m only looking today.

Anti-Virus vs. Anti-Malware

What is the difference between anti-virus software and an anti-malware software?

A virus is a piece of code that is capable of copying itself in order to do damage to your computer, including corrupting your system or destroying data.

Malware, on the other hand, is an umbrella term that stands for a variety of malicious software doing damage to your computer or stealing your information, including Trojans, spyware, worms, adware, ransomware, and yes, viruses.

So the logic follows: all viruses are malware. Not all malware are viruses.

Anti-virus software generally scans for infectious malware which includes viruses, worms, Trojans, rootkis and bots.

Anti-malware software generally tends to focus more on adware, spyware, unwanted toolbars, browser hijackers, potentially unwanted programs and potentially unsafe applications.

Therefore, you need both an anti-virus and an anti-malware solution for maximum protection.

Built-in Windows Defender provides both anti-virus and anti-malware protection, and IMO, is enough for non-tech-savvy users.

A comprehensive FREE anti-virus software is AVG.

A comprehensive FREE anti-malware software is Malwarebytes

An ads blocker for Edge browser is uBlock Origin

An ads blocker for Firefox browser is Adblock Plus

How to Manually Install PHP 8.1 on Windows 10

Motivation:

  • You want to understand how PHP works with IIS.
  • You want to prepare a PHP or WordPress development environment on Windows.
  • You want to update PHP to any version to address compatibility or security issues on Windows.

Solution:

  • Install CGI for IIS on Turn Windows features on or off > Internet Information Services > World Wide Web Services > Application Development Features > CGI.
  • Download VS16 x64 Non Thread Safe package under PHP 8.1 section on https://windows.php.net/download/.
  • Extract the ZIP file to C:\Program Files\php-8.1.10-nts-Win32-vs16-x64 folder.
  • Download the cacert.pem file and move it to the C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl folder.
  • Copy the php-.ini-development file to php.ini.
  • Open the php.ini file and uncomment the following lines
    fastcgi.impersonate = 1;
    
    cgi.fix_pathinfo=1;
    cgi.force_redirect = 1 (and change the value to 0, i.e. cgi.force_redirect = 0)
    
    extension_dir = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\ext"
    
    extension=curl
    extension=fileinfo
    extension=gd
    extension=gettext
    extension=mbstring
    extension=exif ; Must be after mbstring as it depends on it
    extension=mysqli
    extension=openssl
    extension=pdo_mysql
    
    error_log = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\php_errors.log"
    
    error_log = syslog
    
    curl.cainfo = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl\cacert.pem"
    
    openssl.capath="C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl\cacert.pem"
  • Add C:\Program Files\php-8.1.10-nts-Win32-vs16-x64 to SYSTEM PATH.
  • Restart your machine.
  • Open cmd and execute the command below to ensure that the PHP file can be found in the SYSTEM PATH.
    php --version
  • Open IIS, click on Server name, double click on Handler Mappings > Add Module Mapping with below information.
    Request path = *.php
    
    Module = FastCgiModule
    
    Executable = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\php-cgi.exe"
    
    Name = PHP 8.1
    
    Request Restrictions = File or folder
  • Create phpinfo.php file with below content in the root website folder.
    <?php phpinfo(); ?>

 

 

How to Remove Microsoft .NET Framework

Problem:

Some applications do not work with the latest Microsoft .NET Framework. You need to remove it and install an older version of Microsoft .NET Framework.

Solution:

1. Use below utility to remove the current Microsoft .NET Framework.

.NET Framework Cleanup Tool

2. Download an appropriate version of Microsoft .NET Framework using one of below links and install it.

3. Use below utility to detect the installed Microsoft .NET Framework versions.

.NET Detector