All posts by admin

How to Copy, Move, Replicate, Augment or Delete Files and Folders using Commands on Windows

Motivation:

  • You have a web application the backup of which needs to be  created daily.
  • You have a web application the content of which needs to be replicated daily.
  • You have data folder the content of which needs to be augmented daily.

Commands:

  • Copying files and folders inside one folder to another:
robocopy E:\inetpub\wwwroot\website.domain.com E:\inetpub\wwwroot\backup.domain.com /e

/e Copies subdirectories. This option includes empty directories. 

robocopy \\192.168.1.49\E\inetpub E:\inetpub /e
  • Moving entire folder to another location:
PS C:\> Move-Item -path \\192.168.1.15\e\inetpub\ -destination E:\ -force

where PS C:\> is PowerShell.
  • Moving new files and folders inside one folder to another:
robocopy E:\inetpub\wwwroot\website.domain.com E:\inetpub\wwwroot\archive.domain.com /move /e

/move Moves files and directories, and deletes them from the source after they are copied.
  • Copying (mirroring) entire data from one drive to another, including file and folder permissions:
robocopy E:\ G:\ /MIR /COPYALL /ZB /W:1 /R:2 /XO

or

robocopy E:\ G:\ /TEE /LOG+:F:\robolog.txt /MIR /COPYALL /ZB /W:1 /R:2 /XO

E:\
 - Source folder. This can be a UNC path.

G:\
 - Destination folder. This can be a UNC path.

/TEE
 - Display the output of the command in the console window and write it to a log file.

/LOG+:F:\robolog.txt
 - Write the logs to F:\robolog.txt. The + sign means appending the content to the log file.

/MIR
 - Copy all files and subfolders, remove files and folders from the destination if they no longer exist on the source.

/COPYALL
 - Copy all of the NTFS permissions and attributes (security permissions, timestamps, owner info, etc.)

/ZB
 - Use restartable mode when copying files. If a file is in use, retry after a set amount of time (see /W:1 and /R:2). If access is denied then try to copy in backup mode.

/W:1
 - Wait for 1 second between retries when copying files.

/R:2
 - The number of retries on failed copies.

/XO
 - eXclude Older files/folders if the destination file or folder exists and has the same date.
If destination file exists and is the same date or newer than the source - don't bother to overwrite it.
  • Augmenting files and folders (making an incremental backup) from one drive to another, including file and folder permissions:
robocopy E:\ G:\ /E /COPYALL /ZB /W:1 /R:2 /XO /XX

or

robocopy E:\ G:\ /TEE /LOG+:F:\robolog2.txt /E /COPYALL /ZB /W:1 /R:2 /XO /XX

/E
 - Copy Subfolders, including Empty Subfolders.

/XX
 - eXclude "eXtra" files and dirs (present in destination but not source). This will prevent any deletions from the destination.
  • Granting Full control to a user or group:
icacls "E:\inetpub\wwwroot\website.domain.com\App_Data" /grant "IUSR":(OI)(CI)F /T

icacls "E:\inetpub\wwwroot\website.domain.com\App_Data" /grant "IIS_IUSRS":(OI)(CI)F /T

CI
 - Container Inherit - This flag indicates that subordinate containers will inherit this ACE (access control entry).

OI
 - Object Inherit - This flag indicates that subordinate files will inherit the ACE.

OI and CI only apply to new files and sub-folders).

F
 - Full Control

/T
 - Apply recursively to existing files and sub-folders.
  • Deleting and creating a folder:
rmdir "E:\inetpub\wwwroot\website.domain.com\Temp\" /S /Q 
mkdir "E:\inetpub\wwwroot\website.domain.com\Temp\
  • Recursively deleting all files in a folder and all files in its sub-folders:
cd C:\inetpub\wwwroot

del /s *.log /s
 - delete all the files in the sub-folders.


del /s /f /q *.* /f
 - force deletion of read-only files.

/q
 - do not ask to confirm when deleting via wildcard.
  • Recursively deleting a folder, its files and its sub-folders:
rmdir .\force-app\main\default\objects /s /q /s
 - delete all the files in the sub-folders.
  • Enabling long paths and file names: For Windows 10, Version 1607, and Later: Open Group Policy (gpedit.msc) and go to Computer Configuration > Administrative Templates > System > Filesystem. Set “Enabling Win32 long paths” to “Enabled“. Restart the machine. Then use command below:
PS C:\> Move-Item -path \\?\UNC\192.168.101.157\e\Files\ -destination \\?\E:\ -force
  • Removing a drive letter from a volume:
mountvol F: /D

/D
- remove the drive letter from the selected volume.

Topic 22 – Introduction to Machine Learning

Why do I need to learn about machine learning?

Machine learning has been solving many important difficult problems. A few of them include speech recognition, speech synthesis, image recognition, autonomous driving and chat bots.
Nowadays a key skill of software developer is the ability to use machine learning algorithms solve real-world problems.

What can I do after finishing learning about machine learning?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

Please audit
– these Machine Learning Specialization (Coursera) courses and
– this Applied Machine Learning in Python (Coursera) course.

At the same time, please read
– this Aurelien Geron (2022). Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow. O’Reilly Media book and
– this Brett Lantz (2023). Machine Learning with R. Packt Publishing book.

At the same time, please audit
– this Neural Networks and Deep Learning course and
– this Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization course and
– this Structuring Machine Learning Projects  course.

After that please read this Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press book.

After that please watch
– this MIT 6.034 – Artificial Intelligence, Fall 2010 course videos (Readings).

After that please read
– this Tom M. Mitchell (1997). Machine Learning. McGraw-Hill Education book, and
– this Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer book.

After that please audit this RL Course by David Silver course (Slides) and these Reinforcement Learning Specialization (Coursera) courses, and read this Richard S. Sutton and Andrew G. Barto (2018). Reinforcement Learning. The MIT Press book at the same time.

Supervised Learning Terminology Review:

  • Artificial Intelligence.
  • Machine Learning.
  • Deep Learning.
  • Linear Regression: Y = θX + Ε.
  • Cost Function measures how good/bad your model is.
  • Mean Square Error (MSE) measures the average of the squares of the errors.
  • Gradient Descent, Learning Rate.
  • Batch Gradient Descent.
  • The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
  • Stochastic Gradient Descent.
  • Mini-Batch Gradient Descent.
  • Overfitting: machine learning model gives accurate predictions for training data but not for new data.
  • Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
  • Normalization.
  • Logistic Regression.
  • Sigmoid Function.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • One Hot Encoding.
  • The Softmax function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
  • Softmax Regression.
  • Support Vector Machines.
  • Decision Trees.
  • K-Nearest Neighbors.
  • McCulloch-Pitts Neuron.
  • Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
  • Perceptron.
  • Artificial Neural Networks.
  • Backpropagation.
  • Activation Functions: Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Hyperbolic Tangent.
  • Batch Normalization.
  • Learning Rate Decay.
  • Exponentially Weighted Averages.
  • Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
  • Regularization: Dropout.
  • The Joint Probability Table.
  • Bayesian Networks.
  • Naive Bayes Inference.

Unsupervised Learning Terminology Review:

  • K-Means.
  • Principal Component Analysis.
  • User-Based Collaborative Filtering.
  • Item-based Collaborative Filtering.
  • Matrix Factorization.

Reinforcement Learning Terminology Review:

  • k-armed Bandit Problem.
  • Sample-Average Method.
  • Greedy Action.
  • Exploration and Exploitation.
  • ϵ-Greedy Action Selection.
    • Bandit Algorithm.
    • Exponential Recency-Weighted Average.
    • Optimistic Initial Values.
    • Upper-Confidence-Bound Action Selection.
    • Rewards.
    • Agent, Actions, World or Environment.
    • History, States, Terminal State, Environment State, Agent State, Information State.
    • Fully Observable Environments.
    • Partially Observable Environments.
    • Policy,  Value Function, Model.
    • Value Based RL Agent, Policy Based RL Agent, Actor Critic RL Agent.
    • Model Free RL Agent, Model Based RL Agent.
    • Learning Problem and Planning Problem.
    • Prediction and Control.
    • Markov Property.
    • State Transition Matrix.
    • Markov Process.
    • Episodic Tasks.
    • Continuing Tasks.
    • Horizon (H): Number of time steps in each episode, can be infinite.
    • Markov Reward Process.
    • Discount Factor, Discount Rate: 0 ≤ γ ≤ 1.
    • Return.
    • Discounted Return: Discounted sum of rewards from time step t to horizon H.
    • State-Value Function.
    • Bellman Equation for Markov Reward Processes.
    • Markov Decision Process.
    • Policy: Mapping from states to actions. Deterministic policy: π (s) = a. Stochastic policy: π (a|s) = P(aₜ=a|sₜ=s).
    • State Value Function – Vπ(s): The expected return starting from state s following policy π.
    • Bellman Expectation Equation for Vπ.
    • Action Value Function (also known as State-Action Value Fucntion or the Quality Function) – Qπ(s, a): The expected return starting from state , taking action , then following policy .
    • Bellman Expectation Equation for Qπ.
    • Optimal State Value Function.
    • Optimal Action Value Function.
    • Bellman Optimality Equation for v*.
    • Bellman Optimality Equation for q*.
    • Optimal Policies.
    • Dynamic Programming.
    • Iterative Policy Evaluation.
    • Policy Improvement.
    • Policy Improvement Theorem.
    • Policy Iteration.
    • Value Iteration.
    • Synchronous Dynamic Programming.
    • Asynchronous Dynamic Programming.
    • Generalized Policy Iteration.
    • Bootstrapping: Updating estimates on the basis of other estimates.
    • Monte-Carlo Policy Evaluation.
    • First-Visit Monte-Carlo Policy Evaluation.
    • Every-Visit Monte-Carlo Policy Evaluation.
    • Incremental Mean.
    • Incremental Monte-Carlo Updates.
    • Temporal-Difference Learning.
    • Forward-View TD(λ).
    • Eligibility Traces.
    • Backward-View TD(λ).
    • On-Policy Learning.
    • Off-Policy Learning.
    • ϵ-Greedy Exploration.
    • ϵ-greedy Policies: Most of the time they choose an action that has maximal estimated action value, but with probability ϵ they instead select an action at random.
    • Monte-Carlo Policy Iteration. Policy evaluation: Monte-Carlo policy evaluation, Q = qπ. Policy improvement: ϵ-greedy policy improvement.
    • Monte-Carlo Control. Policy evaluation: Monte-Carlo policy evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
    • Exploring Starts: Specify that the episodes start in a state–action pair, and that every pair has a nonzero probability of being selected as the start.
    • Monte Carlo Control Exploring Starts.
    • Greedy in the Limit with In nite Exploration (GLIE) Monte-Carlo Control.
    • ϵ-soft Policies: Policies for which π(a|s) ≥ ϵ/|A(s)| for all states and actions, for some ϵ > 0.
    • On-Policy First-Visit MC Control.
    • SARSA: State (S), Action (A), Reward (R), State (S’), Action (A’).
    • On-Policy Control with SARSA. Policy evaluation: SARSA evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
    • Forward-View SARSA (λ).
    • Backward-View SARSA (λ).
    • Target Policy.
    • Behavior Policy.
    • Importance Sampling: Use samples from one distribution to estimate the expectation of a diff erent distribution.
    • Importance Sampling for Off-Policy Monte-Carlo.
    • Importance Sampling for Off-Policy TD.
    • Q-Learning: Next action is chosen using behaviour policy. Q is updated using alternative successor action.
    • Off -Policy Control with Q-Learning.
    • Expected SARSA.
    • Value Function Approximation.
    • Function Approximators.
    • Differentiable Function Approximators.
    • Feature Vectors.
    • State Aggregation.
    • Coarse Coding.
    • Tile Coding.
    • Continuous States.
    • Incremental Prediction Algorithms.
    • Control with Value Function Approximation. Policy evaluation: Approximate policy evaluation, q(.,., w) ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
    • Learning State Action Value function: Replay Buffer: 10,000 tuples most recent (s, a, R(s), s’). x = (s, a) → Q(θ) → y = R(s) + γmaxQ(s’, a’, θ). Loss = [R(s) + γmaxQ(s’, a’; θ)] − Q(s, a; θ).
    • Expected SARSA with Function Approximation.
    • Target Network: A separate neural network for generating the y targets. It has the same architecture as the original Q-Network. Loss = [R(s) + γmaxTargetQ(s’, a’; θ′)] − Q(s, a; θ). Every C time steps we will use the TargetQ-Network to generate the y targets and update the weights of the TargetQ-Network using the weights of the Q-Network.
    • Soft Updates: ← 0.001θ + 0.999, where and represent the weights of the target network and the current network, respectively.
    • Deep Q-learning.
    • Linear Least Squares Prediction Algorithms.
    • Least Squares Policy Iteration. Policy evaluation: Least squares Q-Learning. Policy improvement: Greedy policy improvement.
    • Average Reward.
    • Discounted Returns, Returns for Average Reward.
    • Stochastic Policies.
    • Softmax Policies.
    • Gaussian Policies.
    • Policy Objective Functions: Start State Objective, Average Reward Objective and Average Value Objective.
    • Score Function.
    • Policy Gradient Theorem.
    • Monte-Carlo Policy Gradient (REINFORCE).
    • Action-Value Actor-Critic: Critic updates w by linear TD(0). Actor updates θ by policy gradient.
    • The Tabular Dyna-Q Algorithm.
    • The Dyna-Q+ Algorithm.
    • Forward Search.
    • Simulation-Based Search.
    • Monte-Carlo Tree Search.
    • Temporal-Difference Search.
    • Dyna-2.

    After finishing learning about machine learning please click Topic 23 – Introduction to Computer Vision to continue.

     

    How to Change Language of an EPUB File

    Problem:

    You have an EPUB file encoded with a wrong language tag.
    Therefore when you use the Read aloud feature of the Google Play Books application the book is read aloud in a wrong language.

    Solution:
    1. Download the EPUB file to a PC.
    2. Change the extension from EPUB to ZIP.
    3. Open the .ZIP file.
    4. Open the content.opf file using the Notepad app.
    5. If you cannot file this content.opf file then please navigate to the OEBPS folder.
    6. Find the tag <dc:language> and change its value (e.g. from <dc:language>en</dc:language>to <dc:language>vi</dc:language>).
    7. If you cannot find the tag <dc:language> then just add a new tag right above the </metadata> tag (e.g.
      <dc:language>vi</dc:language>
      </metadata>
    8. Save the content.opf file and rezip the EPUB file.
    9. Change the file extension from ZIP to EPUB.

     

    How to transfer Photos from iPhone to PC with highest quality

    Problem: Images copied directly from iPhone to a PC usually have lower quality in comparison with the original quality due to format conversion.

    You want to preserve the quality as high as possible.

    Solution:

    1. Connect iPhone to a MacBook.
    2. Open Photos app.
    3. Click on iPhone’s name under Devices section on the left.
    4. Select photos on the right.
    5. Select Import to = Library or New Album.
    6. Click on Import N Selected button, where N is the number of selected photos, to Import photos from iPhone to iPhotos.
      • The imported photos will be copied from iPhone to iPhotos.
      • You have to manually delete the photos directly from iPhone if you want to permanently remove them from iPhone.
    7. Click on an album name on the left menu in iPhotos.
    8. Select the imported images in the album.
    9. Click File > Export > Export N Photos… (N is a number) to export photos from iPhotos to a folder on MacBook.
      • Select Arrow icon at the end of Photo Kind
      • Select PNG
      • Select Color Profile = Original
      • Select Size = Full Size
      • Select Movie Quality = 4K
    10. Click the Export button.
    11. Enter a folder name.
    12. Click the Export button. Wait for the exporting process to be completed by reviewing the circle icon in the toolbar.
    13. Share the folder in a LAN.
    14. Copy the folder to a PC.

     

    Topic 21 – Introduction to Computation and Programming using Python

    Why do I need to learn about computation and programming using Python?

    Computational thinking and Python are fundamental tools for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing and natural language processing.

    What can I do after finishing learning about computation and programming using Python ?

    You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

    That sounds useful! What should I do now?

    Please read this John V. Guttag (2013). Introduction to Computation and Programming using Python. 2nd Edition. The MIT Press book.

    Alternatively, please watch
    – this 6.0001 Introduction to Computer Science and Programming in Python. Fall 2016 course (Lecture Notes) and

    – this MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016 course (Lecture Notes).

    Terminology Review:

    • Big O notation.
    • Monte Carlo Simulation.
    • Random Walk.
    • K-means Clustering.
    • k-Nearest Neighbors Algorithm.

    After finishing reading the book please click Topic 22 – Introduction to Machine Learning to continue.

     

    Topic 19 – Probability & Statistics

    Why do I need to learn about probability and statistics?

    Probability and statistics are fundamental tools for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing and natural language processing.

    What can I do after finishing learning about probability and statistics?

    You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

    That sounds useful! What should I do now?

    Please read
    – this Dimitri P. Bertsekas and John N. Tsitsiklis (2008). Introduction to Probability. Athena Scientific book, or
    – this Hossein Pishro-Nik (2014). Introduction to Probability, Statistics, and Random Processes. Kappa Research, LLC book.

    Alternatively, please read these notes, then watch
    – this MIT 6.041SC – Probabilistic Systems Analysis and Applied Probability, Fall 2011 course videos (Lecture Notes), and
    – this MIT RES.6-012 – Introduction to Probability, Spring 2018 course videos (Lecture Notes).

    Probability and statistics are difficult topics so you may need to learn it 2 or 3 times using different sources to actually master the concepts. For example you may audit this Probability & Statistics for Machine Learning & Data Science course (Coursera) to get more examples and intuitions about core concepts.

    Terminology Review:

    • Sample Space (Ω): Set of possible outcomes.
    • Event: Subset of the sample space.
    • Probability Law: Law specified by giving the probabilities of all possible outcomes.
    • Probability Model = Sample Space + Probability Law.
    • Probability Axioms: Nonnegativity: P(A) ≥ 0; Normalization: P(Ω)=1; Additivity: If A ∩ B = Ø, then P(A ∪ B)= P(A)+ P(B).
    • Conditional Probability: P(A|B) = P (A ∩ B) / P(B).
    • Multiplication Rule.
    • Total Probability Theorem.
    • Bayes’ Rule: Given P(Aᵢ) (initial “beliefs” ) and P (B|Aᵢ). P(Aᵢ|B) = ? (revise “beliefs”, given that B occurred).
    • The Monty Problem: 3 doors, behind which are two goats and a car.
    • The Spam Detection Problem: “Lottery” word in spam emails.
    • Independence of Two Events: P(B|A) = P(B)  or P(A ∩ B) = P(A) · P(B).
    • The Birthday Problem: P(Same Birthday of 23 People) > 50%.
    • The Naive Bayes Model: “Naive” means features independence assumption.
    • Discrete Uniform Law: P(A) = Number of elements of A / Total number of sample points = |A| / |Ω|
    • Basic Counting Principle: r stages, nᵢ choices at stage i, number of choices = n₁ n₂ · · · nᵣ
    • Permutations: Number of ways of ordering elements. No repetition for n slots: [n] [n-1] [n-2] [] [] [] [] [1].
    • Combinations: number of k-element subsets of a given n-element set.
    • Binomial Probabilities. P (any sequence) = p# ʰᵉᵃᵈˢ(1 − p)# ᵗᵃᶦˡˢ.
    • Random Variable: A function from the sample space to the real numbers. It is not random. It is not a variable. It is a function: f: Ω ℝ. Random variable is used to model the whole experiment at once.
    • Discrete Random Variables.
    • Probability Mass Function: P(X = 𝑥) or Pₓ(𝑥): A function from the sample space to [0..1] that produces the likelihood that the value of X equals to 𝑥. PMF gives probabilities. 0 ≤ PMF ≤ 1. All the values of PMF must sum to 1. PMF is used to model a random variable.
    • Bernoulli Random Variable (Indicator Random Variable): f: Ω {1, 0}. Only 2 outcomes: 1 and 0. p(1) = p and p(0) = 1 – p.
    • Binomial Random Variable: X = Number of successes in n trials. X = Number of heads in n independent coin tosses.
    • Binomial Probability Mass Function: Combination of (k, n)pᵏ(1 − p)ⁿ−ᵏ.
    • Geometric Random Variable: X = Number of coin tosses until first head.
    • Geometric Probability Mass Function: (1 − p)ᵏ−¹p.
    • Expectation: E[X] = Sum of xpₓ(x).
    • Let Y=g(X): E[Y] = E[g(X)] = Sum of g(x)pₓ(x). Caution: E[g(X)] ≠ g(E[X]) in general.
    • Variance: var(X) = E[(X−E[X])²].
    • var(aX)=a²var(X).
    • X and Y are independent: var(X+Y) = var(X) + var(Y). Caution: var(X+Y) ≠ var(X) + var(Y) in general.
    • Standard Deviation: Square root of var(X).
    • Conditional Probability Mass Function: P(X=x|A).
    • Conditional Expectation: E[X|A].
    • Joint Probability Mass Function: Pₓᵧ(x,y) = P(X=x, Y=y) = P((X=x) and (Y=y)).
    • Marginal Distribution: Distribution of one variable
      while ignoring the other.
    • Marginal Probability Mass Function: P(x) = Σy Pₓᵧ(x,y).
    • Total Expectation Theorem: E[X|Y = y].
    • Independent Random Variables: P(X=x, Y=y)=P(X=xP(Y=y).
    • Expectation of Multiple Random Variables: E[X + Y + Z] = E[X] + E[Y] + E[Z].
    • Binomial Random Variable: X = Sum of Bernoulli Random Variables.
    • The Hat Problem.
    • Continuous Random Variables.
    • Probability Density Function: P(a ≤ X ≤ b) or Pₓ(𝑥). (a ≤ X ≤ b) means X function produces a real number value within the [a, b] range. Programming language: X(outcome) = 𝑥, where a ≤ 𝑥 ≤ b. PDF does NOT give probabilities. PDF does NOT have to be less than 1. PDF gives probabilities per unit length. The total area under PDF must be 1. PDF is used to define the random variable’s probability coming within a distinct range of values.
    • Cumulative Distribution Function: P(X ≤ b). (X ≤ b) means X function produces a real number value within the [-∞, b] range. Programming language: X(outcome) = 𝑥, where 𝑥 ≤ b.
    • Continuous Uniform Random Variables: fₓ(x) = 1/(b – a) if a ≤ X ≤ b, otherwise f = 0.
    • Normal Random Variable, Gaussian Distribution, Normal Distribution: Fitting bell shaped data.
    • Chi-Squared Distribution: Modelling communication noise.
    • Sampling from a Distribution: The process of drawing a random value (or set of values) from a probability distribution.
    • Joint Probability Density Function.
    • Marginal Probability Density Function.
    • Conditional Probability Density Function.
    • Derived Distributions.
    • Convolution: A mathematical operation on two functions (f and g) that produces a third function.
    • The Distribution of W = X + Y.
    • The Distribution of X + Y where X, Y: Independent Normal Ranndom Variables.
    • Covariance.
    • Covariance Matrix.
    • Correlation Coefficient.
    • Conditional Expectation: E[X | Y = y] = Sum of xpₓ|ᵧ(x|y). If Y is unknown then E[X | Y] is a random variable, i.e. a function of Y. So E[X | Y] also has its expectation and variance.
    • Law of Iterated Expectations: E[E[X | Y]] = E[X].
    • Conditional Variance: var(X | Y) is a function of Y.
    • Law of Total Variance: var(X) =  E[var(X | Y)] +var([E[X | Y]).
    • Bernoulli Process:  A sequence of independent Bernoulli trials. At each trial, i: P(Xᵢ=1)=p, P(Xᵢ=0)=1−p.
    • Poisson Process.
    • Markov Chain.
    • Mean, Median, Mode.
    • Moments of a Distribution.
    • Skewness: E[((X – μ)/σ)³].
    • Kurtosis: E[((X – μ)/σ)⁴].
    • k% Quantile: Value k such that P (X ≤ qₖ/₁₀₀) = k/100.
    • Interquartile Range: IQR = Q₃ − Q₁.
    • Box-Plots: Q₁, Q₂, Q₃, IQR, min, max.
    • Kernel Density Estimation.
    • Violin Plot = Box-Plot + Kernel Density Estimation.
    • Quantile-Quantile Plots (QQ Plots).
    • Population: N.
    • Sample: n.
    • Random Sampling.
    • Population Mean: μ.
    • Sample Mean: x̄.
    • Population Proportion: p.
    • Sample Proportion: p̂.
    • Population Variance: σ².
    • Sample Variance: s².
    • Markov’s Inequality: P(X ≥ a) ≤ E(X)/a (X > 0, a > 0).
    • Chebyshev’s Inequality: P(|X – E(X)| ≥ a) ≤ var(X)/a².
    • Week Law of Large Numbers: The average of the samples will get closer to the population mean as the sample size (not number of items) increases.
    • Central Limit Theorem: The distribution of sample means approximates a normal distribution as the sample size (not number of items) gets larger, regardless of the population’s distribution.
    • Sampling Distributions: Distribution of Sample Mean, Distribution of Sample Proportion, Distribution of Sample Variance.
    • Point Estimate: A single number, calculated from a sample, that estimates a parameter of the population.
    • Maximum Likelihood Estimation: Given data the maximum likelihood estimate (MLE) for the parameter p is the value of p that maximizes the likelihood P (data | p). P (data | p) is the likelihood function. For continuous distributions, we use the probability density function to define the likelihood.
    • Log likelihood: the natural log of the likelihood function.
    • Frequentists: Assume no prior belief, the goal is to find the model that most likely generated observed data.
    • Bayesians: Assume prior belief, the goal is to update prior belief based on observed data.
    • Maximum A Posteriori (MAP): Good for instances when you have limited data or strong prior beliefs. Wrong priors, wrong conclusions. MAP with uninformative priors is just MLE.
    • Margin of Error: A bound that we can confidently place on the difference between an estimate of something and the true value.
    • Significance Level: α, the probability that the event could have occurred by chance.
    • Confidence Level: 1 − α,  a measure of how confident we are in a given margin of error.
    • Confidence Interval: A 95% confidence interval (CI) of the mean is a range with an upper and lower number calculated from a sample. Because the true population mean is unknown, this range describes possible values that the mean could be. If multiple samples were drawn from the same population and a 95% CI calculated for each sample, we would expect the population mean to be found within 95% of these CIs.
    • z-score: the number of standard deviations from the mean value of the reference population.
    • Confidence Interval: Unknown σ.
    • Confidence Interval for Proportions.
    • Hypothesis: A statement about a population developed for the purpose of testing.
    • Hypothesis Testing.
    • Null Hypothesis (H₀): A statement about the value of a population parameter, contains equal sign.
    • Alternate Hypothesis (H₁): A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false, never contains equal sign.
    • Type I Error: Reject the null hypothesis when it is true.
    • Type II Error: Do not reject the null hypothesis when it is false.
    • Significance Level, α: The maximum probability of rejecting the null hypothesis when it is true.
    • Test Statistic:  A number, calculated from samples, used to find if your data could have occurred under the null hypothesis.
    • Right-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is greater than the null hypothesis claims.
    • Left-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is less than the null hypothesis claims.
    • Two-Tailed Test: The alternative hypothesis which does not specify a direction, i.e. when the alternative hypothesis states that the null hypothesis is wrong.
    • p-value: The probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. μ₀ is assumed to be known and H₀ is assumed to be true.
    • Decision Rules: If H₀ is true then acceptable x̄ must fall in (1 − α) region.
    • Critical Value or k-value: A value on a test distribution that is used to decide whether the null hypothesis should be rejected or not.
    • Power of a Test: The probability of rejecting the null hypothesis when it is false; in other words, it is the probability of avoiding a type II error.
    • t-Distribution.
    • T-Statistic.
    • t-Tests: Unknown σ, use T-Statistic.
    • Independent Two-Sample t-Tests.
    • Paired t-Tests.
    • A/B testing: A methodology for comparing two variations (A/B) that uses t-Tests for statistical analysis and making a decision.
    • Model Building: X = a·S + W, where X: output, S: “signal”, a: parameters, W: noise. Know S, assume W, observe X, find a.
    • Inferring: X = a·S + W. Know a, assume W, observe X, find S.
    • Hypothesis Testing: X = a·S + W. Know a, observe X, find S. S can take one of few possible values.
    • Estimation: X = a·S + W. Know a, observe X, find S. S can take unlimited possible values.
    • Bayesian Inference can be used for both Hypothesis Testing and Estimation by leveraging Bayes rule. Output is posterior distribution. Single answer can be Maximum a posteriori probability (MAP) or Conditional Expectation.
    • Least Mean Squares Estimation of Θ based on X.
    • Classical Inference can be used for both Hypothesis Testing and Estimation.

    After finishing learning about probability and statistics please click Topic 20 – Discrete Mathematics to continue.

     

    Topic 18 – Linear Algebra

    Why do I need to learn about linear algebra?

    Linear algebra is a fundamental tool for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing, and natural language processing.

    What can I do after finishing learning about linear algebra?

    You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

    That sounds useful! What should I do now?

    Please read this David C. Lay et al. (2022). Linear Algebra and Its Applications. Pearson Education book.

    Alternatively, please watch this MIT 18.06 – Linear Algebra, Spring 2005 course videos. While watching this course videos please do read Lecture Notes, and this Gilbert Strang (2016). Introduction to Linear Algebra. Wellesley-Cambridge Press book for better understanding some complex topics.

    Terminology Review:

    • Linear Equations.
    • Row Picture.
    • Column Picture.
    • Triangular matrix is a square matrix where all the values above or below the diagonal are zero.
    • Lower Triangular Matrix.
    • Upper Triangular Matrix.
    • Diagonal matrix is a matrix in which the entries outside the main diagonal are all zero.
    • Tridiagonal Matrix.
    • Identity Matrix.
    • Transpose of a Matrix.
    • Symmetric Matrix.
    • Pivot Columns.
    • Pivot Variables.
    • Augmented Matrix.
    • Echelon Form.
    • Reduced Row Echelon Form.
    • Elimination Matrices.
    • Inverse Matrix.
    • Factorization into A = LU.
    • Free Columns.
    • Free Variables.
    • Gauss-Jordan Elimination.
    • Vector Spaces.
    • Rank of a Matrix.
    • Permutation Matrix.
    • Subspaces.
    • Column space, C(A) consists of all combinations of the columns of A and is a vector space in ℝᵐ.
    • Nullspace, N(A) consists of all solutions x of the equation Ax = 0 and lies in ℝⁿ.
    • Row space, C(Aᵀ) consists of all combinations of the row vectors of A and form a subspace of ℝⁿ. We equate this with C(Aᵀ), the column space of the transpose of A.
    • The left nullspace of A, N(Aᵀ) is the nullspace of Aᵀ. This is a subspace of ℝᵐ.
    • Linearly Dependent Vectors.
    • Linearly Independent Vectors.
    • Linear Span of Vectors.
    • A basis for a vector space is a sequence of vectors with two properties:
      • They are independent.
      • They span the vector space.
    • Given a space, every basis for that space has the same number of vectors; that number is the dimension of the space.
    • Dimension of a Vector Space.
    • Dot Product.
    • Orthogonal Vectors.
    • Orthogonal Subspaces.
    • Row space of A is orthogonal to  nullspace of A.
    • Matrix Spaces.
    • Rank-One Matrix.
    • Orthogonal Complements.
    • Projection matrix: P = A(AᵀA)⁻¹Aᵀ. Properties of projection matrix: Pᵀ = P and P² = P. Projection component: Pb = A(AᵀA)⁻¹Aᵀb = (AᵀA)⁻¹(Aᵀb)A.
    • Linear regression, least squares, and normal equations: Instead of solving Ax = b we solve Ax̂ = p or AᵀAx̂ = Aᵀb.
    • Linear Regression.
    • Orthogonal Matrix.
    • Orthogonal Basis.
    • Orthonormal Vectors.
    • Orthonormal Basis.
    • Orthogonal Subspaces.
    • Gram–Schmidt process.
    • Determinant: A number associated with any square matrix letting us know whether the matrix is invertible, the formula for the inverse matrix, the volume of the parallelepiped whose edges are the column vectors of A. The determinant of a triangular matrix is the product of the diagonal entries (pivots).
    • The big formula for computing the determinant.
    • The cofactor formula rewrites the big formula for the determinant of an n by n matrix in terms of the determinants of smaller matrices.
    • Formula for Inverse Matrix.
    • Cramer’s Rule.
    • Eigenvectors are vectors for which Ax is parallel to x: Ax = λx. λ is an eigenvalue of A, det(A − λI)= 0.
    • Diagonalizing a matrix: AS = SΛ 🡲 S⁻¹AS = Λ 🡲 A = SΛS⁻¹. S: matrix of n linearly independent eigenvectors. Λ: matrix of eigenvalues on diagonal.
    • Matrix exponential eᴬᵗ.
    • Markov matrices: All entries are non-negative and each column adds to 1.
    • Symmetric matrices: Aᵀ = A.
    • Positive definite matrices: all eigenvalues are positive or all pivots are positive or all determinants are positive.
    • Similar matrices: A and B = M⁻¹AM.
    • Singular value decomposition (SVD) of a matrix: A = UΣVᵀ, where U is orthogonal, Σ is diagonal, and V is orthogonal.
    • Linear Transformations: T(v + w) = T(v)+ T(w) and T(cv)= cT(v) . For any linear transformation T we can find a matrix A so that T(v) = Av.

    After finishing learning about linear algebra please click Topic 19 – Probability & Statistics to continue.

     

    Talking In A Shop

    Customer: Excuse me, do you sell swimsuits for girls?
    Assistant: It’s in Aisle 12.

    Customer: Excuse me, I am looking for a shirt?
    Assistant: It’s in Aisle 12.

    Assistant: Can I help you?
    Customer: Yes please, I am looking for washing up liquid.

    Assistant: Are you looking for something in particular?
    Customer: Yes please, I am looking for a pair of jeans.
    Customer: I’m fine thanks, just browsing.
    Customer: I’m only looking today.

    Anti-Virus vs. Anti-Malware

    What is the difference between anti-virus software and an anti-malware software?

    A virus is a piece of code that is capable of copying itself in order to do damage to your computer, including corrupting your system or destroying data.

    Malware, on the other hand, is an umbrella term that stands for a variety of malicious software doing damage to your computer or stealing your information, including Trojans, spyware, worms, adware, ransomware, and yes, viruses.

    So the logic follows: all viruses are malware. Not all malware are viruses.

    Anti-virus software generally scans for infectious malware which includes viruses, worms, Trojans, rootkis and bots.

    Anti-malware software generally tends to focus more on adware, spyware, unwanted toolbars, browser hijackers, potentially unwanted programs and potentially unsafe applications.

    Therefore, you need both an anti-virus and an anti-malware solution for maximum protection.

    Built-in Windows Defender provides both anti-virus and anti-malware protection, and IMO, is enough for non-tech-savvy users.

    A comprehensive FREE anti-virus software is AVG.

    A comprehensive FREE anti-malware software is Malwarebytes

    An ads blocker for Edge browser is uBlock Origin

    An ads blocker for Firefox browser is Adblock Plus

    How to Manually Install PHP 8.1 on Windows 10

    Motivation:

    • You want to understand how PHP works with IIS.
    • You want to prepare a PHP or WordPress development environment on Windows.
    • You want to update PHP to any version to address compatibility or security issues on Windows.

    Solution:

    • Install CGI for IIS on Turn Windows features on or off > Internet Information Services > World Wide Web Services > Application Development Features > CGI.
    • Download VS16 x64 Non Thread Safe package under PHP 8.1 section on https://windows.php.net/download/.
    • Extract the ZIP file to C:\Program Files\php-8.1.10-nts-Win32-vs16-x64 folder.
    • Download the cacert.pem file and move it to the C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl folder.
    • Copy the php-.ini-development file to php.ini.
    • Open the php.ini file and uncomment the following lines
      fastcgi.impersonate = 1;
      
      cgi.fix_pathinfo=1;
      cgi.force_redirect = 1 (and change the value to 0, i.e. cgi.force_redirect = 0)
      
      extension_dir = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\ext"
      
      extension=curl
      extension=fileinfo
      extension=gd
      extension=gettext
      extension=mbstring
      extension=exif ; Must be after mbstring as it depends on it
      extension=mysqli
      extension=openssl
      extension=pdo_mysql
      
      error_log = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\php_errors.log"
      
      error_log = syslog
      
      curl.cainfo = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl\cacert.pem"
      
      openssl.capath="C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\extras\ssl\cacert.pem"
    • Add C:\Program Files\php-8.1.10-nts-Win32-vs16-x64 to SYSTEM PATH.
    • Restart your machine.
    • Open cmd and execute the command below to ensure that the PHP file can be found in the SYSTEM PATH.
      php --version
    • Open IIS, click on Server name, double click on Handler Mappings > Add Module Mapping with below information.
      Request path = *.php
      
      Module = FastCgiModule
      
      Executable = "C:\Program Files\php-8.1.10-nts-Win32-vs16-x64\php-cgi.exe"
      
      Name = PHP 8.1
      
      Request Restrictions = File or folder
    • Create phpinfo.php file with below content in the root website folder.
      <?php phpinfo(); ?>