All posts by admin

How to Completely Uninstall Python on macOS

Problem:

  • You have an issue with a specific Python version (e.g. version 3.7).
  • You want to install differerent version (e.g. version 3.6).
  • You want to completely uninstall the current version before installing the new one.

Solution:

Assume the current version is 3.7, replace it with the version you installed (e.g. version 3.8).

Follow the 3 steps below.

1. Remove the third-party Python 3.7 framework

sudo rm -rf /Library/Frameworks/Python.framework/Versions/3.7

2. Remove the Python 3.7 applications directory

sudo rm -rf "/Applications/Python 3.7"

3. Remove the symbolic links, in /usr/local/bin, that point to this Python version. Ensure that the links exit using below command:

ls -l /usr/local/bin | grep '../Library/Frameworks/Python.framework/Versions/3.7'

then run the following commands to remove all the links:

cd /usr/local/bin/ ls -l /usr/local/bin | grep '../Library/Frameworks/Python.framework/Versions/3.7' | awk '{print $9}' | tr -d @ | xargs rm

Installing a new version:

  1. Download a version from https://www.python.org/downloads/mac-osx/, double click the file and follow the instructions.
  2. Verify installation: python3 –version
  3. Install Homebrew from https://brew.sh
  4. Install virtualenv:
    pip3 install virtualenv
    pip3 install virtualenvwrapper
  5. Create and activate a virtual environment:
    cd /Users/admin/Downloads/training_model/model
    python3 -m virtualenv /Users/admin/Downloads/training_model/model
    source bin/activate

     

How to Fix the Dell Laptop “Hard Drive Not Installed” issue

Problem:

Suddenly, when turning your Dell laptop on you get the error message “Hard drive Not installed” and cannot boot into Windows.

Solution:
  1. Power your laptop and quickly press F2 key to enter BIOS.
  2. Expand System Configuration node.
  3. Click SATA Operation.
  4. Select AHCI option.
  5. Click Apply button.
  6. Click Exit button.
  7. If the problem still persists then restore BIOS settings to Default BIOS settings, then try the procedure again.
More information:
  • PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard. It is the common motherboard interface for personal computers’ graphics cards, hard drives, SSDs, Wi-Fi and Ethernet hardware connections.
  • NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open logical-device interface specification for accessing non-volatile storage media attached via PCI Express (PCIe) bus. By its design, NVM Express allows host hardware and software to fully exploit the levels of parallelism possible in modern SSDs. As a result, NVM Express reduces I/O overhead and brings various performance improvements relative to previous logical-device interfaces, including multiple long command queues, and reduced latency.
  • Serial ATA (SATA, abbreviated from Serial AT Attachment) is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives.
  • The Advanced Host Controller Interface (AHCI) is a technical standard defined by Intel that specifies the operation of Serial ATA (SATA) host controllers in a non-implementation-specific manner in its motherboard chipsets. AHCI is mainly recommended for SSDs that have better NVMe drivers from their factories.
  • RAID (“Redundant Array of Inexpensive Disks” or “Redundant Array of Independent Disks“) is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both.

 

How to Set File Permissions for WordPress on Windows IIS

Motivation:

  • You have a WordPress instance on Windows IIS.
  • You upload a file. Its thumbnail is not shown in Media Library.
  • You change the file permission. Its thumbnail now is shown correctly in Media Library.
  • You upload another file and have to change the file permission manually again.
  • How can we make WordPress automatically set the correct permission for new uploaded files?

Procedure:

  1. Ensure that the the Identity of Application pool that the website is running under is ApplicationPoolIdentity.
  2. Execute below commands as Administrator
icacls "C:\inetpub\wwwroot\domain.com" /grant "IUSR":(OI)(CI)F /T 
icacls "C:\inetpub\wwwroot\domain.com" /grant "IIS_IUSRS":(OI)(CI)F /T

3. Set up IIS.

  • Open IIS Manager.
  • Click on your website.
  • Click Authentication.
  • Click Anonymous Authentication (which should be the only one enabled).
  • Click Edit.
  • Select Application pool identity if it is not selected.
  • Click OK.

 

 

How to Count the Number of Source Lines of Code, Find and Replace Content in Multiple Files

Motivation:

  • You have a source code folder and want to know the number of source lines of code.
  • You want to find and replace a string with another string in multiple files.

Procedure:

  1. Right click Start icon, click on Command Prompt (Admin) or Windows Powershell (Admin)
  2. Assume that the source code folder location is C:\Users\admin\Downloads\test, type below commands and press Enter
cd C:\Users\admin\Downloads\test

3. Assume that the source code file extension is .py, type below commands and press Enter

type *.py | Measure-Object -line

SLOC

4. Assume that you want to find and replace “.flac” string with “.wav” string in all .cue files in the “E:\New Music\” directory, type below command, then press Enter.

Get-ChildItem "E:\New Music\" *.cue -recurse | ForEach { (Get-Content -Path $_.FullName).Replace(".flac", ".wav") | Set-Content -Path $_.FullName }

How to Copy, Move, Replicate, Augment or Delete Files and Folders using Commands on Windows

Motivation:

  • You have a web application the backup of which needs to be  created daily.
  • You have a web application the content of which needs to be replicated daily.
  • You have data folder the content of which needs to be augmented daily.

Commands:

  • Copying files and folders inside one folder to another:
robocopy E:\inetpub\wwwroot\website.domain.com E:\inetpub\wwwroot\backup.domain.com /e

/e Copies subdirectories. This option includes empty directories. 

robocopy \\192.168.1.49\E\inetpub E:\inetpub /e
  • Moving entire folder to another location:
PS C:\> Move-Item -path \\192.168.1.15\e\inetpub\ -destination E:\ -force

where PS C:\> is PowerShell.
  • Moving new files and folders inside one folder to another:
robocopy E:\inetpub\wwwroot\website.domain.com E:\inetpub\wwwroot\archive.domain.com /move /e

/move Moves files and directories, and deletes them from the source after they are copied.
  • Copying (mirroring) entire data from one drive to another, including file and folder permissions:
robocopy E:\ G:\ /MIR /COPYALL /ZB /W:1 /R:2 /XO

or

robocopy E:\ G:\ /TEE /LOG+:F:\robolog.txt /MIR /COPYALL /ZB /W:1 /R:2 /XO

E:\
 - Source folder. This can be a UNC path.

G:\
 - Destination folder. This can be a UNC path.

/TEE
 - Display the output of the command in the console window and write it to a log file.

/LOG+:F:\robolog.txt
 - Write the logs to F:\robolog.txt. The + sign means appending the content to the log file.

/MIR
 - Copy all files and subfolders, remove files and folders from the destination if they no longer exist on the source.

/COPYALL
 - Copy all of the NTFS permissions and attributes (security permissions, timestamps, owner info, etc.)

/ZB
 - Use restartable mode when copying files. If a file is in use, retry after a set amount of time (see /W:1 and /R:2). If access is denied then try to copy in backup mode.

/W:1
 - Wait for 1 second between retries when copying files.

/R:2
 - The number of retries on failed copies.

/XO
 - eXclude Older files/folders if the destination file or folder exists and has the same date.
If destination file exists and is the same date or newer than the source - don't bother to overwrite it.
  • Augmenting files and folders (making an incremental backup) from one drive to another, including file and folder permissions:
robocopy E:\ G:\ /E /COPYALL /ZB /W:1 /R:2 /XO /XX

or

robocopy E:\ G:\ /TEE /LOG+:F:\robolog2.txt /E /COPYALL /ZB /W:1 /R:2 /XO /XX

/E
 - Copy Subfolders, including Empty Subfolders.

/XX
 - eXclude "eXtra" files and dirs (present in destination but not source). This will prevent any deletions from the destination.
  • Granting Full control to a user or group:
icacls "E:\inetpub\wwwroot\website.domain.com\App_Data" /grant "IUSR":(OI)(CI)F /T

icacls "E:\inetpub\wwwroot\website.domain.com\App_Data" /grant "IIS_IUSRS":(OI)(CI)F /T

CI
 - Container Inherit - This flag indicates that subordinate containers will inherit this ACE (access control entry).

OI
 - Object Inherit - This flag indicates that subordinate files will inherit the ACE.

OI and CI only apply to new files and sub-folders).

F
 - Full Control

/T
 - Apply recursively to existing files and sub-folders.
  • Deleting and creating a folder:
rmdir "E:\inetpub\wwwroot\website.domain.com\Temp\" /S /Q 
mkdir "E:\inetpub\wwwroot\website.domain.com\Temp\
  • Recursively deleting all files in a folder and all files in its sub-folders:
cd C:\inetpub\wwwroot

del /s *.log /s
 - delete all the files in the sub-folders.


del /s /f /q *.* /f
 - force deletion of read-only files.

/q
 - do not ask to confirm when deleting via wildcard.
  • Recursively deleting a folder, its files and its sub-folders:
rmdir .\force-app\main\default\objects /s /q /s
 - delete all the files in the sub-folders.
  • Enabling long paths and file names: For Windows 10, Version 1607, and Later: Open Group Policy (gpedit.msc) and go to Computer Configuration > Administrative Templates > System > Filesystem. Set “Enabling Win32 long paths” to “Enabled“. Restart the machine. Then use command below:
PS C:\> Move-Item -path \\192.168.101.157\e\Files\ -destination E:\ -force
  • Removing a drive letter from a volume:
mountvol F: /D

/D
- remove the drive letter from the selected volume.

Topic 22 – Introduction to Machine Learning

Why do I need to learn about machine learning?

Machine learning has been used to solve many important and difficult problems, including speech recognition, speech synthesis, image recognition, autonomous driving, and chatbots. Today, a key skill for software developers is the ability to use machine learning algorithms to solve real-world problems.

What can I do after finishing learning about machine learning?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

First, please audit these couses to learn the core concepts of machine learning and gain hands-on experience with them:

After that, please read the following books to reinforce your theoretical understanding and practical competence in machine learning:

After that, please audit this course and read its readings to learn the core approaches and algorithms for building artificial intelligence systems: MIT 6.034 – Artificial Intelligence, Fall 2010 (Readings).

After that, please read the following books to to study the mathematical foundations underlying machine learning algorithms:

After that, please audit the following courses and read the book below to learn the core concepts and algorithms of reinforcement learning:

Supervised Learning Terminology Review:

  • Artificial Intelligence.
  • Machine Learning.
  • Deep Learning.
  • Linear Regression: Y = θX + Ε.
  • Cost Function measures how good/bad your model is.
  • Mean Square Error (MSE) measures the average of the squares of the errors.
  • Gradient Descent, Learning Rate.
  • Batch Gradient Descent.
  • The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
  • Stochastic Gradient Descent.
  • Mini-Batch Gradient Descent.
  • Overfitting: machine learning model gives accurate predictions for training data but not for new data.
  • Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
  • Normalization.
  • Logistic Regression.
  • Sigmoid Function.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • One Hot Encoding.
  • The Softmax function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
  • Softmax Regression.
  • Gradient Ascent.
  • Newton’s Method.
  • Support Vector Machines.
  • Decision Trees.
  • Parametric vs. Non-parametric Models.
  • K-Nearest Neighbors.
  • Locally Weighted Regression.
  • McCulloch-Pitts Neuron.
  • Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
  • Perceptron.
  • Artificial Neural Networks.
  • Backpropagation.
  • Activation Functions: Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Hyperbolic Tangent.
  • Batch Normalization.
  • Learning Rate Decay.
  • Exponentially Weighted Averages.
  • Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
  • Regularization: Dropout.
  • The Joint Probability Table.
  • Bayesian Networks.
  • Naive Bayes Inference.

Unsupervised Learning Terminology Review:

  • K-Means.
  • Principal Component Analysis.
  • User-Based Collaborative Filtering.
  • Item-based Collaborative Filtering.
  • Matrix Factorization.

    Reinforcement Learning Terminology Review:

    • k-armed Bandit Problem.
    • Sample-Average Method.
    • Greedy Action.
    • Exploration and Exploitation.
    • ϵ-Greedy Action Selection.
      • Bandit Algorithm.
      • Exponential Recency-Weighted Average.
      • Optimistic Initial Values.
      • Upper-Confidence-Bound Action Selection.
      • Rewards.
      • Agent, Actions, World or Environment.
      • History, States, Terminal State, Environment State, Agent State, Information State.
      • Fully Observable Environments.
      • Partially Observable Environments.
      • Policy,  Value Function, Model.
      • Value Based RL Agent, Policy Based RL Agent, Actor Critic RL Agent.
      • Model Free RL Agent, Model Based RL Agent.
      • Learning Problem and Planning Problem.
      • Prediction and Control.
      • Markov Property.
      • State Transition Matrix.
      • Markov Process.
      • Episodic Tasks.
      • Continuing Tasks.
      • Horizon (H): Number of time steps in each episode, can be infinite.
      • Markov Reward Process.
      • Discount Factor, Discount Rate: 0 ≤ γ ≤ 1.
      • Return.
      • Discounted Return: Discounted sum of rewards from time step t to horizon H.
      • State-Value Function.
      • Bellman Equation for Markov Reward Processes.
      • Markov Decision Process.
      • Policy: Mapping from states to actions. Deterministic policy: π (s) = a. Stochastic policy: π (a|s) = P(aₜ=a|sₜ=s).
      • State Value Function – Vπ(s): The expected return starting from state s following policy π.
      • Bellman Expectation Equation for Vπ.
      • Action Value Function (also known as State-Action Value Fucntion or the Quality Function) – Qπ(s, a): The expected return starting from state , taking action , then following policy .
      • Bellman Expectation Equation for Qπ.
      • Optimal State Value Function.
      • Optimal Action Value Function.
      • Bellman Optimality Equation for v*.
      • Bellman Optimality Equation for q*.
      • Optimal Policies.
      • Dynamic Programming.
      • Iterative Policy Evaluation.
      • Policy Improvement.
      • Policy Improvement Theorem.
      • Policy Iteration.
      • Value Iteration.
      • Synchronous Dynamic Programming.
      • Asynchronous Dynamic Programming.
      • Generalized Policy Iteration.
      • Bootstrapping: Updating estimates on the basis of other estimates.
      • Monte-Carlo Policy Evaluation.
      • First-Visit Monte-Carlo Policy Evaluation.
      • Every-Visit Monte-Carlo Policy Evaluation.
      • Incremental Mean.
      • Incremental Monte-Carlo Updates.
      • Temporal-Difference Learning.
      • Forward-View TD(λ).
      • Eligibility Traces.
      • Backward-View TD(λ).
      • On-Policy Learning.
      • Off-Policy Learning.
      • ϵ-Greedy Exploration.
      • ϵ-greedy Policies: Most of the time they choose an action that has maximal estimated action value, but with probability ϵ they instead select an action at random.
      • Monte-Carlo Policy Iteration. Policy evaluation: Monte-Carlo policy evaluation, Q = qπ. Policy improvement: ϵ-greedy policy improvement.
      • Monte-Carlo Control. Policy evaluation: Monte-Carlo policy evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Exploring Starts: Specify that the episodes start in a state–action pair, and that every pair has a nonzero probability of being selected as the start.
      • Monte Carlo Control Exploring Starts.
      • Greedy in the Limit with In nite Exploration (GLIE) Monte-Carlo Control.
      • ϵ-soft Policies: Policies for which π(a|s) ≥ ϵ/|A(s)| for all states and actions, for some ϵ > 0.
      • On-Policy First-Visit MC Control.
      • SARSA: State (S), Action (A), Reward (R), State (S’), Action (A’).
      • On-Policy Control with SARSA. Policy evaluation: SARSA evaluation, Q ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Forward-View SARSA (λ).
      • Backward-View SARSA (λ).
      • Target Policy.
      • Behavior Policy.
      • Importance Sampling: Use samples from one distribution to estimate the expectation of a diff erent distribution.
      • Importance Sampling for Off-Policy Monte-Carlo.
      • Importance Sampling for Off-Policy TD.
      • Q-Learning: Next action is chosen using behaviour policy. Q is updated using alternative successor action.
      • Off -Policy Control with Q-Learning.
      • Expected SARSA.
      • Value Function Approximation.
      • Function Approximators.
      • Differentiable Function Approximators.
      • Feature Vectors.
      • State Aggregation.
      • Coarse Coding.
      • Tile Coding.
      • Continuous States.
      • Incremental Prediction Algorithms.
      • Control with Value Function Approximation. Policy evaluation: Approximate policy evaluation, q(.,., w) ≈ qπ. Policy improvement: ϵ-greedy policy improvement.
      • Learning State Action Value function: Replay Buffer: 10,000 tuples most recent (s, a, R(s), s’). x = (s, a) → Q(θ) → y = R(s) + γmaxQ(s’, a’, θ). Loss = [R(s) + γmaxQ(s’, a’; θ)] − Q(s, a; θ).
      • Expected SARSA with Function Approximation.
      • Target Network: A separate neural network for generating the y targets. It has the same architecture as the original Q-Network. Loss = [R(s) + γmaxTargetQ(s’, a’; θ′)] − Q(s, a; θ). Every C time steps we will use the TargetQ-Network to generate the y targets and update the weights of the TargetQ-Network using the weights of the Q-Network.
      • Soft Updates: ← 0.001θ + 0.999, where and represent the weights of the target network and the current network, respectively.
      • Deep Q-learning.
      • Linear Least Squares Prediction Algorithms.
      • Least Squares Policy Iteration. Policy evaluation: Least squares Q-Learning. Policy improvement: Greedy policy improvement.
      • Average Reward.
      • Discounted Returns, Returns for Average Reward.
      • Stochastic Policies.
      • Softmax Policies.
      • Gaussian Policies.
      • Policy Objective Functions: Start State Objective, Average Reward Objective and Average Value Objective.
      • Score Function.
      • Policy Gradient Theorem.
      • Monte-Carlo Policy Gradient (REINFORCE).
      • Action-Value Actor-Critic: Critic updates w by linear TD(0). Actor updates θ by policy gradient.
      • The Tabular Dyna-Q Algorithm.
      • The Dyna-Q+ Algorithm.
      • Forward Search.
      • Simulation-Based Search.
      • Monte-Carlo Tree Search.
      • Temporal-Difference Search.
      • Dyna-2.

      Probabilistic Machine Learning Terminology Review:

      • Probabilistic Machine Learning
      • Non-Probabilistic Machine Learning
      • Algorithmic Machine Learning.
      • Array Programming.
      • Frequentist and Bayesian Approaches.

      After finishing machine learning, please click on Topic 23 – Introduction to Computer Vision to continue.

       

      How to Change Language of an EPUB File

      Problem:

      You have an EPUB file encoded with a wrong language tag.
      Therefore when you use the Read aloud feature of the Google Play Books application the book is read aloud in a wrong language.

      Solution:
      1. Download the EPUB file to a PC.
      2. Change the extension from EPUB to ZIP.
      3. Open the .ZIP file.
      4. Open the content.opf file using the Notepad app.
      5. If you cannot file this content.opf file then please navigate to the OEBPS folder.
      6. Find the tag <dc:language> and change its value (e.g. from <dc:language>en</dc:language>to <dc:language>vi</dc:language>).
      7. If you cannot find the tag <dc:language> then just add a new tag right above the </metadata> tag (e.g.
        <dc:language>vi</dc:language>
        </metadata>
      8. Save the content.opf file and rezip the EPUB file.
      9. Change the file extension from ZIP to EPUB.

       

      How to transfer Photos from iPhone to PC with highest quality

      Problem: Images copied directly from iPhone to a PC usually have lower quality in comparison with the original quality due to format conversion.

      You want to preserve the quality as high as possible.

      Solution:

      1. Connect iPhone to a MacBook.
      2. Open Photos app.
      3. Click on iPhone’s name under Devices section on the left.
      4. Select photos on the right.
      5. Select Import to = Library or New Album.
      6. Click on Import N Selected button, where N is the number of selected photos, to Import photos from iPhone to iPhotos.
        • The imported photos will be copied from iPhone to iPhotos.
        • You have to manually delete the photos directly from iPhone if you want to permanently remove them from iPhone.
      7. Click on an album name on the left menu in iPhotos.
      8. Select the imported images in the album.
      9. Click File > Export > Export N Photos… (N is a number) to export photos from iPhotos to a folder on MacBook.
        • Select Arrow icon at the end of Photo Kind
        • Select PNG
        • Select Color Profile = Original
        • Select Size = Full Size
        • Select Movie Quality = 4K
      10. Click the Export button.
      11. Enter a folder name.
      12. Click the Export button. Wait for the exporting process to be completed by reviewing the circle icon in the toolbar.
      13. Share the folder in a LAN.
      14. Copy the folder to a PC.

       

      Topic 21 – Introduction to Computational Thinking

      Why do I need to learn about computational thinking?

      Computational thinking is a fundamental tool for understanding, implementing, and evaluating modern theories in artificial intelligence, machine learning, deep learning, data mining, security, digital image processing, and natural language processing.

      What can I do after finishing learning about computation thinking?

      You will be able to:

      • use a programming language to express computations,
      • apply systematic problem-solving strategies such as decomposition, pattern recognition, abstraction, and algorithmic thinking to turn an ambiguous problem statement into a computational solution method,
      • apply algorithmic and problem-reduction techniques,
      • use randomness and simulations to address problems that cannot be solved with closed-form solutions,
      • use computational tools, including basic statistical, visualization, and machine learning tools, to model and understand data.

      These skills foster abstract thinking that enables you not only to use technology effectively but also to understand what is possible, recognize inherent trade-offs, and account for computational constraints that shape the software you design.

      You will also be prepared to learn how to design and build compilers, operating systems, database management systems, and distributed systems.

      That sounds useful! What should I do now?

      First, please read this book to learn how to apply computational methods such as simulation, randomized algorithms, and statistical analysis to solve problems such as modeling disease spread, simulating physical systems, analyzing biological data, optimizing transportation, and designing communication networks: John V. Guttag (2021). Introduction to Computation and Programming using Python. 3rd Edition. The MIT Press.

      Alternatively, if you want to gain the same concepts through interactive explanations, please audit the following courses:

      After that, please read chapters 5 and 6 of the following book to learn about the theory of computing and how a machine performs computations: Robert Sedgewick and Kevin Wayne (2016). Computer Science – An Interdisciplinary Approach. Addison-Wesley Professional.

      Alternatively, if you want to gain the same concepts through interactive explanations, please audit the following courses: Computer Science: Algorithms, Theory, and Machines.

      After that, please read the following book to learn what is going on “under the hood” of a computer system: Randal E. Bryant and David R. O’Hallaron (2015). Computer Systems. A Programmer’s Perspective. Pearson.

      After that, please audit this course to learn how to build scalable and high-performance software systems: MIT 6.172 Performance Engineering of Software Systems, Fall 2018 (Lecture Notes).

      Terminology Review:

      • Algorithms.
      • Fixed Program Computer, Stored Program Computer.
      • Computer Architecture.
      • Hardware or Computer Architecture Primitives, Programming Language Primitives, Theoretical or Computability Primitives
      • Mathematical Abstraction of a Computing Machine (Turing Machine, Abstract Device), Turing’s Primitives.
      • Programming Languages.
      • Expressions, Syntax, Static Sematics, Semantics, Variables, Bindings.
      • Programming vs. Math.
      • Programs.
      • Big O notation.
      • Optimization Models: Knapsack Problem.
      • Graph-Theoretic Models: Shortest Path Problems.
      • Simulation Models: Monte Carlo Simulation, Random Walk.
      • Statistical Models.
      • K-means Clustering.
      • k-Nearest Neighbors Algorithm.

      After finishing computational thinking, please click on Topic 22 – Introduction to Machine Learning to continue.

       

      Topic 19 – Probability & Statistics

      Why do I need to learn about probability and statistics?

      Probability and statistics are fundamental tools for understanding many modern theories and techniques such as artificial intelligence, machine learning, deep learning, data mining, security, digital imagine processing and natural language processing.

      What can I do after finishing learning about probability and statistics?

      You will be prepared to learn modern theories and techniques to create modern security, machine learning, data mining, image processing or natural language processing software.

      That sounds useful! What should I do now?

      Please read one of the following books to grasp the core concepts of probability and statistics:

      Alternatively, please read these notes first, and then audit the courses below if you would like to learn through interactive explanations:

      Perhaps probability and statistics are among the most difficult topics in mathematics, so you may need to study them two or three times using different sources to truly master the concepts. For example, you may audit the course and read the books below to gain additional examples and intuition about the concepts:

      Learning probability and statistics requires patience. However, the rewards will be worthwhile: you will be able to master AI algorithms more quickly and with greater confidence.

      Terminology Review:

      • Sample Space (Ω): Set of possible outcomes.
      • Event: Subset of the sample space.
      • Probability Law: Law specified by giving the probabilities of all possible outcomes.
      • Probability Model = Sample Space + Probability Law.
      • Probability Axioms: Nonnegativity: P(A) ≥ 0; Normalization: P(Ω)=1; Additivity: If A ∩ B = Ø, then P(A ∪ B)= P(A)+ P(B).
      • Conditional Probability: P(A|B) = P (A ∩ B) / P(B).
      • Multiplication Rule.
      • Total Probability Theorem.
      • Bayes’ Rule: Given P(Aᵢ) (initial “beliefs” ) and P (B|Aᵢ). P(Aᵢ|B) = ? (revise “beliefs”, given that B occurred).
      • The Monty Problem: 3 doors, behind which are two goats and a car.
      • The Spam Detection Problem: “Lottery” word in spam emails.
      • Independence of Two Events: P(B|A) = P(B)  or P(A ∩ B) = P(A) · P(B).
      • The Birthday Problem: P(Same Birthday of 23 People) > 50%.
      • The Naive Bayes Model: “Naive” means features independence assumption.
      • Discrete Uniform Law: P(A) = Number of elements of A / Total number of sample points = |A| / |Ω|
      • Basic Counting Principle: r stages, nᵢ choices at stage i, number of choices = n₁ n₂ · · · nᵣ
      • Permutations: Number of ways of ordering elements. No repetition for n slots: [n] [n-1] [n-2] [] [] [] [] [1].
      • Combinations: number of k-element subsets of a given n-element set.
      • Binomial Probabilities. P (any sequence) = p# ʰᵉᵃᵈˢ(1 − p)# ᵗᵃᶦˡˢ.
      • Random Variable: A function from the sample space to the real numbers. It is not random. It is not a variable. It is a function: f: Ω ℝ. Random variable is used to model the whole experiment at once.
      • Discrete Random Variables.
      • Probability Mass Function: P(X = 𝑥) or Pₓ(𝑥): A function from the sample space to [0..1] that produces the likelihood that the value of X equals to 𝑥. PMF gives probabilities. 0 ≤ PMF ≤ 1. All the values of PMF must sum to 1. PMF is used to model a random variable.
      • Bernoulli Random Variable (Indicator Random Variable): f: Ω {1, 0}. Only 2 outcomes: 1 and 0. p(1) = p and p(0) = 1 – p.
      • Binomial Random Variable: X = Number of successes in n trials. X = Number of heads in n independent coin tosses.
      • Binomial Probability Mass Function: Combination of (k, n)pᵏ(1 − p)ⁿ−ᵏ.
      • Geometric Random Variable: X = Number of coin tosses until first head.
      • Geometric Probability Mass Function: (1 − p)ᵏ−¹p.
      • Expectation: E[X] = Sum of xpₓ(x).
      • Let Y=g(X): E[Y] = E[g(X)] = Sum of g(x)pₓ(x). Caution: E[g(X)] ≠ g(E[X]) in general.
      • Variance: var(X) = E[(X−E[X])²].
      • var(aX)=a²var(X).
      • X and Y are independent: var(X+Y) = var(X) + var(Y). Caution: var(X+Y) ≠ var(X) + var(Y) in general.
      • Standard Deviation: Square root of var(X).
      • Conditional Probability Mass Function: P(X=x|A).
      • Conditional Expectation: E[X|A].
      • Joint Probability Mass Function: Pₓᵧ(x,y) = P(X=x, Y=y) = P((X=x) and (Y=y)).
      • Marginal Distribution: Distribution of one variable
        while ignoring the other.
      • Marginal Probability Mass Function: P(x) = Σy Pₓᵧ(x,y).
      • Total Expectation Theorem: E[X|Y = y].
      • Independent Random Variables: P(X=x, Y=y)=P(X=xP(Y=y).
      • Expectation of Multiple Random Variables: E[X + Y + Z] = E[X] + E[Y] + E[Z].
      • Binomial Random Variable: X = Sum of Bernoulli Random Variables.
      • The Hat Problem.
      • Continuous Random Variables.
      • Probability Density Function: P(a ≤ X ≤ b) or Pₓ(𝑥). (a ≤ X ≤ b) means X function produces a real number value within the [a, b] range. Programming language: X(outcome) = 𝑥, where a ≤ 𝑥 ≤ b. PDF does NOT give probabilities. PDF does NOT have to be less than 1. PDF gives probabilities per unit length. The total area under PDF must be 1. PDF is used to define the random variable’s probability coming within a distinct range of values.
      • Cumulative Distribution Function: P(X ≤ b). (X ≤ b) means X function produces a real number value within the [-∞, b] range. Programming language: X(outcome) = 𝑥, where 𝑥 ≤ b.
      • Continuous Uniform Random Variables: fₓ(x) = 1/(b – a) if a ≤ X ≤ b, otherwise f = 0.
      • Normal Random Variable, Gaussian Distribution, Normal Distribution: Fitting bell shaped data.
      • Chi-Squared Distribution: Modelling communication noise.
      • Sampling from a Distribution: The process of drawing a random value (or set of values) from a probability distribution.
      • Joint Probability Density Function.
      • Marginal Probability Density Function.
      • Conditional Probability Density Function.
      • Derived Distributions.
      • Convolution: A mathematical operation on two functions (f and g) that produces a third function.
      • The Distribution of W = X + Y.
      • The Distribution of X + Y where X, Y: Independent Normal Ranndom Variables.
      • Covariance.
      • Covariance Matrix.
      • Correlation Coefficient.
      • Conditional Expectation: E[X | Y = y] = Sum of xpₓ|ᵧ(x|y). If Y is unknown then E[X | Y] is a random variable, i.e. a function of Y. So E[X | Y] also has its expectation and variance.
      • Law of Iterated Expectations: E[E[X | Y]] = E[X].
      • Conditional Variance: var(X | Y) is a function of Y.
      • Law of Total Variance: var(X) =  E[var(X | Y)] +var([E[X | Y]).
      • Bernoulli Process:  A sequence of independent Bernoulli trials. At each trial, i: P(Xᵢ=1)=p, P(Xᵢ=0)=1−p.
      • Poisson Process.
      • Markov Chain.

      • Bar Chart, Line Charts, Scatter Plots, Histograms.
      • Mean, Median, Mode.
      • Moments of a Distribution.
      • Skewness: E[((X – μ)/σ)³].
      • Kurtosis: E[((X – μ)/σ)⁴].
      • k% Quantile: Value k such that P (X ≤ qₖ/₁₀₀) = k/100.
      • Interquartile Range: IQR = Q₃ − Q₁.
      • Box-Plots: Q₁, Q₂, Q₃, IQR, min, max.
      • Kernel Density Estimation.
      • Violin Plot = Box-Plot + Kernel Density Estimation.
      • Quantile-Quantile Plots (QQ Plots).
      • Population: N.
      • Sample: n.
      • Random Sampling.
      • Population Mean: μ.
      • Sample Mean: x̄.
      • Population Proportion: p.
      • Sample Proportion: p̂.
      • Population Variance: σ².
      • Sample Variance: s².
      • Sampling Distributions.
      • Sampling from a Distribution: Drawing random values directly from a probability distribution. Purpose: Simulating or modeling real-world processes when the underlying distribution is known.
      • Markov’s Inequality: P(X ≥ a) ≤ E(X)/a (X > 0, a > 0).
      • Chebyshev’s Inequality: P(|X – E(X)| ≥ a) ≤ var(X)/a².
      • Week Law of Large Numbers: The average of the samples will get closer to the population mean as the sample size (not number of items) increases.
      • Central Limit Theorem: The distribution of sample means approximates a normal distribution as the sample size (not number of items) gets larger, regardless of the population’s distribution.
      • Sampling Distributions: Distribution of Sample Mean, Distribution of Sample Proportion, Distribution of Sample Variance.
      • Point Estimate: A single number, calculated from a sample, that estimates a parameter of the population.
      • Maximum Likelihood Estimation: Given data the maximum likelihood estimate (MLE) for the parameter p is the value of p that maximizes the likelihood P (data | p). P (data | p) is the likelihood function. For continuous distributions, we use the probability density function to define the likelihood.
      • Log likelihood: the natural log of the likelihood function.
      • Frequentists: Assume no prior belief, the goal is to find the model that most likely generated observed data.
      • Bayesians: Assume prior belief, the goal is to update prior belief based on observed data.
      • Maximum A Posteriori (MAP): Good for instances when you have limited data or strong prior beliefs. Wrong priors, wrong conclusions. MAP with uninformative priors is just MLE.
      • Margin of Error: A bound that we can confidently place on the difference between an estimate of something and the true value.
      • Significance Level: α, the probability that the event could have occurred by chance.
      • Confidence Level: 1 − α,  a measure of how confident we are in a given margin of error.
      • Confidence Interval: A 95% confidence interval (CI) of the mean is a range with an upper and lower number calculated from a sample. Because the true population mean is unknown, this range describes possible values that the mean could be. If multiple samples were drawn from the same population and a 95% CI calculated for each sample, we would expect the population mean to be found within 95% of these CIs.
      • z-score: the number of standard deviations from the mean value of the reference population.
      • Confidence Interval: Unknown σ.
      • Confidence Interval for Proportions.
      • Hypothesis: A statement about a population developed for the purpose of testing.
      • Hypothesis Testing.
      • Null Hypothesis (H₀): A statement about the value of a population parameter, contains equal sign.
      • Alternate Hypothesis (H₁): A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false, never contains equal sign.
      • Type I Error: Reject the null hypothesis when it is true.
      • Type II Error: Do not reject the null hypothesis when it is false.
      • Significance Level, α: The maximum probability of rejecting the null hypothesis when it is true.
      • Test Statistic:  A number, calculated from samples, used to find if your data could have occurred under the null hypothesis.
      • Right-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is greater than the null hypothesis claims.
      • Left-Tailed Test: The alternative hypothesis states that the true value of the parameter specified in the null hypothesis is less than the null hypothesis claims.
      • Two-Tailed Test: The alternative hypothesis which does not specify a direction, i.e. when the alternative hypothesis states that the null hypothesis is wrong.
      • p-value: The probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. μ₀ is assumed to be known and H₀ is assumed to be true.
      • Decision Rules: If H₀ is true then acceptable x̄ must fall in (1 − α) region.
      • Critical Value or k-value: A value on a test distribution that is used to decide whether the null hypothesis should be rejected or not.
      • Power of a Test: The probability of rejecting the null hypothesis when it is false; in other words, it is the probability of avoiding a type II error.
      • t-Distribution.
      • T-Statistic.
      • t-Tests: Unknown σ, use T-Statistic.
      • Independent Two-Sample t-Tests.
      • Paired t-Tests.
      • A/B testing: A methodology for comparing two variations (A/B) that uses t-Tests for statistical analysis and making a decision.
      • Model Building: X = a·S + W, where X: output, S: “signal”, a: parameters, W: noise. Know S, assume W, observe X, find a.
      • Inferring: X = a·S + W. Know a, assume W, observe X, find S.
      • Hypothesis Testing: X = a·S + W. Know a, observe X, find S. S can take one of few possible values.
      • Estimation: X = a·S + W. Know a, observe X, find S. S can take unlimited possible values.
      • Bayesian Inference can be used for both Hypothesis Testing and Estimation by leveraging Bayes rule. Output is posterior distribution. Single answer can be Maximum a posteriori probability (MAP) or Conditional Expectation.
      • Least Mean Squares Estimation of Θ based on X.
      • Classical Inference can be used for both Hypothesis Testing and Estimation.

      After finishing probability and statistics, please click on Topic 20 – Discrete Mathematics to continue.