Category Archives: Machine Learning

Topic 24 – Introduction to Nature Language Processing

Why do I need to learn about nature language processing?

Natural language processing (NLP) has become more and more interesting. Speech recognition, speech synthesis, autonomous driving and chat bots are examples of breakthrough achievements in the field.

Nowadays a key skill of software developer is the ability to use nature language processing algorithms and tools to solve real-world problems related to text, audio, natural language sentences and speech.

What can I do after finishing learning about nature language processing?

You will be to create software that could recognize speech, translate text to speech, translate a sentence from English to French, answer a customer’s question.

That sounds fun! What should I do now?

Please read
– this Daniel Jurafsky and James H. Martin (2014). Speech and Language Processing. Pearson book, and
– this Christopher D. Manning and Hinrich Schiitze (1999). Foundations of Statistical Natural Language Processing. MIT Press book first.

After that please audit these Natural Language Processing Specialization courses and this Stanford CS224N – NLP with Deep Learning, Winter 2023 course (Lecture Notes).

Terminology Review:

  • Natural Language Processing.
  • Text Classification (e.g. Spam Detection).
  • Named Entity Recognition.
  • Chatbots.
  • Speech Processing.
  • Speech Recognition.
  • Speech Synthesis.
  • Machine Translation.
  • Corpus: A body of texts.
  • Token: a word or a number or a punctuation mark.
  • Collocation: compounds (e.g. disk drive), phrasal verbs (e.g. make up), and other stock phrases (e.g. bacon and eggs).
  • Unigram: word.
  • Bigrams: pairs of words that occur commonly.
  • Trigrams: 3 words that occur commonly.
  • N-grams: n words that occur commonly.
  • Hypothesis Testing.
  • t-Test.
  • Likelihood Ratios.
  • Language Model: statistical model of word sequences.
  • Naive Bayes.
  • Hidden Markov Models.
  • Bag-of-Words Model.
  • Term Frequency–Inverse Document Frequency (TF–IDF).
  • Bag-of-n-Grams.
  • One-Hot Representation: You have a vocabulary of n words and you represent each word using a vector that is n bits long, in which all bits are zero except for one bit that is set to 1.
  • Word Embedding (Featurized Representation) is the transformation from words to dense vector.
  • Euclidean Distance, Dot Product Similarity, Cosine Similarity.
  • Embedding Matrix.
  • Neural Language Model.
  • Word2Vec: Skip-Gram Model, Bag-of-Words Model.
  • Negative Sampling.
  • GloVe, Global Vectors.
  • Recurrent Neural Networks.
  • Backpropagation Through Time.
  • Recurrent Neural Net Language Model (RNNLM).
  • Gated Recurrent Unit (GRU).
  • Long Short Term Memory (LSTM).
  • Bidirectional RNN.
  • Deep RNNs.
  • Sequence to Sequence Model.
  • Teacher Forcing.
  • Image Captioning.
  • Greedy Search.
  • Beam Search, Length Normalization.
  • BLEU (BiLingual Evaluation Understudy) Score.
  • ROUGE (Recall-Oriented Understudy for Gisting Evaluation) Score.
  • F1 Score.
  • Minimum Bayes-Risk.
  • Attention Mechanism.
  • Self-Attention (Scaled and Dot-Product Attention): Queries, Keys and Values.
  • Positional Encoding.
  • Masked Self-Attention.
  • Multi-Head Attention.
  • Residual Dropout.
  • Label Smoothing.
  • Transformer Encoder.
  • Transformer Decoder.
  • Transformer Encoder-Decoder.
  • Cross-Attention.
  • Byte Pair Encoding.
  • BERT (Bidirectional Encoder Representations from Transformers).

After finishing learning about natural language processing please click Topic 25 – Introduction to Blockchain to continue.

 

 

Topic 23 – Introduction to Computer Vision

Why do I need to learn about computer vision?

Computer vision has become more and more interesting. Image recognition and autonomous driving are examples of breakthrough achievements in the field.

Nowadays a key skill that is often required from a software developer is the ability to use computer vision algorithms and tools to solve real-world problems related to images and videos.

What can I do after finishing learning about applied computer vision?

You will be to create software that could recognize recognize a face or transform a picture of a person from young age to old age.

That sounds fun! What should I do now?

Please read
– this Rafael C. Gonzalez and Richard E. Woods (2018). Digital Image Processing. 3rd Edition. Pearson book, and
– this Richard Szeliski (2022). Computer Vision: Algorithms and Applications. Springer book.

After that please
– audit these Deep Learning Specialization courses and
– read this Francois Chollet (2021). Deep Learning with Python. Manning Publications book at the same time.

After that please read this Ian Goodfellow et al. (2016). Deep Learning. The MIT Press book.

Terminology Review:

  • Deep Learning.
  • Artificial Neural Networks.
  • Convolutional Neural Networks (CNNs).
  • Object Detection.
  • Face Recognition.
  • YOLO Algorithm.
  • Neural Style Transfer.
  • Generative Adversarial Networks (GANs).

After finishing learning about computer vision please click Topic 24 – Introduction to Nature Language Processing to continue.

 

 

Topic 22 – Introduction to Machine Learning

Why do I need to learn about machine learning?

Machine learning has solved many important difficult problems recently. A few of them include speech recognition, speech synthesis, image recognition, autonomous driving and chat bots.
Nowadays a key skill of software developer is the ability to use machine learning algorithms solve real-world problems.

What can I do after finishing learning about applied machine learning ?

You will be to create software that could recognize car plate number from an image, identify probability of breast cancer for a patient.

That sounds useful! What should I do now?

Please audit
– this Machine Learning Specialization (Coursera) courses and
– this Applied Machine Learning in Python (Coursera) course.

At the same time, please read
– this Aurelien Geron (2022). Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow. O’Reilly Media book and
– this Brett Lantz (2019). Machine Learning with R – Expert Techniques for Predictive Modeling. Packt Publishing book, and
– this Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press book.

After that please watch
– this MIT 6.034 – Artificial Intelligence, Fall 2010 course (Readings).

After that please read
– this Tom M. Mitchell (1997). Machine Learning. McGraw-Hill Education book, and
– this Christopher M. Bishop (2006). Pattern Recognition and Machine Learning. Springer book.

Terminology Review:

  • Artificial Intelligence.
  • Machine Learning.
  • Deep Learning.
  • Linear Regression: Y = θX + Ε.
  • Cost Function measures how good/bad your model is.
  • Mean Square Error (MSE) measures the average of the squares of the errors.
  • Gradient Descent, Learning Rate.
  • Batch Gradient Descent.
  • The R-Squared Test measures the proportion of the total variance in the output (y) that can be explained by the variation in x. It can be used to evaluate how good a “fit” some model is on the given data.
  • Stochastic Gradient Descent.
  • Mini-Batch Gradient Descent.
  • Overfitting: machine learning model gives accurate predictions for training data but not for new data.
  • Regularization: Ridge Regression, Lasso Regression, Elastic Net, Early Stopping.
  • Logistic Regression.
  • Sigmoid Function.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • One Hot Encoding.
  • The Softmax Function takes an N-dimensional vector of arbitrary real values and produces another N-dimensional vector with real values in the range (0, 1) that add up to 1.0.
  • Softmax Regression.
  • Support Vector Machines.
  • Decision Trees.
  • K-Nearest Neighbors.
  • McCulloch-Pitts Neuron.
  • Linear Threshold Unit with threshold T calculates the weighted sum of its inputs, and then outputs 0 if this sum is less than T, and 1 if the sum is greater than T.
  • Perceptron.
  • Activation Functions: Sigmoid, Hyperbolic Tangent, Rectified Linear Unit (ReLU).
  • Artificial Neural Networks.
  • Backpropagation.
  • Gradient Descent Optimization Algorithms: Momentum, Adagrad, Adadelta, RMSprop, Adam.
  • Regularization: Dropout.
  • K-Means.
  • Principal Component Analysis.
  • User-Based Collaborative Filtering.
  • Item-based Collaborative Filtering.
  • Matrix Factorization.
  • The Joint Probability Table.
  • Bayesian Networks.
  • Naive Bayes Inference.

After finishing learning about machine learning please click Topic 23 – Introduction to Computer Vision to continue.