All posts by admin

Topic 24 – Introduction to Nature Language Processing

Why do I need to learn about nature language processing?

Natural language processing (NLP) has become more and more interesting. Speech recognition, speech synthesis, autonomous driving and chat bots are examples of breakthrough achievements in the field.

Nowadays a key skill of software developer is the ability to use nature language processing algorithms and tools to solve real-world problems related to text, audio, natural language sentences and speech.

What can I do after finishing learning about nature language processing?

You will be to create software that could recognize speech, translate text to speech, translate a sentence from English to French, answer a customer’s question.

That sounds fun! What should I do now?

Please read
– this Daniel Jurafsky and James H. Martin (2014). Speech and Language Processing. Pearson book, and
– this Christopher D. Manning and Hinrich Schiitze (1999). Foundations of Statistical Natural Language Processing. MIT Press book first.

After that please audit these Natural Language Processing Specialization courses and this Stanford CS224N – NLP with Deep Learning, Winter 2023 course (Lecture Notes).

Terminology Review:

  • Natural Language Processing.
  • Text Classification (e.g. Spam Detection).
  • Named Entity Recognition.
  • Chatbots.
  • Speech Processing.
  • Speech Recognition.
  • Speech Synthesis.
  • Machine Translation.
  • Corpus: A body of texts.
  • Token: a word or a number or a punctuation mark.
  • Collocation: compounds (e.g. disk drive), phrasal verbs (e.g. make up), and other stock phrases (e.g. bacon and eggs).
  • Unigram: word.
  • Bigrams: pairs of words that occur commonly.
  • Trigrams: 3 words that occur commonly.
  • N-grams: n words that occur commonly.
  • Hypothesis Testing.
  • t-Test.
  • Likelihood Ratios.
  • Language Model: statistical model of word sequences.
  • Naive Bayes.
  • Hidden Markov Models.
  • Bag-of-Words Model.
  • Term Frequency–Inverse Document Frequency (TF–IDF).
  • Bag-of-n-Grams.
  • One-Hot Representation: You have a vocabulary of n words and you represent each word using a vector that is n bits long, in which all bits are zero except for one bit that is set to 1.
  • Word Embedding (Featurized Representation) is the transformation from words to dense vector.
  • Euclidean Distance, Dot Product Similarity, Cosine Similarity.
  • Embedding Matrix.
  • Neural Language Model.
  • Word2Vec: Skip-Gram Model, Bag-of-Words Model.
  • Negative Sampling.
  • GloVe, Global Vectors.
  • Recurrent Neural Networks.
  • Backpropagation Through Time.
  • Recurrent Neural Net Language Model (RNNLM).
  • Gated Recurrent Unit (GRU).
  • Long Short Term Memory (LSTM).
  • Bidirectional RNN.
  • Deep RNNs.
  • Sequence to Sequence Model.
  • Teacher Forcing.
  • Image Captioning.
  • Greedy Search.
  • Beam Search, Length Normalization.
  • BLEU (BiLingual Evaluation Understudy) Score.
  • ROUGE (Recall-Oriented Understudy for Gisting Evaluation) Score.
  • F1 Score.
  • Minimum Bayes-Risk.
  • Attention Mechanism.
  • Self-Attention (Scaled and Dot-Product Attention): Queries, Keys and Values.
  • Positional Encoding.
  • Masked Self-Attention.
  • Multi-Head Attention.
  • Residual Dropout.
  • Label Smoothing.
  • Transformer Encoder.
  • Transformer Decoder.
  • Transformer Encoder-Decoder.
  • Cross-Attention.
  • Byte Pair Encoding.
  • BERT (Bidirectional Encoder Representations from Transformers).

After finishing learning about natural language processing please click Topic 25 – Introduction to Distributed Systems to continue.

 

 

Topic 23 – Introduction to Computer Vision

Why do I need to learn about computer vision?

Computer vision has become more and more interesting. Image recognition, autonomous driving, and disease detection are examples of breakthrough achievements in the field.

Nowadays a key skill that is often required from a software developer is the ability to use computer vision algorithms and tools to solve real-world problems related to images and videos.

What can I do after finishing learning about applied computer vision?

You will be to create software that could recognize recognize a face or transform a picture of young person to old person.

That sounds fun! What should I do now?

Please read
– this Rafael C. Gonzalez and Richard E. Woods (2018). Digital Image Processing. 4th Edition. Pearson book, and
– this Richard Szeliski (2022). Computer Vision: Algorithms and Applications. Springer book.

At the same time, please
– audit these Deep Learning Specialization courses and
– read this Francois Chollet (2021). Deep Learning with Python. Manning Publications book, and
– this Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press book.

After that please read this David Foster (2023). Generative Deep Learning – Teaching Machines To Paint, Write, Compose, and Play. O’Reilly Media book.

After that please read this Ian Goodfellow et al. (2016). Deep Learning. The MIT Press book.

Terminology Review:

  • Digital Image: f(x, y)
  • Intensity (Gray Level): ℓ = f(x, y)
  • Gray Scale: ℓ = 0 is considered black and ℓ = L – 1 is considered white.
  • Quantization: Digitizing the amplitude values.
  • Sampling: Digitizing the coordinate values.
  • Representing Digital Images: Matrix or Vector.
  • Pixel or Picture Element: An element of matrix or vector.
  • Deep Learning.
  • Artificial Neural Networks.
  • Filter: 2-dimensional matrix commonly square in size containing weights shared all over the input space.
  • The Convolution Operation: Element-wise multiply, and add the outputs.
  • Stride: Filter step size.
  • Padding.
  • Upsampling: Nearest Neighbors, Linear Interpolation, Bilinear Interpolation.
  • Max Pooling, Average Pooling, Min Pooling.
  • Convolutional Layers.
  • Feature Maps.
  • Convolutional Neural Networks (CNNs).
  • Object Detection.
  • Face Recognition.
  • YOLO Algorithm.
  • Latent Variable.
  • Autoencoders.
  • Variational Autoencoders.
  • Generators.
  • Discriminators.
  • Binary Cross Entropy Loss Function, Log Loss Function.
  • Generative Adversarial Networks (GANs).
  • Mode Collapse.
  • CycleGAN.
  • Neural Style Transfer.

After finishing learning about computer vision please click Topic 24 – Introduction to Nature Language Processing to continue.

 

 

How to Discover Requirements

Problem:

You want to quickly capture and analyze requirements for a project developed using Scrum or Kanban method however misunderstandings happen too frequently among your team members.

Suggestion:
  1. Define and get agreement about terminologies (terms).
  2. Decompose a user story into end-to-end workflow with screenshots or mock-ups.
  3. Define test scenarios for a user story.
  4. Use a tool such as Confluence pages for documenting and clarifying user story.
  5. If the problem still persists then try elaborating a user story to a use case, and/or a flow chart, and/or a domain model, and/or mind map, and/or a sequence diagram.
  6. If possible always use face-to-face meetings for communication.

Software Configuration Management Example

Context:

  • Google Play requires that you must update your app to target Android 12 (API level 31).
  • Your application was developed 7 years ago and there have been no updates since then.
  • Your application was developed with cocos2d-x 3.13.1 (Sadly its development was discontinued), Java 8, python 2.7, Android SDK 7.0 (API level 24), tools_r25.2.5, ndk-r13 and ant 1.9.4.

Solution:

1. You tried installing all the tools and compiling the code. Luckily no problem happened.

2. You checked for Java and Gradle compatibility. You decided to stay with Java 8 because cocos2d-x only works best with Java 8. It means that your Gradle version must be less than 4.3. So you selected version 4.1 for Gradle.

3. You checked for Gradle and Android Gradle Plugin compatibility. You realized that you can only use Android Gradle plugin version 3.0.0 or less. So you selected version 3.0.0 for Android Gradle plugin.

4. You searched for “gradle-wrapper.properties” and updated Gradle to version 4.1.

#distributionUrl=https\://services.gradle.org/distributions/gradle-2.4-all.zip
distributionUrl=https\://services.gradle.org/distributions/gradle-4.1-all.zip

5. You searched for “build.gradle” and updated Android Gradle plugin to version 3.0.0.

buildscript {
repositories {
jcenter()
maven {
url "https://maven.google.com"
}
}
dependencies {
// classpath 'com.android.tools.build:gradle:1.3.0'
classpath 'com.android.tools.build:gradle:3.0.0'
...

6. You set your app targetSdkVersion to 31, created APK file and uploaded it to Google Play.

7. Google Play complained about “android:exported” property for activities with “intent-filter. You added “android:exported” property to your AndroidManifest.xml but the problem still persisted.

8. You suspected that Google Play Services library caused the issue because your app uses Firebase Ads that depends on Google Play Services library. So you tried removing Firebase Ads and you were happy when Google Play did not complain anymore.

9. Then you decided to add Google AdMob SDK to your app to replace Firebase Ads. You set your minSdkVersion to 19, your targetSdkVersion to 33 and follow the instructions until you got many build errors with play-services-ads:22.4.0.

10. You tried using latest SDK tools (26.1.1) but you got “The “android” command is deprecated.” issue. cocos2d-x development had been discontinued so you had to revert back to 25.2.5 SDK tools.

11. You tried downgrading play-services-ads to a lower version using trials and errors and luckily you found that version 17.2.1 worked.

12. You created APK file, uploaded it to Google Play, and sent for review. Google Play complained that you must have included 64-bit and 32-bit native code in your app.

13. Luckily, you could build cocos2d-x 3.13.1 for 64-bit architectures by searching for “Application.mk” and set APP_ABI :=armeabi-v7a arm64-v8a, and searching for “gradle.properties” and set PROP_APP_ABI=armeabi-v7a:arm64-v8a, and searching for “build.gradle” and set ndk.abiFilters ‘armeabi-v7a’, ‘arm64-v8a’ under defaultConfig. You also noticed that cocos2d-x 3.13.1 has not supported x86_64 achitectures but at least finally you could be able to publish your app to Google Play.

Lessons learned:

1. Select mature and long-term support tools for your application development.

2. Always package the tools that you used together with the application source code.

3. Always document source code compilation steps and configuration setting locations.

4. Regularly update your application if possible.

How to Set File Permissions for ASP.NET Website on Windows

Problem:

  • You have ASP.NET website on Windows.
  • Your website application pool name is mysite.com.
  • Your website physical location is D:\inetpub\wwwroot\mysite.com.
  • Your website physical data location is D:\mysite_data.
  • Your website users cannot upload or modify website files.
  • Your website users cannot upload or modify website data files.

Solution:

1. Open cmd.exe as Administrator and execute the command below.

icacls "D:\inetpub\wwwroot\mysite.com" /grant "IIS AppPool\mysite.com":(OI)(CI)F /T
icacls "D:\mysite_data" /grant "IIS AppPool\mysite.com":(OI)(CI)F /T

This command give full permissions against D:\inetpub\wwwroot\mysite.com  and all sub-directories and files, and against D:\mysite_data  and all sub-directories and files to mysite.com user.

2. Alternatively you can execute the command below.

icacls "D:\inetpub\wwwroot\mysite.com" /grant IIS_IUSRS:F /t
icacls "D:\mysite_data" /grant IIS_IUSRS:F /t

This command give full permissions against D:\inetpub\wwwroot\mysite.com  and all sub-directories and files, and against D:\mysite_data  and all sub-directories and files to IIS_IUSRS group.

mysite.com user is part of the IIS_IUSRS group.

 

How to Restart a Windows Server in a Domain Remotely

Problem:

You need to restart a Windows server in a domain remotely because it seems to be stuck and you cannot remotely connect to it via RDP.

Solution:

1. Open cmd.exe and execute 2 commands below.

ping server_name
shutdown /r /m \\server_name /t 0/

2. You may get the error below.

server_name: A system shutdown is in progress.(1115)

3. Download and extract PSTools.

4. Open cmd.exe and execute 3 commands below.

cd C:\Users\admin\Downloads\PSTools
pskill \\server_name winlogon
pskill \\server_name TrustedInstaller

 

How to Be Creative in Software Engineering Research

Motivation:

You plan to do a software engineering research and want to make some minor authentic contributions.

Guidelines:

1. You may present a problem and corresponding existing solution using your own understanding.

A method to do this is to

  • Read papers and books about a concept (for example backpropagation or distributed transactions), then
  • Write down the concept and some related terminologies, then
  • Try explain the concept with examples using your speech, and
  • Record your presentation, then
  • Write down your transcript, then
  • Rephrase your transcript.

2. You may try to replicate an existing result. When doing this you may need to make minor changes due to specific technology or environment conditions. Then you can compare your result with the original result.

For example, you may compare your business workflows with existing business workflows to determine which solution may solve a specific problem faster or more reliable.

In case you do not make any minor changes, the replication process may also inspire you some technical ideas. You may get errors while replicating the result. Try to fix these errors and document your experience.

For example you may get errors when upgrading an existing system from Node.js 12 to Node.js 18, or when upgrading an existing deep learning model code from Python 3.9 to Python 3.11. Try to fix the errors, then document your inputs, errors and solution.

3. The core idea to be creative is to do something that you have not done before. You may use trial and error method but be sure that you have a hard unsolved problem first. Trying to search for partial solutions to a problem will inspire you some ideas which may be the starting point for your minor authentic contributions.

4. Each individual’s creativity will need to be developed over time rather than in accordance with any kind of set formula.

 

How to Set File Permissions for WordPress on Ubuntu

Motivation:

  • You have a WordPress instance on Ubuntu Nginx.
  • You want to ensure that only the Nginx process can access WordPress files.

Procedure:

  1. View current file owner and group:
ls -l /var/html

The root folder should be owned by www-data user. www-data is the user that web servers like Apache and Nginx on Ubuntu use by default for their normal operation.

2. Change file owner and group to www-data if necessary:

sudo chown -R www-data:www-data /var/html

3. Set minimum permissions for folders:

cd /var/html
sudo find . -type d -exec chmod 755 {} \; # directory permissions rwxr-xr-x

4. Set minimum permissions for files:

cd /var/html
sudo find . -type f -exec chmod 644 {} \; # file permissions rw-r--r--

5. Verify the changes:

ls -l /var/html

How to Back Up and Restore IIS App Pool and Site Settings

Motivation:

You want to back up and restore IIS app pool and site settings.

Solution:

1. Open Command Prompt (Admin) and execute commands below to export IIS app pool and site settings to XML files.

%windir%\system32\inetsrv\appcmd list apppool /config /xml > \\192.168.101.140\BCDR\WEBBC02\E\inetpub\apppools.xml
%windir%\system32\inetsrv\appcmd list site /config /xml > \\192.168.101.140\BCDR\WEBBC02\E\inetpub\sites.xml

2. Remove all existing IIS app pools and sites, then execute the command below to import IIS app pool and site settings from XML files.

%windir%\system32\inetsrv\appcmd list apppool /config /xml < \\192.168.101.140\BCDR\WEBBC02\E\inetpub\apppools.xml
%windir%\system32\inetsrv\appcmd list site /config /xml < \\192.168.101.140\BCDR\WEBBC02\E\inetpub\sites.xml