Why do I need to learn about computer vision?
Computer vision has become more and more interesting. Image recognition, autonomous driving, and disease detection are examples of breakthrough achievements in the field.
Nowadays a key skill that is often required from a software developer is the ability to use computer vision algorithms and tools to solve real-world problems related to images and videos.
What can I do after finishing learning about applied computer vision?
You will be to create software that could recognize recognize a face or transform a picture of young person to old person.
That sounds fun! What should I do now?
Please read
– this Rafael C. Gonzalez and Richard E. Woods (2018). Digital Image Processing. 4th Edition. Pearson book, and
– this Richard Szeliski (2022). Computer Vision: Algorithms and Applications. Springer book.
At the same time, please
– audit these Deep Learning Specialization courses and
– read this Francois Chollet (2021). Deep Learning with Python. Manning Publications book, and
– this Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press book.
After that please read this David Foster (2023). Generative Deep Learning – Teaching Machines To Paint, Write, Compose, and Play. O’Reilly Media book.
After that please read this Ian Goodfellow et al. (2016). Deep Learning. The MIT Press book.
Terminology Review:
- Digital Image: f(x, y)
- Intensity (Gray Level): ℓ = f(x, y)
- Gray Scale: ℓ = 0 is considered black and ℓ = L – 1 is considered white.
- Quantization: Digitizing the amplitude values.
- Sampling: Digitizing the coordinate values.
- Representing Digital Images: Matrix or Vector.
- Pixel or Picture Element: An element of matrix or vector.
- Deep Learning.
- Artificial Neural Networks.
- Filter: 2-dimensional matrix commonly square in size containing weights shared all over the input space.
- The Convolution Operation: Element-wise multiply, and add the outputs.
- Stride: Filter step size.
- Padding.
- Upsampling: Nearest Neighbors, Linear Interpolation, Bilinear Interpolation.
- Max Pooling, Average Pooling, Min Pooling.
- Convolutional Layers.
- Feature Maps.
- Convolutional Neural Networks (CNNs).
- Object Detection.
- Face Recognition.
- YOLO Algorithm.
- Latent Variable.
- Autoencoders.
- Variational Autoencoders.
- Generators.
- Discriminators.
- Binary Cross Entropy Loss Function, Log Loss Function.
- Generative Adversarial Networks (GANs).
- Mode Collapse.
- Earth Mover’s Distance.
- Wasserstein Loss (W-Loss).
- 1-Lipschitz Continuous Function.
- CycleGAN.
- Neural Style Transfer.
After finishing learning about computer vision please click Topic 24 – Introduction to Nature Language Processing to continue.