10. Computer Vision¶
Many applications in the area of computer vision are closely related to our daily lives, now and in the future, whether medical diagnostics, driverless vehicles, camera monitoring, or smart filters. In recent years, deep learning technology has greatly enhanced computer vision systems’ performance. It can be said that the most advanced computer vision applications are nearly inseparable from deep learning.
We have introduced deep learning models commonly used in the area of computer vision in the chapter “Convolutional Neural Networks” and have practiced simple image classification tasks. In this chapter, we will introduce image augmentation and fine tuning methods and apply them to image classification. Then, we will explore various methods of object detection. After that, we will learn how to use fully convolutional networks to perform semantic segmentation on images. Then, we explain how to use style transfer technology to generate images that look like the cover of this book. Finally, we will perform practice exercises on two important computer vision data sets to review the content of this chapter and the previous chapters.
- 10.1. Image Augmentation
- 10.2. Fine Tuning
- 10.3. Object Detection and Bounding Boxes
- 10.4. Anchor Boxes
- 10.5. Multiscale Object Detection
- 10.6. Object Detection Data Set (Pikachu)
- 10.7. Single Shot Multibox Detection (SSD)
- 10.8. Region-based CNNs (R-CNNs)
- 10.9. Semantic Segmentation and Data Sets
- 10.10. Fully Convolutional Networks (FCN)
- 10.11. Neural Style Transfer
- 10.11.1. Technique
- 10.11.2. Read the Content and Style Images
- 10.11.3. Preprocessing and Postprocessing
- 10.11.4. Extract Features
- 10.11.5. Define the Loss Function
- 10.11.6. Create and Initialize the Composite Image
- 10.11.7. Training
- 10.11.8. Summary
- 10.11.9. Exercises
- 10.11.10. Reference
- 10.11.11. Scan the QR Code to Discuss
- 10.12. Image Classification (CIFAR-10) on Kaggle
- 10.12.1. Obtain and Organize the Data Sets
- 10.12.2. Image Augmentation
- 10.12.3. Read the Data Set
- 10.12.4. Define the Model
- 10.12.5. Define the Training Functions
- 10.12.6. Train and Validate the Model
- 10.12.7. Classify the Testing Set and Submit Results on Kaggle
- 10.12.8. Summary
- 10.12.9. Exercises
- 10.12.10. Scan the QR Code to Discuss
- 10.13. Dog Breed Identification (ImageNet Dogs) on Kaggle
- 10.13.1. Obtain and Organize the Data Sets
- 10.13.2. Image Augmentation
- 10.13.3. Read the Data Set
- 10.13.4. Define the Model
- 10.13.5. Define the Training Functions
- 10.13.6. Train and Validate the Model
- 10.13.7. Classify the Testing Set and Submit Results on Kaggle
- 10.13.8. Summary
- 10.13.9. Exercises
- 10.13.10. Reference
- 10.13.11. Scan the QR Code to Discuss