Each year new deep learning algorithms are faster, smarter, and lighter than their younger brothers. Explore how to focus on the most promising ones, and which ones are worth exploring.
Hundreds of new deep learning algorithms are presented each year. Developers and engineers working in the field of AI are constantly bombarded with models that are faster, smarter, and lighter than their younger brothers. How to focus on the most promising ones, and which ones are worth exploring? In this article, we will share our position which one you should follow or try.
The first algorithm worth exploring is one of the most popular in the field of Computer Vision – YOLO (“You Only Look Once”). It is a number one choice when it comes to Real-time object detection and classification algorithms. The main idea behind Yolo algorithm is to create features from input images. Later feed these features through a prediction system to draw boxes around objects and finally predict their classes. Yolov5 outperforms not only previous Yolo versions but also FasterRCNN. Model runs about 2.5 times faster and is more accurate detecting smaller objects. The results are also cleaner with little to no overlapping boxes.
When looking for a picture for the presentation we quickly find the appropriate one. However, many of these photos are really low resolution. The solution to that problem might be Super-Resolution algorithm. It learns new details pixel by pixel increasing resolution, sharpening, and upscaling the image without losing its content. The whole algorithm is based on Generative Adversarial Networks (GANs). Upsampling an image to increase its resolution to twice its original size takes less than a second. We can go up to eight times the original resolution.
According to Wikipedia ‘Inpainting is a conservation process where damaged, deteriorating, or missing parts of an artwork are filled in to present a complete image. This process can be applied to both physical and digital art mediums such as oil or acrylic paintings, chemical photographic prints, 3-dimensional sculptures, or digital images and video’.
In the field of AI, Generative adversarial networks are typically used for this sort of implementation, given their ability to “generate” new data, or in this case, the missing information.
Thank to AI you can either color the black-and-white images, generate faces, or even translate lip motion to text. Most of those algorithms are based on Generative Adversarial Networks which is the most powerful Neural Network architecture nowadays.