CV-Tricks.com – Learn Machine Learning, AI & Computer vision

ReceiptNinja: Using Google Gemini to extract information from Retail Receipts

by Ankit Sachan • September 30, 2024

Building ReceiptNinja: An Intelligent Receipt Processing Demo App In today’s digital-first world, managing receipts—whether physical or digital—can be a daunting task for individuals and businesses alike. Manual data entry for expense tracking or finance management is time-consuming, error-prone, and tedious. Enter ReceiptNinja, an intelligent demo application designed to automate this process by extracting key fields […]

Key Considerations for Implementing Object Detection on Edge Devices

by Ankit Sachan • August 2, 2024

When starting an object detection project, the initial focus is often on building the most accurate model possible. However, highly accurate models are usually not deployable in production scenarios due to the trade-off between accuracy and computational demands. These models tend to be resource-intensive and can be impractical and costly to deploy. Deploying models on […]

How Transformers Are Shaping the Future of Object Detection

by Ankit Sachan • July 8, 2024

The world of computer vision changed forever 2011 onwards, when convolutional neural networks (CNNs) revolutionized object detection by providing a significant leap in accuracy and efficiency compared to earlier methods like the Viola-Jones framework, which primarily relied on handcrafted features and boosted classifiers. CNN-based models like Faster R-CNN, YOLO, and CenterNet brought about groundbreaking […]

Technical overview of Image Synthesis : Stable Diffusion

by Ankit Sachan • March 2, 2023

Tex to Image models like DALL-E, Imagen, and Stable Diffusion have attracted a lot of attention to Image Synthesis models, recently. These models can generate impressive looking images from benign looking prompts. Here are a few typical examples of images from Stable Diffusion: Looking under the hood […]

MOTR: End-to-End Multi-Object Tracking with Transformers

by Ankit Sachan • January 15, 2023

MOTR is a state of the art end-to-end multiple object tracker that does not require any temporal association between objects of adjacent frames. It directly outputs the track of objects in a sequence of input images (video). MOTR uses Deformable DETR for object detection on a single image. To understand the architecture of MOTR it […]

ReceiptNinja: Using Google Gemini to extract information from Retail Receipts

Key Considerations for Implementing Object Detection on Edge Devices

How Transformers Are Shaping the Future of Object Detection

Technical overview of Image Synthesis : Stable Diffusion

MOTR: End-to-End Multi-Object Tracking with Transformers

Most Popular