ReceiptNinja: Using Google Gemini to extract information from Retail Receipts
Building ReceiptNinja: An Intelligent Receipt Processing Demo App In today’s digital-first world, managing receipts—whether physical or digital—can be a daunting task for individuals and businesses alike. Manual data entry for expense tracking or finance management is time-consuming, error-prone, and tedious. Enter ReceiptNinja, an intelligent demo application designed to automate this process by extracting key fields […]
Continue ReadingKey Considerations for Implementing Object Detection on Edge Devices
When starting an object detection project, the initial focus is often on building the most accurate model possible. However, highly accurate models are usually not deployable in production scenarios due to the trade-off between accuracy and computational demands. These models tend to be resource-intensive and can be impractical and costly to deploy. Deploying models on […]
Continue ReadingHow Transformers Are Shaping the Future of Object Detection
The world of computer vision changed forever 2011 onwards, when convolutional neural networks (CNNs) revolutionized object detection by providing a significant leap in accuracy and efficiency compared to earlier methods like the Viola-Jones framework, which primarily relied on handcrafted features and boosted classifiers. CNN-based models like Faster R-CNN, YOLO, and CenterNet brought about groundbreaking […]
Continue ReadingTechnical overview of Image Synthesis : Stable Diffusion
Tex to Image models like DALL-E, Imagen, and Stable Diffusion have attracted a lot of attention to Image Synthesis models, recently. These models can generate impressive looking images from benign looking prompts. Here are a few typical examples of images from Stable Diffusion: Looking under the hood […]
Continue ReadingMOTR: End-to-End Multi-Object Tracking with Transformers
MOTR is a state of the art end-to-end multiple object tracker that does not require any temporal association between objects of adjacent frames. It directly outputs the track of objects in a sequence of input images (video). MOTR uses Deformable DETR for object detection on a single image. To understand the architecture of MOTR it […]
Continue Reading