AI OCR
BBox-guided OCR with Paddle, Qwen, and Gemma
📷 Upload an Image
Upload an image and AI OCR will detect text boxes, read each crop, and keep every result tied to its original location.
💡 How it works
PaddleOCR first dewarps the image and supplies authoritative text boxes. Qwen3VL:8b reads each rotated crop, then Gemma4:31B reviews the image and candidates to finalize text while preserving every Paddle bbox.
📝 OCR Results
Product Overview
AI OCR uses a bbox-guided pipeline: PaddleOCR provides stable geometry, Qwen3VL:8b performs crop-level OCR, and Gemma4:31B finalizes the text with image context. The original Paddle boxes remain the overlay geometry so results stay anchored to the image.
How It Works
1. Upload Image
Select or drag-and-drop an image containing text — documents, signs, handwriting, screenshots, and more.
2. Paddle Boxes
PaddleOCR dewarps the image, detects text lines, and provides the bounding boxes used for overlays.
3. Qwen + Gemma Review
Qwen3VL:8b reads each rotated crop. Gemma4:31B reviews those candidates with the image and only updates text, never the boxes.
4. Compare & Copy
The extracted text is listed below the image with source labels for Paddle, Qwen, and Gemma. Use the toggle to compare engines or copy any result.