OCR FULL TEXT API

Full text + searchable PDF.

Extract every word from PDFs, TIFFs, scans, photos. Returns both raw text and a searchable PDF. Multi-language including English, Hindi, Tamil, Telugu, Bengali, Arabic, French, Portuguese, Swahili.

Try now API reference →

WHAT IT DOES

Raw text + searchable PDF in one call

✓ Auto-deskew + orientation detection
✓ Multi-page documents in one call
✓ 9 languages including Indian + African scripts
✓ Text + bounding box coordinates returned
✓ Confidence per page in response
✓ Searchable PDF, overlay text on original image

python

curl

from abscode import DocumentAI

client = DocumentAI(api_key="abs_sk_...")

result = client.ocr.extract(
    file="document.pdf",
    languages=["en", "hi"],
    output_searchable_pdf=True
)

print(result.text)
print(result.searchable_pdf_url)
# https://cdn.abscode.com/.../doc.pdf

USE CASES

Common scenarios

Document archive search

OCR legacy scanned PDFs so they become searchable in your archive.

Mobile scan → searchable

Pair with Scanning SDK. Upload phone-captured doc, get searchable PDF back.

Compliance keyword scan

OCR contracts, then grep for terms. Fast batch processing via async API.

Vernacular content

Hindi / Tamil / Telugu / Bengali support for India-local document workflows.

Educational worksheets

OCR student answer sheets, printed or handwritten, for grading workflows.

Govt records digitization

Bulk scan + OCR for state digitization initiatives. Async + webhook callback.

PRICING

Flat per-page rate

Same flat rate per page. No surge pricing. No tier overages. Pricing shown in your local currency, change country in the top-right to switch.

See full pricing + top-up packs →