Getting Started

The Gemina API enables you to extract structured data from documents programmatically. Upload documents via file or URL, receive structured JSON responses with extracted fields, coordinates, and confidence scores.

Base URLhttps://api.gemina.co
AuthenticationX-API-Key: your-api-key

Quick Start

Install the required package and set up your environment:

pip install requests
import os

BASE_URL = os.getenv("GEMINA_BASE_URL", "https://api.gemina.co")
API_KEY = os.getenv("GEMINA_API_KEY", "")
HEADERS = {"X-API-Key": API_KEY}

Authentication

All API requests require authentication via the X-API-Key header:

import requests

headers = {"X-API-Key": "your-api-key"}
response = requests.get(
    "https://api.gemina.co/api/v1/documents/",
    headers=headers
)

Upload Document

Upload a document for extraction using multipart form data:

import requests

url = "https://api.gemina.co/api/v1/documents/uploads"
headers = {"X-API-Key": "your-api-key"}

form_data = [
    ("extraction_types", "invoice_headers"),
    ("extraction_types", "invoice_line_items"),
    ("external_id", "inv-2025-0001"),
    ("model_type", "invictus"),
]

files = {
    "file": ("invoice.pdf", open("./invoice.pdf", "rb"), "application/pdf")
}

response = requests.post(url, headers=headers, data=form_data, files=files)
result = response.json()
print(result)

Response Format

Successful extractions return structured JSON with field values and confidence scores:

{
  "status": "success",
  "meta": {
    "documentId": "9860df92-64fe-4b53-9663-5c11b38a3051",
    "externalId": "inv-2026-0001",
    "filename": "invoice.pdf"
  },
  "data": {
    "extractions": [
      {
        "extractionType": "invoice_headers",
        "status": "success",
        "values": {
          "vendorName": {"value": "Acme Beverages Ltd.", "confidence": "high"},
          "invoiceNumber": {"value": "IL-2026-04812", "confidence": "high"},
          "invoiceDate": {"value": "2026-05-12", "confidence": "high"},
          "currency": {"value": "ILS", "confidence": "high"},

          "grossSubtotalAmount": {"value": 7663.63, "confidence": "high"},
          "discountAmount":      {"value": 229.91,  "confidence": "high"},
          "discountPercentage":  {"value": 3.0,     "confidence": "high"},
          "roundingAmount":      {"value": -0.18,   "confidence": "high"},

          "subtotalAmount": {"value": 7433.90, "confidence": "high"},
          "taxes": [
            {"type": "vat", "name": "VAT 18%", "rate": 18.0, "amount": 1338.10, "confidence": "high"}
          ],
          "totalAmount": {"value": 8772.00, "confidence": "high"}
        }
      },
      {
        "extractionType": "invoice_line_items",
        "status": "success",
        "values": {
          "line_items": [
            {
              "lineNumber": 1,
              "description": "Premium 6-pack 330ml beer cans",
              "itemCode": "BV-330-6",
              "quantity": 12.0,
              "listPrice": 65.00,
              "unitPrice": 58.50,
              "discountAmount": 6.50,
              "discountPercentage": 10.0,
              "taxRate": 18.0,
              "packagingAmount": 0.30,
              "depositAmount": 1.20,
              "unitsPerPackage": 6,
              "packageQuantity": 2.0,
              "lineTotal": 703.50
            },
            {
              "lineNumber": 2,
              "description": "Olive oil 1L",
              "itemCode": "OO-1L",
              "quantity": 0.5,
              "listPrice": null,
              "unitPrice": 45.00,
              "discountAmount": null,
              "discountPercentage": null,
              "taxRate": 18.0,
              "packagingAmount": null,
              "depositAmount": null,
              "unitsPerPackage": null,
              "packageQuantity": null,
              "lineTotal": 22.50
            }
          ],
          "total_lines": 2
        }
      }
    ]
  }
}

Response Fields

The response includes structured extraction values keyed by extraction type. Unpopulated fields are null.

invoice_headers fields

Each header field uses an envelope shape: { value, coordinates, confidence }. When the invoice doesn't print the value, the whole envelope is null — a defensive client can safely guard with if (response.discountAmount) { … }.

  • grossSubtotalAmount — Sum of line items before any header-level discount or rounding.
  • discountAmount — Header-level discount in the document's currency. Sign is verbatim from the invoice: some templates print positives (229.91), others negatives (-229.91) or parenthesized values. Clients that subtract on their side must handle both.
  • discountPercentage — Header-level discount as a percentage (e.g. 3.0 means 3%). Only populated when the invoice prints it.
  • roundingAmount — Rounding adjustment (e.g. "round off", agorot rounding). Signed as printed; magnitude typically < 1.0 in document currency.
  • subtotalAmount — The tax base: the value the invoice's VAT/tax percentage is calculated against, after any header-level discount and rounding, before tax.

The reconciliation identity (modulo printing artifacts):

subtotalAmount + Σ taxes[].amount ≈ totalAmount

invoice_line_items fields

Each item in the line_items array is a flat object (no envelope). Unpopulated fields are null.

  • listPrice — Gross/catalog unit price before any line-level discount. Populated only when the invoice prints a dedicated "list price" / "catalog price" / "MSRP" column. Documentation-only — do not use it in line-total math.
  • unitPrice — NET price per unit, after any line-level discount. The lineTotal math uses this value, so the per-line discountAmount and discountPercentage should not be subtracted again. For the gross/catalog price, use listPrice when populated.
  • packagingAmount — Additive packaging charge (crate fee, palletizing fee). Positive. Contributes to lineTotal.
  • depositAmount — Additive deposit/refund charge (bottle deposit, container deposit). Positive. Contributes to lineTotal.
  • unitsPerPackage — Structural pack size: whole-number count of units per package (e.g. 24 cans per case). Informational; never a volume or weight.
  • packageQuantity — Order quantity in package units; may be fractional (e.g. 2.1 cartons). Informational. Most invoices print only one of unitsPerPackage or packageQuantity — both can be null independently.

Line-total math contract:

lineTotal ≈ quantity × unitPrice
          + taxAmount        (if present)
          + packagingAmount  (if present)
          + depositAmount    (if present)

When both pack-size fields are present, quantity ≈ packageQuantity × unitsPerPackage — the relationship is approximate, not enforced.

API Reference

Extraction Types

  • invoice_headers - Invoice header fields (field list →)
  • invoice_line_items - Line item details (field list →)
  • ocr - Full text extraction
  • document_details_hebrew - Hebrew documents

Model Types

  • velox - Fast processing
  • praetorian - Balanced accuracy
  • invictus - Highest accuracy

Endpoints

  • POST /api/v1/documents/uploads
  • POST /api/v1/documents/uploads/web
  • GET /api/v1/documents/{id}
  • GET /api/v1/documents/results/{id}

Response Statuses

  • success - Extraction completed
  • pending - Job queued
  • in_process - Processing
  • failed - Error occurred

FileTag API

Ready to Get Started?

Sign up for a free trial and start extracting data from your documents today.