Skip to content

File Formats

Falara supports 9 file formats across 6 format families. Each format has a dedicated processor that extracts translatable text, preserves structure and formatting, and reassembles the translated output.


Supported Formats

Format Extensions Description
HTML .html, .htm Markup is extracted and replaced with typed placeholders; reassembled after translation
Markdown .md, .markdown Document structure (headings, lists, code blocks) is preserved; only prose is translated
Word .docx Paragraphs and table cells are extracted; character-level formatting (bold, italic) is preserved
Excel .xlsx Individual cells are translated; formulas and cell references are skipped
PowerPoint .pptx Text boxes and speaker notes are extracted and translated
XLIFF 1.2 .xlf, .xliff Standard exchange format; <trans-unit> source/target pairs are translated
XLIFF 2.0 .xlf, .xliff Newer XLIFF standard; <segment> elements are translated
Eurotext-XLIFF .xlf Proprietary Eurotext format with extended metadata and CDATA-embedded HTML

Limits

Limit Value
Single file upload 10 MB
Batch total 50 MB

How File Processing Works

  1. Upload — File is received and MIME type is validated against the declared extension
  2. Extraction — Format processor extracts translatable segments; non-translatable content (code, formulas, metadata) is preserved as-is
  3. Translation — Segments pass through the multi-agent pipeline
  4. Injection — Translated segments are written back into the original document structure
  5. Download — Reassembled file is served via GET /v1/jobs/{job_id}/download

The injection step happens on-demand at download time, not during the pipeline. The original file structure is never permanently modified.


XLIFF Handling

For XLIFF files, the processor reads <source> elements and writes translations back into <target> elements. Existing <target> content is replaced.

Eurotext-XLIFF files preserve: - CDATA wrappers on text content - HTML markup embedded inside CDATA (block tags, attribute order) - Proprietary metadata attributes


Format Detection

Format is detected from the file extension, then validated against the actual MIME type of the uploaded file. A mismatch returns 415 Unsupported Media Type.