File Formats¶
Falara supports 9 file formats across 6 format families. Each format has a dedicated processor that extracts translatable text, preserves structure and formatting, and reassembles the translated output.
Supported Formats¶
| Format | Extensions | Description |
|---|---|---|
| HTML | .html, .htm |
Markup is extracted and replaced with typed placeholders; reassembled after translation |
| Markdown | .md, .markdown |
Document structure (headings, lists, code blocks) is preserved; only prose is translated |
| Word | .docx |
Paragraphs and table cells are extracted; character-level formatting (bold, italic) is preserved |
| Excel | .xlsx |
Individual cells are translated; formulas and cell references are skipped |
| PowerPoint | .pptx |
Text boxes and speaker notes are extracted and translated |
| XLIFF 1.2 | .xlf, .xliff |
Standard exchange format; <trans-unit> source/target pairs are translated |
| XLIFF 2.0 | .xlf, .xliff |
Newer XLIFF standard; <segment> elements are translated |
| Eurotext-XLIFF | .xlf |
Proprietary Eurotext format with extended metadata and CDATA-embedded HTML |
Limits¶
| Limit | Value |
|---|---|
| Single file upload | 10 MB |
| Batch total | 50 MB |
How File Processing Works¶
- Upload — File is received and MIME type is validated against the declared extension
- Extraction — Format processor extracts translatable segments; non-translatable content (code, formulas, metadata) is preserved as-is
- Translation — Segments pass through the multi-agent pipeline
- Injection — Translated segments are written back into the original document structure
- Download — Reassembled file is served via
GET /v1/jobs/{job_id}/download
The injection step happens on-demand at download time, not during the pipeline. The original file structure is never permanently modified.
XLIFF Handling¶
For XLIFF files, the processor reads <source> elements and writes translations back into <target> elements. Existing <target> content is replaced.
Eurotext-XLIFF files preserve: - CDATA wrappers on text content - HTML markup embedded inside CDATA (block tags, attribute order) - Proprietary metadata attributes
Format Detection¶
Format is detected from the file extension, then validated against the actual MIME type of the uploaded file. A mismatch returns 415 Unsupported Media Type.