User Guide¶
This guide covers the main concepts and usage patterns of Kreuzberg.
Contents¶
- Basic Usage - Essential usage patterns and concepts (API)
- Extraction Configuration - Configure the extraction process (API)
- Metadata Extraction - Document metadata extraction (API)
- Content Chunking - Split documents into manageable chunks
- OCR Configuration - Configure OCR settings (API)
- OCR Backends - Choose and configure different OCR engines
- Supported Formats - All supported document formats
Best Practices¶
- Use the async API for better performance in web applications and concurrent extraction
- Configure OCR language settings to match your document languages for better accuracy
- For large documents, consider file streaming methods to reduce memory usage
- When processing many similar documents, reuse configuration objects for consistency
Common Use Cases¶
Document Analysis: