OCR Configuration¶
Configuration classes for the supported OCR engines.
TesseractConfig¶
Default OCR engine configuration:
kreuzberg.TesseractConfig
dataclass
¶
Configuration options for Tesseract OCR engine.
Source code in kreuzberg/_ocr/_tesseract.py
Attributes¶
classify_use_pre_adapted_templates: bool = True
class-attribute
instance-attribute
¶
Whether to use pre-adapted templates during classification to improve recognition accuracy.
language: str = 'eng'
class-attribute
instance-attribute
¶
Language code to use for OCR. Examples: - 'eng' for English - 'deu' for German - multiple languages combined with '+', e.g. 'eng+deu')
language_model_ngram_on: bool = True
class-attribute
instance-attribute
¶
Enable or disable the use of n-gram-based language models for improved text recognition.
psm: PSMMode = PSMMode.AUTO
class-attribute
instance-attribute
¶
Page segmentation mode (PSM) to guide Tesseract on how to segment the image (e.g., single block, single line).
tessedit_dont_blkrej_good_wds: bool = True
class-attribute
instance-attribute
¶
If True, prevents block rejection of words identified as good, improving text output quality.
tessedit_dont_rowrej_good_wds: bool = True
class-attribute
instance-attribute
¶
If True, prevents row rejection of words identified as good, avoiding unnecessary omissions.
tessedit_enable_dict_correction: bool = True
class-attribute
instance-attribute
¶
Enable or disable dictionary-based correction for recognized text to improve word accuracy.
tessedit_use_primary_params_model: bool = True
class-attribute
instance-attribute
¶
If True, forces the use of the primary parameters model for text recognition.
textord_space_size_is_variable: bool = True
class-attribute
instance-attribute
¶
Allow variable spacing between words, useful for text with irregular spacing.
thresholding_method: bool = True
class-attribute
instance-attribute
¶
Enable or disable specific thresholding methods during image preprocessing for better OCR accuracy.
PSMMode¶
Page Segmentation Mode options for Tesseract:
kreuzberg.PSMMode
¶
Bases: Enum
Enum for Tesseract Page Segmentation Modes (PSM) with human-readable values.
Source code in kreuzberg/_ocr/_tesseract.py
Attributes¶
AUTO = 3
class-attribute
instance-attribute
¶
Fully automatic page segmentation (default).
AUTO_ONLY = 2
class-attribute
instance-attribute
¶
Automatic page segmentation without OSD.
AUTO_OSD = 1
class-attribute
instance-attribute
¶
Automatic page segmentation with orientation and script detection.
CIRCLE_WORD = 9
class-attribute
instance-attribute
¶
Treat the image as a single word in a circle.
OSD_ONLY = 0
class-attribute
instance-attribute
¶
Orientation and script detection only.
SINGLE_BLOCK = 6
class-attribute
instance-attribute
¶
Assume a single uniform block of text.
SINGLE_BLOCK_VERTICAL = 5
class-attribute
instance-attribute
¶
Assume a single uniform block of vertically aligned text.
SINGLE_CHAR = 10
class-attribute
instance-attribute
¶
Treat the image as a single character.
SINGLE_COLUMN = 4
class-attribute
instance-attribute
¶
Assume a single column of text.
SINGLE_LINE = 7
class-attribute
instance-attribute
¶
Treat the image as a single text line.
SINGLE_WORD = 8
class-attribute
instance-attribute
¶
Treat the image as a single word.
EasyOCRConfig¶
Configuration for the EasyOCR engine:
kreuzberg.EasyOCRConfig
dataclass
¶
Configuration options for EasyOCR.
Source code in kreuzberg/_ocr/_easyocr.py
Attributes¶
add_margin: float = 0.1
class-attribute
instance-attribute
¶
Extend bounding boxes in all directions.
adjust_contrast: float = 0.5
class-attribute
instance-attribute
¶
Target contrast level for low contrast text.
beam_width: int = 5
class-attribute
instance-attribute
¶
Beam width for beam search in recognition.
canvas_size: int = 2560
class-attribute
instance-attribute
¶
Maximum image dimension for detection.
contrast_ths: float = 0.1
class-attribute
instance-attribute
¶
Contrast threshold for preprocessing.
decoder: Literal['greedy', 'beamsearch', 'wordbeamsearch'] = 'greedy'
class-attribute
instance-attribute
¶
Decoder method. Options: 'greedy', 'beamsearch', 'wordbeamsearch'.
height_ths: float = 0.5
class-attribute
instance-attribute
¶
Maximum difference in box height for merging.
language: str | list[str] = 'en'
class-attribute
instance-attribute
¶
Language or languages to use for OCR.
link_threshold: float = 0.4
class-attribute
instance-attribute
¶
Link confidence threshold.
low_text: float = 0.4
class-attribute
instance-attribute
¶
Text low-bound score.
mag_ratio: float = 1.0
class-attribute
instance-attribute
¶
Image magnification ratio.
min_size: int = 10
class-attribute
instance-attribute
¶
Minimum text box size in pixels.
rotation_info: list[int] | None = None
class-attribute
instance-attribute
¶
List of angles to try for detection.
slope_ths: float = 0.1
class-attribute
instance-attribute
¶
Maximum slope for merging text boxes.
text_threshold: float = 0.7
class-attribute
instance-attribute
¶
Text confidence threshold.
use_gpu: bool = False
class-attribute
instance-attribute
¶
Whether to use GPU for inference.
width_ths: float = 0.5
class-attribute
instance-attribute
¶
Maximum horizontal distance for merging boxes.
x_ths: float = 1.0
class-attribute
instance-attribute
¶
Maximum horizontal distance for paragraph merging.
y_ths: float = 0.5
class-attribute
instance-attribute
¶
Maximum vertical distance for paragraph merging.
ycenter_ths: float = 0.5
class-attribute
instance-attribute
¶
Maximum shift in y direction for merging.
PaddleOCRConfig¶
Configuration for the PaddleOCR engine:
kreuzberg.PaddleOCRConfig
dataclass
¶
Configuration options for PaddleOCR.
This TypedDict provides type hints and documentation for all PaddleOCR parameters.
Source code in kreuzberg/_ocr/_paddleocr.py
Attributes¶
cls_image_shape: str = '3,48,192'
class-attribute
instance-attribute
¶
Image shape for classification algorithm in format 'channels,height,width'.
det_algorithm: Literal['DB', 'EAST', 'SAST', 'PSE', 'FCE', 'PAN', 'CT', 'DB++', 'Layout'] = 'DB'
class-attribute
instance-attribute
¶
Detection algorithm.
det_db_box_thresh: float = 0.5
class-attribute
instance-attribute
¶
Score threshold for detected boxes. Boxes below this value are discarded.
det_db_thresh: float = 0.3
class-attribute
instance-attribute
¶
Binarization threshold for DB output map.
det_db_unclip_ratio: float = 2.0
class-attribute
instance-attribute
¶
Expansion ratio for detected text boxes.
det_east_cover_thresh: float = 0.1
class-attribute
instance-attribute
¶
Score threshold for EAST output boxes.
det_east_nms_thresh: float = 0.2
class-attribute
instance-attribute
¶
NMS threshold for EAST model output boxes.
det_east_score_thresh: float = 0.8
class-attribute
instance-attribute
¶
Binarization threshold for EAST output map.
det_max_side_len: int = 960
class-attribute
instance-attribute
¶
Maximum size of image long side. Images exceeding this will be proportionally resized.
drop_score: float = 0.5
class-attribute
instance-attribute
¶
Filter recognition results by confidence score. Results below this are discarded.
enable_mkldnn: bool = False
class-attribute
instance-attribute
¶
Whether to enable MKL-DNN acceleration (Intel CPU only).
gpu_mem: int = 8000
class-attribute
instance-attribute
¶
GPU memory size (in MB) to use for initialization.
language: str = 'en'
class-attribute
instance-attribute
¶
Language to use for OCR.
max_text_length: int = 25
class-attribute
instance-attribute
¶
Maximum text length that the recognition algorithm can recognize.
rec: bool = True
class-attribute
instance-attribute
¶
Enable text recognition when using the ocr() function.
rec_algorithm: Literal['CRNN', 'SRN', 'NRTR', 'SAR', 'SEED', 'SVTR', 'SVTR_LCNet', 'ViTSTR', 'ABINet', 'VisionLAN', 'SPIN', 'RobustScanner', 'RFL'] = 'CRNN'
class-attribute
instance-attribute
¶
Recognition algorithm.
rec_image_shape: str = '3,32,320'
class-attribute
instance-attribute
¶
Image shape for recognition algorithm in format 'channels,height,width'.
table: bool = True
class-attribute
instance-attribute
¶
Whether to enable table recognition.
use_angle_cls: bool = True
class-attribute
instance-attribute
¶
Whether to use text orientation classification model.
use_gpu: bool = False
class-attribute
instance-attribute
¶
Whether to use GPU for inference. Requires installing the paddlepaddle-gpu package
use_space_char: bool = True
class-attribute
instance-attribute
¶
Whether to recognize spaces.
use_zero_copy_run: bool = False
class-attribute
instance-attribute
¶
Whether to enable zero_copy_run for inference optimization.