OCR Configuration¶

Configuration classes for the supported OCR engines.

TesseractConfig¶

Default OCR engine configuration:

`kreuzberg.TesseractConfig` `dataclass` ¶

Configuration options for Tesseract OCR engine.

Source code in kreuzberg/_ocr/_tesseract.py

@dataclass(unsafe_hash=True, frozen=True)
class TesseractConfig:
    """Configuration options for Tesseract OCR engine."""

    classify_use_pre_adapted_templates: bool = True
    """Whether to use pre-adapted templates during classification to improve recognition accuracy."""
    language: str = "eng"
    """Language code to use for OCR.
    Examples:
            -   'eng' for English
            -   'deu' for German
            -    multiple languages combined with '+', e.g. 'eng+deu')
    """
    language_model_ngram_on: bool = True
    """Enable or disable the use of n-gram-based language models for improved text recognition."""
    psm: PSMMode = PSMMode.AUTO
    """Page segmentation mode (PSM) to guide Tesseract on how to segment the image (e.g., single block, single line)."""
    tessedit_dont_blkrej_good_wds: bool = True
    """If True, prevents block rejection of words identified as good, improving text output quality."""
    tessedit_dont_rowrej_good_wds: bool = True
    """If True, prevents row rejection of words identified as good, avoiding unnecessary omissions."""
    tessedit_enable_dict_correction: bool = True
    """Enable or disable dictionary-based correction for recognized text to improve word accuracy."""
    tessedit_use_primary_params_model: bool = True
    """If True, forces the use of the primary parameters model for text recognition."""
    textord_space_size_is_variable: bool = True
    """Allow variable spacing between words, useful for text with irregular spacing."""
    thresholding_method: bool = False
    """Enable or disable specific thresholding methods during image preprocessing for better OCR accuracy."""

Attributes¶

`classify_use_pre_adapted_templates: bool = True` `class-attribute` `instance-attribute` ¶

Whether to use pre-adapted templates during classification to improve recognition accuracy.

`language: str = 'eng'` `class-attribute` `instance-attribute` ¶

Language code to use for OCR. Examples: - 'eng' for English - 'deu' for German - multiple languages combined with '+', e.g. 'eng+deu')

`language_model_ngram_on: bool = True` `class-attribute` `instance-attribute` ¶

Enable or disable the use of n-gram-based language models for improved text recognition.

`psm: PSMMode = PSMMode.AUTO` `class-attribute` `instance-attribute` ¶

Page segmentation mode (PSM) to guide Tesseract on how to segment the image (e.g., single block, single line).

`tessedit_dont_blkrej_good_wds: bool = True` `class-attribute` `instance-attribute` ¶

If True, prevents block rejection of words identified as good, improving text output quality.

`tessedit_dont_rowrej_good_wds: bool = True` `class-attribute` `instance-attribute` ¶

If True, prevents row rejection of words identified as good, avoiding unnecessary omissions.

`tessedit_enable_dict_correction: bool = True` `class-attribute` `instance-attribute` ¶

Enable or disable dictionary-based correction for recognized text to improve word accuracy.

`tessedit_use_primary_params_model: bool = True` `class-attribute` `instance-attribute` ¶

If True, forces the use of the primary parameters model for text recognition.

`textord_space_size_is_variable: bool = True` `class-attribute` `instance-attribute` ¶

Allow variable spacing between words, useful for text with irregular spacing.

`thresholding_method: bool = False` `class-attribute` `instance-attribute` ¶

Enable or disable specific thresholding methods during image preprocessing for better OCR accuracy.

PSMMode¶

Page Segmentation Mode options for Tesseract:

`kreuzberg.PSMMode` ¶

Bases: Enum

Enum for Tesseract Page Segmentation Modes (PSM) with human-readable values.

Source code in kreuzberg/_ocr/_tesseract.py

class PSMMode(Enum):
    """Enum for Tesseract Page Segmentation Modes (PSM) with human-readable values."""

    OSD_ONLY = 0
    """Orientation and script detection only."""
    AUTO_OSD = 1
    """Automatic page segmentation with orientation and script detection."""
    AUTO_ONLY = 2
    """Automatic page segmentation without OSD."""
    AUTO = 3
    """Fully automatic page segmentation (default)."""
    SINGLE_COLUMN = 4
    """Assume a single column of text."""
    SINGLE_BLOCK_VERTICAL = 5
    """Assume a single uniform block of vertically aligned text."""
    SINGLE_BLOCK = 6
    """Assume a single uniform block of text."""
    SINGLE_LINE = 7
    """Treat the image as a single text line."""
    SINGLE_WORD = 8
    """Treat the image as a single word."""
    CIRCLE_WORD = 9
    """Treat the image as a single word in a circle."""
    SINGLE_CHAR = 10
    """Treat the image as a single character."""

Attributes¶

`AUTO = 3` `class-attribute` `instance-attribute` ¶

Fully automatic page segmentation (default).

`AUTO_ONLY = 2` `class-attribute` `instance-attribute` ¶

Automatic page segmentation without OSD.

`AUTO_OSD = 1` `class-attribute` `instance-attribute` ¶

Automatic page segmentation with orientation and script detection.

`CIRCLE_WORD = 9` `class-attribute` `instance-attribute` ¶

Treat the image as a single word in a circle.

`OSD_ONLY = 0` `class-attribute` `instance-attribute` ¶

Orientation and script detection only.

`SINGLE_BLOCK = 6` `class-attribute` `instance-attribute` ¶

Assume a single uniform block of text.

`SINGLE_BLOCK_VERTICAL = 5` `class-attribute` `instance-attribute` ¶

Assume a single uniform block of vertically aligned text.

`SINGLE_CHAR = 10` `class-attribute` `instance-attribute` ¶

Treat the image as a single character.

`SINGLE_COLUMN = 4` `class-attribute` `instance-attribute` ¶

Assume a single column of text.

`SINGLE_LINE = 7` `class-attribute` `instance-attribute` ¶

Treat the image as a single text line.

`SINGLE_WORD = 8` `class-attribute` `instance-attribute` ¶

Treat the image as a single word.

EasyOCRConfig¶

Configuration for the EasyOCR engine:

`kreuzberg.EasyOCRConfig` `dataclass` ¶

Configuration options for EasyOCR.

Source code in kreuzberg/_ocr/_easyocr.py

@dataclass(unsafe_hash=True, frozen=True)
class EasyOCRConfig:
    """Configuration options for EasyOCR."""

    add_margin: float = 0.1
    """Extend bounding boxes in all directions."""
    adjust_contrast: float = 0.5
    """Target contrast level for low contrast text."""
    beam_width: int = 5
    """Beam width for beam search in recognition."""
    canvas_size: int = 2560
    """Maximum image dimension for detection."""
    contrast_ths: float = 0.1
    """Contrast threshold for preprocessing."""
    decoder: Literal["greedy", "beamsearch", "wordbeamsearch"] = "greedy"
    """Decoder method. Options: 'greedy', 'beamsearch', 'wordbeamsearch'."""
    height_ths: float = 0.5
    """Maximum difference in box height for merging."""
    language: str | list[str] = "en"
    """Language or languages to use for OCR."""
    link_threshold: float = 0.4
    """Link confidence threshold."""
    low_text: float = 0.4
    """Text low-bound score."""
    mag_ratio: float = 1.0
    """Image magnification ratio."""
    min_size: int = 10
    """Minimum text box size in pixels."""
    rotation_info: list[int] | None = None
    """List of angles to try for detection."""
    slope_ths: float = 0.1
    """Maximum slope for merging text boxes."""
    text_threshold: float = 0.7
    """Text confidence threshold."""
    use_gpu: bool = False
    """Whether to use GPU for inference."""
    width_ths: float = 0.5
    """Maximum horizontal distance for merging boxes."""
    x_ths: float = 1.0
    """Maximum horizontal distance for paragraph merging."""
    y_ths: float = 0.5
    """Maximum vertical distance for paragraph merging."""
    ycenter_ths: float = 0.5
    """Maximum shift in y direction for merging."""

Attributes¶

`add_margin: float = 0.1` `class-attribute` `instance-attribute` ¶

Extend bounding boxes in all directions.

`adjust_contrast: float = 0.5` `class-attribute` `instance-attribute` ¶

Target contrast level for low contrast text.

`beam_width: int = 5` `class-attribute` `instance-attribute` ¶

Beam width for beam search in recognition.

`canvas_size: int = 2560` `class-attribute` `instance-attribute` ¶

Maximum image dimension for detection.

`contrast_ths: float = 0.1` `class-attribute` `instance-attribute` ¶

Contrast threshold for preprocessing.

`decoder: Literal['greedy', 'beamsearch', 'wordbeamsearch'] = 'greedy'` `class-attribute` `instance-attribute` ¶

Decoder method. Options: 'greedy', 'beamsearch', 'wordbeamsearch'.

`height_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

Maximum difference in box height for merging.

`language: str | list[str] = 'en'` `class-attribute` `instance-attribute` ¶

Language or languages to use for OCR.

`link_threshold: float = 0.4` `class-attribute` `instance-attribute` ¶

Link confidence threshold.

`low_text: float = 0.4` `class-attribute` `instance-attribute` ¶

Text low-bound score.

`mag_ratio: float = 1.0` `class-attribute` `instance-attribute` ¶

Image magnification ratio.

`min_size: int = 10` `class-attribute` `instance-attribute` ¶

Minimum text box size in pixels.

`rotation_info: list[int] | None = None` `class-attribute` `instance-attribute` ¶

List of angles to try for detection.

`slope_ths: float = 0.1` `class-attribute` `instance-attribute` ¶

Maximum slope for merging text boxes.

`text_threshold: float = 0.7` `class-attribute` `instance-attribute` ¶

Text confidence threshold.

`use_gpu: bool = False` `class-attribute` `instance-attribute` ¶

Whether to use GPU for inference.

`width_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

Maximum horizontal distance for merging boxes.

`x_ths: float = 1.0` `class-attribute` `instance-attribute` ¶

Maximum horizontal distance for paragraph merging.

`y_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

Maximum vertical distance for paragraph merging.

`ycenter_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

Maximum shift in y direction for merging.

PaddleOCRConfig¶

Configuration for the PaddleOCR engine:

`kreuzberg.PaddleOCRConfig` `dataclass` ¶

Configuration options for PaddleOCR.

This TypedDict provides type hints and documentation for all PaddleOCR parameters.

Source code in kreuzberg/_ocr/_paddleocr.py

@dataclass(unsafe_hash=True, frozen=True)
class PaddleOCRConfig:
    """Configuration options for PaddleOCR.

    This TypedDict provides type hints and documentation for all PaddleOCR parameters.
    """

    cls_image_shape: str = "3,48,192"
    """Image shape for classification algorithm in format 'channels,height,width'."""
    det_algorithm: Literal["DB", "EAST", "SAST", "PSE", "FCE", "PAN", "CT", "DB++", "Layout"] = "DB"
    """Detection algorithm."""
    det_db_box_thresh: float = 0.5
    """Score threshold for detected boxes. Boxes below this value are discarded."""
    det_db_thresh: float = 0.3
    """Binarization threshold for DB output map."""
    det_db_unclip_ratio: float = 2.0
    """Expansion ratio for detected text boxes."""
    det_east_cover_thresh: float = 0.1
    """Score threshold for EAST output boxes."""
    det_east_nms_thresh: float = 0.2
    """NMS threshold for EAST model output boxes."""
    det_east_score_thresh: float = 0.8
    """Binarization threshold for EAST output map."""
    det_max_side_len: int = 960
    """Maximum size of image long side. Images exceeding this will be proportionally resized."""
    drop_score: float = 0.5
    """Filter recognition results by confidence score. Results below this are discarded."""
    enable_mkldnn: bool = False
    """Whether to enable MKL-DNN acceleration (Intel CPU only)."""
    gpu_mem: int = 8000
    """GPU memory size (in MB) to use for initialization."""
    language: str = "en"
    """Language to use for OCR."""
    max_text_length: int = 25
    """Maximum text length that the recognition algorithm can recognize."""
    rec: bool = True
    """Enable text recognition when using the ocr() function."""
    rec_algorithm: Literal[
        "CRNN",
        "SRN",
        "NRTR",
        "SAR",
        "SEED",
        "SVTR",
        "SVTR_LCNet",
        "ViTSTR",
        "ABINet",
        "VisionLAN",
        "SPIN",
        "RobustScanner",
        "RFL",
    ] = "CRNN"
    """Recognition algorithm."""
    rec_image_shape: str = "3,32,320"
    """Image shape for recognition algorithm in format 'channels,height,width'."""
    table: bool = True
    """Whether to enable table recognition."""
    use_angle_cls: bool = True
    """Whether to use text orientation classification model."""
    use_gpu: bool = False
    """Whether to use GPU for inference. Requires installing the paddlepaddle-gpu package"""
    use_space_char: bool = True
    """Whether to recognize spaces."""
    use_zero_copy_run: bool = False
    """Whether to enable zero_copy_run for inference optimization."""

Attributes¶

`cls_image_shape: str = '3,48,192'` `class-attribute` `instance-attribute` ¶

Image shape for classification algorithm in format 'channels,height,width'.

`det_algorithm: Literal['DB', 'EAST', 'SAST', 'PSE', 'FCE', 'PAN', 'CT', 'DB++', 'Layout'] = 'DB'` `class-attribute` `instance-attribute` ¶

Detection algorithm.

`det_db_box_thresh: float = 0.5` `class-attribute` `instance-attribute` ¶

Score threshold for detected boxes. Boxes below this value are discarded.

`det_db_thresh: float = 0.3` `class-attribute` `instance-attribute` ¶

Binarization threshold for DB output map.

`det_db_unclip_ratio: float = 2.0` `class-attribute` `instance-attribute` ¶

Expansion ratio for detected text boxes.

`det_east_cover_thresh: float = 0.1` `class-attribute` `instance-attribute` ¶

Score threshold for EAST output boxes.

`det_east_nms_thresh: float = 0.2` `class-attribute` `instance-attribute` ¶

NMS threshold for EAST model output boxes.

`det_east_score_thresh: float = 0.8` `class-attribute` `instance-attribute` ¶

Binarization threshold for EAST output map.

`det_max_side_len: int = 960` `class-attribute` `instance-attribute` ¶

Maximum size of image long side. Images exceeding this will be proportionally resized.

`drop_score: float = 0.5` `class-attribute` `instance-attribute` ¶

Filter recognition results by confidence score. Results below this are discarded.

`enable_mkldnn: bool = False` `class-attribute` `instance-attribute` ¶

Whether to enable MKL-DNN acceleration (Intel CPU only).

`gpu_mem: int = 8000` `class-attribute` `instance-attribute` ¶

GPU memory size (in MB) to use for initialization.

`language: str = 'en'` `class-attribute` `instance-attribute` ¶

Language to use for OCR.

`max_text_length: int = 25` `class-attribute` `instance-attribute` ¶

Maximum text length that the recognition algorithm can recognize.

`rec: bool = True` `class-attribute` `instance-attribute` ¶

Enable text recognition when using the ocr() function.

`rec_algorithm: Literal['CRNN', 'SRN', 'NRTR', 'SAR', 'SEED', 'SVTR', 'SVTR_LCNet', 'ViTSTR', 'ABINet', 'VisionLAN', 'SPIN', 'RobustScanner', 'RFL'] = 'CRNN'` `class-attribute` `instance-attribute` ¶

Recognition algorithm.

`rec_image_shape: str = '3,32,320'` `class-attribute` `instance-attribute` ¶

Image shape for recognition algorithm in format 'channels,height,width'.

`table: bool = True` `class-attribute` `instance-attribute` ¶

Whether to enable table recognition.

`use_angle_cls: bool = True` `class-attribute` `instance-attribute` ¶

Whether to use text orientation classification model.

`use_gpu: bool = False` `class-attribute` `instance-attribute` ¶

Whether to use GPU for inference. Requires installing the paddlepaddle-gpu package

`use_space_char: bool = True` `class-attribute` `instance-attribute` ¶

Whether to recognize spaces.

`use_zero_copy_run: bool = False` `class-attribute` `instance-attribute` ¶

Whether to enable zero_copy_run for inference optimization.

OCR Configuration¶

TesseractConfig¶

kreuzberg.TesseractConfig dataclass ¶

Attributes¶

classify_use_pre_adapted_templates: bool = True class-attribute instance-attribute ¶

language: str = 'eng' class-attribute instance-attribute ¶

language_model_ngram_on: bool = True class-attribute instance-attribute ¶

psm: PSMMode = PSMMode.AUTO class-attribute instance-attribute ¶

tessedit_dont_blkrej_good_wds: bool = True class-attribute instance-attribute ¶

tessedit_dont_rowrej_good_wds: bool = True class-attribute instance-attribute ¶

tessedit_enable_dict_correction: bool = True class-attribute instance-attribute ¶

tessedit_use_primary_params_model: bool = True class-attribute instance-attribute ¶

textord_space_size_is_variable: bool = True class-attribute instance-attribute ¶

thresholding_method: bool = False class-attribute instance-attribute ¶

PSMMode¶

kreuzberg.PSMMode ¶

Attributes¶

AUTO = 3 class-attribute instance-attribute ¶

AUTO_ONLY = 2 class-attribute instance-attribute ¶

AUTO_OSD = 1 class-attribute instance-attribute ¶

CIRCLE_WORD = 9 class-attribute instance-attribute ¶

OSD_ONLY = 0 class-attribute instance-attribute ¶

SINGLE_BLOCK = 6 class-attribute instance-attribute ¶

SINGLE_BLOCK_VERTICAL = 5 class-attribute instance-attribute ¶

SINGLE_CHAR = 10 class-attribute instance-attribute ¶

SINGLE_COLUMN = 4 class-attribute instance-attribute ¶

SINGLE_LINE = 7 class-attribute instance-attribute ¶

SINGLE_WORD = 8 class-attribute instance-attribute ¶

EasyOCRConfig¶

kreuzberg.EasyOCRConfig dataclass ¶

Attributes¶

add_margin: float = 0.1 class-attribute instance-attribute ¶

adjust_contrast: float = 0.5 class-attribute instance-attribute ¶

beam_width: int = 5 class-attribute instance-attribute ¶

canvas_size: int = 2560 class-attribute instance-attribute ¶

contrast_ths: float = 0.1 class-attribute instance-attribute ¶

decoder: Literal['greedy', 'beamsearch', 'wordbeamsearch'] = 'greedy' class-attribute instance-attribute ¶

height_ths: float = 0.5 class-attribute instance-attribute ¶

language: str | list[str] = 'en' class-attribute instance-attribute ¶

link_threshold: float = 0.4 class-attribute instance-attribute ¶

low_text: float = 0.4 class-attribute instance-attribute ¶

mag_ratio: float = 1.0 class-attribute instance-attribute ¶

min_size: int = 10 class-attribute instance-attribute ¶

rotation_info: list[int] | None = None class-attribute instance-attribute ¶

slope_ths: float = 0.1 class-attribute instance-attribute ¶

text_threshold: float = 0.7 class-attribute instance-attribute ¶

use_gpu: bool = False class-attribute instance-attribute ¶

width_ths: float = 0.5 class-attribute instance-attribute ¶

x_ths: float = 1.0 class-attribute instance-attribute ¶

y_ths: float = 0.5 class-attribute instance-attribute ¶

ycenter_ths: float = 0.5 class-attribute instance-attribute ¶

PaddleOCRConfig¶

kreuzberg.PaddleOCRConfig dataclass ¶

Attributes¶

cls_image_shape: str = '3,48,192' class-attribute instance-attribute ¶

det_algorithm: Literal['DB', 'EAST', 'SAST', 'PSE', 'FCE', 'PAN', 'CT', 'DB++', 'Layout'] = 'DB' class-attribute instance-attribute ¶

det_db_box_thresh: float = 0.5 class-attribute instance-attribute ¶

det_db_thresh: float = 0.3 class-attribute instance-attribute ¶

det_db_unclip_ratio: float = 2.0 class-attribute instance-attribute ¶

det_east_cover_thresh: float = 0.1 class-attribute instance-attribute ¶

det_east_nms_thresh: float = 0.2 class-attribute instance-attribute ¶

det_east_score_thresh: float = 0.8 class-attribute instance-attribute ¶

det_max_side_len: int = 960 class-attribute instance-attribute ¶

drop_score: float = 0.5 class-attribute instance-attribute ¶

enable_mkldnn: bool = False class-attribute instance-attribute ¶

gpu_mem: int = 8000 class-attribute instance-attribute ¶

language: str = 'en' class-attribute instance-attribute ¶

max_text_length: int = 25 class-attribute instance-attribute ¶

rec: bool = True class-attribute instance-attribute ¶

rec_algorithm: Literal['CRNN', 'SRN', 'NRTR', 'SAR', 'SEED', 'SVTR', 'SVTR_LCNet', 'ViTSTR', 'ABINet', 'VisionLAN', 'SPIN', 'RobustScanner', 'RFL'] = 'CRNN' class-attribute instance-attribute ¶

rec_image_shape: str = '3,32,320' class-attribute instance-attribute ¶

table: bool = True class-attribute instance-attribute ¶

use_angle_cls: bool = True class-attribute instance-attribute ¶

use_gpu: bool = False class-attribute instance-attribute ¶

use_space_char: bool = True class-attribute instance-attribute ¶

use_zero_copy_run: bool = False class-attribute instance-attribute ¶

`kreuzberg.TesseractConfig` `dataclass` ¶

`classify_use_pre_adapted_templates: bool = True` `class-attribute` `instance-attribute` ¶

`language: str = 'eng'` `class-attribute` `instance-attribute` ¶

`language_model_ngram_on: bool = True` `class-attribute` `instance-attribute` ¶

`psm: PSMMode = PSMMode.AUTO` `class-attribute` `instance-attribute` ¶

`tessedit_dont_blkrej_good_wds: bool = True` `class-attribute` `instance-attribute` ¶

`tessedit_dont_rowrej_good_wds: bool = True` `class-attribute` `instance-attribute` ¶

`tessedit_enable_dict_correction: bool = True` `class-attribute` `instance-attribute` ¶

`tessedit_use_primary_params_model: bool = True` `class-attribute` `instance-attribute` ¶

`textord_space_size_is_variable: bool = True` `class-attribute` `instance-attribute` ¶

`thresholding_method: bool = False` `class-attribute` `instance-attribute` ¶

`kreuzberg.PSMMode` ¶

`AUTO = 3` `class-attribute` `instance-attribute` ¶

`AUTO_ONLY = 2` `class-attribute` `instance-attribute` ¶

`AUTO_OSD = 1` `class-attribute` `instance-attribute` ¶

`CIRCLE_WORD = 9` `class-attribute` `instance-attribute` ¶

`OSD_ONLY = 0` `class-attribute` `instance-attribute` ¶

`SINGLE_BLOCK = 6` `class-attribute` `instance-attribute` ¶

`SINGLE_BLOCK_VERTICAL = 5` `class-attribute` `instance-attribute` ¶

`SINGLE_CHAR = 10` `class-attribute` `instance-attribute` ¶

`SINGLE_COLUMN = 4` `class-attribute` `instance-attribute` ¶

`SINGLE_LINE = 7` `class-attribute` `instance-attribute` ¶

`SINGLE_WORD = 8` `class-attribute` `instance-attribute` ¶

`kreuzberg.EasyOCRConfig` `dataclass` ¶

`add_margin: float = 0.1` `class-attribute` `instance-attribute` ¶

`adjust_contrast: float = 0.5` `class-attribute` `instance-attribute` ¶

`beam_width: int = 5` `class-attribute` `instance-attribute` ¶

`canvas_size: int = 2560` `class-attribute` `instance-attribute` ¶

`contrast_ths: float = 0.1` `class-attribute` `instance-attribute` ¶

`decoder: Literal['greedy', 'beamsearch', 'wordbeamsearch'] = 'greedy'` `class-attribute` `instance-attribute` ¶

`height_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

`language: str | list[str] = 'en'` `class-attribute` `instance-attribute` ¶

`link_threshold: float = 0.4` `class-attribute` `instance-attribute` ¶

`low_text: float = 0.4` `class-attribute` `instance-attribute` ¶

`mag_ratio: float = 1.0` `class-attribute` `instance-attribute` ¶

`min_size: int = 10` `class-attribute` `instance-attribute` ¶

`rotation_info: list[int] | None = None` `class-attribute` `instance-attribute` ¶

`slope_ths: float = 0.1` `class-attribute` `instance-attribute` ¶

`text_threshold: float = 0.7` `class-attribute` `instance-attribute` ¶

`use_gpu: bool = False` `class-attribute` `instance-attribute` ¶

`width_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

`x_ths: float = 1.0` `class-attribute` `instance-attribute` ¶

`y_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

`ycenter_ths: float = 0.5` `class-attribute` `instance-attribute` ¶

`kreuzberg.PaddleOCRConfig` `dataclass` ¶

`cls_image_shape: str = '3,48,192'` `class-attribute` `instance-attribute` ¶

`det_algorithm: Literal['DB', 'EAST', 'SAST', 'PSE', 'FCE', 'PAN', 'CT', 'DB++', 'Layout'] = 'DB'` `class-attribute` `instance-attribute` ¶

`det_db_box_thresh: float = 0.5` `class-attribute` `instance-attribute` ¶

`det_db_thresh: float = 0.3` `class-attribute` `instance-attribute` ¶

`det_db_unclip_ratio: float = 2.0` `class-attribute` `instance-attribute` ¶

`det_east_cover_thresh: float = 0.1` `class-attribute` `instance-attribute` ¶

`det_east_nms_thresh: float = 0.2` `class-attribute` `instance-attribute` ¶

`det_east_score_thresh: float = 0.8` `class-attribute` `instance-attribute` ¶

`det_max_side_len: int = 960` `class-attribute` `instance-attribute` ¶

`drop_score: float = 0.5` `class-attribute` `instance-attribute` ¶

`enable_mkldnn: bool = False` `class-attribute` `instance-attribute` ¶

`gpu_mem: int = 8000` `class-attribute` `instance-attribute` ¶

`language: str = 'en'` `class-attribute` `instance-attribute` ¶

`max_text_length: int = 25` `class-attribute` `instance-attribute` ¶

`rec: bool = True` `class-attribute` `instance-attribute` ¶

`rec_algorithm: Literal['CRNN', 'SRN', 'NRTR', 'SAR', 'SEED', 'SVTR', 'SVTR_LCNet', 'ViTSTR', 'ABINet', 'VisionLAN', 'SPIN', 'RobustScanner', 'RFL'] = 'CRNN'` `class-attribute` `instance-attribute` ¶

`rec_image_shape: str = '3,32,320'` `class-attribute` `instance-attribute` ¶

`table: bool = True` `class-attribute` `instance-attribute` ¶

`use_angle_cls: bool = True` `class-attribute` `instance-attribute` ¶

`use_gpu: bool = False` `class-attribute` `instance-attribute` ¶

`use_space_char: bool = True` `class-attribute` `instance-attribute` ¶

`use_zero_copy_run: bool = False` `class-attribute` `instance-attribute` ¶