medkit.text.preprocessing.char_replacer#
Classes#
Generic character replacer to be used as pre-processing module. |
Module Contents#
- class medkit.text.preprocessing.char_replacer.CharReplacer(output_label: str, rules: list[tuple[str, str]] | None = None, name: str | None = None, uid: str | None = None)#
Bases:
medkit.core.operation.Operation
Generic character replacer to be used as pre-processing module.
This module is a non-destructive module allowing to replace selected 1-char string with the wanted n-chars strings. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.
- Parameters:
- output_labelstr
The output label of the created annotations
- ruleslist of tuple, optional
The list of replacement rules. Default: ALL_CHAR_RULES
- namestr, optional
Name describing the pre-processing module (defaults to the class name)
- uidstr, optional
Identifier of the pre-processing module
- init_args#
- output_label#
- rules#
- run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment] #
Run the module on a list of segments provided as input and returns a new list of segments.
- Parameters:
- segmentslist of Segment
List of segments to process
- Returns:
- list of Segment
List of new segments
- _process_segment_text(segment: medkit.core.text.Segment)#