medkit.text.preprocessing.regexp_replacer#
Classes#
Generic pattern replacer to be used as pre-processing module. |
Module Contents#
- class medkit.text.preprocessing.regexp_replacer.RegexpReplacer(output_label: str, rules: list[tuple[str, str]] | None = None, name: str | None = None, uid: str | None = None)#
Bases:
medkit.core.operation.Operation
Generic pattern replacer to be used as pre-processing module.
This module is a non-destructive module allowing to replace a regex pattern by a new text. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.
- Parameters:
- output_labelstr
The output label of the created annotations
- ruleslist of tuple, optional
The list of replacement rules [(pattern_to_replace, new_text)]
- namestr, optional
Name describing the pre-processing module (defaults to the class name)
- uidstr, optional
Identifier of the pre-processing module
- init_args#
- output_label#
- rules#
- regex_rules#
- regex_rule#
- _pattern#
- run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment] #
Run the module on a list of segments provided as input and returns a new list of segments.
- Parameters:
- segmentslist of Segment
List of segments to normalize
- Returns:
- list of Segment
List of normalized segments
- _normalize_segment_text(segment: medkit.core.text.Segment)#