medkit.text.preprocessing.regexp_replacer#

Classes#

RegexpReplacer

Generic pattern replacer to be used as pre-processing module.

Module Contents#

class medkit.text.preprocessing.regexp_replacer.RegexpReplacer(output_label: str, rules: list[tuple[str, str]] | None = None, name: str | None = None, uid: str | None = None)#

Bases: medkit.core.operation.Operation

Generic pattern replacer to be used as pre-processing module.

This module is a non-destructive module allowing to replace a regex pattern by a new text. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.

Parameters:
output_labelstr

The output label of the created annotations

ruleslist of tuple, optional

The list of replacement rules [(pattern_to_replace, new_text)]

namestr, optional

Name describing the pre-processing module (defaults to the class name)

uidstr, optional

Identifier of the pre-processing module

init_args#
output_label#
rules#
regex_rules#
regex_rule#
_pattern#
run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment]#

Run the module on a list of segments provided as input and returns a new list of segments.

Parameters:
segmentslist of Segment

List of segments to normalize

Returns:
list of Segment

List of normalized segments

_normalize_segment_text(segment: medkit.core.text.Segment)#