medkit.text.preprocessing.char_replacer#

Classes#

CharReplacer

Generic character replacer to be used as pre-processing module.

Module Contents#

class medkit.text.preprocessing.char_replacer.CharReplacer(output_label: str, rules: list[tuple[str, str]] | None = None, name: str | None = None, uid: str | None = None)#

Bases: medkit.core.operation.Operation

Generic character replacer to be used as pre-processing module.

This module is a non-destructive module allowing to replace selected 1-char string with the wanted n-chars strings. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.

Parameters:
output_labelstr

The output label of the created annotations

ruleslist of tuple, optional

The list of replacement rules. Default: ALL_CHAR_RULES

namestr, optional

Name describing the pre-processing module (defaults to the class name)

uidstr, optional

Identifier of the pre-processing module

init_args#
output_label#
rules#
run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment]#

Run the module on a list of segments provided as input and returns a new list of segments.

Parameters:
segmentslist of Segment

List of segments to process

Returns:
list of Segment

List of new segments

_process_segment_text(segment: medkit.core.text.Segment)#