medkit.text.spacy.pipeline#

Classes#

SpacyPipeline

Segment annotator relying on a Spacy pipeline.

Module Contents#

class medkit.text.spacy.pipeline.SpacyPipeline(nlp: spacy.Language, spacy_entities: list[str] | None = None, spacy_span_groups: list[str] | None = None, spacy_attrs: list[str] | None = None, medkit_attribute_factories: dict[str, Callable[[spacy.tokens.Span, str], medkit.core.Attribute]] | None = None, name: str | None = None, uid: str | None = None)#

Bases: medkit.core.operation.Operation

Segment annotator relying on a Spacy pipeline.

init_args#
nlp#
spacy_entities#
spacy_span_groups#
spacy_attrs#
medkit_attribute_factories#
run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment]#

Run the operation.

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments. Each segment is converted to spacy document (Doc object). Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters:
segmentslist of Segment

List of segments on which to run the spacy pipeline

Returns:
list of Segment

List of new annotations

_find_segments_in_spacy_doc(spacy_doc: spacy.tokens.Doc, medkit_source_ann: medkit.core.text.Segment)#