medkit.text.spacy#
Submodules#
Classes#
DocPipeline to obtain annotations created using spacy. |
|
Segment annotator relying on a Spacy pipeline. |
Package Contents#
- class medkit.text.spacy.SpacyDocPipeline(nlp: spacy.Language, medkit_labels_anns: list[str] | None = None, medkit_attrs: list[str] | None = None, spacy_entities: list[str] | None = None, spacy_span_groups: list[str] | None = None, spacy_attrs: list[str] | None = None, medkit_attribute_factories: dict[str, Callable[[spacy.tokens.Span, str], medkit.core.Attribute]] | None = None, name: str | None = None, uid: str | None = None)#
Bases:
medkit.core.DocOperation
DocPipeline to obtain annotations created using spacy.
- init_args#
- nlp#
- medkit_labels_anns#
- medkit_attrs#
- spacy_entities#
- spacy_span_groups#
- spacy_attrs#
- medkit_attribute_factories#
- run(medkit_docs: list[medkit.core.text.TextDocument]) None #
Run a spacy pipeline on a list of medkit documents.
Each medkit document is converted to spacy document (Doc object), with the selected annotations and attributes. Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.
- Parameters:
- medkit_docslist of TextDocument
List of TextDocuments on which to run the pipeline
- class medkit.text.spacy.SpacyPipeline(nlp: spacy.Language, spacy_entities: list[str] | None = None, spacy_span_groups: list[str] | None = None, spacy_attrs: list[str] | None = None, medkit_attribute_factories: dict[str, Callable[[spacy.tokens.Span, str], medkit.core.Attribute]] | None = None, name: str | None = None, uid: str | None = None)#
Bases:
medkit.core.operation.Operation
Segment annotator relying on a Spacy pipeline.
- init_args#
- nlp#
- spacy_entities#
- spacy_span_groups#
- spacy_attrs#
- medkit_attribute_factories#
- run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Segment] #
Run the operation.
Run a spacy pipeline on a list of segments provided as input and returns a new list of segments. Each segment is converted to spacy document (Doc object). Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.
- Parameters:
- segmentslist of Segment
List of segments on which to run the spacy pipeline
- Returns:
- list of Segment
List of new annotations
- _find_segments_in_spacy_doc(spacy_doc: spacy.tokens.Doc, medkit_source_ann: medkit.core.text.Segment)#