medkit.text.spacy.doc_pipeline#

Classes#

SpacyDocPipeline

DocPipeline to obtain annotations created using spacy.

Module Contents#

class medkit.text.spacy.doc_pipeline.SpacyDocPipeline(nlp: spacy.Language, medkit_labels_anns: list[str] | None = None, medkit_attrs: list[str] | None = None, spacy_entities: list[str] | None = None, spacy_span_groups: list[str] | None = None, spacy_attrs: list[str] | None = None, medkit_attribute_factories: dict[str, Callable[[spacy.tokens.Span, str], medkit.core.Attribute]] | None = None, name: str | None = None, uid: str | None = None)#

Bases: medkit.core.DocOperation

DocPipeline to obtain annotations created using spacy.

init_args#
nlp#
medkit_labels_anns#
medkit_attrs#
spacy_entities#
spacy_span_groups#
spacy_attrs#
medkit_attribute_factories#
run(medkit_docs: list[medkit.core.text.TextDocument]) None#

Run a spacy pipeline on a list of medkit documents.

Each medkit document is converted to spacy document (Doc object), with the selected annotations and attributes. Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters:
medkit_docslist of TextDocument

List of TextDocuments on which to run the pipeline