medkit.text.ner.edsnlp_date_matcher#
Classes#
Date matcher based on the EDS-NLP dates pipeline. |
Module Contents#
- class medkit.text.ner.edsnlp_date_matcher.EDSNLPDateMatcher(output_label: str = 'date', attrs_to_copy: list[str] | None = None, uid: str | None = None)#
Bases:
medkit.core.text.operation.NEROperation
Date matcher based on the EDS-NLP dates pipeline.
Note that this operation is designed to run on french documents.
Absolute dates (ex: “23/08/2021”), relatives dates (ex: “la semaine dernière”) and durations (ex: “pendant quatre jours”) will be matched.
For each date that is found, an entity will be created with an attribute attached to it containing normalized values of the date components. The attribute label will be either “date” or “duration”, and the class of the attribute will be either class
DateAttribute
,RelativeDateAttribute
orDurationAttribute
.- Parameters:
- output_labelstr, default=”date”
Label to use for date entities created (the label of the attributes will always be “date” or “duration”)
- attrs_to_copylist of str, optional
Labels of the attributes that should be copied from the input segment to the created date entity. Useful for propagating context attributes (negation, antecedent, etc).
- uidstr, optional
Identifier of the matcher
- output_label#
- attrs_to_copy#
- _edsnlp#
- run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Entity] #
Find and return date entities for all segments.
- Parameters:
- segmentslist of Segment
List of segments into which to look for date mentions
- Returns:
- list of Entity
Date entities found in segments, with
DateAttribute
,RelativeDateAttribute
orDurationAttribute
attributes.
- _find_dates_in_segment(segment, spacy_doc) Iterator[medkit.core.text.Entity] #
- _build_entity(segment, spacy_span, is_duration) medkit.core.text.Entity #