medkit.text.ner.edsnlp_date_matcher#

Classes#

EDSNLPDateMatcher

Date matcher based on the EDS-NLP dates pipeline.

Module Contents#

class medkit.text.ner.edsnlp_date_matcher.EDSNLPDateMatcher(output_label: str = 'date', attrs_to_copy: list[str] | None = None, uid: str | None = None)#

Bases: medkit.core.text.operation.NEROperation

Date matcher based on the EDS-NLP dates pipeline.

Note that this operation is designed to run on french documents.

Absolute dates (ex: “23/08/2021”), relatives dates (ex: “la semaine dernière”) and durations (ex: “pendant quatre jours”) will be matched.

For each date that is found, an entity will be created with an attribute attached to it containing normalized values of the date components. The attribute label will be either “date” or “duration”, and the class of the attribute will be either class DateAttribute, RelativeDateAttribute or DurationAttribute.

Parameters:
output_labelstr, default=”date”

Label to use for date entities created (the label of the attributes will always be “date” or “duration”)

attrs_to_copylist of str, optional

Labels of the attributes that should be copied from the input segment to the created date entity. Useful for propagating context attributes (negation, antecedent, etc).

uidstr, optional

Identifier of the matcher

output_label#
attrs_to_copy#
_edsnlp#
run(segments: list[medkit.core.text.Segment]) list[medkit.core.text.Entity]#

Find and return date entities for all segments.

Parameters:
segmentslist of Segment

List of segments into which to look for date mentions

Returns:
list of Entity

Date entities found in segments, with DateAttribute, RelativeDateAttribute or DurationAttribute attributes.

_find_dates_in_segment(segment, spacy_doc) Iterator[medkit.core.text.Entity]#
_build_entity(segment, spacy_span, is_duration) medkit.core.text.Entity#