medkit.audio.transcription.transcribed_text_document#

Classes#

TranscribedTextDocument

Text document generated by audio transcription.

Module Contents#

class medkit.audio.transcription.transcribed_text_document.TranscribedTextDocument(text: str, text_spans_to_audio_spans: dict[medkit.core.text.Span, medkit.core.audio.Span], audio_doc_id: str | None, anns: Sequence[medkit.core.text.TextAnnotation] | None = None, attrs: Sequence[medkit.core.Attribute] | None = None, metadata: dict[str, Any] | None = None, uid: str | None = None)#

Bases: medkit.core.text.TextDocument

Text document generated by audio transcription.

Parameters:
text: str

The full transcribed text.

text_spans_to_audio_spans: dict of TextSpan to AudioSpan

Mapping between text characters spans in this document and corresponding audio spans in the original audio.

audio_doc_id: str, optional

Identifier for the original AudioDocument that was transcribed, if known.

anns: sequence of TextAnnotation, optional

Annotations of the document.

attrs: sequence of Attribute, optional

Attributes of the document.

metadata: dict of str to Any

Document metadata.

uid: str, optional

Document identifier.

Attributes:
raw_segment: TextSegment

Auto-generated segment containing the raw full transcribed text.

text_spans_to_audio_spans: dict[medkit.core.text.Span, medkit.core.audio.Span]#
audio_doc_id: str | None#
get_containing_audio_spans(text_ann_spans: list[medkit.core.text.AnySpan]) list[medkit.core.audio.Span]#

Return the audio spans used to transcribe the text referenced by a text annotation.

For instance, if the audio ranging from 1.0 to 20.0 seconds is transcribed to some text ranging from character 10 to 56 in the transcribed document, and then a text annotation is created referencing the span 15 to 25, then the containing audio span will be the one ranging from 1.0 to 20.0 seconds.

Note that some text annotations maybe be contained in more than one audio spans.

Parameters:
text_ann_spans: list of AnyTextSpan

Text spans of a text annotation referencing some characters in the transcribed document.

Returns:
list of AudioSpan

Audio spans used to transcribe the text referenced by the spans of text_ann.

to_dict(with_anns: bool = True) dict[str, Any]#
classmethod from_dict(doc_dict: dict[str, Any]) typing_extensions.Self#

Create a TranscribedTextDocument from a dict.

Parameters:
doc_dict: dict of str to Any

A dictionary from a serialized TranscribedTextDocument as generated by to_dict()