medkit.io.rttm#

Classes#

RTTMInputConverter

Class for conversions from Rich Transcription Time Marked (.rttm) into turn segments.

RTTMOutputConverter

Class for conversions to Rich Transcription Time Marked (.rttm).

Module Contents#

class medkit.io.rttm.RTTMInputConverter(turn_label: str = 'turn', speaker_label: str = 'speaker', converter_id: str | None = None)#

Bases: medkit.core.InputConverter

Class for conversions from Rich Transcription Time Marked (.rttm) into turn segments.

Convert Rich Transcription Time Marked (.rttm) files containing diarization information into turn segments.

For each turn in a .rttm file containing diarization information, a Segment will be created, with an associated Attribute holding the name of the turn speaker as value. The segments can be retrieved directly or as part of an AudioDocument instance.

If a ProvTracer is set, provenance information will be added for each segment and each attribute (referencing the input converter as the operation).

Parameters:
turn_labelstr, default=”turn”

Label of segments representing turns in the .rttm file.

speaker_labelstr, default=”speaker”

Label of speaker attributes to add to each segment.

converter_idstr, optional

Identifier of the converter.

Attributes:
descriptionOperationDescription

Description for the operation.

uid#
turn_label#
speaker_label#
_prov_tracer: medkit.core.ProvTracer | None = None#
property description: medkit.core.OperationDescription#

Contains all the input converter init parameters.

set_prov_tracer(prov_tracer: medkit.core.ProvTracer)#

Enable provenance tracing.

Parameters:
prov_tracer:

The provenance tracer used to trace the provenance.

load(rttm_dir: str | pathlib.Path, audio_dir: str | pathlib.Path | None = None, audio_ext: str = '.wav') list[medkit.core.audio.AudioDocument]#

Load all .rttm files in a directory into a list of audio documents.

For each .rttm file, they must be a corresponding audio file with the same basename, either in the same directory or in an separated audio directory.

Parameters:
rttm_dirstr or Path

Directory containing the .rttm files.

audio_dirstr or Path, optional

Directory containing the audio files corresponding to the .rttm files, if they are not in rttm_dir.

audio_extstr, default=”.wav”

File extension to use for audio files.

Returns:
list of AudioDocument

List of generated documents.

load_doc(rttm_file: str | pathlib.Path, audio_file: str | pathlib.Path) medkit.core.audio.AudioDocument#

Load a single .rttm file into an audio document.

Parameters:
rttm_filestr or Path

Path to the .rttm file.

audio_filestr or Path

Path to the corresponding audio file.

Returns:
AudioDocument

Generated document.

load_turns(rttm_file: str | pathlib.Path, audio_file: str | pathlib.Path) list[medkit.core.audio.Segment]#

Load a .rttm file as a list of segments.

Parameters:
rttm_filestr or Path

Path to the .rttm file.

audio_filestr or Path

Path to the corresponding audio file.

Returns:
list of Segment

Turn segments as found in the .rttm file.

static _load_rows(rttm_file: pathlib.Path)#
_build_turn_segment(row: dict[str, Any], full_audio: medkit.core.audio.FileAudioBuffer) medkit.core.audio.Segment#
class medkit.io.rttm.RTTMOutputConverter(turn_label: str = 'turn', speaker_label: str = 'speaker')#

Bases: medkit.core.OutputConverter

Class for conversions to Rich Transcription Time Marked (.rttm).

Build Rich Transcription Time Marked (.rttm) files containing diarization information from Segment objects.

There must be a segment for each turn, with an associated Attribute holding the name of the turn speaker as value. The segments can be passed directly or as part of AudioDocument instances.

Parameters:
turn_labelstr, default=”turn”

Label of segments representing turns in the audio documents.

speaker_labelstr, default=”speaker”

Label of speaker attributes attached to each turn segment.

turn_label#
speaker_label#
save(docs: list[medkit.core.audio.AudioDocument], rttm_dir: str | pathlib.Path, doc_names: list[str] | None = None)#

Save a collection of audio documents to RTTM files in a directory.

Parameters:
docslist of AudioDocument

List of audio documents to save.

rttm_dirstr or Path

Directory into which the generated .rttm files will be stored.

doc_nameslist of str, optional

Optional list of names to use as basenames and file ids for the generated .rttm files (2d column). If none provided, the document ids will be used.

save_doc(doc: medkit.core.audio.AudioDocument, rttm_file: str | pathlib.Path, rttm_doc_id: str | None = None)#

Save a single audio document to a RTTM file.

Parameters:
docAudioDocument

Audio document to save.

rttm_filestr or Path

Path of the generated .rttm file.

rttm_doc_idstr, optional

File uid to use for the generated .rttm file (2d column). If none provided, the document uid will be used.

save_turn_segments(turn_segments: list[medkit.core.audio.Segment], rttm_file: str | pathlib.Path, rttm_doc_id: str | None)#

Save Segment objects into a .rttm file.

Parameters:
turn_segmentslist of Segment

Turn segments to save.

rttm_filestr or Path

Path of the generated .rttm file.

rttm_doc_idstr, optional

File uid to use for the generated .rttm file (2d column).

_build_rttm_row(turn_segment: medkit.core.audio.Segment, rttm_doc_id: str | None) dict[str, Any]#