medkit.io.srt#
Classes#
| Convert .srt files containing transcription information into turn segments with transcription attributes. | |
| Build .srt files containing transcription information from segments. | 
Module Contents#
- class medkit.io.srt.SRTInputConverter(turn_segment_label: str = 'turn', transcription_attr_label: str = 'transcribed_text', converter_id: str | None = None)#
- Bases: - medkit.core.InputConverter- Convert .srt files containing transcription information into turn segments with transcription attributes. - For each turn in a .srt file, a - Segmentwill be created, with an associated- Attributeholding the transcribed text as value. The segments can be retrieved directly or as part of an- AudioDocumentinstance.- If a - ProvTraceris set, provenance information will be added for each segment and each attribute (referencing the input converter as the operation).- Parameters:
- turn_segment_labelstr, default=”turn”
- Label to use for segments representing turns in the .srt file. 
- transcription_attr_labelstr, default=”transcribed_text”
- Label to use for segments attributes containing the transcribed text. 
- converter_idstr, optional
- Identifier of the converter. 
 
 - uid#
 - turn_segment_label#
 - transcription_attr_label#
 - _prov_tracer: medkit.core.ProvTracer | None = None#
 - property description: medkit.core.OperationDescription#
- Contains all the input converter init parameters. 
 - set_prov_tracer(prov_tracer: medkit.core.ProvTracer)#
- Enable provenance tracing. - Parameters:
- prov_tracerProvTracer
- The provenance tracer used to trace the provenance. 
 
 
 - load(srt_dir: str | pathlib.Path, audio_dir: str | pathlib.Path | None = None, audio_ext: str = '.wav') list[medkit.core.audio.AudioDocument]#
- Load all .srt files in a directory into a list of audio documents. - For each .srt file, they must be a corresponding audio file with the same basename, either in the same directory or in an separated audio directory. - Parameters:
- srt_dirstr or Path
- Directory containing the .srt files. 
- audio_dirstr or Path, optional
- Directory containing the audio files corresponding to the .srt files, if they are not in srt_dir. 
- audio_extstr, default=”.wav”
- File extension to use for audio files. 
 
- Returns:
- list of AudioDocument
- List of generated documents. 
 
 
 - load_doc(srt_file: str | pathlib.Path, audio_file: str | pathlib.Path) medkit.core.audio.AudioDocument#
- Load a single .srt file into an audio document containing turn segments with transcription attributes. - Parameters:
- srt_filestr or Path
- Path to the .srt file. 
- audio_filestr or Path
- Path to the corresponding audio file. 
 
- Returns:
- AudioDocument
- Generated document. 
 
 
 - load_segments(srt_file: str | pathlib.Path, audio_file: str | pathlib.Path) list[medkit.core.audio.Segment]#
- Load a .srt file and return a list of segments corresponding to turns with transcription attributes. - Parameters:
- srt_filestr or Path
- Path to the .srt file. 
- audio_filestr or Path
- Path to the corresponding audio file. 
 
- Returns:
- list of Segment
- Turn segments as found in the .srt file, with transcription attributes attached. 
 
 
 - _build_segment(srt_item: pysrt.SubRipItem, full_audio: medkit.core.audio.FileAudioBuffer) medkit.core.audio.Segment#
 
- class medkit.io.srt.SRTOutputConverter(segment_turn_label: str = 'turn', transcription_attr_label: str = 'transcribed_text')#
- Bases: - medkit.core.OutputConverter- Build .srt files containing transcription information from segments. - There must be a segment for each turn, with an associated - Attributeholding the transcribed text as value. The segments can be passed directly or as part of- AudioDocumentinstances.- Parameters:
- segment_turn_labelstr, default=”turn”
- Label of segments representing turns in the audio documents. 
- transcription_attr_labelstr, default=”transcribed_text”
- Label of segments attributes containing the transcribed text. 
 
 - segment_turn_label#
 - transcription_attr_label#
 - save(docs: list[medkit.core.audio.AudioDocument], srt_dir: str | pathlib.Path, doc_names: list[str] | None = None)#
- Save multiple audio documents as .srt files in a directory. - Parameters:
- docslist of AudioDocument
- List of audio documents to save. 
- srt_dirstr or Path
- Directory into which the generated .str files will be stored. 
- doc_nameslist of str, optional
- Optional list of names to use as basenames for the generated .srt files. 
 
 
 - save_doc(doc: medkit.core.audio.AudioDocument, srt_file: str | pathlib.Path)#
- Save a single audio document as a .srt file. - Parameters:
- docAudioDocument
- Audio document to save. 
- srt_filestr or Path
- Path of the generated .srt file. 
 
 
 - save_segments(segments: list[medkit.core.audio.Segment], srt_file: str | pathlib.Path)#
- Save segments representing turns into a .srt file. - Parameters:
- segmentslist of Segment
- Turn segments to save. 
- srt_filestr or Path
- Path of the generated .srt file. 
 
 
 
