Audio Components#

This page contains all core audio concepts of medkit.

For more details about public APIs, refer to medkit.core.audio.

Data Structures#

The AudioDocument class implements the Document protocol. It allows to store instances of the Segment class, which implements the Annotation protocol.

classDiagram direction TB class Document~Annotation~{ <<protocol>> } class Annotation{ <<protocol>> } class AudioDocument{ uid: str anns: AudioAnnotationContainer } class Segment { uid: str label: str attrs: AttributeContainer } Document <|.. AudioDocument: implements Annotation <|.. Segment: implements AudioDocument *-- Segment : contains\n(AudioAnnotationContainer)

Audio document and annotation hierarchy#

Documents#

AudioDocument relies on AudioAnnotationContainer, a subclass of AnnotationContainer, to manage the annotations.

For more details about the common interfaces provided by the core components, please refer to Document.

Annotations#

For the audio modality, AudioDocument can only contain Segment.

Spans#

Similarly to text spans, audio annotations have an audio span pointing to the part of the audio document that is annotated. Contrary to text annotations, multiple discontinuous spans are not supported.

Important

An audio annotation can only have 1 continuous span, and there is no concept of “modified spans”.

For more details about public APIs, please refer to medkit.core.audio.span.

Audio Buffer#

Access to the actual waveform data is handled through AudioBuffer instances.

The same way text annotations store the text they refer to in their text property, audio annotations store the portion of the audio signal they refer to in an audio property holding an AudioBuffer.

The contents of an AudioBuffer might be different from the initial raw signal if it has been preprocessed. If the signal is identical to the initial raw signal, then a FileAudioBuffer can be used (with appropriate start and end boundaries). Otherwise, a MemoryAudioBuffer has to be used, as there is no corresponding audio file containing the signal.

Creating a new AudioBuffer containing a portion of a pre-existing buffer is done through the trim() method.

For more details about public APIs, please refer to medkit.core.audio.audio_buffer.

Operations#

Abstract subclasses of Operation have been defined for audio to ease the development of audio operations according to run operations.

classDiagram Operation <|-- DocOperation Operation <|-- PreprocessingOperation Operation <|-- SegmentationOperation

Operation hierarchy#

For more details about public APIs, please refer to medkit.core.audio.operation.