medkit.io._brat_utils#
Attributes#
Classes#
A simple entity annotation data structure. |
|
A simple relation data structure. |
|
A simple attribute data structure. |
|
A simple note data structure. |
|
A grouping data structure for entities of type And-Group and Or-Group. |
|
An augmented entity data structure with its relations and attributes. |
|
Configuration data structure of a BratRelation. |
|
Configuration data structure of a BratAttribure. |
|
A data structure to represent 'annotation.conf' in brat documents. |
Functions#
|
Ensure that the attribue value is a string. |
|
Read an annotation file to get the Entities, Relations and Attributes in it. |
|
Read a string containing all annotations and extract Entities, Relations and Attributes. |
|
Parse the brat entity string into an Entity structure. |
|
Parse the annotation string into a Relation structure. |
|
Parse the annotation string into an Attribute structure. |
|
Parse the annotation string into an Note structure. |
Module Contents#
- medkit.io._brat_utils.GROUPING_ENTITIES#
- medkit.io._brat_utils.GROUPING_RELATIONS#
- medkit.io._brat_utils.logger#
- class medkit.io._brat_utils.BratEntity#
A simple entity annotation data structure.
- uid: str#
- type: str#
- span: list[tuple[int, int]]#
- text: str#
- property start: int#
- property end: int#
- to_str() str #
- class medkit.io._brat_utils.BratRelation#
A simple relation data structure.
- uid: str#
- type: str#
- subj: str#
- obj: str#
- to_str() str #
- class medkit.io._brat_utils.BratAttribute#
A simple attribute data structure.
- uid: str#
- type: str#
- target: str#
- value: str = None#
- to_str() str #
- class medkit.io._brat_utils.BratNote#
A simple note data structure.
- uid: str#
- target: str#
- value: str#
- type: str = 'AnnotatorNotes'#
- to_str() str #
- medkit.io._brat_utils.ensure_attr_value(attr_value: Any) str #
Ensure that the attribue value is a string.
- class medkit.io._brat_utils.Grouping#
A grouping data structure for entities of type And-Group and Or-Group.
- uid: str#
- type: str#
- items: list[BratEntity]#
- property text#
- class medkit.io._brat_utils.BratAugmentedEntity#
An augmented entity data structure with its relations and attributes.
- uid: str#
- type: str#
- span: tuple[tuple[int, int], Ellipsis]#
- text: str#
- relations_from_me: tuple[BratRelation, Ellipsis]#
- relations_to_me: tuple[BratRelation, Ellipsis]#
- attributes: tuple[BratAttribute, Ellipsis]#
- property start: int#
- property end: int#
- class medkit.io._brat_utils.BratDocument#
- entities: dict[str, BratEntity]#
- relations: dict[str, BratRelation]#
- attributes: dict[str, BratAttribute]#
- get_augmented_entities() dict[str, BratAugmentedEntity] #
- class medkit.io._brat_utils.RelationConf#
Bases:
NamedTuple
Configuration data structure of a BratRelation.
- type: str#
- arg1: str#
- arg2: str#
- class medkit.io._brat_utils.AttributeConf#
Bases:
NamedTuple
Configuration data structure of a BratAttribure.
- from_entity: bool#
- type: str#
- value: str#
- class medkit.io._brat_utils.BratAnnConfiguration(top_values_by_attr: int = 50)#
A data structure to represent āannotation.confā in brat documents.
This is necessary to generate a valid annotation project in brat. An āannotation.confā has four sections. The section āeventsā is not supported in medkit, so the section is empty.
- _entity_types: set[str]#
- _rel_types_arg_1: dict[str, set[str]]#
- _rel_types_arg_2: dict[str, set[str]]#
- _attr_entity_values: dict[str, list[str]]#
- _attr_relation_values: dict[str, list[str]]#
- top_values_by_attr#
- property entity_types: list[str]#
- property rel_types_arg_1: dict[str, list[str]]#
- property rel_types_arg_2: dict[str, list[str]]#
- property attr_relation_values: dict[str, list[str]]#
- property attr_entity_values: dict[str, list[str]]#
- add_entity_type(type: str)#
- add_relation_type(relation_conf: RelationConf)#
- add_attribute_type(attr_conf: AttributeConf)#
- to_str() str #
- static _attribute_to_str(type: str, values: list[str], from_entity: bool) str #
- static _relation_to_str(type: str, arg_1_types: list[str], arg_2_types: list[str]) str #
- medkit.io._brat_utils.parse_file(ann_path: str | pathlib.Path, detect_groups: bool = False) BratDocument #
Read an annotation file to get the Entities, Relations and Attributes in it.
All other lines are ignored.
- Parameters:
- ann_pathstr or Path
The path to the annotation file to be processed.
- detect_groupsbool, default=False
If set to True, the function will also parse the group of entities according to some specific keywords. By default, it is set to False.
- Returns:
- Document
The dataclass object containing entities, relations and attributes
- medkit.io._brat_utils.parse_string(ann_string: str, detect_groups: bool = False) BratDocument #
Read a string containing all annotations and extract Entities, Relations and Attributes.
All other lines are ignored.
- Parameters:
- ann_stringstr
The string containing all brat annotations
- detect_groupsbool, default=False
If set to True, the function will also parse the group of entities according to some specific keywords. By default, it is set to False.
- Returns:
- Document
The dataclass object containing entities, relations and attributes
- medkit.io._brat_utils._parse_entity(entity_id: str, entity_content: str) BratEntity #
Parse the brat entity string into an Entity structure.
- Parameters:
- entity_idstr
The ID defined in the brat annotation (e.g., āT12ā)
- entity_contentstr
- The string content for this ID to parse
(e.g., āTemporal-Modifier 116 126thistory ofā)
- Returns:
- BratEntity
The dataclass object representing the entity
- Raises:
- ValueError
Raises when the entity canāt be parsed
- medkit.io._brat_utils._parse_relation(relation_id: str, relation_content: str) BratRelation #
Parse the annotation string into a Relation structure.
- Parameters:
- relation_idstr
The ID defined in the brat annotation (e.g., āR12ā)
- relation_contentstr
The relation text content. (e.g., āModified-By Arg1:T8 Arg2:T6tā)
- Returns:
- BratRelation
The dataclass object representing the relation
- Raises:
- ValueError
Raises when the relation canāt be parsed
- medkit.io._brat_utils._parse_attribute(attribute_id: str, attribute_content: str) BratAttribute #
Parse the annotation string into an Attribute structure.
- Parameters:
- attribute_idstr
The attribute ID defined in the annotation. (e.g., āA1ā)
- attribute_contentstr
The attribute text content. (e.g., āTense T19 Past-Endedā)
- Returns:
- BratAttribute
The dataclass object representing the attribute
- Raises:
- ValueError
Raises when the attribute canāt be parsed
- medkit.io._brat_utils._parse_note(note_id: str, note_content: str) BratNote #
Parse the annotation string into an Note structure.
- Parameters:
- note_idstr
The note ID defined in the annotation. (e.g., ā#1ā)
- note_contentstr
The note text content. (e.g., āAnnotatorNotes T10 C0011849ā)
- Returns:
- BratNote
The dataclass object representing the note
- Raises:
- ValueError
Raises when the note canāt be parsed