medkit.core.prov_tracer#

Classes#

Prov

Provenance information for a specific data item.

ProvTracer

Provenance tracing component.

Module Contents#

class medkit.core.prov_tracer.Prov#

Provenance information for a specific data item.

Parameters:
data_itemIdentifiableDataItem

Data item that was created (for instance an annotation or an attribute).

op_desc: OperationDescription, optional

Description of the operation that created the data item.

source_data_itemslist of IdentifiableDataItem

Data items that were used by the operation to create the data item.

derived_data_itemslist of IdentifiableDataItem

Data items that were created by other operations using this data item.

data_item: medkit.core.data_item.IdentifiableDataItem#
op_desc: medkit.core.operation_desc.OperationDescription | None#
source_data_items: list[medkit.core.data_item.IdentifiableDataItem]#
derived_data_items: list[medkit.core.data_item.IdentifiableDataItem]#
class medkit.core.prov_tracer.ProvTracer(store: medkit.core.prov_store.ProvStore | None = None, _graph: medkit.core._prov_graph.ProvGraph | None = None)#

Provenance tracing component.

ProvTracer is intended to gather provenance information about how all data generated by medkit. For each data item (for instance an annotation or an attribute), ProvTracer can tell the operation that created it, the data items that were used to create it, and reciprocally, the data items that were derived from it (cf. Prov).

Provenance-compatible operations should inform the provenance tracer of each data item that through the add_prov() method.

Users wanting to gather provenance information should instantiate one unique ProvTracer object and provide it to all operations involved in their data processing flow. Once all operations have been executed, they may then retrieve provenance info for specific data items through get_prov(), or for all items with get_provs().

Composite operations relying on inner operations (such as pipelines) shouldn’t call add_prov() method. Instead, they should instantiate their own internal ProvTracer and provide it to the operations they rely on, then use add_prov_from_sub_tracer() to integrate information from this internal sub-provenance tracer into the main provenance tracer that was provided to them.

This will build sub-provenance information, that can be retrieved later through get_sub_prov_tracer() or get_sub_prov_tracers(). The inner operations of a composite operation can themselves be composite operations, leading to a tree-like structure of nested provenance tracers.

Parameters:
store:

Store that will contain all traced data items.

store: medkit.core.prov_store.ProvStore#
_graph: medkit.core._prov_graph.ProvGraph#
add_prov(data_item: medkit.core.data_item.IdentifiableDataItem, op_desc: medkit.core.operation_desc.OperationDescription, source_data_items: list[medkit.core.data_item.IdentifiableDataItem])#

Append provenance information about a specific data item.

Parameters:
data_itemIdentifiableDataItem

Data item that was created.

op_descOperationDescription

Description of the operation that created the data item.

source_data_itemslist of IdentifiableDataItem

Data items that were used by the operation to create the data item.

add_prov_from_sub_tracer(data_items: list[medkit.core.data_item.IdentifiableDataItem], op_desc: medkit.core.operation_desc.OperationDescription, sub_tracer: ProvTracer)#

Add provenance information about data items to a specific tracer.

Append provenance information about data items created by a composite operation relying on inner operations (such as a pipeline) having its own internal sub-provenance tracer.

Parameters:
data_itemslist of IdentifiableDataItem

Data items created by the composite operation. Should not include internal intermediate data items, only the output of the operation.

op_descOperationDescription

Description of the composite operation that created the data items.

sub_tracerProvTracer

Internal sub-provenance tracer of the composite operation.

_add_prov_from_sub_tracer_for_data_item(data_item_id: str, operation_id: str, sub_graph: medkit.core._prov_graph.ProvGraph)#
has_prov(data_item_id: str) bool#

Check whether a specific data item has provenance information.

Note

This will return False if we have provenance info about a data item but only in a sub-provenance tracer.

Parameters:
data_item_idstr

Id of the data item.

Returns:
bool:

True if there is provenance info that can be retrieved with get_prov().

get_prov(data_item_id: str) Prov#

Return provenance information about a specific data item.

Parameters:
data_item_idstr

Id of the data item.

Returns:
Prov:

Provenance info about the data item.

get_provs() list[Prov]#

Return all provenance information about all data items known to the tracer.

Note

Nested provenance info from sub-provenance tracers will not be returned.

Returns:
list of Prov

Provenance info about all known data items.

has_sub_prov_tracer(operation_id: str) bool#

Check whether the provenance tracer has a sub-provenance tracer for an operation.

Note

This will return False if there is a sub-provenance tracer for the operation but that is not a direct child (i.e. that is deeper in the hierarchy).

Parameters:
operation_idstr

Id of the composite operation.

Returns:
bool

True if there is a sub-provenance tracer for the operation.

get_sub_prov_tracer(operation_id: str) ProvTracer#

Return a sub-provenance tracer containing sub-provenance information for an operation.

Parameters:
operation_idstr

Id of the composite operation.

Returns:
ProvTracer

The sub-provenance tracer containing sub-provenance information from the operation.

get_sub_prov_tracers() list[ProvTracer]#

Return all sub-provenance tracers of the provenance tracer.

Note

This will not return sub-provenance tracers that are not direct children of this tracer (i.e. that are deeper in the hierarchy).

Returns:
List[ProvTracer]

All sub-provenance tracers of this provenance tracer.

_build_prov_from_node(node: medkit.core._prov_graph.ProvNode)#