medkit.tools.mtsamples#
Tools for accessing examples of mtsamples files.
Refer to the mtsamplesFR repository for more information.
This repository contains:
the original dataset from Kaggle (data/mtsamples.csv);
a French translation for the dataset (data/mtsamples_translation.json).
Both of which are made available under the CC0-1.0 license.
Functions#
  | 
Load mtsamples data as medkit text documents.  | 
  | 
Save mtsamples data as a medkit text file.  | 
Module Contents#
- medkit.tools.mtsamples.load_mtsamples(cache_dir: pathlib.Path | str = '.cache', translated: bool = True, nb_max: int | None = None) list[medkit.core.text.TextDocument]#
 Load mtsamples data as medkit text documents.
- Parameters:
 - cache_dirstr or Path, default=”.cache”
 Directory where to store mtsamples file. Default: .cache
- translatedbool, default=True
 If True (default), mtsamples_translated.json file is used (FR). If False, mtsamples.csv is used (EN)
- nb_maxint, optional
 Maximum number of documents to load
- Returns:
 - list of TextDocument
 The medkit text documents corresponding to mtsamples data
- medkit.tools.mtsamples.convert_mtsamples_to_medkit(output_file: pathlib.Path | str, encoding: str | None = 'utf-8', cache_dir: pathlib.Path | str = '.cache', translated: bool = True)#
 Save mtsamples data as a medkit text file.
- Parameters:
 - output_filestr or Path
 Path to the medkit jsonl file to generate
- encodingstr, default=”utf-8”
 Encoding of the medkit file to generate
- cache_dirstr or Path, default=”.cache”
 Directory where mtsamples file is cached. Default: .cache
- translatedbool, default=True
 If True (default), mtsamples_translated.json file is used (FR). If False, mtsamples.csv is used (EN)