generative_ai.dataset_generation.orchestrate_generation module#
Define functionalities to orchestrate dataset generation.
- generate_json_dataset(raw_datasets: list[Dataset]) JSONDataset#
Convert raw documents into JSON format.
- Parameters:
raw_datasets (
list[Dataset]) -- all retrieval and tuning documents for root package and its contents- Returns:
all details for querying a package documentation in JSON format
- Return type:
JSONDataset
- generate_raw_datasets(package_name: str) list[Dataset]#
Generate all retrieval and tuning documents for exploring documentation of a package.
- Parameters:
package_name (
str) -- name of the root package to import with- Returns:
all retrieval and tuning documents for root package and its contents
- Return type:
list[Dataset]
- load_json_dataset(file_path: Path) JSONDataset#
Load JSON dataset from a JSON file.
- Parameters:
file_path (
pathlib.Path) -- path to load JSON dataset from- Returns:
all details for querying a package documentation in JSON format
- Return type:
JSONDataset
- store_json_dataset(json_dataset: JSONDataset, file_path: Path) None#
Dump JSON dataset into a JSON file.
- Parameters:
json_dataset (
JSONDataset) -- all details for querying a package documentation in JSON formatfile_path (
pathlib.Path) -- path to store JSON dataset