generative_ai.top_level module#

Define functionalities for top level modules.

create_database(dataset_file: Path, embedding_model: str, database_directory: Path, force: bool) Path#

Generate embedding database for querying a package documentation.

Parameters:
  • dataset_file (pathlib.Path) -- path storing JSON dataset

  • embedding_model (str) -- name of Sentence Transformers model from Hugging Face

  • database_directory (pathlib.Path) -- path to directory for storing vector store

  • force (bool) -- override if database_directory already exists

Returns:

absolute path to directory storing vector store

Return type:

pathlib.Path

Raises:
create_dataset(package_name: str, dataset_file: Path, force: bool) Path#

Generate JSON dataset for querying a package documentation.

Parameters:
  • package_name (str) -- name of the root package to import with

  • dataset_file (pathlib.Path) -- path to store JSON dataset

  • force (bool, optional) -- override if dataset_file already exists

Returns:

absolute path storing JSON dataset

Return type:

pathlib.Path

Raises:

FileExistsError -- if dataset_file already exists and overriding is not allowed

get_response(question: str, embedding_model: str, database_directory: Path, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float, language_model_type: TransformerType, standard_pipeline_type: PipelineType, standard_model_name: str, quantised_model_name: str, quantised_model_file: str, quantised_model_type: str) Response#

Get answer from large language model.

Parameters:
  • question (str) -- query from user

  • embedding_model (str) -- name of Sentence Transformers model from Hugging Face

  • database_directory (pathlib.Path) -- path to load vector store from

  • search_type (RetrievalType) -- kind of retrieval algorithm for searching vector store

  • number_of_documents (int) -- number of documents to retrieve

  • initial_number_of_documents (int) -- initial number of documents to consider

  • diversity_level (float) -- similarity between retrieved documents

  • language_model_type (TransformerType) -- kind of language model

  • standard_pipeline_type (PipelineType) -- kind of Hugging Face pipeline

  • standard_model_name (str) -- name of transformers compatible Hugging Face model

  • quantised_model_name (str) -- name of ctransformers compatible Hugging Face model

  • quantised_model_file (str) -- named of quantised model file

  • quantised_model_type (str) -- type of quantised model

Returns:

answer from large language model with additional captured details

Return type:

Response

Raises:

FileNotFoundError -- if database_directory does not exist