cli module#
Define command line interface using Typer.
- generate_dataset(package_name: str, dataset_file: Path = PosixPath('json_documents.json'), force: bool = False) None#
Create JSON dataset for querying a package documentation.
- Parameters:
package_name (
str) -- name of the root package to import withdataset_file (
pathlib.Path, optional) -- path to store JSON dataset, by default pathlib.Path("json_documents.json")force (
bool, optional) -- override ifdataset_filealready exists, by default False
- generate_database(dataset_file: Path = PosixPath('json_documents.json'), embedding_model: str = 'sentence-transformers/all-MiniLM-L6-v2', database_directory: Path = PosixPath('embeddings_database'), force: bool = False) None#
Generate embedding database for querying a package documentation.
- Parameters:
dataset_file (
pathlib.Path, optional) -- path storing JSON dataset, by default pathlib.Path("json_documents.json")embedding_model (
str, optional) -- name of Sentence Transformers model, by default "sentence-transformers/all-MiniLM-L6-v2"database_directory (
pathlib.Path, optional) -- path to directory for storing vector store, by default pathlib.Path("embeddings_database")force (
bool, optional) -- override ifdatabase_directoryalready exists, by default False
- answer_query(query: str, embedding_model: str = 'sentence-transformers/all-MiniLM-L6-v2', database_directory: Path = PosixPath('embeddings_database'), search_type: RetrievalType = RetrievalType.SIMILARITY, number_of_documents: int = 5, initial_number_of_documents: int = 10, diversity_level: float = 0.5, language_model_type: TransformerType = TransformerType.STANDARD_TRANSFORMERS, standard_pipeline_type: PipelineType = PipelineType.TEXT2TEXT_GENERATION, standard_model_name: str = 'google/flan-t5-large', quantised_model_name: str = 'TheBloke/zephyr-7B-beta-GGUF', quantised_model_file: str = 'zephyr-7b-beta.Q4_K_M.gguf', quantised_model_type: str = 'mistral') None#
Get response from large language model.
- Parameters:
query (
str) -- question from userembedding_model (
str, optional) -- name of Sentence Transformers model, by default "sentence-transformers/all-MiniLM-L6-v2"database_directory (
pathlib.Path, optional) -- path to directory for storing vector store, by default pathlib.Path("embeddings_database")search_type (
RetrievalType, optional) -- kind of retrieval algorithm for searching vector store, by default RetrievalType.SIMILARITYnumber_of_documents (
int, optional) -- number of documents to retrieve, by default 5initial_number_of_documents (
int, optional) -- initial number of documents to consider, by default 10diversity_level (
float, optional) -- similarity between retrieved documents, by default 0.5language_model_type (
TransformerType, optional) -- kind of language model, by default TransformerType.STANDARD_TRANSFORMERSstandard_pipeline_type (
PipelineType, optional) -- kind of Hugging Face pipeline, by default PipelineType.TEXT2TEXT_GENERATIONstandard_model_name (
str, optional) -- name oftransformerscompatible model, by default "google/flan-t5-large"quantised_model_name (
str, optional) -- name ofctransformerscompatible model, by default "TheBloke/zephyr-7B-beta-GGUF"quantised_model_file (
str, optional) -- named of quantised model file, by default "zephyr-7b-beta.Q4_K_M.gguf"quantised_model_type (
str, optional) -- type of quantised model, by default "mistral"