generative_ai.information_retrieval package#
Submodules#
- generative_ai.information_retrieval.orchestrate_retrieval module
- generative_ai.information_retrieval.step_1_retrieval module
- generative_ai.information_retrieval.step_2_retrieval module
- generative_ai.information_retrieval.step_3_retrieval module
- generative_ai.information_retrieval.utils_retrieval module
TransformerTypePipelineTypeQuantisedModelQuantisedModel.language_model_typeQuantisedModel.quantised_model_nameQuantisedModel.quantised_model_fileQuantisedModel.quantised_model_typeQuantisedModel.language_model_typeQuantisedModel.quantised_model_nameQuantisedModel.quantised_model_fileQuantisedModel.quantised_model_typeQuantisedModel.model_computed_fieldsQuantisedModel.model_configQuantisedModel.model_fields
RetrievalTypeStandardModel
Module contents#
Define functionalities for information retrieval.
- class CaptureDetailsCallback#
Bases:
BaseCallbackHandlerCapture details of question answering pipeline.
- effective_prompt#
exact prompt passed to large language model after successful retrieval
- Type:
str | None
- effective_duration#
time taken (in seconds) for large language model to generate response
- Type:
float | None
- on_llm_start(serialized: dict, prompts: list[str], *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, tags: list[str] | None = None, metadata: dict | None = None, **kwargs: Any) None#
Run when large language model starts generating response.
Notes
This method only uses
promptsargument, and rest are ignored.This modifies
self.effective_promptandself.effective_durationattributes.self.effective_promptis set to the first element ofprompts.self.effective_durationis set to the current time.
- on_llm_end(response: LLMResult, *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, **kwargs: Any) None#
Run when large language model finishes generating response.
Notes
This method ignores all of its arguments.
This modifies
self.effective_durationattribute.It is updated to the difference between current time and the stored value.
- class PipelineType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#
-
Define supported pipeline types.
- TEXT_GENERATION = 'text-generation'#
- TEXT2TEXT_GENERATION = 'text2text-generation'#
- class QuantisedModel(*, language_model_type: Literal[TransformerType.QUANTISED_CTRANSFORMERS], quantised_model_name: str, quantised_model_file: str, quantised_model_type: str)#
Bases:
BaseModelStore details of a
ctransformerslibrary compatible Hugging Face model.- language_model_type#
kind of language model
- Type:
typing.Literal[TransformerType.QUANTISED_CTRANSFORMERS]
- model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.QUANTISED_CTRANSFORMERS: 'quantised_ctransformers'>], required=True), 'quantised_model_file': FieldInfo(annotation=str, required=True), 'quantised_model_name': FieldInfo(annotation=str, required=True), 'quantised_model_type': FieldInfo(annotation=str, required=True)}#
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
- class RetrievalType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#
-
Define supported retrieval types.
- MMR = 'mmr'#
- SIMILARITY = 'similarity'#
- class StandardModel(*, language_model_type: Literal[TransformerType.STANDARD_TRANSFORMERS], standard_pipeline_type: PipelineType, standard_model_name: str)#
Bases:
BaseModelStore details of a
transformerslibrary compatible Hugging Face model.- language_model_type#
kind of language model
- Type:
typing.Literal[TransformerType.STANDARD_TRANSFORMERS]
- standard_pipeline_type#
kind of Hugging Face pipeline
- Type:
- standard_pipeline_type: PipelineType#
- model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.STANDARD_TRANSFORMERS: 'standard_transformers'>], required=True), 'standard_model_name': FieldInfo(annotation=str, required=True), 'standard_pipeline_type': FieldInfo(annotation=PipelineType, required=True)}#
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
- class TransformerType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#
-
Define supported transformer types.
- STANDARD_TRANSFORMERS = 'standard_transformers'#
- QUANTISED_CTRANSFORMERS = 'quantised_ctransformers'#
- configure_language_model(language_model_type: TransformerType, standard_pipeline_type: PipelineType, standard_model_name: str, quantised_model_name: str, quantised_model_file: str, quantised_model_type: str) QuantisedModel | StandardModel#
Prepare configurations to load language model.
- Parameters:
language_model_type (
TransformerType) -- kind of language modelstandard_pipeline_type (
PipelineType) -- kind of Hugging Face pipelinestandard_model_name (
str) -- name oftransformerscompatible Hugging Face modelquantised_model_name (
str) -- name ofctransformerscompatible Hugging Face modelquantised_model_file (
str) -- named of quantised model filequantised_model_type (
str) -- type of quantised model
- Returns:
configurations of language model
- Return type:
LanguageModel- Raises:
ValueError -- if language model type is not supported
- create_database_retriever(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float) VectorStoreRetriever#
Prepare a vector store retriever for the retrieval database.
- Parameters:
embedding_database (
Chroma) -- vector storesearch_type (
RetrievalType) -- kind of retrieval algorithm for searching vector storenumber_of_documents (
int) -- number of documents to retrieveinitial_number_of_documents (
int) -- initial number of documents to considerdiversity_level (
float) -- similarity between retrieved documents
- Returns:
vector store retriever
- Return type:
VectorStoreRetriever- Raises:
ValueError -- if retrieval type is not supported
Notes
If
search_typeissimilarity, onlynumber_of_documentsis used.For maximal marginal relevance (
MMR),diversity_levelmust be in [0, 1].0means minimum diversity.1means maximum diversity.
- create_document_embedder(embedding_model: str) HuggingFaceEmbeddings#
Prepare a Sentence Transformers model for document embedding.
- Parameters:
embedding_model (
str) -- name of Sentence Transformers model from Hugging Face- Returns:
document embedder
- Return type:
HuggingFaceEmbeddings
- create_embedding_database(embedding_model: str, directory_path: pathlib.Path, source_documents: list[Document]) Chroma#
Prepare an embedding database.
- Parameters:
embedding_model (
str) -- name of Sentence Transformers model from Hugging Facedirectory_path (
pathlib.Path) -- path to directory for storing vector storesource_documents (
list[Document]) -- partitioned source documents
- Returns:
vector store
- Return type:
Chroma
- create_llm(language_model: QuantisedModel | StandardModel) CTransformers | HuggingFacePipeline#
Prepare a large language model.
- Parameters:
language_model (
LanguageModel) -- details of large language model- Returns:
loaded large language model
- Return type:
CTransformers | HuggingFacePipeline- Raises:
ValueError -- if language model type is not supported
Notes
At most 256 new tokens will be generated.
Deterministic behaviour is ensured for both types of large language models.
For
transformerscompatible models,top_kis set to 1.For
ctransformerscompatible models,temperatureis set to 0.
- create_vector_store(embedder: HuggingFaceEmbeddings, directory_path: pathlib.Path) Chroma#
Initialise a Chroma vector store.
- Parameters:
embedder (
HuggingFaceEmbeddings) -- document embedderdirectory_path (
pathlib.Path) -- path to directory for storing vector store
- Returns:
vector store
- Return type:
Chroma
- generate_retrieval_chain(database_retriever: VectorStoreRetriever, llm: CTransformers | HuggingFacePipeline) BaseRetrievalQA#
Prepare a retrieval chain for question answering.
- Parameters:
database_retriever (
VectorStoreRetriever) -- vector store retrieverllm (
CTransformers | HuggingFacePipeline) -- large language model
- Returns:
retrieval chain
- Return type:
BaseRetrievalQA
Notes
The prompt template instructs the model to not answer if it is missing in the context.
It also instructs the model to keep the answer as concise as possible.
- load_embedding_database(embedding_model: str, directory_path: pathlib.Path) Chroma#
Load vector store from disk from configured directory.
- Parameters:
embedding_model (
str) -- name of Sentence Transformers model from Hugging Facedirectory_path (
pathlib.Path) -- path to load vector store from
- Returns:
vector store
- Return type:
Chroma
Notes
embedding_modelmust match the one originally used for database creation.
- load_source_documents(file_path: pathlib.Path) list[Document]#
Load and partition source documents.
- Parameters:
file_path (
pathlib.Path) -- path storing JSON dataset- Returns:
partitioned source documents
- Return type:
list[Document]
- load_json_documents(file_path: pathlib.Path) list[Document]#
Load retrieval documents from a JSON file.
- Parameters:
file_path (
pathlib.Path) -- path to JSON file- Returns:
retrieval documents
- Return type:
list[Document]
- partition_documents(raw_documents: list[Document]) list[Document]#
Partition retrieval documents into chunks.
- Parameters:
raw_documents (
list[Document]) -- retrieval documents- Returns:
chunks of retrieval documents
- Return type:
list[Document]
Notes
Chunk length will be at most 512 tokens.
Different chunks from same document will overlap by 64 tokens.
- prepare_question_answer_chain(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float, language_model: QuantisedModel | StandardModel) RunnableSerializable#
Prepare a question answering pipeline.
- Parameters:
embedding_database (
Chroma) -- vector storesearch_type (
RetrievalType) -- kind of retrieval algorithm for searching vector storenumber_of_documents (
int) -- number of documents to retrieveinitial_number_of_documents (
int) -- initial number of documents to considerdiversity_level (
float) -- similarity between retrieved documentslanguage_model (
LanguageModel) -- configurations of language model
- Returns:
question answering pipeline
- Return type:
RunnableSerializable
- run_question_answer_chain(question_answer_chain: RunnableSerializable, question: str) tuple[dict, CaptureDetailsCallback]#
Run question answering pipeline for user input.
- Parameters:
question_answer_chain (
RunnableSerializable) -- question answering pipelinequestion (
str) -- query from user
- Returns:
dict-- response from large language modelCaptureDetailsCallback-- callback capturing details of particular run of question answering pipeline
- Return type: