generative_ai.information_retrieval package#

Submodules#

Module contents#

Define functionalities for information retrieval.

class CaptureDetailsCallback#

Bases: BaseCallbackHandler

Capture details of question answering pipeline.

effective_prompt#

exact prompt passed to large language model after successful retrieval

Type:: str | None

effective_duration#

time taken (in seconds) for large language model to generate response

Type:: float | None

on_llm_start(serialized: dict, prompts: list[str], *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, tags: list[str] | None = None, metadata: dict | None = None, **kwargs: Any) → None#

Run when large language model starts generating response.

Notes

This method only uses prompts argument, and rest are ignored.
This modifies self.effective_prompt and self.effective_duration attributes.
- self.effective_prompt is set to the first element of prompts.
- self.effective_duration is set to the current time.

on_llm_end(response: LLMResult, *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, **kwargs: Any) → None#

Run when large language model finishes generating response.

Notes

This method ignores all of its arguments.
This modifies self.effective_duration attribute.
- It is updated to the difference between current time and the stored value.

class PipelineType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported pipeline types.

TEXT_GENERATION = 'text-generation'#

TEXT2TEXT_GENERATION = 'text2text-generation'#

class QuantisedModel(*, language_model_type: Literal[TransformerType.QUANTISED_CTRANSFORMERS], quantised_model_name: str, quantised_model_file: str, quantised_model_type: str)#

Bases: BaseModel

Store details of a ctransformers library compatible Hugging Face model.

language_model_type#

kind of language model

Type:: typing.Literal[TransformerType.QUANTISED_CTRANSFORMERS]

quantised_model_name#

name of the Hugging Face model

Type:: str

quantised_model_file#

named of quantised model file

Type:: str

quantised_model_type#

type of quantised model

Type:: str

language_model_type: Literal[TransformerType.QUANTISED_CTRANSFORMERS]#

quantised_model_name: str#

quantised_model_file: str#

quantised_model_type: str#

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.QUANTISED_CTRANSFORMERS: 'quantised_ctransformers'>], required=True), 'quantised_model_file': FieldInfo(annotation=str, required=True), 'quantised_model_name': FieldInfo(annotation=str, required=True), 'quantised_model_type': FieldInfo(annotation=str, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class RetrievalType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported retrieval types.

MMR = 'mmr'#

SIMILARITY = 'similarity'#

class StandardModel(*, language_model_type: Literal[TransformerType.STANDARD_TRANSFORMERS], standard_pipeline_type: PipelineType, standard_model_name: str)#

Bases: BaseModel

Store details of a transformers library compatible Hugging Face model.

language_model_type#

kind of language model

Type:: typing.Literal[TransformerType.STANDARD_TRANSFORMERS]

standard_pipeline_type#

kind of Hugging Face pipeline

Type:: PipelineType

standard_model_name#

name of the Hugging Face model

Type:: str

language_model_type: Literal[TransformerType.STANDARD_TRANSFORMERS]#

standard_pipeline_type: PipelineType#

standard_model_name: str#

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.STANDARD_TRANSFORMERS: 'standard_transformers'>], required=True), 'standard_model_name': FieldInfo(annotation=str, required=True), 'standard_pipeline_type': FieldInfo(annotation=PipelineType, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class TransformerType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported transformer types.

STANDARD_TRANSFORMERS = 'standard_transformers'#

QUANTISED_CTRANSFORMERS = 'quantised_ctransformers'#

configure_language_model(language_model_type: TransformerType, standard_pipeline_type: PipelineType, standard_model_name: str, quantised_model_name: str, quantised_model_file: str, quantised_model_type: str) → QuantisedModel | StandardModel#

Prepare configurations to load language model.

Parameters:

language_model_type (TransformerType) -- kind of language model
standard_pipeline_type (PipelineType) -- kind of Hugging Face pipeline
standard_model_name (str) -- name of transformers compatible Hugging Face model
quantised_model_name (str) -- name of ctransformers compatible Hugging Face model
quantised_model_file (str) -- named of quantised model file
quantised_model_type (str) -- type of quantised model

Returns:

configurations of language model

Return type:

LanguageModel

Raises:

ValueError -- if language model type is not supported

create_database_retriever(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float) → VectorStoreRetriever#

Prepare a vector store retriever for the retrieval database.

Parameters:

embedding_database (Chroma) -- vector store
search_type (RetrievalType) -- kind of retrieval algorithm for searching vector store
number_of_documents (int) -- number of documents to retrieve
initial_number_of_documents (int) -- initial number of documents to consider
diversity_level (float) -- similarity between retrieved documents

Returns:

vector store retriever

Return type:

VectorStoreRetriever

Raises:

ValueError -- if retrieval type is not supported

Notes

If search_type is similarity, only number_of_documents is used.
For maximal marginal relevance (MMR), diversity_level must be in [0, 1].
- 0 means minimum diversity.
- 1 means maximum diversity.

create_document_embedder(embedding_model: str) → HuggingFaceEmbeddings#

Prepare a Sentence Transformers model for document embedding.

Parameters:: embedding_model (str) -- name of Sentence Transformers model from Hugging Face
Returns:: document embedder
Return type:: HuggingFaceEmbeddings

create_embedding_database(embedding_model: str, directory_path: pathlib.Path, source_documents: list[Document]) → Chroma#

Prepare an embedding database.

Parameters:

embedding_model (str) -- name of Sentence Transformers model from Hugging Face
directory_path (pathlib.Path) -- path to directory for storing vector store
source_documents (list[Document]) -- partitioned source documents

Returns:

vector store

Return type:

Chroma

create_llm(language_model: QuantisedModel | StandardModel) → CTransformers | HuggingFacePipeline#

Prepare a large language model.

Parameters:: language_model (LanguageModel) -- details of large language model
Returns:: loaded large language model
Return type:: CTransformers | HuggingFacePipeline
Raises:: ValueError -- if language model type is not supported

Notes

At most 256 new tokens will be generated.
Deterministic behaviour is ensured for both types of large language models.
- For transformers compatible models, top_k is set to 1.
- For ctransformers compatible models, temperature is set to 0.

create_vector_store(embedder: HuggingFaceEmbeddings, directory_path: pathlib.Path) → Chroma#

Initialise a Chroma vector store.

Parameters:

embedder (HuggingFaceEmbeddings) -- document embedder
directory_path (pathlib.Path) -- path to directory for storing vector store

Returns:

vector store

Return type:

Chroma

generate_retrieval_chain(database_retriever: VectorStoreRetriever, llm: CTransformers | HuggingFacePipeline) → BaseRetrievalQA#

Prepare a retrieval chain for question answering.

Parameters:

database_retriever (VectorStoreRetriever) -- vector store retriever
llm (CTransformers | HuggingFacePipeline) -- large language model

Returns:

retrieval chain

Return type:

BaseRetrievalQA

Notes

The prompt template instructs the model to not answer if it is missing in the context.
It also instructs the model to keep the answer as concise as possible.

load_embedding_database(embedding_model: str, directory_path: pathlib.Path) → Chroma#

Load vector store from disk from configured directory.

Parameters:

embedding_model (str) -- name of Sentence Transformers model from Hugging Face
directory_path (pathlib.Path) -- path to load vector store from

Returns:

vector store

Return type:

Chroma

Notes

embedding_model must match the one originally used for database creation.

load_source_documents(file_path: pathlib.Path) → list[Document]#

Load and partition source documents.

Parameters:: file_path (pathlib.Path) -- path storing JSON dataset
Returns:: partitioned source documents
Return type:: list[Document]

load_json_documents(file_path: pathlib.Path) → list[Document]#

Load retrieval documents from a JSON file.

Parameters:: file_path (pathlib.Path) -- path to JSON file
Returns:: retrieval documents
Return type:: list[Document]

partition_documents(raw_documents: list[Document]) → list[Document]#

Partition retrieval documents into chunks.

Parameters:: raw_documents (list[Document]) -- retrieval documents
Returns:: chunks of retrieval documents
Return type:: list[Document]

Notes

Chunk length will be at most 512 tokens.
Different chunks from same document will overlap by 64 tokens.

prepare_question_answer_chain(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float, language_model: QuantisedModel | StandardModel) → RunnableSerializable#

Prepare a question answering pipeline.

Parameters:

embedding_database (Chroma) -- vector store
search_type (RetrievalType) -- kind of retrieval algorithm for searching vector store
number_of_documents (int) -- number of documents to retrieve
initial_number_of_documents (int) -- initial number of documents to consider
diversity_level (float) -- similarity between retrieved documents
language_model (LanguageModel) -- configurations of language model

Returns:

question answering pipeline

Return type:

RunnableSerializable

run_question_answer_chain(question_answer_chain: RunnableSerializable, question: str) → tuple[dict, CaptureDetailsCallback]#

Run question answering pipeline for user input.

Parameters:

question_answer_chain (RunnableSerializable) -- question answering pipeline
question (str) -- query from user

Returns:

dict -- response from large language model
CaptureDetailsCallback -- callback capturing details of particular run of question answering pipeline

Return type:

tuple[dict, CaptureDetailsCallback]

store_embedding_database(vector_store: Chroma) → None#

Dump vector store to disk into configured directory.

Parameters:: vector_store (Chroma) -- vector store