generative_ai.information_retrieval package#

Submodules#

Module contents#

Define functionalities for information retrieval.

class CaptureDetailsCallback#

Bases: BaseCallbackHandler

Capture details of question answering pipeline.

effective_prompt#

exact prompt passed to large language model after successful retrieval

Type:

str | None

effective_duration#

time taken (in seconds) for large language model to generate response

Type:

float | None

on_llm_start(serialized: dict, prompts: list[str], *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, tags: list[str] | None = None, metadata: dict | None = None, **kwargs: Any) None#

Run when large language model starts generating response.

Notes

  • This method only uses prompts argument, and rest are ignored.

  • This modifies self.effective_prompt and self.effective_duration attributes.

    • self.effective_prompt is set to the first element of prompts.

    • self.effective_duration is set to the current time.

on_llm_end(response: LLMResult, *, run_id: uuid.UUID, parent_run_id: uuid.UUID | None = None, **kwargs: Any) None#

Run when large language model finishes generating response.

Notes

  • This method ignores all of its arguments.

  • This modifies self.effective_duration attribute.

    • It is updated to the difference between current time and the stored value.

class PipelineType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported pipeline types.

TEXT_GENERATION = 'text-generation'#
TEXT2TEXT_GENERATION = 'text2text-generation'#
class QuantisedModel(*, language_model_type: Literal[TransformerType.QUANTISED_CTRANSFORMERS], quantised_model_name: str, quantised_model_file: str, quantised_model_type: str)#

Bases: BaseModel

Store details of a ctransformers library compatible Hugging Face model.

language_model_type#

kind of language model

Type:

typing.Literal[TransformerType.QUANTISED_CTRANSFORMERS]

quantised_model_name#

name of the Hugging Face model

Type:

str

quantised_model_file#

named of quantised model file

Type:

str

quantised_model_type#

type of quantised model

Type:

str

language_model_type: Literal[TransformerType.QUANTISED_CTRANSFORMERS]#
quantised_model_name: str#
quantised_model_file: str#
quantised_model_type: str#
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.QUANTISED_CTRANSFORMERS: 'quantised_ctransformers'>], required=True), 'quantised_model_file': FieldInfo(annotation=str, required=True), 'quantised_model_name': FieldInfo(annotation=str, required=True), 'quantised_model_type': FieldInfo(annotation=str, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class RetrievalType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported retrieval types.

MMR = 'mmr'#
SIMILARITY = 'similarity'#
class StandardModel(*, language_model_type: Literal[TransformerType.STANDARD_TRANSFORMERS], standard_pipeline_type: PipelineType, standard_model_name: str)#

Bases: BaseModel

Store details of a transformers library compatible Hugging Face model.

language_model_type#

kind of language model

Type:

typing.Literal[TransformerType.STANDARD_TRANSFORMERS]

standard_pipeline_type#

kind of Hugging Face pipeline

Type:

PipelineType

standard_model_name#

name of the Hugging Face model

Type:

str

language_model_type: Literal[TransformerType.STANDARD_TRANSFORMERS]#
standard_pipeline_type: PipelineType#
standard_model_name: str#
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'language_model_type': FieldInfo(annotation=Literal[<TransformerType.STANDARD_TRANSFORMERS: 'standard_transformers'>], required=True), 'standard_model_name': FieldInfo(annotation=str, required=True), 'standard_pipeline_type': FieldInfo(annotation=PipelineType, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class TransformerType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: str, Enum

Define supported transformer types.

STANDARD_TRANSFORMERS = 'standard_transformers'#
QUANTISED_CTRANSFORMERS = 'quantised_ctransformers'#
configure_language_model(language_model_type: TransformerType, standard_pipeline_type: PipelineType, standard_model_name: str, quantised_model_name: str, quantised_model_file: str, quantised_model_type: str) QuantisedModel | StandardModel#

Prepare configurations to load language model.

Parameters:
  • language_model_type (TransformerType) -- kind of language model

  • standard_pipeline_type (PipelineType) -- kind of Hugging Face pipeline

  • standard_model_name (str) -- name of transformers compatible Hugging Face model

  • quantised_model_name (str) -- name of ctransformers compatible Hugging Face model

  • quantised_model_file (str) -- named of quantised model file

  • quantised_model_type (str) -- type of quantised model

Returns:

configurations of language model

Return type:

LanguageModel

Raises:

ValueError -- if language model type is not supported

create_database_retriever(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float) VectorStoreRetriever#

Prepare a vector store retriever for the retrieval database.

Parameters:
  • embedding_database (Chroma) -- vector store

  • search_type (RetrievalType) -- kind of retrieval algorithm for searching vector store

  • number_of_documents (int) -- number of documents to retrieve

  • initial_number_of_documents (int) -- initial number of documents to consider

  • diversity_level (float) -- similarity between retrieved documents

Returns:

vector store retriever

Return type:

VectorStoreRetriever

Raises:

ValueError -- if retrieval type is not supported

Notes

  • If search_type is similarity, only number_of_documents is used.

  • For maximal marginal relevance (MMR), diversity_level must be in [0, 1].

    • 0 means minimum diversity.

    • 1 means maximum diversity.

create_document_embedder(embedding_model: str) HuggingFaceEmbeddings#

Prepare a Sentence Transformers model for document embedding.

Parameters:

embedding_model (str) -- name of Sentence Transformers model from Hugging Face

Returns:

document embedder

Return type:

HuggingFaceEmbeddings

create_embedding_database(embedding_model: str, directory_path: pathlib.Path, source_documents: list[Document]) Chroma#

Prepare an embedding database.

Parameters:
  • embedding_model (str) -- name of Sentence Transformers model from Hugging Face

  • directory_path (pathlib.Path) -- path to directory for storing vector store

  • source_documents (list[Document]) -- partitioned source documents

Returns:

vector store

Return type:

Chroma

create_llm(language_model: QuantisedModel | StandardModel) CTransformers | HuggingFacePipeline#

Prepare a large language model.

Parameters:

language_model (LanguageModel) -- details of large language model

Returns:

loaded large language model

Return type:

CTransformers | HuggingFacePipeline

Raises:

ValueError -- if language model type is not supported

Notes

  • At most 256 new tokens will be generated.

  • Deterministic behaviour is ensured for both types of large language models.

    • For transformers compatible models, top_k is set to 1.

    • For ctransformers compatible models, temperature is set to 0.

create_vector_store(embedder: HuggingFaceEmbeddings, directory_path: pathlib.Path) Chroma#

Initialise a Chroma vector store.

Parameters:
  • embedder (HuggingFaceEmbeddings) -- document embedder

  • directory_path (pathlib.Path) -- path to directory for storing vector store

Returns:

vector store

Return type:

Chroma

generate_retrieval_chain(database_retriever: VectorStoreRetriever, llm: CTransformers | HuggingFacePipeline) BaseRetrievalQA#

Prepare a retrieval chain for question answering.

Parameters:
  • database_retriever (VectorStoreRetriever) -- vector store retriever

  • llm (CTransformers | HuggingFacePipeline) -- large language model

Returns:

retrieval chain

Return type:

BaseRetrievalQA

Notes

  • The prompt template instructs the model to not answer if it is missing in the context.

  • It also instructs the model to keep the answer as concise as possible.

load_embedding_database(embedding_model: str, directory_path: pathlib.Path) Chroma#

Load vector store from disk from configured directory.

Parameters:
  • embedding_model (str) -- name of Sentence Transformers model from Hugging Face

  • directory_path (pathlib.Path) -- path to load vector store from

Returns:

vector store

Return type:

Chroma

Notes

  • embedding_model must match the one originally used for database creation.

load_source_documents(file_path: pathlib.Path) list[Document]#

Load and partition source documents.

Parameters:

file_path (pathlib.Path) -- path storing JSON dataset

Returns:

partitioned source documents

Return type:

list[Document]

load_json_documents(file_path: pathlib.Path) list[Document]#

Load retrieval documents from a JSON file.

Parameters:

file_path (pathlib.Path) -- path to JSON file

Returns:

retrieval documents

Return type:

list[Document]

partition_documents(raw_documents: list[Document]) list[Document]#

Partition retrieval documents into chunks.

Parameters:

raw_documents (list[Document]) -- retrieval documents

Returns:

chunks of retrieval documents

Return type:

list[Document]

Notes

  • Chunk length will be at most 512 tokens.

  • Different chunks from same document will overlap by 64 tokens.

prepare_question_answer_chain(embedding_database: Chroma, search_type: RetrievalType, number_of_documents: int, initial_number_of_documents: int, diversity_level: float, language_model: QuantisedModel | StandardModel) RunnableSerializable#

Prepare a question answering pipeline.

Parameters:
  • embedding_database (Chroma) -- vector store

  • search_type (RetrievalType) -- kind of retrieval algorithm for searching vector store

  • number_of_documents (int) -- number of documents to retrieve

  • initial_number_of_documents (int) -- initial number of documents to consider

  • diversity_level (float) -- similarity between retrieved documents

  • language_model (LanguageModel) -- configurations of language model

Returns:

question answering pipeline

Return type:

RunnableSerializable

run_question_answer_chain(question_answer_chain: RunnableSerializable, question: str) tuple[dict, CaptureDetailsCallback]#

Run question answering pipeline for user input.

Parameters:
  • question_answer_chain (RunnableSerializable) -- question answering pipeline

  • question (str) -- query from user

Returns:

  • dict -- response from large language model

  • CaptureDetailsCallback -- callback capturing details of particular run of question answering pipeline

Return type:

tuple[dict, CaptureDetailsCallback]

store_embedding_database(vector_store: Chroma) None#

Dump vector store to disk into configured directory.

Parameters:

vector_store (Chroma) -- vector store