generative_ai.dataset_generation.step_2_generation module#

Define functionalities to generate retrieval and tuning sources.

enumerate_array_elements(array: list, attribute: str | None = None) str#

Store all members of array, or a common property of them.

Parameters:
  • array (list) -- original objects whose elements (or their property) are to be stored

  • attribute (str | None, optional) -- name of common property of array elements that need to be stored, by default None

Returns:

concatenated string with all elements of array (or their property) in a numbered list

Return type:

str

Raises:

ValueError -- if elements of array are not strings and attribute is missing

generate_class_member_dataset(class_member: str, class_docstring: str, member_type_details: ClassDetails) tuple[Dataset, list[str]]#

Create relevant question and answers based on class member details.

Parameters:
  • class_member (str) -- name of the class member

  • class_docstring (str) -- __doc__ attribute of the class member, if any

  • member_type_details (ClassDetails) -- details of the class member

Returns:

  • Dataset -- all documents for retrieval and tuning for querying class member documentation

  • list[str] -- only retrieval documents

Return type:

tuple[Dataset, list[str]]

generate_enum_member_dataset(enum_member: str, enum_docstring: str, member_type_details: EnumDetails) tuple[Dataset, list[str]]#

Create relevant question and answers based on enum member details.

Parameters:
  • enum_member (str) -- name of the enum member

  • enum_docstring (str) -- __doc__ attribute of the enum member, if any

  • member_type_details (EnumDetails) -- details of the enum member

Returns:

  • Dataset -- all documents for retrieval and tuning for querying enum member documentation

  • list[str] -- only retrieval documents

Return type:

tuple[Dataset, list[str]]

generate_function_member_dataset(function_member: str, function_docstring: str, member_type_details: FunctionDetails) tuple[Dataset, list[str]]#

Create relevant question and answers based on function member details.

Parameters:
  • function_member (str) -- name of the function member

  • function_docstring (str) -- __doc__ attribute of the function member, if any

  • member_type_details (FunctionDetails) -- details of the function member

Returns:

  • Dataset -- all documents for retrieval and tuning for querying function member

  • list[str] -- only retrieval documents

Return type:

tuple[Dataset, list[str]]

generate_member_dataset(member_details: MemberDetails) tuple[Dataset, ...]#

Create a dataset for a member.

Parameters:

member_details (MemberDetails) -- all details of the member

Returns:

all documents for retrieval and tuning for querying member documentation

Return type:

tuple[Dataset, ]

Raises:

ValueError -- if the member type is not supported

Notes

  • There will be a single return if member type is not enum, class or function.

  • Otherwise, there will be two returns, one for the member and one for the member type.

generate_module_dataset(module_contents: ModuleDetails) Dataset#

Create relevant question and answers based on module details.

Parameters:

module_contents (ModuleDetails) -- details of a python module

Returns:

all documents for retrieval and tuning for querying module documentation

Return type:

Dataset

generate_package_dataset(package_contents: PackageDetails) Dataset#

Create relevant question and answers based on package details.

Parameters:

package_contents (PackageDetails) -- details of a python package

Returns:

all documents for retrieval and tuning for querying package documentation

Return type:

Dataset