Vector Stores
MariaDBStore provides a LangChain-compatible vector store backed by MariaDB, supporting similarity search, metadata filtering, and maximal marginal relevance retrieval.
Version: langchain-mariadb v0.0.21
DistanceStrategy
Distance strategies for vector similarity.
Attributes
EUCLIDEAN:
COSINE:
TableConfig
Configuration for database table names.
Constructor
__init__(embedding_table: Optional[str] = None, collection_table: Optional[str] = None) -> NoneInitialize TableConfig with custom or default table names.
Parameters:
embedding_table (
Optional[str]): Name for embedding table (default: langchain_embedding)collection_table (
Optional[str]): Name for collection table (default: langchain_collection)
Methods
default
defaultCreate TableConfig with default values.
Attributes
embedding_table (
str):collection_table (
str):
ColumnConfig
Configuration for database column names.
Constructor
Initialize ColumnConfig with custom or default column names.
Parameters:
embedding_id (
Optional[str]): Name for embedding ID column (default: id)embedding (
Optional[str]): Name for embedding vector column (default: embedding)content (
Optional[str]): Name for content column (default: content)metadata (
Optional[str]): Name for metadata column (default: metadata)collection_id (
Optional[str]): Name for collection ID column (default: id)collection_label (
Optional[str]): Name for collection label column (default: label)collection_metadata (
Optional[str]): Name for collection metadata column (default: metadata)
Methods
default
defaultCreate ColumnConfig with default values.
Attributes
embedding_id (
str):embedding (
str):content (
str):metadata (
str):collection_id (
str):collection_label (
str):collection_metadata (
str):
MariaDBStoreSettings
Configuration for MariaDBStore.
Constructor
Initialize MariaDBStoreSettings with custom or default configurations.
Parameters:
tables (
Optional[TableConfig]): Table configurationcolumns (
Optional[ColumnConfig]): Column configurationpre_delete_collection (
bool): delete existing collection (default: False)
Methods
default
defaultCreate MariaDBStoreSettings with default values.
Attributes
tables (
TableConfig):columns (
ColumnConfig):pre_delete_collection (
bool):lazy_init (
bool):
MariaDBStore
MariaDB vector store integration for LangChain.
Examples:
Constructor
Initialize the MariaDB vector store.
Parameters:
embeddings (
Embeddings): Embeddings object for creating embeddingsembedding_length (
Optional[int]): Length of embedding vectors (default: 1536)datasource (
Union[Engine | str]): datasource (connection string, sqlalchemy engine or MariaDB connection pool)collection_name (
str): Name of the collection to store vectorscollection_metadata (
Optional[dict]): Optional metadata for the collectiondistance_strategy (
DistanceStrategy): Strategy for distances (COSINE or EUCLIDEAN)config (
MariaDBStoreSettings): Store configuration for tables and columnslogger (
Optional[logging.Logger]): Optional logger instance for debuggingrelevance_score_fn (
Optional[Callable[[float], float]]): function to override relevance score calculation
Methods
create_tables_if_not_exists
create_tables_if_not_existsCreate the necessary database tables if they don't exist.
drop_tables
drop_tablesDrop all tables used by the vector store.
create_collection
create_collectionCreate a new collection or retrieve existing one.
delete_collection
delete_collectionDelete the current collection and its associated data.
delete
deleteDelete vectors by their IDs.
Parameters:
ids (
Optional[List[str]]): List of IDs to delete**kwargs (
Any): Additional arguments (not used)
get_by_ids
get_by_idsGet documents by their IDs.
add_embeddings
add_embeddingsAdd embeddings to the vectorstore.
Parameters:
texts (
Sequence[str]): Sequence of strings to addembeddings (
List[List[float]]): List of embedding vectorsmetadatas (
Optional[List[dict]]): Optional list of metadata dicts for each textids (
Optional[List[str]]): Optional list of IDs for the documents**kwargs (
Any): Additional arguments (not used)
Returns:
List[str] - List of IDs for the added documents
Raises:
ValueError: If any provided ID contains invalid characters
add_texts
add_textsRun more texts through the embeddings and add to the vectorstore.
Parameters:
texts (
Iterable[str]): Iterable of strings to add to the vectorstore.metadatas (
Optional[List[dict]]): Optional list of metadatas associated with the texts.ids (
Optional[List[str]]): Optional list of ids for the texts. If not provided, will generate a new id for each text.kwargs (
Any): vectorstore specific parameters
Returns:
List[str] - List of ids from adding the texts into the vectorstore.
similarity_search
similarity_searchRun similarity search with MariaDB.
Parameters:
query (
str): Query text to search fork (
int): Number of results to return (default: 4)filter (
Union[None, dict]): Optional filter by metadata**kwargs (
Any): Additional arguments passed to similarity_search_by_vector
Returns:
List[Document] - List of Documents most similar to the query
similarity_search_with_score
similarity_search_with_scoreReturn docs most similar to query along with scores.
Parameters:
query (
str): Text to look up documents similar tok (
int): Number of Documents to return (default: 4)filter (
Union[None, dict]): Optional filter by metadata
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, similarity_score)
similarity_search_with_score_by_vector
similarity_search_with_score_by_vectorReturn docs most similar to embedding vector along with scores.
Parameters:
embedding (
List[float]): Embedding vector to look up documents similar tok (
int): Number of Documents to return (default: 4)filter (
Union[None, dict]): Optional filter by metadata
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, similarity_score)
similarity_search_by_vector
similarity_search_by_vectorReturn docs most similar to embedding vector.
Parameters:
embedding (
List[float]): Embedding vector to look up documents similar tok (
int): Number of Documents to return (default: 4)filter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments (not used)
Returns:
List[Document] - List of Documents most similar to the query vector
max_marginal_relevance_search
max_marginal_relevance_searchReturn docs selected using maximal marginal relevance.
Parameters:
query (
str): Text to look up documents similar tok (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments passed to search_by_vector
Returns:
List[Document] - List of Documents selected by maximal marginal relevance
amax_marginal_relevance_search
amax_marginal_relevance_searchReturn docs selected using maximal marginal relevance asynchronously.
Parameters:
query (
str): Text to look up documents similar tok (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments passed to search_by_vector
Returns:
List[Document] - List of Documents selected by maximal marginal relevance
max_marginal_relevance_search_with_score
max_marginal_relevance_search_with_scoreReturn docs selected using maximal marginal relevance with scores.
Parameters:
query (
str): Text to look up documents similar tok (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments passed to search_by_vector
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, score) selected by maximal marginal relevance
amax_marginal_relevance_search_with_score
amax_marginal_relevance_search_with_scoreReturn docs selected using maximal marginal relevance with scores asynchronously.
Parameters:
query (
str): Text to look up documents similar tok (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments passed to search_by_vector
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, score) selected by maximal marginal relevance
max_marginal_relevance_search_by_vector
max_marginal_relevance_search_by_vectorReturn docs selected using maximal marginal relevance.
Parameters:
embedding (
List[float]): Query embedding vectork (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments (not used)
Returns:
List[Document] - List of Documents selected by maximal marginal relevance
amax_marginal_relevance_search_by_vector
amax_marginal_relevance_search_by_vectorReturn docs selected using maximal marginal relevance asynchronously.
Parameters:
embedding (
List[float]): Query embedding vectork (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments (not used)
Returns:
List[Document] - List of Documents selected by maximal marginal relevance
max_marginal_relevance_search_with_score_by_vector
max_marginal_relevance_search_with_score_by_vectorReturn docs selected using maximal marginal relevance with scores.
Parameters:
embedding (
List[float]): Query embedding vectork (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments (not used)
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, score) selected by maximal marginal relevance
amax_marginal_relevance_search_with_score_by_vector
amax_marginal_relevance_search_with_score_by_vectorReturn docs selected using maximal marginal relevance with scores asynchronously.
Parameters:
embedding (
List[float]): Query embedding vectork (
int): Number of documents to return (default: 4)fetch_k (
int): Number of documents to fetch before selecting top-k (default: 20)lambda_mult (
float): Balance between relevance and diversity, 0-1 (default: 0.5) 0 = maximize diversity, 1 = maximize relevancefilter (
Union[None, dict]): Optional metadata filter**kwargs (
Any): Additional arguments (not used)
Returns:
List[Tuple[Document, float]] - List of tuples of (Document, score) selected by maximal marginal relevance
from_texts
from_textsCreate a MariaDBStore instance from texts.
Parameters:
texts (
List[str]): List of text strings to storeembedding (
Embeddings): Embeddings object for creating embeddingsmetadatas (
Optional[List[dict]]): Optional list of metadata dicts for each textids (
Optional[List[str]]): Optional list of unique IDs for each textdatasource (
Optional[Union[Engine, str]]): Database connection (connection string or sqlalchemy engine)collection_name (
str): Name of the collection to store vectorsdistance_strategy (
DistanceStrategy): Strategy for distances (COSINE or EUCLIDEAN)embedding_length (
Optional[int]): Length of embedding vectors (default: 1536)config (
MariaDBStoreSettings): Store configuration for tables and columnslogger (
Optional[logging.Logger]): Optional logger instance for debuggingrelevance_score_fn (
Optional[Callable[[float], float]]): Optional function to override relevance score calculation**kwargs (
Any): Additional arguments passed to add_embeddings
Returns:
MariaDBStore - MariaDBStore instance initialized with the provided texts
Raises:
ValueError: If
datasourceis not provided.
from_embeddings
from_embeddingsCreate a MariaDBStore instance from text-embedding pairs.
Parameters:
text_embeddings (
List[Tuple[str, List[float]]]): List of (text, embedding) tuplesids (
Optional[List[str]]): Optional list of IDs for the documentsmetadatas (
Optional[List[dict]]): Optional list of metadata dictsembedding (
Embeddings): Embeddings object for creating embeddingsdistance_strategy (
DistanceStrategy): Strategy for computing distancesrelevance_score_fn (
Optional[Callable[[float], float]]): Optional function to compute relevance scoresconfig (
MariaDBStoreSettings): Store configuration for tables and columns**kwargs (
Any): Additional arguments including datasource, collection_name
Returns:
MariaDBStore - MariaDBStore instance
from_existing_index
from_existing_indexCreate a MariaDBStore instance from an existing index.
Parameters:
embedding (
Embeddings): Embeddings object for creating embeddingscollection_name (
str): Name of collection (default: langchain)distance_strategy (
DistanceStrategy): Strategy for computing distancesdatasource (
Union[Engine | str]): datasource (connection string, sqlalchemy engine or MariaDB connection pool)**kwargs (
Any): Additional arguments passed to constructor
Returns:
MariaDBStore - MariaDBStore instance connected to existing index
from_documents
from_documentsCreate a MariaDBStore instance from documents.
Parameters:
documents (
List[Document]): List of Document objects to storeembedding (
Embeddings): Embeddings object for creating embeddingsids (
Optional[List[str]]): Optional list of IDs for the documentsdatasource (
Optional[Union[Engine, str]]): Database connection (connection string or sqlalchemy engine)collection_name (
str): Name of the collection to store vectorsdistance_strategy (
DistanceStrategy): Strategy for distances (COSINE or EUCLIDEAN)embedding_length (
Optional[int]): Length of embedding vectors (default: 1536)config (
MariaDBStoreSettings): Store configuration for tables and columnslogger (
Optional[logging.Logger]): Optional logger instance for debuggingrelevance_score_fn (
Optional[Callable[[float], float]]): Optional function to override relevance score calculation**kwargs (
Any): Additional arguments passed to from_texts
Returns:
MariaDBStore - MariaDBStore instance
as_retriever
as_retrieverReturn VectorStoreRetriever initialized from this VectorStore.
Parameters:
**kwargs (
Any): Keyword arguments to pass to the search function. Can include:search_type: Defines the type of search that the Retriever should perform. Can be 'similarity' (default), 'mmr', or 'similarity_score_threshold'.
search_kwargs: Keyword arguments to pass to the search function. Can include:
k: Amount of documents to return (Default: 4)
score_threshold: Minimum relevance threshold for similarity_score_threshold
fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
filter: Filter by document metadata
Returns:
VectorStoreRetriever - VectorStoreRetriever instance configured for this vector store.
Examples:
Attributes
embedding_function:
collection_name:
collection_metadata:
pre_delete_collection:
lazy_init:
logger:
override_relevance_score_fn:
embeddings (
Embeddings):
Last updated
Was this helpful?

