JaguarDB Dual Indexing

Jonathan Yue, PhD
3 min readNov 21, 2023

--

In Jaguar Vector Database, the aggregation of a user’s data is organized into entities known as ‘pods’. Each pod serves as a dedicated repository, designed for the efficient storage and management of user data. A single pod may include ultiple vector stores, each functioning as an integral part of the data storage architecture. Diverging from conventional database systems, which are predominantly limited to scalar data, JaguarDB introduces a dynamic capability. It allows a pod to simultaneously accommodate both scalar and vector data types, therefore enhancing the database’s versatility and applicability in diverse data handling scenarios. Delving deeper into the structure of a pod, we encounter several vector stores. These stores draw parallels to the familiar concepts of tables or collections in traditional database systems. However, they extend beyond these traditional structures by enabling the storage of rows composed of scalar data, along with multiple vector indexes. Additionally, for the effective indexing of scalar data, these vector stores are equipped with scalar indexes. This dual-indexing mechanism significantly enhances data retrieval efficiency and query performance in JaguarDB.

In JaguarDB, when creating a vector store, if a key column is not specified, the system automatically generates a ‘zeromove unique ID’ to serve as the primary key. This key is referred to as ‘zid’, an acronym for ‘zeromove unique ID’, and is of the type ‘zuid’.

A vector store in JaguarDB is not only capable of hosting multiple vector indexes but can also include various scalar columns or fields. A key feature of this system is its automatic index creation when a vector store is established. These scalar indexes are intricately designed to associate the vector ID from the vector indexes with the ‘zeromove ID’ (zid). This mechanism facilitates a seamless retrieval process: users can trace the ‘zid’ value using the vector ID from the vector index. Consequently, with the zeromove ID (zid), all associated fields within the vector store can be rapidly accessed via these automatically generated indexes.

Integrating vector and scalar data with their indexes into a unified vector store can significantly accelerate the retrieval and search processes of intricate business data. This method stands in stark contrast to the practice of storing such data in disparate locations, which often leads to reduced efficiency and slower data access. By localizing the storage, businesses can streamline their data handling, enabling quicker and more effective data analysis and decision-making processes. This approach not only improves operational speed but also enhances overall data management efficiency.

--

--

Jonathan Yue, PhD

Enthusiast on vector databases, AI, RAG, data science, consensus algorithms, distributed systems. Initiator and developer of the JaguarDB vector database