A collection is the primary container for storing and querying documents in Zvec. Each collection has a fixed schema that defines its structure, including scalar fields and vector fields.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/alibaba/zvec/llms.txt
Use this file to discover all available pages before exploring further.
What is a Collection?
A collection in Zvec is similar to a table in traditional databases. It:- Stores documents with a consistent schema
- Persists data to disk for durability
- Supports CRUD operations (Create, Read, Update, Delete)
- Enables vector similarity search with optional filtering
- Manages indexes for efficient querying
Collection Lifecycle
Creating a New Collection
Usecreate_and_open() to create a new collection with a defined schema:
Opening an Existing Collection
Useopen() to access a previously created collection:
The collection must have been previously created with
create_and_open(). Opening a non-existent collection will raise an error.Collection Properties
TheCollection class exposes several read-only properties:
| Property | Type | Description |
|---|---|---|
path | str | Filesystem path where collection data is stored |
schema | CollectionSchema | The schema defining the collection structure |
stats | CollectionStats | Runtime statistics (document count, size, etc.) |
option | CollectionOption | Configuration options used to open the collection |
Core Collection Operations
Data Manipulation (DML)
Collections support standard CRUD operations:Data Retrieval (DQL)
Schema Modification (DDL)
Collections support dynamic schema changes:Persistence and Durability
Flushing Data
By default, Zvec buffers writes in memory for performance. Useflush() to ensure data is persisted to disk:
Optimizing Performance
Periodically optimize the collection to merge segments and rebuild indexes:Destroying a Collection
Multi-Collection Workflows
You can work with multiple collections simultaneously:Best Practices
Design your schema carefully
Design your schema carefully
Collection schemas are fixed at creation time. Plan your fields and vector dimensions in advance. Use
nullable=True for fields that may not always have values.Batch operations when possible
Batch operations when possible
Inserting multiple documents at once is more efficient than individual inserts:
Flush periodically for durability
Flush periodically for durability
If your application requires strong durability guarantees, call
flush() after critical writes. However, excessive flushing can impact performance.Create indexes before large-scale queries
Create indexes before large-scale queries
Build appropriate indexes (HNSW, IVF) on vector fields before running similarity searches at scale. See Indexing for details.
Next Steps
Schemas
Learn how to define collection schemas with fields and vectors
Vectors
Understand dense and sparse vector types
Indexing
Optimize search performance with indexes
Querying
Execute vector similarity searches with filters