Troubleshooting

Installation Issues

Python Version Incompatibility

Problem: ERROR: Package 'zvec' requires a different Python: 3.9.x not in '>=3.10,<3.13' Solution: Zvec requires Python 3.10, 3.11, or 3.12. Check your Python version:

python --version

If you’re on an older version, install a compatible Python:

# Using conda
conda create -n zvec_env python=3.11
conda activate zvec_env
pip install zvec

# Using pyenv
pyenv install 3.11.0
pyenv virtualenv 3.11.0 zvec_env
pyenv activate zvec_env
pip install zvec

Python 3.13+ is not yet supported. Stick to Python 3.10-3.12.

Platform Not Supported

Problem: ERROR: No matching distribution found for zvec Solution: Zvec currently supports:

Linux (x86_64, ARM64)
macOS (ARM64 only - Apple Silicon)

Check your platform:

# Check OS
uname -s

# Check architecture
uname -m  # Should show x86_64, aarch64, or arm64

If you’re on an unsupported platform (Windows, macOS Intel), you’ll need to:

Use a supported platform (Linux VM, Docker, etc.)
Wait for future platform support
Build from source (advanced)

Import Errors After Installation

Problem: ImportError: cannot import name 'zvec' from 'zvec' or ModuleNotFoundError: No module named 'zvec' Solutions:

Verify installation:
```
pip show zvec
```
Check Python path:
```
import sys
print(sys.path)
```

Reinstall with cache clear:

pip uninstall zvec
pip install --no-cache-dir zvec

Check for naming conflicts:

# Make sure you don't have a file named zvec.py in your working directory
ls -la zvec.py

If using a virtual environment, ensure it’s activated before installing and running.

Build Errors When Installing from Source

Problem: CMake Error or C++ compiler error when building Solutions:

Check CMake version (requires ≥ 3.26, < 4.0):
```
cmake --version
```
Install if needed:
```
pip install cmake==3.27.0
```

Check C++ compiler:

g++ --version  # Should be 11+

Install if needed:

# Ubuntu/Debian
sudo apt-get install g++-11

# macOS
xcode-select --install

Initialize submodules:

git submodule update --init --recursive

Clean build:

pip uninstall zvec
rm -rf build/ dist/ *.egg-info
pip install -e ".[dev]"

See the Building from Source guide for detailed build instructions.

Runtime Errors

Collection Creation Failed

Problem: Status error when creating collection or Failed to create collection Solutions:

Check directory permissions:
```
ls -la /path/to/collection
```
Ensure you have write access.
Verify directory doesn’t exist (for create operations):
```
rm -rf ./my_collection  # If you want to recreate
```

Check schema validity:

# Ensure schema is properly defined
schema = zvec.CollectionSchema(
    name="test",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 128)
)
print(schema)  # Verify schema

Check disk space:
```
df -h /path/to/collection
```

Insert Operation Failed

Problem: Failed to insert documents or Invalid vector dimension Solutions:

Verify vector dimensions match schema:

# Schema specifies dimension 768
schema = zvec.CollectionSchema(
    name="docs",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 768)
)

# Vector must be exactly 768 dimensions
doc = zvec.Doc(id="1", vectors={"embedding": [0.1] * 768})
collection.insert([doc])

Check vector data type:

# Ensure vector is a list of floats, not numpy array
import numpy as np
vector = np.random.rand(768)
doc = zvec.Doc(id="1", vectors={"embedding": vector.tolist()})  # Convert to list

Verify document ID is unique:

# Document IDs must be unique within a collection
# Use update() if you want to modify an existing document

Check field names and types:

# Field names must match schema
doc = zvec.Doc(
    id="1",
    vectors={"embedding": vector},
    fields={"title": "Text", "count": 42}  # Match schema field names
)

Batch inserts are more efficient than single inserts. Insert multiple documents at once when possible.

Query Returns No Results

Problem: Query executes but returns empty results Solutions:

Verify data was inserted:

stats = collection.stats()
print(f"Document count: {stats['doc_count']}")

Check if optimize is needed:

# Optimize after bulk inserts
collection.optimize()

Verify query vector dimensions:

query_vector = [0.1] * 768  # Must match schema dimension
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_vector),
    topk=10
)

Increase topk or adjust parameters:

# Try larger topk
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_vector),
    topk=100  # Increased from 10
)

# Or adjust HNSW ef_search
params = zvec.HnswQueryParams(ef_search=100)
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_vector, params=params),
    topk=10
)

Check filters aren’t too restrictive:

# Remove or relax filters temporarily
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_vector),
    # filter="category == 'test'",  # Comment out temporarily
    topk=10
)

Memory Errors

Problem: MemoryError, std::bad_alloc, or process killed (OOM) Solutions:

Check memory usage:

# Monitor memory while running
htop  # or top

Reduce batch size:

# Instead of inserting 100K docs at once
batch_size = 1000
for i in range(0, len(docs), batch_size):
    collection.insert(docs[i:i+batch_size])

Use lower precision vectors:

# Use FP16 instead of FP32 to halve memory usage
schema = zvec.CollectionSchema(
    name="docs",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP16, 768)
)

Optimize collection regularly:

# Consolidate segments to reduce memory overhead
collection.optimize()

Consider IVF index for large datasets:

# IVF uses less memory than HNSW
from zvec import IVFIndexParams, MetricType

schema = zvec.CollectionSchema(
    name="large_collection",
    vectors=zvec.VectorSchema(
        "embedding",
        zvec.DataType.VECTOR_FP32,
        768,
        index_params=IVFIndexParams(metric_type=MetricType.L2)
    )
)

Memory requirements: roughly N × D × bytes_per_element × 1.2 where N = vector count, D = dimension.

File Lock or Corruption Errors

Problem: Failed to open collection, Lock file exists, or Corrupted data Solutions:

Check for running processes:

# Find processes using the collection
lsof /path/to/collection

Close collection properly:

# Always close or use context manager
collection.close()

# Or use with statement
with zvec.open("./data") as collection:
    # Operations here
    pass
# Automatically closed

Remove stale lock files (if no process is running):
```
rm /path/to/collection/*.lock
```

Restore from backup:

# If data is corrupted, restore from backup
rm -rf ./corrupted_collection
cp -r ./backup/collection ./recovered_collection

Only remove lock files if you’re certain no other process is using the collection.

Performance Issues

Slow Query Performance

Problem: Queries taking too long Solutions:

Optimize the collection:

# Consolidate segments after bulk inserts
collection.optimize()

Tune HNSW ef_search (recall vs. speed tradeoff):

# Lower ef_search = faster but lower recall
params = zvec.HnswQueryParams(ef_search=50)  # Default is often 100+

results = collection.query(
    zvec.VectorQuery("embedding", vector=query_vector, params=params),
    topk=10
)

Check index parameters (set during schema creation):

# For faster queries, reduce M or increase ef_construction
from zvec import HnswIndexParams, MetricType

index_params = HnswIndexParams(
    metric_type=MetricType.IP,
    m=16,  # Reduce from default 32 for faster queries
    ef_construction=200
)

Use appropriate metric type:

# IP (Inner Product) is fastest for normalized vectors
# Normalize vectors before insertion:
import numpy as np

def normalize(v):
    return (np.array(v) / np.linalg.norm(v)).tolist()

Profile query patterns:

import time

start = time.time()
results = collection.query(...)
print(f"Query took {time.time() - start:.3f}s")

Slow Insert Performance

Problem: Insertions taking too long Solutions:

Use batch inserts:

# Bad: Insert one at a time
for doc in docs:
    collection.insert([doc])  # Slow

# Good: Batch insert
collection.insert(docs)  # Much faster

Optimize less frequently:

# Don't optimize after every insert
# Instead, optimize periodically
batch_count = 0
for batch in data_batches:
    collection.insert(batch)
    batch_count += 1
    if batch_count % 10 == 0:  # Every 10 batches
        collection.optimize()

Adjust index construction parameters:

# Lower ef_construction for faster indexing (but lower recall)
index_params = HnswIndexParams(
    metric_type=MetricType.IP,
    m=16,
    ef_construction=100  # Lower = faster inserts
)

Consider using Flat index initially:

# Build with Flat index, then convert to HNSW
from zvec import FlatIndexParams

# Start with Flat for fast ingestion
schema = zvec.CollectionSchema(
    name="temp",
    vectors=zvec.VectorSchema(
        "embedding",
        zvec.DataType.VECTOR_FP32,
        768,
        index_params=FlatIndexParams()
    )
)

High Memory Usage

Problem: Process using too much memory Solutions:

Switch to lower precision:

# Use FP16 instead of FP32 (half the memory)
zvec.DataType.VECTOR_FP16

# Or use quantized INT8 (1/4 the memory)
zvec.DataType.VECTOR_INT8

Use IVF instead of HNSW:

from zvec import IVFIndexParams

# IVF uses significantly less memory
index_params = IVFIndexParams(
    metric_type=MetricType.L2,
    nlist=100  # Number of clusters
)

Enable memory-mapped storage:

# Let OS manage memory
collection_options = zvec.CollectionOptions(
    use_mmap=True  # Use memory-mapped files
)

Reduce HNSW M parameter:

# Lower M = less memory, but slower queries
index_params = HnswIndexParams(
    metric_type=MetricType.IP,
    m=8  # Default is often 16-32
)

Debugging Tips

Enable Verbose Logging

import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("zvec")
logger.setLevel(logging.DEBUG)

Check Collection Stats

stats = collection.stats()
print(f"Documents: {stats.doc_count}")
print(f"Segments: {stats.segment_count}")
print(f"Index type: {stats.index_type}")

Validate Schema

# Print schema to verify
print(schema)
print(f"Vector dimension: {schema.vectors[0].dimension}")
print(f"Fields: {[f.name for f in schema.fields]}")

Test with Minimal Example

import zvec

# Minimal test to isolate issues
schema = zvec.CollectionSchema(
    name="test",
    vectors=zvec.VectorSchema("vec", zvec.DataType.VECTOR_FP32, 4),
)

coll = zvec.create_and_open("./test_db", schema)
coll.insert([zvec.Doc(id="1", vectors={"vec": [0.1, 0.2, 0.3, 0.4]})])
results = coll.query(zvec.VectorQuery("vec", vector=[0.1, 0.2, 0.3, 0.4]), topk=1)
print(results)
coll.close()

Getting Help

If you’re still experiencing issues:

GitHub Issues

Report bugs and get help from maintainers

Discord Community

Get real-time help from the community

When Reporting Issues

Please include:

Zvec version: pip show zvec
Python version: python --version
Operating system: uname -a
Minimal reproducible example
Error messages (full stack trace)
Steps you’ve already tried

The more details you provide, the faster we can help resolve your issue!

Additional Resources

Documentation Index

​Installation Issues

​Python Version Incompatibility

​Platform Not Supported

​Import Errors After Installation

​Build Errors When Installing from Source

​Runtime Errors

​Collection Creation Failed

​Insert Operation Failed

​Query Returns No Results

​Memory Errors

​File Lock or Corruption Errors

​Performance Issues

​Slow Query Performance

​Slow Insert Performance

​High Memory Usage

​Debugging Tips

​Enable Verbose Logging

​Check Collection Stats

​Validate Schema

​Test with Minimal Example

​Getting Help

GitHub Issues

Discord Community

​When Reporting Issues

Installation Issues

Python Version Incompatibility

Platform Not Supported

Import Errors After Installation

Build Errors When Installing from Source

Runtime Errors

Collection Creation Failed

Insert Operation Failed

Query Returns No Results

Memory Errors

File Lock or Corruption Errors

Performance Issues

Slow Query Performance

Slow Insert Performance

High Memory Usage

Debugging Tips

Enable Verbose Logging

Check Collection Stats

Validate Schema

Test with Minimal Example

Getting Help

When Reporting Issues