Vector-io simplifies interaction with various vector databases by providing a universal format for importing and exporting vector datasets. Easily manage your vector data backed by a range of supported databases while contributing to expand the ecosystem further.
Vector-io is a powerful and comprehensive vector data tooling library designed for seamless interaction with various vector databases and datasets. It provides a universal interface for importing, exporting, backing up, and re-embedding vector data, making it a game-changer for professionals working in the fields of machine learning and data analysis.
Key Features
- Universal Integration: Easily interact with multiple vector databases using a standardized format that simplifies data handling.
- Multi-Database Support: Currently supports fully operational integration with popular databases such as Pinecone, Qdrant, Milvus, and many more.
- Easy Export and Import: Utilize tools for exporting data from various vector databases into a universal vector dataset format (VDF) and import back into supported databases effortlessly.
- Re-embedding Capability: Update your vector datasets with new model embeddings, enabling greater flexibility and adaptability in data analysis.
Usage Examples
Here are quick examples of how to leverage Vector-io:
# Exporting data from Pinecone with a specific model
export_vdf -m hkunlp/instructor-xl --push_to_hub pinecone --environment gcp-starter
# Importing data into Milvus from a local VDF dataset directory
import_vdf -d /path/to/vdf/dataset milvus
# Re-embedding a vector dataset with a new model
reembed_vdf -d /path/to/vdf/dataset -m sentence-transformers/all-MiniLM-L6-v2 -t title
Supported Vector Databases
Vector-io currently supports a range of vector databases across different stages of implementation:
-
Fully Supported Databases include:
- Pinecone
- Qdrant
- Milvus
- GCP Vertex AI Vector Search
- KDB.AI
- LanceDB
- DataStax Astra DB
- Chroma
- Turbopuffer
-
In Development: Work is ongoing to extend support to Azure AI Search, Weaviate, and others.
Getting Involved
If you're interested in contributing to Vector-io, check out the Contributing section in the README. You can propose new vector databases for support or suggest enhancements to existing functionalities.
For questions, feel free to open an issue within the repository or connect with the community via Discord.
Vector-io empowers you to manage vector data with confidence and ease, enhancing your data modeling and retrieval processes across platforms.