PitchHut
Log in / Sign up
tiny-graphrag
10 views
Harness the power of local knowledge with GraphRAG.
Pitch

Experience a local, efficient implementation of the GraphRAG algorithm with just 1000 lines of Python. This hackable and accessible tool empowers you to build knowledge graphs without reliance on commercial models, using only the essentials. Dive deep into document insights and insights extraction right from your own machine.

Description

Overview

Tiny GraphRAG is an innovative, lightweight implementation of the GraphRAG algorithm built entirely in Python. With a concise codebase of just 1000 lines, it provides a robust and locally operable solution, making it both hackable and easily understandable. Unlike many existing solutions, Tiny GraphRAG does not rely on OpenAI or any commercial LLM providers, allowing developers to run everything locally for enhanced privacy and control.

Key Features

  • Local Execution: Run everything locally without the need for external API calls.
  • Flexible Components: Utilize a mix of well-established tools such as pgvector for vector storage and sentence-transformers for embedding models.
  • Powerful Search Options: Engage in three different search modes:
    • Local Search for context-aware queries based on entity relationships.
    • Global Search for thematic insights and community analysis of documents.
    • Naive RAG for straightforward text chunk retrieval using vector similarity and keyword matching.

Example Usage

Here's a glimpse of how to utilize Tiny GraphRAG for various search operations:

# Local search
poetry run graphrag query local --graph graphs/1_graph.pkl "What did Barack Obama study at Columbia University?"

# Global search
poetry run graphrag query global --graph graphs/1_graph.pkl --doc-id 1 "What are the main themes of this document?"

# Naive RAG
poetry run graphrag query naive "What efforts did Barack Obama make to combat climate change?"

Performance Optimization

For enhanced performance, especially when dealing with larger datasets or more complex queries, consider leveraging dedicated hardware:

  • Apple Silicon users can install thinc-apple-ops for optimized execution.
  • Nvidia GPU users can install the necessary libraries to speed up embedding and inference tasks.

Customization

Tiny GraphRAG allows you to define custom entities and relationships, ensuring that your graphs adhere to specific semantic rules. For instance, you can customize entity types and constrain relational definitions in your documents:

# Define custom entity types
geography_entity_types = ["City", "Country", "Landmark", "River"]

# Define custom relation types with constraints
geography_relation_types = {
    "capital of": {
        "allowed_head": ["City"],
        "allowed_tail": ["Country"]
    },
    "flows through": {
        "allowed_head": ["River"],
        "allowed_tail": ["City", "Country"]
    }
}

# Store document with custom types
store_document(
    filepath="data/geography.txt",
    title="World Geography",
    entity_types=geography_entity_types,
    relation_types=geography_relation_types,
)

Conclusion

Tiny GraphRAG represents a significant step towards creating accessible, modular, and powerful local graph analysis tools. Dive into the project to explore its capabilities and start building your knowledge-driven applications today!