A cloud-native, AI-first graph database written in Rust. Unifying property graphs, vector search, and graph neural networks in a single system.
AstraeaDB synthesizes the strengths of leading graph databases into a single, cohesive system built from the ground up in Rust.
Nodes carry labels, JSON properties, and float32 embedding vectors. The HNSW vector index navigation links map to graph edges, enabling semantic traversal—find neighbors most similar to a concept, not just structurally connected.
Index-free adjacency via pointer swizzling. Hot pages are promoted to direct memory pointers for nanosecond-level traversal—O(k) neighbor lookups instead of O(log N) index scans.
Blend graph proximity with vector similarity using a configurable alpha. Semantic walks greedily traverse the graph toward a concept embedding, combining structural and semantic intelligence.
Built-in Retrieval-Augmented Generation: vector search finds the anchor, BFS extracts a subgraph, linearization converts it to text, and the result is fed to an LLM—all in one atomic operation.
Differentiable tensors and message passing layers built in. Run a training loop for node classification directly inside the database—forward pass, loss computation, and backpropagation through edge weights.
Edges carry validity intervals. Query the graph as it existed at any point in time: neighbors_at(), bfs_at(), and shortest_path_at() with full temporal filtering.
Hand-written recursive-descent parser for ISO GQL / Cypher. Full execution pipeline: MATCH with pattern matching, WHERE filtering, CREATE, DELETE, ORDER BY, LIMIT, aggregation functions, and more.
RBAC authentication, mutual TLS, and a homomorphic encryption engine that allows server-side label matching on encrypted data—the server never sees unencrypted node labels.
Apache Arrow Flight server for zero-copy data transfer. Stream GQL results directly into Pandas or Polars DataFrames without serialization overhead. JSON-TCP and gRPC transports also available.
A three-tier "hydrated" architecture that solves the cloud-native memory wall problem, written entirely in Rust for memory safety and zero-GC pauses.
Pointer swizzling promotes active subgraphs into RAM. 64-bit disk IDs are converted to direct memory pointers for nanosecond-level traversal. HNSW index lives here.
LRU buffer pool caches 8 KiB pages with pin/unpin semantics. Pluggable I/O backends: memmap2 (cross-platform) and io_uring (Linux async I/O).
Data persists in JSON, Apache Parquet, or cloud object stores (S3, GCS, Azure). Open formats for interoperability and long-term archival.
Get up and running in minutes. Build from source, start the server, and connect with the interactive shell or your language of choice.
Synthesizing the best features from across the graph database ecosystem into one unified system.
| Capability | Current Leader | AstraeaDB |
|---|---|---|
| Native Graph Storage | Neo4j | Index-free adjacency with pointer swizzling |
| Massively Parallel Processing | TigerGraph | Hash/range partitioning, shard coordination |
| Multi-Model Flexibility | ArangoDB | Vector-Property Graph (JSON + embeddings) |
| In-Memory Speed | Memgraph | Pointer swizzling + HNSW in hot tier |
| Vector / AI Integration | Weaviate / Neo4j | Built-in HNSW, GNN training, GraphRAG |
| Query Standard | ISO GQL (2024) | GQL / Cypher parser + full executor |
| Privacy / Encryption | — | Homomorphic encryption for encrypted label matching |
Production-ready implementations of essential graph analytics, with optional GPU acceleration.
Power iteration with dangling node handling for node importance ranking.
Louvain algorithm for discovering densely connected clusters.
Degree and betweenness centrality (Brandes' algorithm) for identifying key nodes.
Connected and strongly-connected components via Tarjan's algorithm.
Connect from your language of choice over three transport protocols.
JsonClient (zero deps), ArrowClient (pyarrow), and AstraeaClient (unified). DataFrame integration with Pandas and Polars. 23 tests.
AstraeaClient, ArrowClient, and UnifiedClient. Full feature parity with the Python client, including data.frame import/export.
JSONClient (zero deps), GRPCClient (protobuf), and unified Client that auto-selects gRPC when available. Functional options, context.Context support, and batch operations. 30 tests.
JsonClient (all 22 ops), GrpcClient (protobuf, 14 RPCs), FlightAstraeaClient (Arrow), and UnifiedClient (auto-transport). Java 17+ records, builder pattern, try-with-resources. 113 tests.
Use AstraeaDB as a library with no network overhead. Direct access to Graph, StorageEngine, and VectorIndex traits.
Explore the full documentation, browse the source code, or jump straight into building.