Chapter 11: GraphRAG
Combine the structural richness of graph traversals with the semantic power of large language models. AstraeaDB's GraphRAG pipeline retrieves context-aware subgraphs and feeds them to an LLM in a single atomic operation.
11.1 What Is GraphRAG?
RAG: The Foundation
Retrieval-Augmented Generation (RAG) is a technique that improves LLM answers by feeding relevant context into the prompt before asking the model to respond. Instead of relying solely on the model's training data, RAG retrieves up-to-date, domain-specific information at query time.
Traditional RAG vs. GraphRAG
Traditional RAG performs vector search over a flat collection of document chunks. It finds the most semantically similar passages to the user's question and includes them in the LLM prompt. This works well for simple lookups, but it has a fundamental limitation: it treats each document chunk as an isolated unit, with no awareness of how facts relate to one another.
GraphRAG replaces the flat document store with a graph. Instead of retrieving isolated passages, it retrieves connected subgraphs -- networks of facts with explicit relationships between them. This enables the LLM to reason about:
- Relationships: "Alice manages Bob, who works on the Matrix project" -- not just "Alice" and "Matrix" appearing in the same document.
- Causality: "Event A triggered Event B, which led to Outcome C" -- with explicit causal edges.
- Multi-hop reasoning: "What is the connection between Alice and the Matrix project?" requires traversing through intermediate nodes (Alice -> manages -> Bob -> works_on -> Matrix).
The GraphRAG Pipeline
AstraeaDB implements the full GraphRAG pipeline as a single server-side operation. Here is the flow:
Question | v Vector Search -----> Find the most relevant "anchor node" | v Graph Traversal ---> BFS from the anchor, collecting N hops of context | v Subgraph -----------> A local neighborhood of connected facts | v Linearization ------> Convert the subgraph into text (structured, prose, triples, or JSON) | v LLM Prompt ---------> Context + Question sent to the language model | v Answer -------------> Grounded, context-aware response
The key advantage is that steps 1 through 4 happen inside the database, minimizing round-trip overhead and ensuring the LLM receives a coherent, connected context rather than a bag of unrelated snippets.
11.2 Subgraph Extraction
The first step in GraphRAG is extracting the relevant subgraph around a center node. AstraeaDB uses breadth-first search (BFS) to collect the local neighborhood.
Parameters
| Parameter | Type | Description |
|---|---|---|
center | Node ID | The anchor node from which to start the traversal. |
hops | Integer | Number of BFS hops from the center. More hops = broader context, but more tokens. |
max_nodes | Integer | Maximum number of nodes to include. Caps the output to fit within LLM context window limits. |
format | String | Linearization format: "structured", "prose", "triples", or "json". |
Example: Extracting a Subgraph
from astraeadb import AstraeaClient with AstraeaClient("127.0.0.1", 7687) as client: # Extract the 2-hop neighborhood around a node subgraph = client.extract_subgraph( center=alice_id, hops=2, max_nodes=50, format="structured" ) print(f"Center: {subgraph['center']}") print(f"Nodes: {subgraph['node_count']}") print(f"Edges: {subgraph['edge_count']}") print(f"Text:\n{subgraph['text']}")
source("r_client.R") client <- AstraeaClient$new("127.0.0.1", 7687) client$connect() # Extract the 2-hop neighborhood subgraph <- client$extract_subgraph( center = alice_id, hops = 2, max_nodes = 50, format = "structured" ) cat("Center:", subgraph$center, "\n") cat("Nodes:", subgraph$node_count, "\n") cat("Edges:", subgraph$edge_count, "\n") cat("Text:\n", subgraph$text, "\n") client$close()
package main import ( "context" "fmt" "github.com/AstraeaDB/AstraeaDB-Official" ) func main() { client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687)) ctx := context.Background() client.Connect(ctx) defer client.Close() subgraph, _ := client.ExtractSubgraph(ctx, &astraeadb.SubgraphConfig{ Center: aliceID, Hops: 2, MaxNodes: 50, Format: "structured", }) fmt.Printf("Center: %s\n", subgraph.Center) fmt.Printf("Nodes: %d\n", subgraph.NodeCount) fmt.Printf("Edges: %d\n", subgraph.EdgeCount) fmt.Printf("Text:\n%s\n", subgraph.Text) }
import com.astraeadb.unified.UnifiedClient; try (var client = UnifiedClient.builder() .host("127.0.0.1").port(7687).build()) { client.connect(); var subgraph = client.extractSubgraph( aliceId, // center node 2, // hops 50, // max nodes "structured" // format ); System.out.println("Center: " + subgraph.getCenter()); System.out.println("Nodes: " + subgraph.getNodeCount()); System.out.println("Edges: " + subgraph.getEdgeCount()); System.out.println("Text:\n" + subgraph.getText()); }
11.3 Linearization Formats
Once a subgraph is extracted, it must be converted into text that an LLM can process. AstraeaDB supports four linearization formats, each with different trade-offs between readability and token efficiency.
Consider a small graph: Alice (Person) --MANAGES--> Bob (Person) --WORKS_ON--> Matrix (Project), where Matrix has a property status: "active".
Structured Format
An indented tree with arrows. The most readable format and recommended for most LLM interactions.
Subgraph centered on Alice (Person): Alice [Person] {name: "Alice", role: "Director"} --MANAGES--> Bob [Person] {name: "Bob", role: "Engineer"} --WORKS_ON--> Matrix [Project] {name: "Matrix", status: "active"}
Prose Format
Natural language paragraphs. Good for LLMs that perform better with conversational context.
The following describes the neighborhood of Alice.
Alice is a Person with role "Director". Alice manages Bob, who is a Person
with role "Engineer". Bob works on the Matrix project, which is currently
active.
Triples Format
Subject-predicate-object triples. Compact and token-efficient, ideal for contexts with strict token limits.
(Alice, type, Person) (Alice, role, "Director") (Alice, MANAGES, Bob) (Bob, type, Person) (Bob, role, "Engineer") (Bob, WORKS_ON, Matrix) (Matrix, type, Project) (Matrix, status, "active")
JSON Format
Machine-readable structured format. Useful when the LLM output will be parsed programmatically or when you need exact property values.
{
"center": "Alice",
"nodes": [
{"id": "nd-1", "labels": ["Person"], "props": {"name": "Alice", "role": "Director"}},
{"id": "nd-2", "labels": ["Person"], "props": {"name": "Bob", "role": "Engineer"}},
{"id": "nd-3", "labels": ["Project"], "props": {"name": "Matrix", "status": "active"}}
],
"edges": [
{"from": "nd-1", "to": "nd-2", "type": "MANAGES"},
{"from": "nd-2", "to": "nd-3", "type": "WORKS_ON"}
]
}
Choosing a Format
| Format | Token Efficiency | Readability | Best For |
|---|---|---|---|
| Structured | Medium | High | General-purpose LLM queries. Recommended default. |
| Prose | Low | Very High | Conversational LLMs, user-facing explanations. |
| Triples | High | Medium | Token-limited contexts, large subgraphs. |
| JSON | Low | Medium | Programmatic parsing, structured output tasks. |
"structured" for most LLM interactions. It strikes the best balance between readability and token count. Switch to "triples" when you need to fit a large subgraph into a limited context window.
11.4 LLM Integration
Built-In Providers
AstraeaDB includes built-in support for several LLM providers, configurable via the server configuration file or environment variables:
| Provider | Models | Configuration |
|---|---|---|
| OpenAI | GPT-4o, GPT-4, GPT-3.5-turbo | OPENAI_API_KEY environment variable |
| Anthropic | Claude 4 Opus, Claude 4 Sonnet | ANTHROPIC_API_KEY environment variable |
| Ollama | Llama 3, Mistral, any local model | OLLAMA_HOST (default: http://localhost:11434) |
| Mock | Echo provider for testing | No configuration required |
Server Configuration
Add the LLM provider to your astraeadb.toml:
# astraeadb.toml [llm] provider = "anthropic" # "openai", "anthropic", "ollama", "mock" model = "claude-sonnet-4-20250514" max_tokens = 2048 # For Ollama (local models) # provider = "ollama" # model = "llama3" # ollama_host = "http://localhost:11434"
The graph_rag Endpoint
The graph_rag method handles the entire pipeline -- vector search, subgraph extraction, linearization, and LLM invocation -- in a single call:
from astraeadb import AstraeaClient with AstraeaClient("127.0.0.1", 7687) as client: result = client.graph_rag( question="What is Alice's relationship to the Matrix project?", anchor=alice_id, hops=2, max_nodes=50, format="structured" ) print(result["answer"]) print(f"Context included {result['nodes_in_context']} nodes") print(f"Estimated {result['estimated_tokens']} tokens")
source("r_client.R") client <- AstraeaClient$new("127.0.0.1", 7687) client$connect() result <- client$graph_rag( question = "What is Alice's relationship to the Matrix project?", anchor = alice_id, hops = 2, max_nodes = 50, format = "structured" ) cat(result$answer, "\n") cat("Context included", result$nodes_in_context, "nodes\n") cat("Estimated", result$estimated_tokens, "tokens\n") client$close()
package main import ( "context" "fmt" "github.com/AstraeaDB/AstraeaDB-Official" ) func main() { client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687)) ctx := context.Background() client.Connect(ctx) defer client.Close() result, _ := client.GraphRAG(ctx, &astraeadb.GraphRAGConfig{ Question: "What is Alice's relationship to the Matrix project?", Anchor: aliceID, Hops: 2, MaxNodes: 50, Format: "structured", }) fmt.Println(result.Answer) fmt.Printf("Context included %d nodes\n", result.NodesInContext) fmt.Printf("Estimated %d tokens\n", result.EstimatedTokens) }
import com.astraeadb.unified.UnifiedClient; try (var client = UnifiedClient.builder() .host("127.0.0.1").port(7687).build()) { client.connect(); var result = client.graphRag( "What is Alice's relationship to the Matrix project?", aliceId, // anchor node 2, // hops 50, // max nodes "structured" // format ); System.out.println(result.getAnswer()); System.out.printf("Context included %d nodes%n", result.getNodesInContext()); System.out.printf("Estimated %d tokens%n", result.getEstimatedTokens()); }
"mock" provider. It echoes back the context it receives, allowing you to verify that the correct subgraph is being extracted and linearized without incurring LLM API costs.
11.5 End-to-End Example
Let us walk through a complete GraphRAG pipeline. We will build a small knowledge graph about tech companies, store embeddings for semantic search, and then ask a question that requires multi-hop reasoning.
Step 1: Build the Knowledge Graph
from astraeadb import AstraeaClient with AstraeaClient("127.0.0.1", 7687) as client: # --- Step 1: Create nodes with embeddings --- acme = client.create_node( labels=["Company"], properties={"name": "Acme Corp", "industry": "AI"}, embedding=[0.1, 0.8, 0.3, 0.5] # simplified 4D embedding ) phoenix = client.create_node( labels=["Product"], properties={"name": "Phoenix Engine", "type": "ML Platform"}, embedding=[0.2, 0.9, 0.4, 0.6] ) alice = client.create_node( labels=["Person"], properties={"name": "Alice Chen", "role": "CTO"}, embedding=[0.3, 0.7, 0.2, 0.4] ) bob = client.create_node( labels=["Person"], properties={"name": "Bob Smith", "role": "Lead Engineer"}, embedding=[0.25, 0.75, 0.35, 0.45] ) nova = client.create_node( labels=["Company"], properties={"name": "Nova Labs", "industry": "Cloud"}, embedding=[0.15, 0.6, 0.5, 0.7] ) # --- Step 2: Create relationships --- client.create_edge(alice["node_id"], acme["node_id"], "WORKS_AT", {"since": 2019}) client.create_edge(bob["node_id"], acme["node_id"], "WORKS_AT", {"since": 2021}) client.create_edge(alice["node_id"], bob["node_id"], "MANAGES") client.create_edge(acme["node_id"], phoenix["node_id"], "DEVELOPS") client.create_edge(bob["node_id"], phoenix["node_id"], "LEADS") client.create_edge(acme["node_id"], nova["node_id"], "PARTNERS_WITH") # --- Step 3: Ask a multi-hop question via GraphRAG --- result = client.graph_rag( question="Who leads the development of Acme Corp's ML platform?", anchor=acme["node_id"], hops=2, max_nodes=50, format="structured" ) print("=== GraphRAG Answer ===") print(result["answer"]) print(f"\nContext: {result['nodes_in_context']} nodes, ~{result['estimated_tokens']} tokens")
source("r_client.R") client <- AstraeaClient$new("127.0.0.1", 7687) client$connect() # Step 1: Create nodes with embeddings acme <- client$create_node( labels = list("Company"), properties = list(name = "Acme Corp", industry = "AI"), embedding = c(0.1, 0.8, 0.3, 0.5) ) phoenix <- client$create_node( labels = list("Product"), properties = list(name = "Phoenix Engine", type = "ML Platform"), embedding = c(0.2, 0.9, 0.4, 0.6) ) alice <- client$create_node( labels = list("Person"), properties = list(name = "Alice Chen", role = "CTO"), embedding = c(0.3, 0.7, 0.2, 0.4) ) bob <- client$create_node( labels = list("Person"), properties = list(name = "Bob Smith", role = "Lead Engineer"), embedding = c(0.25, 0.75, 0.35, 0.45) ) nova <- client$create_node( labels = list("Company"), properties = list(name = "Nova Labs", industry = "Cloud"), embedding = c(0.15, 0.6, 0.5, 0.7) ) # Step 2: Create relationships client$create_edge(alice$node_id, acme$node_id, "WORKS_AT", list(since = 2019)) client$create_edge(bob$node_id, acme$node_id, "WORKS_AT", list(since = 2021)) client$create_edge(alice$node_id, bob$node_id, "MANAGES") client$create_edge(acme$node_id, phoenix$node_id, "DEVELOPS") client$create_edge(bob$node_id, phoenix$node_id, "LEADS") client$create_edge(acme$node_id, nova$node_id, "PARTNERS_WITH") # Step 3: GraphRAG query result <- client$graph_rag( question = "Who leads the development of Acme Corp's ML platform?", anchor = acme$node_id, hops = 2, max_nodes = 50, format = "structured" ) cat("=== GraphRAG Answer ===\n") cat(result$answer, "\n") client$close()
package main import ( "context" "fmt" "github.com/AstraeaDB/AstraeaDB-Official" ) func main() { client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687)) ctx := context.Background() client.Connect(ctx) defer client.Close() // Step 1: Create nodes acme, _ := client.CreateNode(ctx, []string{"Company"}, map[string]any{"name": "Acme Corp", "industry": "AI"}, []float32{0.1, 0.8, 0.3, 0.5}) phoenix, _ := client.CreateNode(ctx, []string{"Product"}, map[string]any{"name": "Phoenix Engine", "type": "ML Platform"}, []float32{0.2, 0.9, 0.4, 0.6}) alice, _ := client.CreateNode(ctx, []string{"Person"}, map[string]any{"name": "Alice Chen", "role": "CTO"}, []float32{0.3, 0.7, 0.2, 0.4}) bob, _ := client.CreateNode(ctx, []string{"Person"}, map[string]any{"name": "Bob Smith", "role": "Lead Engineer"}, []float32{0.25, 0.75, 0.35, 0.45}) nova, _ := client.CreateNode(ctx, []string{"Company"}, map[string]any{"name": "Nova Labs", "industry": "Cloud"}, []float32{0.15, 0.6, 0.5, 0.7}) // Step 2: Create relationships client.CreateEdge(ctx, alice.NodeID, acme.NodeID, "WORKS_AT", map[string]any{"since": 2019}) client.CreateEdge(ctx, bob.NodeID, acme.NodeID, "WORKS_AT", map[string]any{"since": 2021}) client.CreateEdge(ctx, alice.NodeID, bob.NodeID, "MANAGES", nil) client.CreateEdge(ctx, acme.NodeID, phoenix.NodeID, "DEVELOPS", nil) client.CreateEdge(ctx, bob.NodeID, phoenix.NodeID, "LEADS", nil) client.CreateEdge(ctx, acme.NodeID, nova.NodeID, "PARTNERS_WITH", nil) // Step 3: GraphRAG query result, _ := client.GraphRAG(ctx, &astraeadb.GraphRAGConfig{ Question: "Who leads the development of Acme Corp's ML platform?", Anchor: acme.NodeID, Hops: 2, MaxNodes: 50, Format: "structured", }) fmt.Println("=== GraphRAG Answer ===") fmt.Println(result.Answer) _ = nova // used above }
import com.astraeadb.unified.UnifiedClient; import java.util.List; import java.util.Map; try (var client = UnifiedClient.builder() .host("127.0.0.1").port(7687).build()) { client.connect(); // Step 1: Create nodes with embeddings var acme = client.createNode( List.of("Company"), Map.of("name", "Acme Corp", "industry", "AI"), new float[]{0.1f, 0.8f, 0.3f, 0.5f}); var phoenix = client.createNode( List.of("Product"), Map.of("name", "Phoenix Engine", "type", "ML Platform"), new float[]{0.2f, 0.9f, 0.4f, 0.6f}); var alice = client.createNode( List.of("Person"), Map.of("name", "Alice Chen", "role", "CTO"), new float[]{0.3f, 0.7f, 0.2f, 0.4f}); var bob = client.createNode( List.of("Person"), Map.of("name", "Bob Smith", "role", "Lead Engineer"), new float[]{0.25f, 0.75f, 0.35f, 0.45f}); var nova = client.createNode( List.of("Company"), Map.of("name", "Nova Labs", "industry", "Cloud"), new float[]{0.15f, 0.6f, 0.5f, 0.7f}); // Step 2: Create relationships client.createEdge(alice.getNodeId(), acme.getNodeId(), "WORKS_AT", Map.of("since", 2019)); client.createEdge(bob.getNodeId(), acme.getNodeId(), "WORKS_AT", Map.of("since", 2021)); client.createEdge(alice.getNodeId(), bob.getNodeId(), "MANAGES", Map.of()); client.createEdge(acme.getNodeId(), phoenix.getNodeId(), "DEVELOPS", Map.of()); client.createEdge(bob.getNodeId(), phoenix.getNodeId(), "LEADS", Map.of()); client.createEdge(acme.getNodeId(), nova.getNodeId(), "PARTNERS_WITH", Map.of()); // Step 3: GraphRAG query var result = client.graphRag( "Who leads the development of Acme Corp's ML platform?", acme.getNodeId(), 2, 50, "structured"); System.out.println("=== GraphRAG Answer ==="); System.out.println(result.getAnswer()); }
Step 2: What Happens Inside
When the graph_rag call executes, AstraeaDB performs these steps internally:
- Subgraph extraction: Starting from the Acme Corp node, BFS traverses 2 hops, collecting Alice, Bob, Phoenix Engine, and Nova Labs.
- Linearization: The subgraph is converted to structured text:
Acme Corp [Company] {name: "Acme Corp", industry: "AI"} --DEVELOPS--> Phoenix Engine [Product] {name: "Phoenix Engine", type: "ML Platform"} --PARTNERS_WITH--> Nova Labs [Company] {name: "Nova Labs", industry: "Cloud"} <--WORKS_AT-- Alice Chen [Person] {name: "Alice Chen", role: "CTO"} --MANAGES--> Bob Smith [Person] {name: "Bob Smith", role: "Lead Engineer"} --LEADS--> Phoenix Engine [Product]
- LLM prompt: The linearized text is combined with the user's question and sent to the configured LLM.
- Answer: The LLM responds with a grounded answer like: "Bob Smith, a Lead Engineer at Acme Corp, leads the Phoenix Engine, which is Acme's ML Platform. He is managed by Alice Chen, the CTO."
Step 3: Why GraphRAG Beats Simple Vector Search
A traditional vector search for "Acme Corp's ML platform" would return the Phoenix Engine node (most similar embedding). But it would not tell you who leads it -- that information lives in the graph edges, not in any single node's properties. GraphRAG captures these multi-hop connections automatically.
| Approach | Context Retrieved | Can Answer "Who leads it?" |
|---|---|---|
| Simple Vector Search | Phoenix Engine node only | No -- no relationship context |
| GraphRAG (2 hops) | Phoenix + Bob + Alice + Acme + Nova | Yes -- LEADS edge connects Bob to Phoenix |
hops=2 and max_nodes=50. If the LLM's answers lack context, increase hops. If responses are slow or token costs are high, reduce max_nodes or switch to "triples" format. The goal is to include enough context for accurate answers without overwhelming the LLM's context window.