Chapter 11: GraphRAG

Combine the structural richness of graph traversals with the semantic power of large language models. AstraeaDB's GraphRAG pipeline retrieves context-aware subgraphs and feeds them to an LLM in a single atomic operation.

11.1 What Is GraphRAG?

RAG: The Foundation

Retrieval-Augmented Generation (RAG) is a technique that improves LLM answers by feeding relevant context into the prompt before asking the model to respond. Instead of relying solely on the model's training data, RAG retrieves up-to-date, domain-specific information at query time.

Traditional RAG vs. GraphRAG

Traditional RAG performs vector search over a flat collection of document chunks. It finds the most semantically similar passages to the user's question and includes them in the LLM prompt. This works well for simple lookups, but it has a fundamental limitation: it treats each document chunk as an isolated unit, with no awareness of how facts relate to one another.

GraphRAG replaces the flat document store with a graph. Instead of retrieving isolated passages, it retrieves connected subgraphs -- networks of facts with explicit relationships between them. This enables the LLM to reason about:

Relationships: "Alice manages Bob, who works on the Matrix project" -- not just "Alice" and "Matrix" appearing in the same document.
Causality: "Event A triggered Event B, which led to Outcome C" -- with explicit causal edges.
Multi-hop reasoning: "What is the connection between Alice and the Matrix project?" requires traversing through intermediate nodes (Alice -> manages -> Bob -> works_on -> Matrix).

The GraphRAG Pipeline

AstraeaDB implements the full GraphRAG pipeline as a single server-side operation. Here is the flow:

Question
    |
    v
Vector Search -----> Find the most relevant "anchor node"
    |
    v
Graph Traversal ---> BFS from the anchor, collecting N hops of context
    |
    v
Subgraph -----------> A local neighborhood of connected facts
    |
    v
Linearization ------> Convert the subgraph into text (structured, prose, triples, or JSON)
    |
    v
LLM Prompt ---------> Context + Question sent to the language model
    |
    v
Answer -------------> Grounded, context-aware response

The key advantage is that steps 1 through 4 happen inside the database, minimizing round-trip overhead and ensuring the LLM receives a coherent, connected context rather than a bag of unrelated snippets.

11.2 Subgraph Extraction

The first step in GraphRAG is extracting the relevant subgraph around a center node. AstraeaDB uses breadth-first search (BFS) to collect the local neighborhood.

Parameters

Parameter	Type	Description
`center`	Node ID	The anchor node from which to start the traversal.
`hops`	Integer	Number of BFS hops from the center. More hops = broader context, but more tokens.
`max_nodes`	Integer	Maximum number of nodes to include. Caps the output to fit within LLM context window limits.
`format`	String	Linearization format: `"structured"`, `"prose"`, `"triples"`, or `"json"`.

Example: Extracting a Subgraph

from astraeadb import AstraeaClient

with AstraeaClient("127.0.0.1", 7687) as client:
    # Extract the 2-hop neighborhood around a node
    subgraph = client.extract_subgraph(
        center=alice_id,
        hops=2,
        max_nodes=50,
        format="structured"
    )

    print(f"Center: {subgraph['center']}")
    print(f"Nodes:  {subgraph['node_count']}")
    print(f"Edges:  {subgraph['edge_count']}")
    print(f"Text:\n{subgraph['text']}")

source("r_client.R")

client <- AstraeaClient$new("127.0.0.1", 7687)
client$connect()

# Extract the 2-hop neighborhood
subgraph <- client$extract_subgraph(
  center = alice_id,
  hops = 2,
  max_nodes = 50,
  format = "structured"
)

cat("Center:", subgraph$center, "\n")
cat("Nodes:", subgraph$node_count, "\n")
cat("Edges:", subgraph$edge_count, "\n")
cat("Text:\n", subgraph$text, "\n")

client$close()

package main

import (
    "context"
    "fmt"
    "github.com/AstraeaDB/AstraeaDB-Official"
)

func main() {
    client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687))
    ctx := context.Background()
    client.Connect(ctx)
    defer client.Close()

    subgraph, _ := client.ExtractSubgraph(ctx, &astraeadb.SubgraphConfig{
        Center:   aliceID,
        Hops:     2,
        MaxNodes: 50,
        Format:   "structured",
    })

    fmt.Printf("Center: %s\n", subgraph.Center)
    fmt.Printf("Nodes:  %d\n", subgraph.NodeCount)
    fmt.Printf("Edges:  %d\n", subgraph.EdgeCount)
    fmt.Printf("Text:\n%s\n", subgraph.Text)
}

import com.astraeadb.unified.UnifiedClient;

try (var client = UnifiedClient.builder()
        .host("127.0.0.1").port(7687).build()) {
    client.connect();

    var subgraph = client.extractSubgraph(
        aliceId,       // center node
        2,              // hops
        50,             // max nodes
        "structured"    // format
    );

    System.out.println("Center: " + subgraph.getCenter());
    System.out.println("Nodes:  " + subgraph.getNodeCount());
    System.out.println("Edges:  " + subgraph.getEdgeCount());
    System.out.println("Text:\n" + subgraph.getText());
}

11.3 Linearization Formats

Once a subgraph is extracted, it must be converted into text that an LLM can process. AstraeaDB supports four linearization formats, each with different trade-offs between readability and token efficiency.

Consider a small graph: Alice (Person) --MANAGES--> Bob (Person) --WORKS_ON--> Matrix (Project), where Matrix has a property status: "active".

Structured Format

An indented tree with arrows. The most readable format and recommended for most LLM interactions.

Subgraph centered on Alice (Person):

Alice [Person] {name: "Alice", role: "Director"}
  --MANAGES--> Bob [Person] {name: "Bob", role: "Engineer"}
    --WORKS_ON--> Matrix [Project] {name: "Matrix", status: "active"}

Prose Format

Natural language paragraphs. Good for LLMs that perform better with conversational context.

The following describes the neighborhood of Alice.

Alice is a Person with role "Director". Alice manages Bob, who is a Person
with role "Engineer". Bob works on the Matrix project, which is currently
active.

Triples Format

Subject-predicate-object triples. Compact and token-efficient, ideal for contexts with strict token limits.

(Alice, type, Person)
(Alice, role, "Director")
(Alice, MANAGES, Bob)
(Bob, type, Person)
(Bob, role, "Engineer")
(Bob, WORKS_ON, Matrix)
(Matrix, type, Project)
(Matrix, status, "active")

JSON Format

Machine-readable structured format. Useful when the LLM output will be parsed programmatically or when you need exact property values.

{
  "center": "Alice",
  "nodes": [
    {"id": "nd-1", "labels": ["Person"], "props": {"name": "Alice", "role": "Director"}},
    {"id": "nd-2", "labels": ["Person"], "props": {"name": "Bob", "role": "Engineer"}},
    {"id": "nd-3", "labels": ["Project"], "props": {"name": "Matrix", "status": "active"}}
  ],
  "edges": [
    {"from": "nd-1", "to": "nd-2", "type": "MANAGES"},
    {"from": "nd-2", "to": "nd-3", "type": "WORKS_ON"}
  ]
}

Choosing a Format

Format	Token Efficiency	Readability	Best For
Structured	Medium	High	General-purpose LLM queries. Recommended default.
Prose	Low	Very High	Conversational LLMs, user-facing explanations.
Triples	High	Medium	Token-limited contexts, large subgraphs.
JSON	Low	Medium	Programmatic parsing, structured output tasks.

Recommendation Use "structured" for most LLM interactions. It strikes the best balance between readability and token count. Switch to "triples" when you need to fit a large subgraph into a limited context window.

11.4 LLM Integration

Built-In Providers

AstraeaDB includes built-in support for several LLM providers, configurable via the server configuration file or environment variables:

Provider	Models	Configuration
OpenAI	GPT-4o, GPT-4, GPT-3.5-turbo	`OPENAI_API_KEY` environment variable
Anthropic	Claude 4 Opus, Claude 4 Sonnet	`ANTHROPIC_API_KEY` environment variable
Ollama	Llama 3, Mistral, any local model	`OLLAMA_HOST` (default: `http://localhost:11434`)
Mock	Echo provider for testing	No configuration required

Server Configuration

Add the LLM provider to your astraeadb.toml:

# astraeadb.toml
[llm]
provider = "anthropic"       # "openai", "anthropic", "ollama", "mock"
model    = "claude-sonnet-4-20250514"
max_tokens = 2048

# For Ollama (local models)
# provider = "ollama"
# model    = "llama3"
# ollama_host = "http://localhost:11434"

The graph_rag Endpoint

The graph_rag method handles the entire pipeline -- vector search, subgraph extraction, linearization, and LLM invocation -- in a single call:

from astraeadb import AstraeaClient

with AstraeaClient("127.0.0.1", 7687) as client:
    result = client.graph_rag(
        question="What is Alice's relationship to the Matrix project?",
        anchor=alice_id,
        hops=2,
        max_nodes=50,
        format="structured"
    )

    print(result["answer"])
    print(f"Context included {result['nodes_in_context']} nodes")
    print(f"Estimated {result['estimated_tokens']} tokens")

source("r_client.R")

client <- AstraeaClient$new("127.0.0.1", 7687)
client$connect()

result <- client$graph_rag(
  question = "What is Alice's relationship to the Matrix project?",
  anchor = alice_id,
  hops = 2,
  max_nodes = 50,
  format = "structured"
)

cat(result$answer, "\n")
cat("Context included", result$nodes_in_context, "nodes\n")
cat("Estimated", result$estimated_tokens, "tokens\n")

client$close()

package main

import (
    "context"
    "fmt"
    "github.com/AstraeaDB/AstraeaDB-Official"
)

func main() {
    client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687))
    ctx := context.Background()
    client.Connect(ctx)
    defer client.Close()

    result, _ := client.GraphRAG(ctx, &astraeadb.GraphRAGConfig{
        Question: "What is Alice's relationship to the Matrix project?",
        Anchor:   aliceID,
        Hops:     2,
        MaxNodes: 50,
        Format:   "structured",
    })

    fmt.Println(result.Answer)
    fmt.Printf("Context included %d nodes\n", result.NodesInContext)
    fmt.Printf("Estimated %d tokens\n", result.EstimatedTokens)
}

import com.astraeadb.unified.UnifiedClient;

try (var client = UnifiedClient.builder()
        .host("127.0.0.1").port(7687).build()) {
    client.connect();

    var result = client.graphRag(
        "What is Alice's relationship to the Matrix project?",
        aliceId,        // anchor node
        2,               // hops
        50,              // max nodes
        "structured"     // format
    );

    System.out.println(result.getAnswer());
    System.out.printf("Context included %d nodes%n", result.getNodesInContext());
    System.out.printf("Estimated %d tokens%n", result.getEstimatedTokens());
}

Note: Mock Provider for Testing During development and testing, use the "mock" provider. It echoes back the context it receives, allowing you to verify that the correct subgraph is being extracted and linearized without incurring LLM API costs.

11.5 End-to-End Example

Let us walk through a complete GraphRAG pipeline. We will build a small knowledge graph about tech companies, store embeddings for semantic search, and then ask a question that requires multi-hop reasoning.

Step 1: Build the Knowledge Graph

from astraeadb import AstraeaClient

with AstraeaClient("127.0.0.1", 7687) as client:
    # --- Step 1: Create nodes with embeddings ---
    acme = client.create_node(
        labels=["Company"],
        properties={"name": "Acme Corp", "industry": "AI"},
        embedding=[0.1, 0.8, 0.3, 0.5]  # simplified 4D embedding
    )

    phoenix = client.create_node(
        labels=["Product"],
        properties={"name": "Phoenix Engine", "type": "ML Platform"},
        embedding=[0.2, 0.9, 0.4, 0.6]
    )

    alice = client.create_node(
        labels=["Person"],
        properties={"name": "Alice Chen", "role": "CTO"},
        embedding=[0.3, 0.7, 0.2, 0.4]
    )

    bob = client.create_node(
        labels=["Person"],
        properties={"name": "Bob Smith", "role": "Lead Engineer"},
        embedding=[0.25, 0.75, 0.35, 0.45]
    )

    nova = client.create_node(
        labels=["Company"],
        properties={"name": "Nova Labs", "industry": "Cloud"},
        embedding=[0.15, 0.6, 0.5, 0.7]
    )

    # --- Step 2: Create relationships ---
    client.create_edge(alice["node_id"], acme["node_id"], "WORKS_AT",
                       {"since": 2019})
    client.create_edge(bob["node_id"], acme["node_id"], "WORKS_AT",
                       {"since": 2021})
    client.create_edge(alice["node_id"], bob["node_id"], "MANAGES")
    client.create_edge(acme["node_id"], phoenix["node_id"], "DEVELOPS")
    client.create_edge(bob["node_id"], phoenix["node_id"], "LEADS")
    client.create_edge(acme["node_id"], nova["node_id"], "PARTNERS_WITH")

    # --- Step 3: Ask a multi-hop question via GraphRAG ---
    result = client.graph_rag(
        question="Who leads the development of Acme Corp's ML platform?",
        anchor=acme["node_id"],
        hops=2,
        max_nodes=50,
        format="structured"
    )

    print("=== GraphRAG Answer ===")
    print(result["answer"])
    print(f"\nContext: {result['nodes_in_context']} nodes, ~{result['estimated_tokens']} tokens")

source("r_client.R")

client <- AstraeaClient$new("127.0.0.1", 7687)
client$connect()

# Step 1: Create nodes with embeddings
acme <- client$create_node(
  labels = list("Company"),
  properties = list(name = "Acme Corp", industry = "AI"),
  embedding = c(0.1, 0.8, 0.3, 0.5)
)

phoenix <- client$create_node(
  labels = list("Product"),
  properties = list(name = "Phoenix Engine", type = "ML Platform"),
  embedding = c(0.2, 0.9, 0.4, 0.6)
)

alice <- client$create_node(
  labels = list("Person"),
  properties = list(name = "Alice Chen", role = "CTO"),
  embedding = c(0.3, 0.7, 0.2, 0.4)
)

bob <- client$create_node(
  labels = list("Person"),
  properties = list(name = "Bob Smith", role = "Lead Engineer"),
  embedding = c(0.25, 0.75, 0.35, 0.45)
)

nova <- client$create_node(
  labels = list("Company"),
  properties = list(name = "Nova Labs", industry = "Cloud"),
  embedding = c(0.15, 0.6, 0.5, 0.7)
)

# Step 2: Create relationships
client$create_edge(alice$node_id, acme$node_id, "WORKS_AT", list(since = 2019))
client$create_edge(bob$node_id, acme$node_id, "WORKS_AT", list(since = 2021))
client$create_edge(alice$node_id, bob$node_id, "MANAGES")
client$create_edge(acme$node_id, phoenix$node_id, "DEVELOPS")
client$create_edge(bob$node_id, phoenix$node_id, "LEADS")
client$create_edge(acme$node_id, nova$node_id, "PARTNERS_WITH")

# Step 3: GraphRAG query
result <- client$graph_rag(
  question = "Who leads the development of Acme Corp's ML platform?",
  anchor = acme$node_id,
  hops = 2,
  max_nodes = 50,
  format = "structured"
)

cat("=== GraphRAG Answer ===\n")
cat(result$answer, "\n")

client$close()

package main

import (
    "context"
    "fmt"
    "github.com/AstraeaDB/AstraeaDB-Official"
)

func main() {
    client := astraeadb.NewClient(astraeadb.WithAddress("127.0.0.1", 7687))
    ctx := context.Background()
    client.Connect(ctx)
    defer client.Close()

    // Step 1: Create nodes
    acme, _ := client.CreateNode(ctx, []string{"Company"},
        map[string]any{"name": "Acme Corp", "industry": "AI"},
        []float32{0.1, 0.8, 0.3, 0.5})

    phoenix, _ := client.CreateNode(ctx, []string{"Product"},
        map[string]any{"name": "Phoenix Engine", "type": "ML Platform"},
        []float32{0.2, 0.9, 0.4, 0.6})

    alice, _ := client.CreateNode(ctx, []string{"Person"},
        map[string]any{"name": "Alice Chen", "role": "CTO"},
        []float32{0.3, 0.7, 0.2, 0.4})

    bob, _ := client.CreateNode(ctx, []string{"Person"},
        map[string]any{"name": "Bob Smith", "role": "Lead Engineer"},
        []float32{0.25, 0.75, 0.35, 0.45})

    nova, _ := client.CreateNode(ctx, []string{"Company"},
        map[string]any{"name": "Nova Labs", "industry": "Cloud"},
        []float32{0.15, 0.6, 0.5, 0.7})

    // Step 2: Create relationships
    client.CreateEdge(ctx, alice.NodeID, acme.NodeID, "WORKS_AT",
        map[string]any{"since": 2019})
    client.CreateEdge(ctx, bob.NodeID, acme.NodeID, "WORKS_AT",
        map[string]any{"since": 2021})
    client.CreateEdge(ctx, alice.NodeID, bob.NodeID, "MANAGES", nil)
    client.CreateEdge(ctx, acme.NodeID, phoenix.NodeID, "DEVELOPS", nil)
    client.CreateEdge(ctx, bob.NodeID, phoenix.NodeID, "LEADS", nil)
    client.CreateEdge(ctx, acme.NodeID, nova.NodeID, "PARTNERS_WITH", nil)

    // Step 3: GraphRAG query
    result, _ := client.GraphRAG(ctx, &astraeadb.GraphRAGConfig{
        Question: "Who leads the development of Acme Corp's ML platform?",
        Anchor:   acme.NodeID,
        Hops:     2,
        MaxNodes: 50,
        Format:   "structured",
    })

    fmt.Println("=== GraphRAG Answer ===")
    fmt.Println(result.Answer)
    _ = nova // used above
}

import com.astraeadb.unified.UnifiedClient;
import java.util.List;
import java.util.Map;

try (var client = UnifiedClient.builder()
        .host("127.0.0.1").port(7687).build()) {
    client.connect();

    // Step 1: Create nodes with embeddings
    var acme = client.createNode(
        List.of("Company"),
        Map.of("name", "Acme Corp", "industry", "AI"),
        new float[]{0.1f, 0.8f, 0.3f, 0.5f});

    var phoenix = client.createNode(
        List.of("Product"),
        Map.of("name", "Phoenix Engine", "type", "ML Platform"),
        new float[]{0.2f, 0.9f, 0.4f, 0.6f});

    var alice = client.createNode(
        List.of("Person"),
        Map.of("name", "Alice Chen", "role", "CTO"),
        new float[]{0.3f, 0.7f, 0.2f, 0.4f});

    var bob = client.createNode(
        List.of("Person"),
        Map.of("name", "Bob Smith", "role", "Lead Engineer"),
        new float[]{0.25f, 0.75f, 0.35f, 0.45f});

    var nova = client.createNode(
        List.of("Company"),
        Map.of("name", "Nova Labs", "industry", "Cloud"),
        new float[]{0.15f, 0.6f, 0.5f, 0.7f});

    // Step 2: Create relationships
    client.createEdge(alice.getNodeId(), acme.getNodeId(), "WORKS_AT",
        Map.of("since", 2019));
    client.createEdge(bob.getNodeId(), acme.getNodeId(), "WORKS_AT",
        Map.of("since", 2021));
    client.createEdge(alice.getNodeId(), bob.getNodeId(), "MANAGES", Map.of());
    client.createEdge(acme.getNodeId(), phoenix.getNodeId(), "DEVELOPS", Map.of());
    client.createEdge(bob.getNodeId(), phoenix.getNodeId(), "LEADS", Map.of());
    client.createEdge(acme.getNodeId(), nova.getNodeId(), "PARTNERS_WITH", Map.of());

    // Step 3: GraphRAG query
    var result = client.graphRag(
        "Who leads the development of Acme Corp's ML platform?",
        acme.getNodeId(), 2, 50, "structured");

    System.out.println("=== GraphRAG Answer ===");
    System.out.println(result.getAnswer());
}

Step 2: What Happens Inside

When the graph_rag call executes, AstraeaDB performs these steps internally:

Subgraph extraction: Starting from the Acme Corp node, BFS traverses 2 hops, collecting Alice, Bob, Phoenix Engine, and Nova Labs.

Linearization: The subgraph is converted to structured text:

Acme Corp [Company] {name: "Acme Corp", industry: "AI"}
  --DEVELOPS--> Phoenix Engine [Product] {name: "Phoenix Engine", type: "ML Platform"}
  --PARTNERS_WITH--> Nova Labs [Company] {name: "Nova Labs", industry: "Cloud"}
  <--WORKS_AT-- Alice Chen [Person] {name: "Alice Chen", role: "CTO"}
    --MANAGES--> Bob Smith [Person] {name: "Bob Smith", role: "Lead Engineer"}
      --LEADS--> Phoenix Engine [Product]

LLM prompt: The linearized text is combined with the user's question and sent to the configured LLM.
Answer: The LLM responds with a grounded answer like: "Bob Smith, a Lead Engineer at Acme Corp, leads the Phoenix Engine, which is Acme's ML Platform. He is managed by Alice Chen, the CTO."

Step 3: Why GraphRAG Beats Simple Vector Search

A traditional vector search for "Acme Corp's ML platform" would return the Phoenix Engine node (most similar embedding). But it would not tell you who leads it -- that information lives in the graph edges, not in any single node's properties. GraphRAG captures these multi-hop connections automatically.

Approach	Context Retrieved	Can Answer "Who leads it?"
Simple Vector Search	Phoenix Engine node only	No -- no relationship context
GraphRAG (2 hops)	Phoenix + Bob + Alice + Acme + Nova	Yes -- LEADS edge connects Bob to Phoenix

Tip: Tuning Hops and Max Nodes Start with hops=2 and max_nodes=50. If the LLM's answers lack context, increase hops. If responses are slow or token costs are high, reduce max_nodes or switch to "triples" format. The goal is to include enough context for accurate answers without overwhelming the LLM's context window.

MCP Integration AstraeaDB's MCP (Model Context Protocol) server exposes graph_rag and extract_subgraph as tools that LLM clients can call directly. When using Claude Desktop, Claude Code, or Cursor with AstraeaDB's MCP server, you can simply ask questions like "Use GraphRAG to answer: who leads the Phoenix project?" and the LLM will call the graph_rag tool automatically. See Chapter 7 for MCP setup instructions.

← Chapter 10: Graph Algorithms Chapter 12: Graph Neural Networks →