Chapter 6: Graph Traversals
While GQL lets you describe what you want to find, traversal algorithms give you fine-grained control over how the graph is explored. This chapter covers BFS, DFS, shortest path (unweighted and weighted), and neighbor queries—the building blocks for everything from social-network analysis to route planning.
6.1 Breadth-First Search (BFS)
Breadth-First Search explores the graph level by level. Starting from a given node, it first visits all immediate neighbors (depth 1), then all neighbors-of-neighbors (depth 2), and so on. This guarantees that nodes are discovered in order of increasing distance from the start.
How BFS Works
Consider this social network:
The BFS API
AstraeaDB exposes BFS through a dedicated traversal method:
result = client.bfs(start_node_id, max_depth=3)
The return value is a list of objects, each containing:
| Field | Type | Description |
|---|---|---|
node_id | Integer | The internal ID of the discovered node |
depth | Integer | How many hops from the start node |
When to Use BFS
- Shortest path (unweighted): BFS inherently finds the fewest-hop path between two nodes.
- Finding all nodes within N hops: Set
max_depthto limit exploration radius. - Level-order analysis: Analyze how influence or information spreads outward from a source.
- Connected component discovery: BFS from any node discovers its entire connected component.
BFS Code Examples
from astraeadb import AstraeaClient with AstraeaClient() as client: # First, find Alice's node ID alice = client.query('MATCH (n:Person {name: "Alice"}) RETURN id(n)') alice_id = alice["rows"][0][0] # Run BFS from Alice, up to 3 hops deep result = client.bfs(alice_id, max_depth=3) # Group results by depth print("BFS from Alice:") for entry in result: print(f" Depth {entry['depth']}: Node {entry['node_id']}") # Example output: # BFS from Alice: # Depth 0: Node 0 (Alice herself) # Depth 1: Node 1 (Bob) # Depth 1: Node 2 (Carol) # Depth 1: Node 3 (Dave) # Depth 2: Node 4 (Eve) # Depth 2: Node 5 (Frank) # Depth 2: Node 6 (Grace) # Depth 3: Node 7 (Hank) # Count nodes at each depth from collections import Counter depth_counts = Counter(e["depth"] for e in result) for depth, count in sorted(depth_counts.items()): print(f" Depth {depth}: {count} node(s)") # Depth 0: 1 node(s) # Depth 1: 3 node(s) # Depth 2: 3 node(s) # Depth 3: 1 node(s)
library(astraeadb) client <- astraea_client() client$connect() # Find Alice's node ID alice <- client$query('MATCH (n:Person {name: "Alice"}) RETURN id(n)') alice_id <- alice$rows[1, 1] # Run BFS from Alice, up to 3 hops deep result <- client$bfs(alice_id, max_depth = 3) # Print results cat("BFS from Alice:\n") for (i in seq_len(nrow(result))) { cat(sprintf(" Depth %d: Node %d\n", result[i, "depth"], result[i, "node_id"])) } # Count nodes at each depth depth_counts <- table(result$depth) print(depth_counts) # 0 1 2 3 # 1 3 3 1 client$close()
package main import ( "context" "fmt" "github.com/AstraeaDB/AstraeaDB-Official" ) func main() { ctx := context.Background() client := astraeadb.NewClient( astraeadb.WithAddress("127.0.0.1", 7687), ) client.Connect(ctx) defer client.Close() // Find Alice's node ID alice, _ := client.Query(ctx, `MATCH (n:Person {name: "Alice"}) RETURN id(n)`) aliceID := alice.Rows[0][0].(int64) // Run BFS from Alice, up to 3 hops deep result, _ := client.BFS(ctx, aliceID, 3) fmt.Println("BFS from Alice:") for _, entry := range result { fmt.Printf(" Depth %d: Node %d\n", entry.Depth, entry.NodeID) } // Count nodes at each depth counts := make(map[int]int) for _, entry := range result { counts[entry.Depth]++ } for depth := 0; depth <= 3; depth++ { fmt.Printf(" Depth %d: %d node(s)\n", depth, counts[depth]) } }
import com.astraeadb.unified.UnifiedClient; import com.astraeadb.unified.TraversalEntry; import java.util.List; import java.util.stream.Collectors; public class BFSExample { public static void main(String[] args) { try (var client = UnifiedClient.builder() .host("127.0.0.1").build()) { client.connect(); // Find Alice's node ID var alice = client.query( "MATCH (n:Person {name: \"Alice\"}) RETURN id(n)"); long aliceId = ((Number) alice.rows() .get(0).get(0)).longValue(); // Run BFS from Alice, up to 3 hops deep List<TraversalEntry> result = client.bfs(aliceId, 3); System.out.println("BFS from Alice:"); for (var entry : result) { System.out.printf( " Depth %d: Node %d%n", entry.depth(), entry.nodeId()); } // Count nodes at each depth var counts = result.stream() .collect(Collectors.groupingBy( TraversalEntry::depth, Collectors.counting())); counts.forEach((depth, count) -> System.out.printf( " Depth %d: %d node(s)%n", depth, count)); } } }
6.2 Depth-First Search (DFS)
Depth-First Search takes the opposite approach from BFS: instead of exploring level by level, it follows a single path as deep as possible before backtracking to explore alternative branches.
BFS vs. DFS: Visual Comparison
The DFS API
result = client.dfs(start_node_id, max_depth=3)
Returns the same structure as BFS: a list of {node_id, depth} entries, but in DFS visit order.
When to Use DFS
- Cycle detection: DFS naturally detects back-edges, which indicate cycles in directed graphs.
- Topological sorting: Post-order DFS produces a topological ordering of a DAG.
- Exhaustive path enumeration: DFS explores every possible path, making it ideal for finding all routes between two nodes.
- Memory efficiency: DFS uses O(depth) memory, while BFS uses O(branching_factor^depth). For very wide graphs, DFS is more memory-efficient.
Choosing Between BFS and DFS
| Criterion | BFS | DFS |
|---|---|---|
| Finds shortest path? | Yes (unweighted) | No |
| Memory usage | O(branching^depth) | O(depth) |
| Explores nearest first? | Yes | No |
| Detects cycles? | Possible | Natural fit |
| Topological sort? | No | Yes (post-order) |
| Exhaustive path search? | Expensive | Natural fit |
6.3 Shortest Path (Unweighted)
The unweighted shortest path finds the path between two nodes that uses the fewest hops. Under the hood, this is implemented as a BFS from the source node that terminates as soon as the target is reached.
The Shortest Path API
result = client.shortest_path(from_node, to_node)
The return value contains:
| Field | Type | Description |
|---|---|---|
path | List of integers | Ordered list of node IDs from source to target |
length | Integer | Number of edges in the path |
Example: Degrees of Separation
A classic use case: finding the shortest connection between two actors in a movie database. How many "hops" separate Keanu Reeves from Tom Hanks?
Shortest Path Code Examples
with AstraeaClient() as client: # Look up the node IDs for our two target people keanu = client.query( 'MATCH (n:Person {name: "Keanu"}) RETURN id(n)') tom = client.query( 'MATCH (n:Person {name: "Tom Hanks"}) RETURN id(n)') keanu_id = keanu["rows"][0][0] tom_id = tom["rows"][0][0] # Find the shortest path between them result = client.shortest_path(keanu_id, tom_id) print(f"Path length: {result['length']} hops") print(f"Path: {result['path']}") # Resolve the node IDs to names for display for node_id in result["path"]: info = client.query( f"MATCH (n) WHERE id(n) = {node_id} RETURN labels(n), n.name") labels, name = info["rows"][0] print(f" {node_id}: {labels} {name}") # Output: # Path length: 4 hops # Path: [10, 20, 15, 25, 12] # 10: ["Person"] Keanu # 20: ["Movie"] The Matrix # 15: ["Person"] Hugo Weaving # 25: ["Movie"] Cloud Atlas # 12: ["Person"] Tom Hanks
client <- astraea_client() client$connect() # Look up node IDs keanu <- client$query('MATCH (n:Person {name: "Keanu"}) RETURN id(n)') tom <- client$query('MATCH (n:Person {name: "Tom Hanks"}) RETURN id(n)') keanu_id <- keanu$rows[1, 1] tom_id <- tom$rows[1, 1] # Find the shortest path result <- client$shortest_path(keanu_id, tom_id) cat(sprintf("Path length: %d hops\n", result$length)) cat("Path:", result$path, "\n") # Resolve node IDs to names for (nid in result$path) { info <- client$query(sprintf( "MATCH (n) WHERE id(n) = %d RETURN labels(n), n.name", nid)) cat(sprintf(" %d: %s %s\n", nid, info$rows[1, 1], info$rows[1, 2])) } client$close()
ctx := context.Background() client := astraeadb.NewClient( astraeadb.WithAddress("127.0.0.1", 7687), ) client.Connect(ctx) defer client.Close() // Look up node IDs keanu, _ := client.Query(ctx, `MATCH (n:Person {name: "Keanu"}) RETURN id(n)`) tom, _ := client.Query(ctx, `MATCH (n:Person {name: "Tom Hanks"}) RETURN id(n)`) keanuID := keanu.Rows[0][0].(int64) tomID := tom.Rows[0][0].(int64) // Find the shortest path result, _ := client.ShortestPath(ctx, keanuID, tomID) fmt.Printf("Path length: %d hops\n", result.Length) fmt.Printf("Path: %v\n", result.Path) // Resolve node IDs to names for _, nodeID := range result.Path { info, _ := client.Query(ctx, fmt.Sprintf( "MATCH (n) WHERE id(n) = %d RETURN labels(n), n.name", nodeID)) fmt.Printf(" %d: %v %v\n", nodeID, info.Rows[0][0], info.Rows[0][1]) }
try (var client = UnifiedClient.builder() .host("127.0.0.1").build()) { client.connect(); // Look up node IDs var keanu = client.query( "MATCH (n:Person {name: \"Keanu\"}) RETURN id(n)"); var tom = client.query( "MATCH (n:Person {name: \"Tom Hanks\"}) RETURN id(n)"); long keanuId = ((Number) keanu.rows() .get(0).get(0)).longValue(); long tomId = ((Number) tom.rows() .get(0).get(0)).longValue(); // Find the shortest path var result = client.shortestPath(keanuId, tomId); System.out.printf("Path length: %d hops%n", result.length()); System.out.println("Path: " + result.path()); // Resolve node IDs to names for (long nodeId : result.path()) { var info = client.query(String.format( "MATCH (n) WHERE id(n) = %d RETURN labels(n), n.name", nodeId)); System.out.printf(" %d: %s %s%n", nodeId, info.rows().get(0).get(0), info.rows().get(0).get(1)); } }
6.4 Weighted Shortest Path (Dijkstra)
When edges carry weights (costs, distances, latencies, or strengths), the shortest path is no longer the one with the fewest hops. Instead, it is the path whose total weight is minimized. AstraeaDB uses Dijkstra's algorithm for this purpose.
How Dijkstra's Algorithm Works
In accessible terms: imagine you are at an intersection and you can see the travel time to each neighboring intersection. Dijkstra's algorithm always expands the node with the smallest total cumulative cost so far. It keeps a priority queue of candidates and relaxes edges greedily, guaranteeing the optimal solution when all weights are non-negative.
The Weighted Shortest Path API
result = client.shortest_path(from_node, to_node, weighted=True)
The return value contains:
| Field | Type | Description |
|---|---|---|
path | List of integers | Ordered list of node IDs from source to target |
cost | Float | Total accumulated weight along the path |
weight property from each edge. Make sure your edges
have a numeric weight property set when creating them. Edges without a weight
property default to a weight of 1.0.
Dijkstra Code Examples
with AstraeaClient() as client: # Build a small routing graph with distance weights cities = {} for name in ["London", "Paris", "Berlin", "Rome", "Madrid"]: cities[name] = client.create_node( ["City"], {"name": name}) # Add routes with distance weights (km) routes = [ ("London", "Paris", 340), ("Paris", "Berlin", 878), ("Paris", "Rome", 1105), ("Berlin", "Rome", 1181), ("London", "Madrid", 1264), ("Madrid", "Rome", 1365), ("Madrid", "Paris", 1054), ] for src, dst, km in routes: client.create_edge( cities[src], cities[dst], "ROUTE", {"weight": km}) # Weighted shortest path: London to Rome result = client.shortest_path( cities["London"], cities["Rome"], weighted=True) print(f"Total distance: {result['cost']} km") print(f"Path: {result['path']}") # Output: # Total distance: 1445 km # Path: [london_id, paris_id, rome_id] # (London → Paris: 340km, Paris → Rome: 1105km) # Compare with unweighted (fewest hops) unweighted = client.shortest_path( cities["London"], cities["Rome"]) print(f"Unweighted hops: {unweighted['length']}")
client <- astraea_client() client$connect() # Build a routing graph with distance weights cities <- list() for (name in c("London", "Paris", "Berlin", "Rome", "Madrid")) { cities[[name]] <- client$create_node( list("City"), list(name = name)) } # Add routes with distance weights (km) client$create_edge(cities[["London"]], cities[["Paris"]], "ROUTE", list(weight = 340)) client$create_edge(cities[["Paris"]], cities[["Berlin"]], "ROUTE", list(weight = 878)) client$create_edge(cities[["Paris"]], cities[["Rome"]], "ROUTE", list(weight = 1105)) client$create_edge(cities[["Berlin"]], cities[["Rome"]], "ROUTE", list(weight = 1181)) client$create_edge(cities[["London"]], cities[["Madrid"]], "ROUTE", list(weight = 1264)) client$create_edge(cities[["Madrid"]], cities[["Rome"]], "ROUTE", list(weight = 1365)) # Weighted shortest path: London to Rome result <- client$shortest_path( cities[["London"]], cities[["Rome"]], weighted = TRUE) cat(sprintf("Total distance: %g km\n", result$cost)) cat("Path:", result$path, "\n") client$close()
ctx := context.Background() client := astraeadb.NewClient( astraeadb.WithAddress("127.0.0.1", 7687), ) client.Connect(ctx) defer client.Close() // Create city nodes london, _ := client.CreateNode(ctx, []string{"City"}, map[string]any{"name": "London"}, nil) paris, _ := client.CreateNode(ctx, []string{"City"}, map[string]any{"name": "Paris"}, nil) rome, _ := client.CreateNode(ctx, []string{"City"}, map[string]any{"name": "Rome"}, nil) // Add weighted routes client.CreateEdge(ctx, london, paris, "ROUTE", map[string]any{"weight": 340.0}) client.CreateEdge(ctx, paris, rome, "ROUTE", map[string]any{"weight": 1105.0}) // Weighted shortest path: London to Rome result, _ := client.ShortestPath(ctx, london, rome, astraeadb.WithWeighted(true)) fmt.Printf("Total distance: %.0f km\n", result.Cost) fmt.Printf("Path: %v\n", result.Path)
try (var client = UnifiedClient.builder() .host("127.0.0.1").build()) { client.connect(); // Create city nodes long london = client.createNode( List.of("City"), Map.of("name", "London"), null); long paris = client.createNode( List.of("City"), Map.of("name", "Paris"), null); long rome = client.createNode( List.of("City"), Map.of("name", "Rome"), null); // Add weighted routes client.createEdge(london, paris, "ROUTE", Map.of("weight", 340.0)); client.createEdge(paris, rome, "ROUTE", Map.of("weight", 1105.0)); // Weighted shortest path: London to Rome var result = client.shortestPath( london, rome, true); // weighted=true System.out.printf( "Total distance: %.0f km%n", result.cost()); System.out.println( "Path: " + result.path()); }
6.5 Neighbor Queries
Sometimes you do not need a full traversal—you just want to know what is immediately connected to a node. The neighbor query API gives you direct access to a node's local neighborhood with optional filtering by direction and edge type.
The Neighbors API
result = client.neighbors( node_id, direction="outgoing", # "outgoing", "incoming", or "both" edge_type="KNOWS" # optional: filter by relationship type )
Direction Parameter
| Direction | Meaning | Diagram |
|---|---|---|
"outgoing" |
Nodes that this node points to | (node) --> (neighbor) |
"incoming" |
Nodes that point to this node | (neighbor) --> (node) |
"both" |
All connected nodes regardless of direction | (neighbor) <--> (node) |
Filtering by Edge Type
The optional edge_type parameter restricts results to neighbors connected by a
specific relationship type. This is especially useful when nodes have many different kinds
of connections.
Neighbor Query Code Examples
with AstraeaClient() as client: # Find Alice's node ID alice = client.query( 'MATCH (n:Person {name: "Alice"}) RETURN id(n)') alice_id = alice["rows"][0][0] # All outgoing neighbors all_out = client.neighbors(alice_id, direction="outgoing") print("All outgoing neighbors:") for n in all_out: print(f" Node {n['node_id']} via {n['edge_type']}") # Only KNOWS relationships friends = client.neighbors( alice_id, direction="outgoing", edge_type="KNOWS") print(f"\nAlice knows {len(friends)} people:") for f in friends: # Resolve the neighbor's name info = client.query( f"MATCH (n) WHERE id(n) = {f['node_id']} RETURN n.name") print(f" {info['rows'][0][0]}") # Incoming neighbors (who knows Alice?) admirers = client.neighbors( alice_id, direction="incoming", edge_type="KNOWS") print(f"\n{len(admirers)} people know Alice") # Both directions (full neighborhood) all_connected = client.neighbors( alice_id, direction="both") print(f"\nTotal connections: {len(all_connected)}") # Example output: # All outgoing neighbors: # Node 1 via KNOWS # Node 2 via KNOWS # Node 20 via ACTED_IN # # Alice knows 2 people: # Bob # Carol # # 1 people know Alice # # Total connections: 4
client <- astraea_client() client$connect() # Find Alice's node ID alice <- client$query('MATCH (n:Person {name: "Alice"}) RETURN id(n)') alice_id <- alice$rows[1, 1] # All outgoing neighbors all_out <- client$neighbors(alice_id, direction = "outgoing") cat("All outgoing neighbors:\n") for (i in seq_along(all_out)) { cat(sprintf(" Node %d via %s\n", all_out[[i]]$node_id, all_out[[i]]$edge_type)) } # Only KNOWS relationships friends <- client$neighbors( alice_id, direction = "outgoing", edge_type = "KNOWS") cat(sprintf("\nAlice knows %d people:\n", length(friends))) for (f in friends) { info <- client$query(sprintf( "MATCH (n) WHERE id(n) = %d RETURN n.name", f$node_id)) cat(" ", info$rows[1, 1], "\n") } # Incoming neighbors admirers <- client$neighbors( alice_id, direction = "incoming", edge_type = "KNOWS") cat(sprintf("\n%d people know Alice\n", length(admirers))) # Both directions all_connected <- client$neighbors( alice_id, direction = "both") cat(sprintf("Total connections: %d\n", length(all_connected))) client$close()
ctx := context.Background() client := astraeadb.NewClient( astraeadb.WithAddress("127.0.0.1", 7687), ) client.Connect(ctx) defer client.Close() // Find Alice's node ID alice, _ := client.Query(ctx, `MATCH (n:Person {name: "Alice"}) RETURN id(n)`) aliceID := alice.Rows[0][0].(int64) // All outgoing neighbors allOut, _ := client.Neighbors(ctx, aliceID, astraeadb.DirectionOutgoing) fmt.Println("All outgoing neighbors:") for _, n := range allOut { fmt.Printf(" Node %d via %s\n", n.NodeID, n.EdgeType) } // Only KNOWS relationships friends, _ := client.Neighbors(ctx, aliceID, astraeadb.DirectionOutgoing, astraeadb.WithEdgeType("KNOWS")) fmt.Printf("\nAlice knows %d people:\n", len(friends)) for _, f := range friends { info, _ := client.Query(ctx, fmt.Sprintf( "MATCH (n) WHERE id(n) = %d RETURN n.name", f.NodeID)) fmt.Printf(" %v\n", info.Rows[0][0]) } // Both directions allConnected, _ := client.Neighbors(ctx, aliceID, astraeadb.DirectionBoth) fmt.Printf("\nTotal connections: %d\n", len(allConnected))
try (var client = UnifiedClient.builder() .host("127.0.0.1").build()) { client.connect(); // Find Alice's node ID var alice = client.query( "MATCH (n:Person {name: \"Alice\"}) RETURN id(n)"); long aliceId = ((Number) alice.rows() .get(0).get(0)).longValue(); // All outgoing neighbors var allOut = client.neighbors( aliceId, "outgoing", null); System.out.println("All outgoing neighbors:"); for (var n : allOut) { System.out.printf(" Node %d via %s%n", n.nodeId(), n.edgeType()); } // Only KNOWS relationships var friends = client.neighbors( aliceId, "outgoing", "KNOWS"); System.out.printf( "%nAlice knows %d people:%n", friends.size()); for (var f : friends) { var info = client.query(String.format( "MATCH (n) WHERE id(n) = %d RETURN n.name", f.nodeId())); System.out.println( " " + info.rows().get(0).get(0)); } // Both directions var allConnected = client.neighbors( aliceId, "both", null); System.out.printf( "%nTotal connections: %d%n", allConnected.size()); }
Chapter Summary
- BFS explores level by level. Use it for shortest paths (unweighted), bounded exploration, and discovering connected components.
- DFS dives deep before backtracking. Use it for cycle detection, topological sorting, and exhaustive path enumeration.
- Shortest Path (unweighted) finds the fewest-hop path between two nodes. It is a specialized BFS that stops at the target.
- Dijkstra (weighted) finds the lowest-cost path when edges have weights. Always reads the
weightproperty from edges. - Neighbor Queries give direct access to a node's local neighborhood, filterable by direction (
outgoing,incoming,both) and edge type.
These traversal primitives are the foundation for everything that follows. In the next chapter, we explore the transport protocols (JSON-TCP, gRPC, Arrow Flight) that carry these operations between your application and the AstraeaDB server.