Chapter 14: Performance and Scaling
AstraeaDB is engineered for performance at every layer—from the storage engine that eliminates garbage collection pauses to the GPU acceleration path for analytical workloads. This chapter explains the architecture that makes it fast, the knobs you can turn to make it faster, and the monitoring tools that tell you when something is wrong.
14.1 The Three-Tier Storage Architecture
The fundamental challenge of a cloud-native graph database is that graph traversals require random access at nanosecond speeds, while cloud storage (S3, GCS) offers sequential throughput at millisecond latencies. AstraeaDB solves this with a three-tier storage architecture that automatically moves data between cost-optimized cold storage and speed-optimized hot memory.
Tier 1: Cold Storage (Object Storage)
Data at rest lives in Apache Parquet files on object storage (Amazon S3, Google Cloud Storage, Azure Blob Storage) or a local filesystem. Parquet is a columnar format that compresses well and is readable by virtually every data tool in the ecosystem (Spark, DuckDB, Polars, Pandas). This means your graph data is never locked into a proprietary format.
Tier 2: Warm Cache (NVMe SSD)
When data is first accessed, it is loaded from cold storage into the LRU (Least Recently Used) buffer pool on the local NVMe SSD. The buffer pool manages data in 8 KiB pages with reference counting (pin/unpin) to prevent eviction of pages that are actively being read.
AstraeaDB supports two I/O backends for the warm tier:
| Backend | Platform | Characteristics |
|---|---|---|
memmap2 |
All platforms (Linux, macOS, WSL2) | Memory-mapped files. The OS kernel manages page caching. Simple, reliable, good default. |
io_uring |
Linux (kernel 5.10+) | Asynchronous I/O with zero system call overhead for batched operations. Higher throughput under heavy concurrent load. |
Tier 3: Hot Memory (Pointer Swizzling)
The most frequently accessed subgraphs are promoted to the hot tier, where AstraeaDB performs pointer swizzling: 64-bit disk page IDs are replaced with direct memory pointers. This eliminates all indirection—traversing from one node to its neighbor is a single pointer dereference, achieving nanosecond-level traversal for active working sets.
MVCC and the Write-Ahead Log
AstraeaDB uses MVCC (Multi-Version Concurrency Control) to allow concurrent readers and writers without locking. Each transaction sees a consistent snapshot of the graph, even while other transactions are writing.
Durability is guaranteed by the Write-Ahead Log (WAL). Every mutation is written to the WAL before being applied to the data pages. If the server crashes, it replays the WAL on startup to recover any committed transactions that had not yet been flushed to the data files.
# astraeadb.toml - Storage Configuration [storage] data_dir = "/var/lib/astraeadb/data" wal_dir = "/var/lib/astraeadb/wal" [storage.tiering] hot_cache_mb = 4096 # RAM budget for pointer-swizzled subgraphs warm_cache_mb = 16384 # NVMe buffer pool size cold_backend = "s3" # "local", "s3", or "gcs" io_backend = "io_uring" # "memmap2" or "io_uring" (Linux only) [storage.s3] bucket = "my-astraeadb-data" region = "us-east-1" prefix = "production/v1"
14.2 Connection Management
Under heavy load, connection management becomes critical. AstraeaDB provides several mechanisms to prevent resource exhaustion and maintain responsiveness.
Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
max_connections |
1024 |
Maximum number of concurrent TCP connections. New connections beyond this limit are rejected with a clear error message. |
max_concurrent_requests |
256 |
Maximum number of requests being processed simultaneously. Provides backpressure to prevent the server from being overwhelmed by a burst of queries. |
idle_timeout_secs |
300 (5 min) |
Connections that send no requests for this duration are automatically closed, freeing resources for active clients. |
request_timeout_secs |
30 |
Maximum time a single request can run before being cancelled. Prevents runaway queries from consuming resources indefinitely. |
# astraeadb.toml - Connection Management [server] max_connections = 2048 max_concurrent_requests = 512 idle_timeout_secs = 300 request_timeout_secs = 60
ConnectionGuard (RAII Pattern)
Internally, AstraeaDB uses a Rust RAII (Resource Acquisition Is Initialization) pattern called ConnectionGuard. When a connection is accepted, a guard object is created that increments the active connection counter. When the guard is dropped (whether due to normal disconnection, timeout, or error), the counter is automatically decremented and all associated resources are cleaned up. This makes resource leaks structurally impossible—a guarantee provided by Rust's ownership model.
max_connections and decrease idle_timeout_secs. For applications with fewer long-lived connections running complex queries (e.g., analytics dashboards), increase request_timeout_secs and max_concurrent_requests.
14.3 Monitoring with Prometheus
AstraeaDB exposes a Prometheus-compatible metrics endpoint that allows you to monitor the health and performance of your database in real time.
Key Metrics
| Metric | Type | Description |
|---|---|---|
astraea_requests_total |
Counter | Total number of requests processed, labeled by operation type (create_node, query, bfs, etc.) |
astraea_errors_total |
Counter | Total number of errors, labeled by error type (auth_failed, timeout, internal) |
astraea_request_duration_seconds |
Histogram | Request latency distribution. Exposes percentiles: p50, p90, p99. Essential for SLA monitoring. |
astraea_connections_active |
Gauge | Current number of active connections. Alerts if approaching max_connections. |
astraea_buffer_pool_hit_ratio |
Gauge | Percentage of page reads served from the warm buffer pool (vs. cold storage). Should be above 95% for good performance. |
astraea_hot_tier_nodes |
Gauge | Number of nodes currently in the hot tier (pointer-swizzled in RAM). |
Enabling the Metrics Endpoint
# astraeadb.toml - Prometheus Metrics [monitoring] prometheus_enabled = true metrics_port = 9090 # Separate port for metrics scraping health_check_path = "/health" # Returns 200 OK if the server is healthy
Scraping with Prometheus
Add AstraeaDB as a target in your Prometheus configuration:
# prometheus.yml scrape_configs: - job_name: "astraeadb" static_configs: - targets: ["astraeadb-host:9090"] scrape_interval: 15s
Visualizing in Grafana
Once Prometheus is scraping AstraeaDB, connect Grafana to your Prometheus data source and build dashboards. Recommended panels include:
- Request rate —
rate(astraea_requests_total[5m])by operation type - Error rate —
rate(astraea_errors_total[5m])as a percentage of total requests - Latency percentiles —
histogram_quantile(0.99, astraea_request_duration_seconds_bucket) - Connection utilization —
astraea_connections_active / max_connections - Buffer pool hit ratio —
astraea_buffer_pool_hit_ratio(alert if below 90%)
/health endpoint returns a 200 OK response when the server is accepting connections and the storage engine is operational. Use this as the health check target for load balancers (ALB, Nginx, HAProxy) to automatically remove unhealthy instances from the pool.
14.4 GPU Acceleration
Many graph algorithms—PageRank, BFS, single-source shortest path—are fundamentally matrix operations disguised as traversals. The adjacency matrix of a graph can be represented as a sparse matrix, and operations like "visit all neighbors" become sparse matrix-vector multiplications (SpMV). This makes them natural candidates for GPU acceleration.
CSR Matrix Representation
AstraeaDB represents the graph's adjacency structure in CSR (Compressed Sparse Row) format for algorithmic workloads. CSR stores only the non-zero entries of the adjacency matrix, using three arrays:
row_ptr— for each node, the starting index in thecol_idxarraycol_idx— the neighbor node IDs, sorted by source nodevalues— edge weights (optional, for weighted algorithms)
Compute Backends
| Backend | Status | Description |
|---|---|---|
CpuBackend |
Production-ready | SIMD-optimized sparse matrix-vector multiplication. Uses CPU vector instructions (SSE4.2, AVX2) for parallel computation. The default backend on all platforms. |
GpuBackend |
Framework ready | Architecture in place for CUDA/cuGraph integration. Transfers the CSR matrix to GPU memory and executes kernels for massive parallelism. Targeted at million+ node analytical workloads. |
Supported Algorithms
| Algorithm | Description | Complexity |
|---|---|---|
| PageRank | Iterative importance ranking. Converges via repeated SpMV until scores stabilize (tolerance threshold). | O(V + E) per iteration |
| BFS | Breadth-first search from a source node. Level-synchronous: each "frontier" is a SpMV operation. | O(V + E) |
| SSSP (Bellman-Ford) | Single-source shortest path supporting negative edge weights. Iterates V-1 times over all edges. | O(V * E) |
14.5 Sharding and Distributed Processing
When your graph outgrows a single machine—either in storage capacity or query throughput—AstraeaDB supports partitioning the graph across multiple shards.
Partitioning Strategies
| Strategy | How It Works | Best For |
|---|---|---|
| Hash Partitioning | Applies a consistent hash function to node IDs to distribute them evenly across shards. Adding or removing a shard only requires moving a fraction of the data (consistent hashing minimizes reshuffling). | Uniform access patterns, general-purpose workloads |
| Range Partitioning | Assigns contiguous ranges of node IDs to specific shards. Nodes with adjacent IDs (often created together) are co-located. | Temporal data ingestion, locality-sensitive workloads |
Architecture Components
ShardMap
The ShardMap is a routing table that maps node ID ranges to shard locations. When a query arrives, the query engine consults the ShardMap to determine which shard(s) hold the relevant data, then routes the request accordingly.
ClusterCoordinator
The ClusterCoordinator manages shard assignment, health monitoring, and rebalancing. It tracks which shards are online, handles failover when a shard becomes unavailable, and coordinates data migration when shards are added or removed.
# astraeadb.toml - Sharding Configuration [cluster] enabled = true partition_strategy = "hash" # "hash" or "range" shard_count = 4 [cluster.shards] # Shard assignments (for range partitioning) 0 = { host = "shard-0.astraea.internal", port = 7687 } 1 = { host = "shard-1.astraea.internal", port = 7687 } 2 = { host = "shard-2.astraea.internal", port = 7687 } 3 = { host = "shard-3.astraea.internal", port = 7687 }
ClusterCoordinator and ShardMap for single-machine multi-shard configurations. Distributed coordination across multiple physical machines (using a consensus protocol like Raft) is on the roadmap and is planned for a future release. For production deployments requiring horizontal scaling today, use the local coordinator with separate AstraeaDB instances behind a routing proxy.
Performance Summary
The following table summarizes AstraeaDB's performance characteristics across different deployment configurations:
| Configuration | Graph Size | Traversal Latency | Analytical Throughput |
|---|---|---|---|
| Single node, hot tier | Up to ~100M nodes | Nanoseconds (pointer swizzling) | CPU-bound SpMV |
| Single node, warm tier | Up to ~1B nodes | Microseconds (NVMe page reads) | I/O-bound, benefits from io_uring |
| Single node + GPU | Up to ~1B nodes | Microseconds (traversal), GPU-accelerated analytics | 10-100x SpMV speedup for large matrices |
| Multi-shard cluster | Multi-billion nodes | Milliseconds (network hop between shards) | Linear scale-out with shard count |