Performance Tuning & Traffic Shaping

Hive Router is built for performance right out of the box, but every production setup is different. Tuning the router's traffic management settings to match your specific workload and subgraph capabilities can unlock significantly better throughput and reliability.

This guide covers the traffic_shaping configuration in detail, explains the trade-offs of each setting, and gives you a practical approach to benchmarking and optimizing your deployment.

For a quick reference of the configuration syntax, see the traffic_shaping configuration reference.

Understanding Connection Limits

The most important setting for both performance and stability is max_connections_per_host. This controls how many concurrent HTTP connections the router will open to each subgraph host (like products.api.example.com).

Default Value: 100

Finding the Sweet Spot

Getting this right is about balancing maximum throughput with protecting your subgraphs from overload.

Too low = bottleneck:

Even if your subgraphs have plenty of capacity, a low connection limit will queue requests inside the router, adding latency
Your subgraph services might sit idle while the router artificially throttles traffic
You're leaving performance on the table

Too high = overload risk:

During traffic spikes, the router might flood subgraphs with more connections than they can handle
This can overwhelm connection pools, CPU, or memory on your subgraphs
Can trigger cascading failures or "thundering herd" problems where sudden traffic surges crash downstream services
More open connections may lead to ephemeral port exhaustion

How to Tune It

Start with the default and adjust based on your observations:

Monitor subgraph performance under normal and peak load
Watch for connection pool exhaustion in your subgraph logs
Look for queuing in router metrics - if requests are waiting for connections, you might need to increase the limit
Load test gradually - increase the limit incrementally and measure the impact

Managing Idle Connections

The pool_idle_timeout setting controls how long unused connections stay open in the router's connection pool before being closed.

Default Value: 50s

It takes a duration string (like 30s for 30 seconds, or 1m for 1 minute). This setting affects how aggressively the router reuses existing connections versus closing them to free up resources.

The Connection Reuse Trade-off

Too short = latency overhead:

Connections get closed quickly, so new requests have to establish fresh TCP/TLS connections
Each new connection adds handshake latency (especially noticeable with TLS)
Your router and subgraphs spend more CPU on connection setup

Too long = resource waste:

Idle connections consume memory and file descriptors on both the router and subgraph servers
Network devices (load balancers, firewalls) might have shorter timeouts and silently drop connections, leading to "zombie" connections that fail when used

Tuning Guidelines

High-traffic APIs: Use longer timeouts (60-300 seconds) since connections are likely to be reused quickly
Low-traffic APIs: Use shorter timeouts (10-30 seconds) to free up resources
Check your infrastructure: Make sure this timeout is shorter than any load balancer or firewall timeouts in your stack
Monitor connection errors: If you see connection failures, your timeout might be longer than network device timeouts

Request Deduplication

The router supports two complementary levels of in-flight request deduplication that can be enabled independently: inbound and outbound.

Inbound Deduplication

Inbound deduplication (traffic_shaping.router.dedupe) operates at the entry point of the router. When multiple clients send identical GraphQL query operations simultaneously, the router executes the operation only once and shares the result with all waiting clients — subgraphs receive just a single request regardless of how many clients are waiting.

Default: false (opt-in)

traffic_shaping:
  router:
    dedupe:
      enabled: true

Deduplication key

Two requests are considered identical when the following all match:

HTTP method and path
Normalized operation text (whitespace/comment differences are ignored)
GraphQL variables
GraphQL extensions
Schema checksum (prevents sharing across schema reload transitions)
Selected request headers (controlled by the headers policy below)

Header policy

By default all headers are included in the fingerprint, so requests with different Authorization or Cookie headers are not deduplicated with each other. You can narrow this down:

traffic_shaping:
  router:
    dedupe:
      enabled: true
      headers: all # default — include every header

traffic_shaping:
  router:
    dedupe:
      enabled: true
      headers: none # ignore all headers (requests from any user may be deduplicated)

traffic_shaping:
  router:
    dedupe:
      enabled: true
      headers:
        include: # include only these headers in the fingerprint
          - authorization
          - cookie

When to enable it:

Many clients frequently issue the same popular queries (dashboards, landing pages, product listings)
You want to reduce overall query execution pressure on your subgraphs under concurrent load

When you might leave it disabled:

All queries are highly personalised and rarely identical
You're debugging and want every request to execute independently

Outbound Deduplication

Outbound deduplication (dedupe_enabled) deduplicates the requests the router makes to individual subgraphs. When the router would send multiple identical requests to the same subgraph simultaneously, it sends only one and fans the response back to all waiting parallel fetches.

Default Value: true

This is almost always beneficial to keep enabled. It dramatically reduces load on subgraphs when multiple clients request the same data at once (think of popular content or dashboard queries that many users run simultaneously).

When you might disable it:

Your queries are always unique (heavily personalized)
You're debugging and want to see every request
You have very low traffic where deduplication doesn't help

Circuit Breaker

The circuit breaker pattern prevents the router from continuously sending requests to a subgraph that is failing or unresponsive. When the subgraph's error rate rises above a configurable threshold, the circuit "opens" and subsequent requests are immediately rejected with a SUBGRAPH_CIRCUIT_BREAKER_REJECTED error — instead of waiting for the subgraph to time out. This frees up router resources and gives the subgraph time to recover.

Default: circuit_breaker: null (disabled). Any per-field defaults listed below apply only when a circuit_breaker object is provided.

How It Works

The circuit breaker has three states:

State	Behaviour
Closed	Normal operation — all requests pass through.
Open	Requests are immediately rejected. No traffic reaches the subgraph.
Half-open	After `reset_timeout`, one probe request is allowed. Success closes the circuit; failure reopens it.

The circuit transitions from closed → open once both conditions are met:

At least volume_threshold requests have been observed.
The fraction of those requests that errored is ≥ error_threshold.

Configuration

router.config.yaml

traffic_shaping:
  all:
    circuit_breaker:
      enabled: true
      error_threshold: 50% # open when ≥ 50% of requests fail
      volume_threshold: 5 # evaluate after at least 5 requests
      reset_timeout: 30s # retry after 30s

Option	Type	Default	Description
`enabled`	`boolean`	`false`	Enable or disable the circuit breaker.
`error_threshold`	`string`	`50%`	Error rate percentage that triggers the breaker (e.g. `50%`).
`volume_threshold`	`integer`	`5`	Minimum request count before the error rate is evaluated.
`reset_timeout`	`string`	`30s`	Duration the circuit stays open before allowing a probe request.

Global vs Per-Subgraph Configuration

Circuit breaker settings can be applied globally to all subgraphs under traffic_shaping.all, and selectively overridden for individual subgraphs under traffic_shaping.subgraphs.<name>. The per-subgraph configuration is merged with the global configuration — you only need to specify the fields you want to override.

router.config.yaml

traffic_shaping:
  all:
    circuit_breaker:
      enabled: true
      error_threshold: 50%
      volume_threshold: 5
      reset_timeout: 30s

router.config.yaml

traffic_shaping:
  all:
    circuit_breaker:
      enabled: false # disabled globally
  subgraphs:
    accounts:
      circuit_breaker:
        enabled: true
        error_threshold: 60%
        volume_threshold: 3
        reset_timeout: 10s
    products:
      circuit_breaker:
        enabled: true
        error_threshold: 70%
        volume_threshold: 4
        reset_timeout: 15s

router.config.yaml

traffic_shaping:
  all:
    circuit_breaker:
      enabled: true
      error_threshold: 50%
      volume_threshold: 10
      reset_timeout: 30s
  subgraphs:
    accounts:
      circuit_breaker:
        enabled: true
        volume_threshold: 3 # override only volume_threshold; other settings inherit from global

Metrics

When the circuit breaker rejects a request, the router increments the hive.router.circuit_breaker.rejected_requests_total counter. The counter carries a subgraph.name label so you can track rejection rates per subgraph in your metrics backend.

Tuning Guidelines

error_threshold — Lower values make the breaker more sensitive. Start at the default (50%) and tighten it only if you need to protect fragile subgraphs more aggressively.
volume_threshold — Keep this high enough to avoid false positives during low-traffic periods. A value of 5–10 is reasonable for most deployments.
reset_timeout — Should be long enough to give the subgraph time to recover, but short enough that you notice quickly when it does. 30s–60s is a sensible starting point.

When to enable the circuit breaker:

You have subgraphs that are occasionally slow or unavailable, and you want to fail fast rather than accumulate timeout latency
You want to give a struggling subgraph breathing room to recover instead of being overwhelmed by retried requests

When you might leave it disabled:

All subgraphs are highly reliable and well within their resource limits
You prefer to surface subgraph errors directly to clients rather than short-circuiting them

Edit on GitHub

Performance Tuning & Traffic Shaping

Understanding Connection Limits

Finding the Sweet Spot

How to Tune It

Managing Idle Connections

The Connection Reuse Trade-off

Tuning Guidelines

Request Deduplication

Inbound Deduplication

Outbound Deduplication

Circuit Breaker

How It Works

Configuration

Global vs Per-Subgraph Configuration

Metrics

Tuning Guidelines

On this page