버전: 0.0.3

Prometheus Metrics

aerospike-py collects operation-level metrics in Rust and exposes them in Prometheus text format. Metric names follow OpenTelemetry DB Client Semantic Conventions.

Quick Start

import aerospike_py

# Get metrics as string
text: str = aerospike_py.get_metrics()

# Or start a built-in HTTP server
aerospike_py.start_metrics_server(port=9464)
# Prometheus scrapes http://localhost:9464/metrics

# Stop when done
aerospike_py.stop_metrics_server()

`db_client_operation_duration_seconds`

A histogram tracking the duration of every data operation.

Labels:

Label	Examples
`db_system_name`	`aerospike`
`db_namespace`	`test`, `production`
`db_collection_name`	`users`, `sessions`
`db_operation_name`	`get`, `put`, `delete`, `query`
`error_type`	`""` (success), `Timeout`, `KeyNotFoundError`

Buckets: 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, 10.0 seconds

Instrumented operations: put, get, select, exists, remove, touch, append, prepend, increment, operate, batch_read, batch_operate, batch_remove, query

팁

exists() treats KeyNotFoundError as success since "not found" is a normal outcome.

Framework Integration

FastAPI

from fastapi import FastAPI, Response
from prometheus_client import generate_latest, REGISTRY
import aerospike_py

@app.get("/metrics")
def metrics():
    python_metrics = generate_latest(REGISTRY).decode("utf-8")
    aerospike_metrics = aerospike_py.get_metrics()
    return Response(
        python_metrics + "\n" + aerospike_metrics,
        media_type="text/plain; version=0.0.4",
    )

Django

# myproject/apps.py
from django.apps import AppConfig
import aerospike_py

class MyAppConfig(AppConfig):
    name = "myapp"

    def ready(self):
        aerospike_py.start_metrics_server(port=9464)

Prometheus Config

scrape_configs:
  - job_name: "aerospike-py"
    scrape_interval: 15s
    static_configs:
      - targets: ["localhost:9464"]

PromQL Examples

# Average latency (5m)
rate(db_client_operation_duration_seconds_sum[5m])
/ rate(db_client_operation_duration_seconds_count[5m])

# P99 latency
histogram_quantile(0.99, rate(db_client_operation_duration_seconds_bucket[5m]))

# Error rate by type
sum by (error_type) (rate(db_client_operation_duration_seconds_count{error_type!=""}[5m]))

# Ops/sec by namespace
sum by (db_namespace, db_operation_name) (rate(db_client_operation_duration_seconds_count[1m]))

Grafana Dashboard

Panel	PromQL	Type
Ops/sec	`sum(rate(..._count[1m])) by (db_operation_name)`	Time series
P50/P95/P99	`histogram_quantile(0.5\|0.95\|0.99, rate(..._bucket[5m]))`	Time series
Error Rate	`sum(rate(..._count{error_type!=""}[1m])) by (error_type)`	Time series
By Namespace	`sum(rate(..._count[1m])) by (db_namespace)`	Pie chart

Performance

Scenario	Overhead
Per-operation recording	~30-80 ns (atomic increment)
Relative to network round-trip	0.001-0.01%
`get_metrics()` encoding	~50-200 us

Metrics collection is always enabled with negligible overhead.

Quick Start​

db_client_operation_duration_seconds​

Framework Integration​

FastAPI​

Django​

Prometheus Config​

PromQL Examples​

Grafana Dashboard​

Performance​