For the DBtune optimization engine to operate successfully, we require various metrics related to the database engine, the hardware, and also the software running on your infrastructure.
Some metrics are required; otherwise, the optimization cannot start, and some are good to have in order to increase the observability of the system.
To operate with peace of mind and avoid unit conversion rabbit holes, we require specific unit types to make things easier to work with.
Below are the following unit types we accept:
Type | Pseudo Type |
---|---|
int |
integer |
float |
float |
string |
string |
bytes |
integer |
boolean |
bool |
time |
integer (unix timestamp) |
percentage |
float |
As an example, we could have the following metric: hardware_total_memory: 8589934592
which would be 8GiB
to bytes.
System information defines your hardware, infrastructure, and core system characteristics that rarely change. If a datapoint does change, it would be appropriate to have a new optimization session.
These information are sent to the /system-info
endpoint, which is documented in our OpenAPI specification.
Datapoint Name | Description | Value type | Required |
---|---|---|---|
node_cpu_count |
The number of CPUs (or vCPUs) | float |
✅ |
node_memory_total |
Total memory of the node | bytes |
✅ |
node_os_info |
linux or windows . |
||
The default agent uses this to detect OS info | string |
||
node_disk_device_type |
The type of the disk, HDD , SSD , NVME or UNKNOWN |
string |
|
node_storage_type |
network or physical . The type of storage. |
string |
|
node_disk_size |
Total size of the disk where database is stored. | bytes |
|
pg_max_connections |
PostgreSQL max connections allowed | int |
✅ |
pg_version |
Version of the database server | string |
✅ |
Metrics adds a layer of observability around your DBMS. It helps us understand how your PostgreSQL server is operating, allowing us to optimize it and implement necessary server-side guardrails when you are tuning.
Below are the metrics we currently support in the dashboard, with more to come. If you have any recommendations, feel free to send us a message.
Datapoint Name | Description | Value type | Required |
---|---|---|---|
node_cpu_usage |
The CPU usage percentage | percentage |
✅ |
node_memory_used |
bytes |
✅ | |
node_memory_available |
Available memory | bytes |
|
node_disk_io_ops_total |
The total IOps for all the disks | int |
|
node_disk_io_ops_read |
The total reads of all the disks | int |
|
node_disk_io_ops_write |
The total writes of all the disks | int |
|
perf_average_query_runtime |
The average query runtime calculation. | float |
✅ |
perf_transactions_per_second |
The transactions per second calculation. | float |
✅ |
pg_autovacuum_count |
Auto-vacuum events happening at the time of querying the database. | int |
|
pg_cache_hit_ratio |
float |
||
pg_instance_size |
Size of the database cluster. | bytes |
|
pg_active_connections |
Current active connections to the PG server. | int |
|
server_uptime |
float |
||
pg_wait_events_<metric> |
Breakdown of wait events that are currently in the PG sever. Metric values are: | ||
Activity , BufferPin , Client , Extension , IO , IPC , Lock , LWLock , Timeout |
Also we have an extra one called TOTAL
which is a sum of all the wait events. | int
| |
| pg_active_connections
| Active connections in the PG server. | int
| |