agent.yaml. If you only need to onboard the host object, keep the default configuration first. After the host appears in the console, add MySQL, Redis, PostgreSQL, and other objects as needed.
Basic configuration example
The following configuration is suitable for first-time onboarding:locator_mappings
locator_mappings controls the object address displayed in the console. It is commonly used for non-host objects such as MySQL.
For example, the Agent connects to MySQL through a local address:
- If MySQL, Redis, PostgreSQL, MongoDB, or similar services are configured with
localhostor127.0.0.1, also configurelocator_mappings. - The mapped address should be a stable IP, DNS name, or
host:port. - Do not map an address to
localhostor127.0.0.1. - Kafka and Elasticsearch are cluster-level objects and do not use
locator_mappings. Kafka usescluster_nameas the identifier. Elasticsearch automatically obtainscluster_namefrom the cluster.
host
host controls the collection behavior for host diagnostics. It usually does not need to be changed for first-time onboarding.
| Config | Recommended value | Description |
|---|---|---|
sample_interval | 2s or 3s | Sampling interval for CPU, disk I/O, network I/O, and similar metrics. |
disk.statfs_timeout | 1s | Prevents abnormal mount points from slowing down diagnostics. |
disk.top_n | 20 | Controls the number of file systems returned. |
disk_io.top_n | 5 | Controls the number of disk I/O devices returned. |
network_io.top_n | 5 | Controls the number of network interfaces returned. |
top_processes.default_top_n | 10 | Default number of processes returned. |
top_processes.include_cmdline | false | Does not return the full command line by default, reducing the risk of exposing passwords, tokens, or connection strings. |
shell_exec
shell_exec controls whether the Agent allows controlled host diagnostic commands.
- Keep
enabled: truewhen AI-SRE needs live host diagnostics. Only controlled shell commands can be executed. - If some shell commands are blocked by built-in guardrails, add them manually to
user_allow_listonly after confirming that they are safe, read-only, and do not expose sensitive information.
tool_policy.disabled_tools:
MySQL
To diagnose MySQL, add instance configuration undermysql:. Use a read-only MySQL account, and preferably store the password in a separate credential file:
| Config | Recommendation |
|---|---|
targets | Explicitly use host:port. If localhost:3306 is used, configure locator_mappings. |
connection.timeout | Recommended: 3s. |
overview.sample_interval | Recommended: 2s or 3s. |
query.enabled | Default: false. Enable only after confirming the account is read-only. |
query.default_max_rows | Recommended: 200. |
query.statement_timeout | Recommended: 6s. |
credential.source | Recommended: env_file. |
mysql.query is enabled, always use a read-only account. This tool executes controlled read-only SQL and should not use a privileged account.
Redis
To diagnose Redis, add instance configuration underredis:.
| Tool | Function | Default status |
|---|---|---|
redis.overview | Collects INFO ALL twice, calculates the diff, and returns key metrics such as memory, hit rate, connections, and QPS. | Enabled |
redis.slowlog | Reads SLOWLOG GET and returns recent slow query records. | Enabled |
redis.command | Executes controlled read-only Redis commands with an allowlist policy. | Disabled by default |
| Config | Recommendation |
|---|---|
targets | Explicitly use host:port. If localhost:6379 is used, configure locator_mappings. |
connection.database | Default: 0. |
connection.timeout | Recommended: 3s. |
overview.sample_interval | Recommended: 2s or 3s, range [1s, 5s]. |
command.enabled | Default: false. Enable only when controlled read-only commands are needed. |
credential | Redis versions earlier than 6 only use password authentication, so username can be omitted. Redis 6+ ACL mode can configure both username_key and password_key. |
redis.command is enabled, only allowlisted read-only commands such as CONFIG GET, CLIENT LIST, MEMORY USAGE, and LATENCY HISTORY are allowed. Write commands are rejected.
Redis Sentinel
To diagnose a Redis Sentinel high availability cluster, add Sentinel process configuration underredis_sentinel:. redis_sentinel and redis are different object types. They point to Sentinel processes and Redis data nodes respectively.
| Tool | Function | Default status |
|---|---|---|
redis_sentinel.overview | Gets Sentinel INFO, including the monitored master list and status. | Enabled |
redis_sentinel.topology | Gets topology information for all monitored masters, including master, replica, and sentinel node lists. | Enabled |
| Config | Recommendation |
|---|---|
targets | Sentinel default port is 26379. Explicitly use host:port. |
connection.timeout | Recommended: 3s. |
credential | Sentinel usually uses password-only authentication without username. If Sentinel does not enable requirepass, credential can be omitted. |
PostgreSQL
To diagnose PostgreSQL, add instance configuration underpostgres:.
| Tool | Function | Default status |
|---|---|---|
postgres.overview | Collects key statistics views twice, calculates the diff, and returns connection count, transaction throughput, cache hit rate, replication lag, and other key metrics. | Enabled |
postgres.activity | Queries pg_stat_activity and returns current active and long-running queries. | Enabled |
postgres.query | Executes controlled read-only SQL queries such as SELECT, WITH, and EXPLAIN. | Disabled by default |
| Config | Recommendation |
|---|---|
targets | Explicitly use host:port. The default port is 5432. If localhost:5432 is used, configure locator_mappings. |
connection.database | Required. PostgreSQL has no implicit default database. A common value is postgres. |
connection.sslmode | Default: prefer. Valid values: disable, allow, prefer, require, verify-ca, verify-full. |
connection.timeout | Recommended: 3s. |
overview.sample_interval | Recommended: 2s or 3s, range [1s, 5s]. |
activity.min_query_age | Returns only queries running longer than this threshold. Default: 1s. |
activity.top_n | Default: 5, maximum 20. |
query.enabled | Default: false. Enable only after confirming the account is read-only. |
query.default_max_rows | Recommended: 200, maximum 10000. |
query.statement_timeout | Recommended: 6s, range [1s, 7s]. |
credential | Required. PostgreSQL wire protocol does not support anonymous connections. Grant the pg_monitor role to get full pg_stat_activity visibility. |
postgres.query is enabled, always use a read-only account.
MongoDB
To diagnose MongoDB (mongod or replica set members), add instance configuration undermongodb:. Only the host:port format is accepted. mongodb+srv:// URIs are not supported.
| Tool | Function | Default status |
|---|---|---|
mongodb.overview | Collects serverStatus twice, calculates the diff, and returns connection count, operation throughput, memory, replication lag, and other key metrics. | Enabled |
mongodb.current_ops | Queries currentOp and returns currently running operations. | Enabled |
mongodb.command | Executes controlled read-only management commands with an allowlist policy. | Disabled by default |
| Config | Recommendation |
|---|---|
targets | Explicitly use host:port. Each target corresponds to an independent mongod instance. |
connection.database | SCRAM authentication authSource database. Default: admin. |
connection.timeout | Recommended: 3s. |
connection.tls | Optional. Set enabled: true when TLS is enabled. Specify ca_file when using a self-signed CA. |
overview.sample_interval | Recommended: 3s, range [1s, 5s]. |
command.enabled | Default: false. Enable only when controlled read-only management commands are needed. |
credential | If MongoDB authentication is disabled in development or test environments, credential can be omitted. Configure a read-only account in production. |
mongodb.command is enabled, only allowlisted read-only management commands such as dbStats, collStats, serverStatus, and replSetGetStatus are allowed. Write commands and dangerous commands are rejected.
MongoDB Mongos
To diagnose MongoDB sharded cluster routing processes (mongos), add configuration undermongodb_mongos:. mongodb_mongos and mongodb are different object types. They point to mongos routing processes and mongod data nodes respectively.
| Tool | Function | Default status |
|---|---|---|
mongodb_mongos.overview | Collects mongos serverStatus and returns connection count, operation throughput, and other key metrics. | Enabled |
mongodb_mongos.shard_distribution | Gets sharded cluster topology and data distribution information. | Enabled |
| Config | Recommendation |
|---|---|
targets | mongos routing process address. Explicitly use host:port. |
connection.database | SCRAM authentication authSource database. Default: admin. |
connection.timeout | Recommended: 3s. |
overview.sample_interval | Recommended: 3s, range [1s, 5s]. |
credential | Usually shares the same credential file as mongodb. |
Kafka
To diagnose a Kafka cluster, add configuration underkafka:. Kafka is a cluster-level object. One kafka configuration block represents one logical cluster, and bootstrap_brokers are connection entry points rather than independent targets.
| Tool | Function | Default status |
|---|---|---|
kafka.overview | Gets the broker list, controller information, and topic overview. | Enabled |
kafka.consumer_lag | Gets consumer group lag. | Enabled |
kafka.topic_detail | Gets partition details for a specified topic, including replica distribution, ISR, and leader. | Enabled |
kafka.group_detail | Gets details for a specified consumer group, including member assignment and offsets. | Enabled |
| Config | Recommendation |
|---|---|
cluster_name | Required. Used as the object identifier in the console. Only lowercase letters, digits, ., -, and _ are allowed. Length: 2-128. |
bootstrap_brokers | At least one Broker address in host:port format. Configure multiple addresses for better availability. |
connection.timeout | Recommended: 5s. |
connection.sasl_mechanism | Default: none. Supported values: none, plain, scram-sha-256, scram-sha-512. |
connection.tls | Optional. Set enabled: true when TLS is enabled. mTLS requires both cert_file and key_file. |
consumer_lag.default_top_n | Default: 10, range [1, 50]. |
credential | Required only when sasl_mechanism is not none. |
locator_mappings. cluster_name is directly used as the object address in the console.
Elasticsearch
To diagnose an Elasticsearch cluster, add configuration underelasticsearch:. Elasticsearch is a cluster-level object. cluster_name does not need to be declared in the configuration. The Agent automatically obtains it through GET _cluster/health during startup or reload. If the cluster is unreachable, the target is skipped until the next reload.
| Tool | Function | Default status |
|---|---|---|
elasticsearch.overview | Gets cluster health, node count, index count, shard allocation, and other global cluster information. | Enabled |
elasticsearch.node_stats | Gets detailed node metrics such as JVM, OS, thread pool, and transport. | Enabled |
elasticsearch.index_stats | Gets index-level statistics such as document count, storage size, and read/write throughput. | Enabled |
elasticsearch.shard_allocation | Gets cluster shard allocation details to diagnose uneven shard distribution or unassigned shards. | Enabled |
elasticsearch.cat | Executes controlled _cat API queries with an allowlist policy. | Disabled by default |
| Config | Recommendation |
|---|---|
targets | Full URL format, including protocol and port, such as https://es-node:9200. http:// and https:// are supported. Configure multiple nodes for better availability. |
connection.timeout | Recommended: 5s. |
connection.tls.ca_cert | Specify the CA certificate path when using a self-signed CA. It must be an absolute path. |
connection.tls.skip_verify | Default: false. Do not enable it in production. |
cat.enabled | Default: false. When enabled, allowlisted _cat API queries can be executed. |
credential | If Elasticsearch security authentication is disabled, credential can be omitted. Configure a read-only account in production. |
locator_mappings. The Agent automatically obtains cluster_name from the cluster and uses it as the object address in the console.
script_tool
script_tool is used to add custom script tools. Most users can keep it disabled: