Flashduty Docs
中文English
RoadmapAPI官网控制台
中文English
RoadmapAPI官网控制台
  1. Alert Rules
  • Introduction
  • On-call
    • Getting Started
      • Quick start
      • FAQ
      • Product Comparison
    • Incidents
      • What is an Incident
      • View Incidents
      • Handle Incidents
      • Escalations and Assignments
      • Custom Fields
      • Custom Actions
      • Alert Noise Reduction
      • Past Incidents
      • Outlier Incidents
      • Status Pages
    • Configure On-call
      • Channels
      • Integrate Alerts
      • Alert Noise Reduction
      • Escalation Rules
      • Label Enrichment
      • Schedules
      • Templates
      • Service Calendars
      • Preferences
      • Alert Routing
      • Silence and Inhibition
      • Filters
      • Notifications
      • Alert Pipeline
    • Advanced Features
      • Referencing Variables
      • Dynamic Assignment
      • Insights
      • War-room
    • Integrations
      • Alerts integration
        • Standard Alert Integration
        • Email Integration
        • Nightingale/FlashCat Integration
        • Prometheus Integration
        • Grafana Integration
        • Zabbix Integration
        • Uptime Kuma Integration
        • Alibaba Cloud ARMS Integration
        • Alibaba Cloud Monitor CM Event Integration
        • Alibaba Cloud Monitor CM Metrics Integration
        • Alibaba Cloud SLS Integration
        • AWS CloudWatch Integration
        • Azure Monitor Integration
        • Baidu Cloud BCM Integration
        • Huawei Cloud CES Integration
        • Influxdata Integration
        • Open Falcon Integration
        • PagerDuty Integration
        • Tencent BlueKing Integration
        • Tencent Cloud CLS Integration
        • Tencent Cloud Monitor CM Integration
        • Tencent Cloud EventBridge
        • OceanBase Integration
        • Graylog Integration
        • Skywalking Integration
        • Sentry Integration
        • Jiankongbao Integration
        • AWS EventBridge Integration
        • Dynatrace Integration
        • Huawei Cloud LTS Integration
        • GCP Integration
        • Splunk Alert Events Integration
        • AppDynamics Alert Integration
        • SolarWinds Alert Events Integration
        • Volcengine CM Alert Events Integration
        • Volcengine CM Event Center Integration
        • Volcengine TLS Integration
        • OpManager Integration
        • Meraki Integration
        • Keep Integration
        • ElastAlert2 Alert Integration
        • StateCloud Alert Events
        • Guance Alert Events
        • Zilliz Alert Events
        • Huawei Cloud APM Alerts
        • zstack integration
        • Monit Alert Integration
        • RUM Alert Integration
      • Change integration
        • Standard Change Event
        • Jira Issue Events
      • IM integration
        • Feishu (Lark) Integration Guide
        • Dingtalk Integration
        • WeCom Integration
        • Slack Integration
        • Microsoft Teams Integration
      • Single Sign-On
        • Authing Integration
        • Keycloak Guide
        • OpenLDAP Guide
      • Webhooks
        • Alert webhook
        • Incident webhook
        • Costom action
        • ServiceNow Sync
        • Jira Sync
      • Other
        • Link Integration
  • RUM
    • Getting Started
      • Introduction
      • Quick start
      • FAQ
    • Applications
      • Applications
      • SDK Integration
      • Advanced Configuration
      • Analysis Dashboard
    • Performance Monitoring
      • Overview
      • Metrics
      • Performance Analysis
      • Performance Optimize
    • Error Tracking
      • Overview
      • Error Reporting
      • Issues
      • Source Mapping
      • Error Grouping
      • Issue States
      • Issue Alerting
    • Session Explorer
      • Overview
      • Data Query
    • Session Replay
      • View Session Replay
      • Overview
      • SDK Configuration
      • Privacy Protection
    • Best Practice
      • Distributed Tracing
    • Others
      • Terminology
      • Data Collection
      • Data Security
  • Monitors
    • Getting Started
      • Introduction
      • Quick Start
    • FAQ
      • FAQ
    • Alert Rules
      • Prometheus
      • ElasticSearch
      • Loki
      • ClickHouse
      • MySQL
      • Oracle
      • PostgreSQL
      • Aliyun SLS
  • Platform
    • Teams and Members
    • Permissions
    • Single Sign-On
  • Terms
    • Terms of Service
    • User Agreement/Privary Policy
    • SLA
    • Data Security
中文English
RoadmapAPI官网控制台
中文English
RoadmapAPI官网控制台
  1. Alert Rules

Aliyun SLS

This document provides detailed instructions on configuring alert rules for Alibaba Cloud Log Service (SLS) data sources in Monitors. Monitors retrieves data through the SLS SQL query interface (GetLogsV3) and triggers alerts based on query results.

Core concepts#

Query language: Uses SLS SQL syntax.
Required parameters: Each query must specify sls.project and sls.logstore parameters.
Time range: The SLS query time range is controlled by API parameters (configured via sls.timespan). You do not need to write WHERE __time__ > ... in the SQL statement.
Field handling: By default, __source__ and __time__ fields are ignored (unless explicitly specified as value fields).

1. Threshold mode#

This mode is suitable for scenarios requiring threshold comparisons on aggregated values.

Configuration#

1.
Query: Write a SLS SQL aggregation query.
Example: Count error logs per host in the last 15 minutes.
2.
Query parameters:
sls.project: (Required) Project name.
sls.logstore: (Required) Logstore name.
sls.timespan.value: (Optional) Time span value, defaults to 15.
sls.timespan.unit: (Optional) Time span unit, supports s (seconds), m (minutes), h (hours), d (days). Defaults to m.
3.
Field mapping:
Label fields: Fields used to distinguish different alert objects. In the example above, this is host. This field can be left empty, and Monitors will automatically treat all fields except value fields as label fields.
Value fields: Numeric fields used for threshold evaluation. In the example above, this is error_cnt.
4.
Threshold conditions:
Use $A.field_name to reference values.
Example: Critical: $A.error_cnt > 50, Warning: $A.error_cnt > 10.

How it works#

The engine calls the SLS API with a specified time range (e.g., last 15 minutes) and executes the SQL query. After retrieving results, it groups by label fields and compares value fields against thresholds.

Recovery logic#

Auto recovery: Automatically recovers when the latest query result values no longer meet any alert threshold.
Specific recovery conditions: Configure additional recovery expressions (e.g., $A.error_cnt < 5).
Recovery query:
Supports configuring an independent SQL statement for recovery evaluation.
Supports ${label_name} variable substitution.
Example: The alert SQL found that network card with network_host="a", interface="b" is down. The recovery SQL can be:
The engine replaces ${network_host} and ${interface} with actual values before executing the query. If data is found, recovery is confirmed.

2. Data exists mode#

This mode is suitable for scenarios where filtering logic is written directly in SQL.

Configuration#

1.
Query: Use a HAVING clause to filter anomalous data.
Example: Query hosts with more than 50 errors.
2.
Query parameters: Same as above, requires sls.project and sls.logstore.
3.
Evaluation rule: An alert is triggered as soon as the query returns data.

Pros and cons#

Pros: Leverages SLS server-side computing power, reducing data transmission.
Cons: Cannot distinguish between multiple severity levels.

Recovery logic#

Data disappearance means recovery: Recovery is confirmed when the query result is empty.
Recovery query: Supports configuring additional query statements.

3. No data mode#

This mode monitors scenarios where data is expected but actually missing.

Configuration#

1.
Query: Write a query that should continuously return data.
Example: Query log reporting heartbeats from all hosts.
2.
Evaluation rule: If a host appeared in previous cycles but cannot be found in the current and N consecutive cycles, a "no data" alert is triggered.

4. Advanced configuration and best practices#

Power SQL#

If you need to use SLS enhanced SQL syntax, add the following to query parameters:
sls.powersql: true

Time range control#

By default, data from the last 15 minutes is queried. Adjust using parameters:
sls.timespan.value: 60
sls.timespan.unit: m
Note: Do not use __time__ for filtering in SQL unless you have special requirements. The engine automatically sets the API request's from and to timestamps based on the above parameters.

Debug parameters#

If you need to debug data for a specific time period, use the following parameters (typically for debugging only, do not configure in production rules):
sls.from: Start timestamp (seconds).
sls.to: End timestamp (seconds).

添加官方技术支持微信

在这里,获得使用上的任何帮助,快速上手FlashDuty

微信扫码交流
修改于 2025-12-31 06:16:06
上一页
PostgreSQL
下一页
Teams and Members
Built with