Incidents, Alerts, and Events
When Flashduty On-call receives an alert event (such as a Zabbix notification), the system automatically triggers an alert, which in turn triggers an incident. Multiple similar active alerts may be grouped into the same incident for unified assignment, notification, and handling. Simply put: An incident is a combination of similar alerts. Without noise reduction, an incident equals an alert. Conversely, with noise reduction, an incident equals its associated multiple alerts. For more about alert noise reduction, read Understanding Noise Reduction.Incident Severity, Status, and Progress
Severity
| Level | Description |
|---|---|
| Info | Minor, service is still running normally, just a status reminder, no immediate action needed |
| Warning | Warning, service may have errors or problems are imminent, should intervene early to prevent escalation |
| Critical | Critical, widespread service errors or outages, users affected, must take immediate action |
- Event Severity: Alert events from different integration sources (like Zabbix and Nightingale) have different severity enumerations. Flashduty On-call maps them to these three standard severities according to specific rules. For mapping details, refer to the specific integration documentation. To customize severity, see Alert Processing.
- Alert Severity: Equals the highest severity among associated events.
- Incident Severity: Equals the highest severity among associated alerts.
Processing Progress
| Status | Description |
|---|---|
| Triggered | After incident triggers, progress defaults to “Triggered”, system initiates automatic assignment, sets responders and sends notifications |
| Processing | When anyone clicks Acknowledge, progress immediately changes to “Processing”. In this state, responders may be Acknowledged or Unacknowledged, but at least one person is “Acknowledged”. When all responders unacknowledge, progress reverts to “Triggered” |
| Closed | When anyone clicks Close or incident auto-recovers, progress immediately changes to “Closed” |
Incident Status
Alert status represents the incident’s state in the original monitoring system, i.e., “Recovered” or “Not Recovered”. Incident status is completely determined by its associated alerts.| Status | Description |
|---|---|
| Recovered | All alerts associated with the incident have recovered, incident auto-recovers |
| Not Recovered | At least one alert associated with the incident hasn’t recovered, incident remains unrecovered |
Incident auto-recovery leads to automatic closure (of processing progress); but manually closing an incident has no effect on incident status.
Incident Labels
Labels are a fundamental concept in Flashduty On-call. Different labels describe alert and incident information across various dimensions, and are extensively used in filtering, searching, and grouping scenarios.Label Generation Rules
Alert labels are extracted from event messages reported by the original monitoring system. Different sources have different extraction methods, but generally we follow the capture everything relevant principle. For example, for Prometheus-sourced alert events, Flashduty On-call extracts Labels and Annotations information from the Payload. Flashduty On-call provides label enhancement for automatic label generation. Go to Configure Label Enhancement to learn more.Incident Lifecycle
Trigger New Incident
Incidents can be triggered in the following ways:
- Auto-trigger: Flashduty On-call receives an alert event from an integration (like Zabbix notification), event auto-triggers an alert, alert auto-triggers an incident
- Manual trigger: Click Create Incident button in Flashduty On-call console, fill in title, description, severity, etc. to trigger a new incident
Assignment and Notification
After a new incident triggers, Flashduty On-call sequentially matches escalation rules under the channel. After matching an escalation rule, the system assigns the incident to individuals, team members, or on-call personnel and sends notifications.You can set different escalation rules for different time periods or incident types to achieve flexible assignment. The system allows you to set multiple levels within an escalation rule. If current level responders don’t acknowledge and resolve the incident within the specified time, the system automatically escalates to the next level.You can flexibly arrange notification methods in escalation rules. Flashduty On-call supports many group chat and direct message notification channels. Direct messages are one-to-one push channels (like voice, SMS, email), group chats push messages to messaging groups (like Feishu/Lark, Dingtalk, Slack) with additional mentions for assignees.
If you assign an incident to a schedule with no one on-call (empty schedule), the system won’t send notifications to individuals, but if you’ve configured group chat channels, messages will still be pushed to those groups.
Acknowledge and Resolve
On-call personnel can acknowledge immediately upon receiving notification. You can acknowledge incidents via voice calls or instant messages. After acknowledgment, incident progress changes to Processing.Closing an incident changes progress to Closed. If alerts associated with the incident auto-recover, the incident also auto-closes. Conversely, if you manually close an incident, all associated alerts are automatically closed. This means these alerts will no longer merge new events.
Incident Timeline
Every incident has a timeline for tracing changes and actions at different historical moments. For example, at what time, through what channel, who was notified, and notification results.
Triggering Incidents
Trigger via Integration
Flashduty On-call supports most common monitoring systems, including Prometheus, Zabbix, Nightingale, and cloud monitoring. Go to Alert Integration for specific steps.Trigger via API
Flashduty On-call provides a custom event standard, allowing you to report alerts via standard protocol, suitable for any non-integrated monitoring system. For details, read Custom Alert Events.Trigger via Email
Flashduty On-call provides email integration, allowing you to report alerts by sending emails, suitable for all monitoring systems supporting email alerts. For details, read Email Integration Guide.Trigger via Console
Click Create button in console to create an incident.| Field | Required | Description |
|---|---|---|
| Incident Title | Yes | One sentence describing what happened |
| Severity | Yes | Choose Critical, Warning, or Info |
| Channel | Yes | Incident ownership; not needed if creating within a channel |
| Assignment Method | Yes | By Policy: Select a channel policy for assignment Direct: Select individuals or schedules for assignment |
| Incident Description | No | Detailed description, supports Markdown |
Related Topics
Understanding Noise Reduction
Understand the relationship between events, alerts, incidents and noise reduction principles
Search and View Incidents
Master incident list and details page usage
Handle and Update Incidents
Learn how to acknowledge, snooze, close incidents
Escalate and Reassign Incidents
Learn how to reassign and escalate