What is an Incident

An incident represents an ongoing issue or a matter that needs attention. Incidents are typically triggered by alerts and often associate with a series of similar alerts.

Incidents, Alerts, and Events

When Flashduty receives an alert event (such as a Zabbix notification), the system automatically triggers an alert, which in turn triggers an incident. Multiple similar active alerts may be grouped into a single incident for unified assignment, notification, and handling.

Simply put: an incident is a combination of similar alerts. Without noise reduction, an incident equals a single alert. Conversely, with noise reduction enabled, an incident equals multiple associated alerts. To learn more about alert noise reduction models, please read Understanding Noise Reduction.

Incident Severity, Status, and Progress

Severity

Info: Minor issues where services remain operational; serves as a status reminder with no immediate action required.

Warning: Issues that may indicate potential problems or impending issues; early intervention recommended to prevent escalation.

Critical: Severe issues causing widespread service disruption or outages affecting users; immediate intervention required.

Incidents, alerts, and events all use these three severity levels. Severity levels are capitalized, which is important when using the API. The severity determination rules are as follows:

Event Severity: Different integrations (like Zabbix and Nightingale) have their own severity enumerations, which Flashduty maps to these three standard levels. For specific mapping relationships, please refer to the integration documentation, or Alert Processing Pipeline for custom severity levels.

Alert Severity: Equals the highest severity level among associated events.

Incident Severity: Equals the highest severity level among associated alerts.

Progress

Pending: Default status when an incident is triggered; system initiates automatic assignment, sets responders, and sends notifications.

In Progress: Status changes when any responder clicks acknowledge. Responders may be in either acknowledged or unacknowledged states, but at least one must be "acknowledged". Returns to "Pending" if all responders un-acknowledge.

Closed: Status changes when any responder clicks close the incident or when the incident auto-resolves.

Status

Alert status represents the incident's state in the original monitoring system: "resolved" or "unresolved". An incident's status is determined entirely by its associated alerts.

Resolved: All associated alerts are resolved; incident automatically resolves.

Unresolved: At least one associated alert remains unresolved.

💡

Automatic incident resolution leads to automatic closure (progress status), but manually closing an incident doesn't affect its resolution status.

Incident Labels

Labels are a fundamental concept in Flashduty, describing alert and incident information across different dimensions, used extensively for filtering, searching, and grouping.

How are labels generated?

Alert labels are extracted from event message bodies reported by the original alert system. Different sources have different extraction methods, following a maximum extraction principle. For example, for Prometheus alerts, Flashduty extracts Labels and Annotations information from the Payload.

Labels can only be obtained through event reporting, not manual modification or addition. An automatically triggered incident's labels always equal those of its first associated alert. A manually triggered incident always has empty labels.

Flashduty provides label enrichment options for automatic label generation. Learn more at Configure Label Enrichment.

Incident Lifecycle

1. Triggering New Incidents

Automatic Trigger: When Flashduty receives an integration alert event (like Zabbix notification), it automatically triggers an alert, which triggers an incident.

Manual Trigger: Manually create an incident through the Flashduty console by clicking Create Incident and filling in title, description, severity, etc.

2. Assignment and Notification

After triggering, Flashduty matches escalation rules within the channel. Upon matching, the system assigns the incident to individuals, team members, or on-call personnel and sends notifications. Without matching escalation rules, the incident won't be assigned or generate notifications.

You can set different escalation rules for different time periods or incident types for flexible assignment. Multiple escalation levels can be set within one rule. If current level responders don't confirm and handle the incident within the specified time, the system automatically escalates to the next level.

You can flexibly arrange notification methods in escalation rules. Flashduty supports numerous group and individual notification channels. Individual channels are one-on-one (voice, SMS, email), while group channels push messages to groups (Feishu/Lark, Dingtalk, Slack) with additional responder notifications.

💡

Note: Incidents only generate notifications after assignment. No assignment means no notifications.

If you assign an incident to a schedule with no one On-Call (empty shift), no individual notifications will be sent, but group chat messages will still be delivered if configured.

3. Acknowledgment and Resolution

On-Call personnel can acknowledge incidents immediately upon notification through voice calls or instant messages. After acknowledgment, incident progress changes to In Progress.

💡

Currently, Flashduty doesn't restrict incident acknowledgment to "assigned responders" only. Anyone who can see the incident can acknowledge it.

Close the incident changes progress to Closed. If associated alerts auto-resolve, the incident auto-closes. Conversely, manually closing an incident auto-closes all associated alerts, preventing them from merging with new events.

Incident Timeline

Each incident has a timeline tracking historical changes and operations. It shows when and how notifications were sent, to whom, and their results.

Triggering Incidents

Via Integration

Flashduty supports most common monitoring systems including Prometheus, Zabbix, Nightingale, and cloud monitoring. Visit Standard Alert Integration for specific steps.

💡

Flashduty supports dedicated and shared integration modes. Alerts delivered to channel-specific integrations trigger incidents in that channel.
Alternatively, deliver alerts to shared integrations in the integration center, then configure routing to different channels based on rules.

Via API

Flashduty provides a custom event standard for alert reporting via standard protocol, suitable for any non-adapted monitoring system. Read Email Integration for details.

💡

For system stability, Flashduty currently limits API reporting to 200qps. Excess requests will be rejected.

💡

Ensure you actively close alerts or set automatic incident timeout closure in your channel. Too many incidents severely impact console search performance. The system may close historical incidents without notification in such cases.

Via Email

Flashduty provides email integration for alert reporting via email, suitable for all monitoring systems supporting email notifications. Read Custom Events for details.

💡

You can set specific email prefixes for each integration. Contact us to set up a memorable custom domain for your account, like order-service@tesla.flashcat.cloud.

Via Console

Click Create in the console to initiate incident creation.

Field	Required	Description
Title	Yes	One-line description of what happened
Severity	Yes	Choose from Critical, Warning, Info
Channel	Yes	Incident ownership; not required if creating within a channel
Assignment	Yes	Rule-based: Select channel escalation rules. Direct: Select individuals or schedules
Description	No	Detailed incident description, supports Markdown

What is an Incident

Incidents, Alerts, and Events#

Incident Severity, Status, and Progress#

Severity#

Progress#

Status#

Incident Labels#

Incident Lifecycle#

Incident Timeline#

Triggering Incidents#

Via Integration#

Via API#

Via Email#

Via Console#