Introduction

What is the Alert Engine (Monitors)?

The Alert Engine (Monitors) integrates with various metric and log data sources, performs threshold evaluation based on your configured alert rules through periodic data queries, generates alert events, and finally pushes them to Flashduty On-call for aggregation and delivery.

Flashduty Monitors can replace the alerting capabilities of products like Nightingale, vmalert, and elastalert. The Monitors alert engine is designed to be extremely flexible and deeply integrated with On-call products, capable of meeting various complex alerting requirements.

Alert Engine (Monitors) Architecture Design

Flashduty is a SaaS service that cannot access data sources within users' private networks from the SaaS side. Therefore, the Alert Engine (Monitors) consists of two parts:

SaaS Server: Responsible for managing alert rules and permissions

monitedge: Deployed within users' private networks, synchronizes alert rules from SaaS, performs periodic data queries and threshold evaluation, generates alert events and pushes them to the SaaS side

The architecture diagram is shown below:

The diagram assumes that the customer has two data centers, East US and South China. Each data center has a monitedge instance deployed, responsible for alert evaluation of data sources within their respective data centers and pushing alert events to the SaaS side.

If you only have one data center, or if the network quality between data centers is good, you can also deploy only one monitedge instance to handle alert evaluation for all data sources.

If you are concerned about single point of failure risks when deploying one monitedge, you can also deploy multiple monitedge instances to form a cluster. For example, deploy 2 monitedge instances in the East US data center to form a cluster, setting the same cluster name through the --alerter.clusterName meidong parameter when starting the instances; deploy 2 monitedge instances in the South China data center to form another cluster, setting another cluster name through the --alerter.clusterName huanan parameter when starting these two instances.

Multiple instances in an alert engine cluster will automatically shard the processing of alert rules. For example, if this cluster needs to process 100 alert rules, the system will automatically balance the load, allowing each monitedge instance to process 50 rules respectively. If one instance fails, another instance will take over the processing of all 100 alert rules, ensuring high availability while avoiding duplicate alert event delivery.

What is the Alert Engine (Monitors)?#

Alert Engine (Monitors) Architecture Design#

What is the Alert Engine (Monitors)?

Alert Engine (Monitors) Architecture Design