Managing Alerts
In vuSmartMaps, users can create any number of alert rules in vuSmartMaps. The system will evaluate all configured alert rules at regular intervals and track the alarm state corresponding to each.
The saved alert rules in the system can be viewed or modified by users who have appropriate access permissions for individual saved rules.
- Listing Alert Rules Configured: You can view the names, descriptions, status, creation, and modification details, and perform actions like editing or deleting alert rules conveniently from this list
- Permissions: In this section, you can assign three types of permissions to roles: "None" (no permissions), "View" (read-only access), and "Modify" (full editing access).
- Enabling and Disabling Alert Rules You can enable or disable alert rules with a simple toggle.
- Alarm Mode continuously monitors conditions and sends active notifications when they become true and clear notifications when they become false.
- Disabling Alarm Mode is ideal for one-time events.
- Changing from one mode to another Users can easily change an alert rule from alarm mode to non-alarm mode, or vice versa, by modifying the configuration and saving it.
Listing Alert Rules Configured
To view existing alert rules, simply click the Alerts button under the Explore section situated on the right side of the Home page..
- Name - The name of an Alert Rule.
- Description - The description of the Alert Rule.
- Status- Describes whether the alert is Enabled or Disabled.
- Created By - The information on the user who originally created the Alert Rule.
- Created At - The information on the date and time when the Alert Rule was originally created.
- Modified By - The information on the user who modified the Alert Rule recently.
- Modified At - The information on the date and time when an Alert Rule recently was modified.
Edit: Click on the Edit Button under the Actions column to view or edit the respective Alert Rule.
Delete: Click on the Delete Button under the Actions column to delete the respective Alert Rule.
Multi Delete: Select one or more Alert Rules and click the Delete button at the top right.
Permissions
Now, You can click on Permissions to manage Object Level Permissions in the Alert Rule.
The screen will look like this, For every role, you can attribute 3 types of permission.
- None: There are no permissions given.
- View: The selected user can only view the Alert Rule.
- Modify: The selected can also modify and make changes to the Alert Rule.
Edit: Click on this to edit the Alert Rule.
Clone: Click on this to replicate the current Alert Rule. This is the easiest and quickest way to duplicate an existing Alert Rule and make the relevant changes to match your requirement.
Enter the Clone name to create a unique name.
Enabling and Disabling Alert Rules
An alert rule can be disabled using the Alert Enable Switch button on the top right corner of the alert rule. When an alert rule is disabled, vuSmartMaps stops evaluating the alert conditions for the rule and will not be generating any notifications.
Enable: Check one or more Alert Rules and click Enable.
Disable: Check one or more Alert Rules and click Disable.
Alarm Mode
By default, alert rules in vuSmartMaps work as alarms. In alarm mode, the system continuously monitors the alert conditions. It triggers an active alert notification when the conditions become true and a clear notification when they become false.
For example, let's say you have an alert rule to monitor host CPU usage. When the CPU usage exceeds the set threshold, vuSmartMaps sends a notification. As long as the CPU usage remains above the threshold, no additional notifications are sent. Once the CPU usage falls below the threshold, vuSmartMaps sends a clear notification.
Figure: The points at which Alarm Active Notification and Clear Notifications are sent
The clear notification for alarms will include details about how long the alarm was active and provide metrics and information for the entire duration of the alarm.
Disabling Alarm Mode
You can configure an alert rule to work in non-alarm mode if you need general information notifications. When alarm mode is disabled, vuSmartMaps will generate notifications whenever the alert rule conditions are met. It will also continue to send alert notifications at regular intervals as long as the conditions are met.
However, in this mode, vuSmartMaps won't track the alert state, and it won't send clear notifications when the conditions are no longer met.
Figure: The points at which Notifications are sent in non-alarm mode
Non-alarm mode-based alerts are particularly useful for one-time event cases. For instance, when monitoring Syslog messages and you only want to be notified when a specific event like 'System rebooted' is detected, you can configure a non-alarm mode to check for this condition. In such cases, vuSmartMaps will send notifications each time the event occurs without tracking it as an ongoing alarm.
Changing from one mode to another
Switching between alarm mode and non-alarm mode is a flexible process for users. You can change an alert rule from alarm mode to non-alarm mode or vice versa by adjusting the configuration and saving it.
It's important to remember that when you switch from alarm mode to non-alarm mode, the system stops tracking the alarm state for the rule. This means that no clear notification will be sent for alarm conditions that were active at the time of the configuration change. The same principle applies when an alert rule in alarm mode is disabled in the configuration. In this case as well, the system won't send an explicit clear notification.
Behind the Scene - vuSmartMaps Alert Rule Engine
Internally, the vuSmartMaps Alert Rule Engine regularly checks all the alert configurations. It looks for conditions that are true and creates notifications for those conditions, delivering them through the configured channels.
Frequency of Rule Evaluation
You can choose how often vuSmartMaps evaluates the alert configurations. In the user interface, go to Alert controls and select the frequency for alert execution.
The time frame for which a metric within an alert rule is to be evaluated is different from the frequency of the alert rule evaluation. The diagram below shows this.
Alarm Update Notifications
Update notifications can be sent for the same alarm in two situations:
- When the severity of the active alarm changes.
- When the alarm state of a specific metric within an alert rule changes.
Alarm State and Channels
When an alert condition changes from false to true, notifications are sent through all configured channels.
At regular intervals (based on the alert rule's scheduling frequency), if the alert condition remains true, the system will provide ongoing updates through Alert Dashboards. This documentation will include real-time information about the associated metrics and the total duration of the active alarm.
However, after the initial active notification, no further notifications will be sent via email, SMS, or WhatsApp channels as long as the alarm state remains the same.
Updates will be sent to ticketing systems to update previously opened tickets for the alert condition (ticketing support is coming in the next release).
When the alarm state changes from active to inactive, clear notifications are sent through all configured channels except for the "Report" channel. The vuSmartMaps datastore will also record a new document capturing the clear notification.
The table below summarizes the behavior of different channels for various alarm notifications.
Channel | New Alarm | Alarm Periodic Update | Alarm Change in severity and affected metrics | Alarm Clear |
---|---|---|---|---|
Data Store | ✅ | ✅ | ✅ | ✅ |
✅ | ✅ | ✅ | ||
Report | ✅ | |||
ITSM | ✅ | ✅ | ✅ | |
Runbook | ✅ | |||
✅ | ✅ | ✅ | ||
SMS | ✅ | ✅ | ✅ |
Non-Alarm Mode and Channels
When you disable alarm mode for an alert configuration, the system will keep generating notifications at regular intervals, usually every 5 minutes, as long as the alert conditions remain true. No tracking or notification is done when the alert condition is cleared.
Throttling, which stops notifications during the throttling period, is supported in both cases when enabled.
Enabling Debugs for Alert Module
Turning on debugs for the vuAlert module generates extensive debug logs, especially when many alert rules are configured. To manage this, you can selectively enable debugs for specific alert rules.
This feature can be configured in the auxiliary configuration file configs/alert_rule.yml
, and the debug logs in this case will be logged at the WARNING level.
# Add the Alert rule name here, if you want to debug level logs to be logged
# in the vuSmartMaps logs.
"alert-debug-list": ["test123"]
Controlling various alert email sections
By default, an alert email has the following sections -
- Additional field information in form of hashtags
- Historical data related to past events
- Contextual metrics table
- Information metrics table
But if a user does not want all this information in alert emails, this can be controlled at the system level using an auxiliary configuration file, configs/alert_rule.yml. The configuration given below can be used as a reference.
# All the controls related to alert emails -
# Control for adding additional field tags in the email
"email-tags": True
# Control for adding historical data in email
"history-in-emails": True
# Control for adding contextual metrics table in the email
"contextual-metrics": False
# Control for adding information table in the email
"information-table": True
Managing multiple vuAlert nodes
In a dockerized environment, you can run multiple vuAlert nodes, and all alert rules will be evenly distributed among these nodes for load balancing.
The assignment of alert rules to specific vuAlert nodes depends on the name of the alert rule. Each alert rule consistently runs on the same node.
Users also have the option to manually select a specific vuAlert node to run a particular alert rule if needed.
Adding Additional Contents in Notification
Standard alert notifications include details like a summary, description, event duration, metric values, and historical data. However, users often need additional context about the affected components and metrics.
Adding Additional Fields to the Metric Information
For instance, when reporting CPU usage on a server, users might want the notification to include the server's location, the server's operating system, and the name of the application using it.
Information Rules
These rules provide contextual information when alarm notifications are generated. For example, if a high CPU usage notification occurs, users can receive a list of the top processes consuming CPU. Likewise, when a notification reports a low success rate for an e-commerce application, users can receive the top reasons for failures to pinpoint the problem area.
In all such cases, Information Rules can be applied to the alert rule.
Your first rule monitors system CPU usage using hostname as a grouping level. Then, you have a second information rule that identifies the top CPU-consuming processes, and it uses both the hostname and process ID for grouping.
Using Multiple Information Rules
Feel free to add as many information rules as needed to enhance your alert notifications. Just remember to include at least one primary rule that specifies how to detect the alert condition.
Using Thresholds in Information Rule: You have the option to set thresholds in your information rules. When you do this, your alert notifications will include color codes and insights based on these thresholds. However, it's important to note that these threshold conditions won't affect the alarm state. They are purely for providing additional context in your notifications