Help Docs

Threshold and availability for network devices

Add a network monitor and keep track of all the performance metrics for critical network devices to help network teams visualize, monitor, optimize, and manage network devices and interfaces.

Threshold and availability profiles help the alarm engine decide whether a specific network device or resource has to be declared as down or in a trouble state. Thresholds can also be configured for child attributes like network interfaces and performance counters.

Add a threshold and availability profile 

The monitor’s status changes to Trouble or Critical when the condition applied to any of the below threshold strategies is met.

  1. Click Admin > Configuration Profiles > Threshold and Availability.
  2. Click Add Threshold Profile in the Threshold and Availability screen, and in the drop-down, select Threshold Profile.
  3. Specify the following details for adding threshold and availability for a network device:
    • Monitor Type: Select Network Device from the drop-down list
    • Display Name: Provide a label for identification purposes.
  4. Alert if the device is not responding to SNMP queries: Toggle to Yes to receive an alert when the device is not responding to SNMP queries.
  5. Alert when the interface is Down/Trouble: Toggle to Yes to receive an alert when any one of the interfaces in a device is Down or in a Trouble state.
  6. Alert when an individual switch in a switch stack is Down: Toggle to Yes to receive an alert when an individual switch in a switch stack is Down.
  7. Alert when the Temperature Sensor Status is Down: Toggle to Yes to receive an alert when the temperature sensor status is Down.
  8. Alert if the Temperature Sensor is malfunctioning: Toggle to Yes to receive an alert when the temperature sensor is malfunctioning.
  9. Alert when the Fan Sensor Status is Down: Toggle to Yes to receive an alert when the fan sensor status is Down.
  10. Alert if the Fan Sensor is malfunctioning: Toggle to Yes to receive an alert when the fan sensor is malfunctioning.
  11. Alert when the Power Sensor Status is Down: Toggle to Yes to receive an alert when the power sensor status is Down.
  12. Alert if the Power Sensor is malfunctioning: Toggle to Yes to receive an alert when the power sensor is malfunctioning.
  13. Alert when the Voltage Sensor Status is Down: Toggle to Yes to receive an alert when the voltage sensor status is Down.
  14. Alert if the Voltage Sensor is malfunctioning: Toggle to Yes to receive an alert when the voltage sensor is malfunctioning.
  15. Alert when a Peer is Down: Toggle to Yes to receive an alert when a peer is Down.
  16. Alert if trap processing is suspended due to device limit: Toggle to Yes to receive an alert when trap processing is suspended due to device limits.
  17. Threshold Type: Select Static Threshold to set thresholds manually and Zia based Threshold to track abnormal spikes using anomaly detection and to set a dynamic threshold. From the drop-down menu, choose the desired metrics for which thresholds need to be configured. Enter a value specific to the unit in the Threshold field, set the threshold criteria (<, <=,=,>, or >=, !=) in the Condition field, select an appropriate Poll Strategy, and enter the Poll Value and the monitor state (Critical or Trouble) next to each metric. You'll receive alerts when these threshold conditions are violated.
    • Device-level attributes: Response Time, Packet Loss, CPU Utilization, Memory Utilization, and System Uptime.
    • Interface-level attributes:  In Traffic, Out Traffic, Total Traffic, Rx Utilized (%), Tx Utilized (%), Errors (%), Discards (%), Rx Volume, Tx Volume, Total Volume, Rx Unicast Packets, Tx Unicast Packets, Rx Broadcast Packets, Tx Broadcast Packets, Rx Multicast Packets, Tx Multicast Packets, Rx Non-Unicast Packets, and Tx Non-Unicast Packets
    • Hardware Sensor Value: Temperature Sensor Value and Voltage Sensor Value
    • BGP Peer Metrics: Total Flaps, In Updates, Out Updates, Messages Sent, and Messages Received
    • OSPF Metrics: Total Flaps
    • Tunnel: In Traffic and Out Traffic
  18. Additional settings: For each of these threshold configurations, you can also select an Automation step and an Event reason template.
    Note

    You can select configlets as an Automation step. However, the step will be executed only if the network device is also added as an NCM monitor. 

  19. Advanced Threshold: Set complex alert conditions using logical operators across multiple attributes to accurately detect anomalies using advanced threshold settings. You can choose to provide options through the drop-down menu in the Condition section or provide custom logic through scripts in the Custom Function section. 
  20. Click Save.

How it works
Poll counts are the default strategy to validate threshold breaches. You can validate a threshold breach by applying multiple conditions (>, <,=,>=, <=, or !=) to your specified threshold strategy. The monitor’s status changes to Trouble or Critical when the condition applied to any of the below threshold strategies is met:

  • Poll Count: The monitor’s status changes to Trouble or Critical when the condition applied to the threshold value is continuously validated for the specified poll count.
  • Poll Average: The monitor’s status changes to Trouble or Critical when the average of the attribute values for the number of polls configured continuously breaches the condition applied to the threshold value.
  • Time duration (in minutes): When the specified condition applied to the threshold value is continuously validated for all the polls during the configured time duration, the monitor’s status changes to Trouble or Critical.
  • Average time (in minutes): The monitor’s status changes to Trouble or Critical when the average of the attribute values for the average time configured continuously justifies the condition applied on the threshold value.

The multiple poll check strategy, mentioned in the Poll Average step, will not be applied by default. During the conditions where no strategy could be applied, the threshold breach will be validated for a single poll alone.

Note
To be sure that the condition applied in the Time duration (in minutes) strategy or the Average time (in minutes) strategy for threshold breach detection works as intended, you must ensure that you specify a time duration that is at least twice the applied check frequency for that monitor.
Info
Threshold profiles can also be configured for individual interfaces. Learn more about configuring thresholds for network interfaces and performance counters.

Edit a Threshold and Availability Profile for a network device

  1. Click the profile that you want to edit. Alternatively, you can navigate to Network > Network Devices > click a device. Click the hamburger icon , then click Edit. Next, click the pencil iconbeside the Threshold and Availability field in the Configuration Profiles section.
  2. Edit the parameters that need to be changed in the Edit Threshold and Availability window.
  3. Click Save.

Delete a Threshold and Availability Profile for a network device

  1. Click the profile in the Threshold and Availability screen that needs to be deleted.
  2. This will bring you to the Edit Threshold and Availability window.
  3. Click Delete.

Related article

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!