Performance Optimization Guide: Difference between revisions

From Allegro Network Multimeter Manual
Jump to navigation Jump to search
Access restrictions were established for this page. If you see this message, you have no access to this page.
 
(49 intermediate revisions by 7 users not shown)
Line 1: Line 1:
<accesscontrol>AC:GroupUsers</accesscontrol>
== About ==
== About ==


This guide is about performance optimization of the Allegro for specific use cases. By default, the Allegro runs in a configuration that fits for the majority of users and you do not need to change any parameter of the configuration.
This guide is about performance optimization of the '''Allegro Network Multimeter''' for specific use cases. By default, the device runs in a configuration that fits for the majority of users and you do not need to change any parameter of the configuration. Depending on the actual network traffic and measurement setup, it can be beneficial to adjust some performance-related parameters to achieve better overall performance.


== High Level Allegro System Layout ==
== High level Allegro system layout ==


The Allegro has various units that processes the traffic. These units are:
The '''Allegro Network Multimeter''' has various components that process traffic. These components are:


* I/O threads: responsible for all I/O operations between the interfaces and cpus
* I/O threads: responsible for all I/O operations between the network interface cards and the CPUs.
* Analyzer threads: responsible for decoding and most of the database operations
* Analyzer threads: responsible for decoding network traffic and most of the database operations for the statistical values.
* DB threads: optional threads offloading memory intensive database operations, see [[DB mode]]
* DB threads: optional threads which offload memory intensive database operations, see [[DB mode]].


The Allegro uses queues to buffer packets and messages between the hardware instances ( interfaces chips, central processing unit, storage ) and threads. All threads measure their which you can monitor at '''Info''' → '''System Info''' → '''Load'''.
The '''Allegro Network Multimeter''' uses queues to buffer packets and messages between the hardware components (interfaces chips, central processing unit, storage) and threads. All threads measure their load individually which can be monitored at '''Info''' → '''System info''' → '''Load'''.


The following queues can be monitored to see where you need to change settings in the Allegro:
The utilization of the following queues can be monitored to see if and where changes to queue settings are helpful:


=== Interface hardware queues ===
== Interface hardware queues ==


The interface hardware queue is between the interfaces and I/O threads of the central processing unit. Whenever the I/O threads are too slow to consume all packets from the built-in or extension interfaces, the '''hardware miss''' counter of the interface statistics will increase over time.  
The interface hardware queue is between the network interfaces and the I/O threads of the central processing unit. Whenever the I/O threads are too slow to consume all packets from the built-in or extension network interfaces, the '''hardware miss''' counter of the [[Interface statistics]] will increase over time.  


The load of the I/O threads can be checked at the '''Info''' → '''System Info''' → '''Load'''.
The load of the I/O threads can be checked at the '''Info''' → '''System info''' → '''Load'''.


TODO: add overloaded Interface graph.
TODO: add overloaded Interface graph.


If this counter increases regularly, you can try to use the sink mode and to increase the number of I/O threads.
If the load is near 100%, packet loss can occur and the following countermeasures can be attempted:
 
# The Bridge mode requires approximately 10% - 30% more load on the I/O threads than the Sink mode. The I/O threads have to send the incoming traffic to the corresponding outgoing network interface for forwarding. If packet forwarding is not necessary (for example when being deployed at a Mirror Port), switching to '''Sink mode''' will improve the performance of the device. For configuration, please see [[Global_settings#Packet_processing_mode|Global settings]].
# The number of queues can be adjusted at the cost of analyzer threads. Each queue uses a corresponding CPU thread so more queues means less CPU threads available for other components. This option is available on the Allegro 3000 and above due to the large number of internal CPU cores. Allegro Packets recommends to increase the number of queues only on the Allegro 3000 and above if necessary.
:: This setting can be changed at '''Settings''' -> '''Global settings''' → '''Expert settings'''
:: [[File:Rx sockets io thread.png|600px]]
:: Allegro Packets recommends to test with HT enabled and 2 or 4 queue for I/O. If you see a high load on the analyzers, you can also test with 4 queues (I/O threads) without HT for maximum performance.


The bridge mode requires approximately 10 - 30 % more load on the I/O Threads as the sink mode. The I/O threads have to sent the traffic to the corresponding network interface for forwarding.
== Analyzer queues ==
For configuration, please see: [[Global_settings#Packet_processing_mode]]


The number of I/O threads can be increased at the cost of analyzer threads. This option is available on the Allegro 3000 and above as these devices have a high cpu. Allegro recommends to increase the analyzer threads only on the Allegro 1000 and above if necessary.
The '''Allegro Network Multimeter''' has a packet queue between I/O threads and the analyzer threads which distribute the incoming packets to the actual processing threads. There are two statistics describing the load of the queues and analyzer threads. If analyzer threads cannot process incoming packets quickly enough, the corresponding queue will eventually overflow and packets must be skipped for processing.


This setting can be changed at:
<ol>
<li>Graphs for skipped packets: You can check at the '''Interface stats''' per interface and check whether all I/O threads were able to push all packets to the analyzer queues or not. The corresponding counter is '''Not processed due to overload'''.<br />
[[File:Interface analyzer packet drop.png|400px]]<br />
Note that high counters for a few seconds at the initial startup of the device are normal when it is started under high network load scenarios.
</li>
<li>Utilization of individual analyzer threads: The load of the analyzer threads can be checked at the '''Info''' → '''System info''' → '''Load'''. Depending on the Allegro model there can be two (Allegro 200, Allegro 500/510) or up to 120 (Allegro 5300 or 5500) analyzer threads.
The load graph gives an indication about the overall utilization of each thread but the important counter is the '''Not processed due to overload''' counter since this is the event when ultimately one or more packets could not be processed due to overload.
</li>
</ol>


=== Analyzer queue ===
There are 3 scenarios where the a queue overload can occur which are described in the following sections:


The Allegro has a packet queue between I/O threads and the analyzer threads.  Please note that the increasing counters at the initial startup of the Allegro can be normal under high load scenarios for a few seconds. You can check at the '''Interface stats''' per interface all I/O threads could push all packets to the analyzer queues or not. The counter is '''Not processed due to overload'''.
=== Skipped packets at high analyzer load ===


[[File:Interface analyzer packet drop.png|400px]]
The '''Allegro Network Multimeter''' has reached its processing limit for current traffic when the load of one or multiple analyzers reaches 100%.


The load of the analyzer threads can be checked at the '''Info''' '''System Info''' → '''Load'''. Depending of your Allegro and the number of I/O threads, there could be a 2 ( Allegro 200 ) up to 120 ( Allegro 5300 or 5500 ) analyzers.
There are a some options to reduce the analyzer load but they come with the penalty of no longer seeing the entire measurement data. You can '''disable''' some features or add a '''Ingress (NIC) filter''' to process only parts of the traffic.


There are 3 scenarios where the an queue overload can happen.
<ol>
<li>You can reduce the level of analysis at '''Settings''' -> '''Global settings'''.<br />
[[File:Detail of traffic analysis.png|1000px]]<br />
Every level reduction will reduce the amount of analyzed data and saved database operations, see [[Global_settings#Detail_of_traffic_analysis]] for more details of this option. It is possible to adjust the setting so that live traffic is stored as fast possible to the ring buffer without further analysis and re-analyze parts of the recorded traffic with full analysis by using [[Parallel packet processing]].</li>
<li>The Ingress (NIC) filter can be used to reduce the amount of monitored traffic. It excludes traffic from the analyzers for the cost of not seeing all traffic of the link. See the [[Filter|interface filter]] for more details.</li>
</ol>


==== Low analyzer load and drop ====
If none of these options are applicable, you need to upgrade the '''Allegro Network Multimeter''' to a larger model with more performance (Allegro 1000 to 3000 or 3000 to 5000).


This happens if the Allegro is in a powersafe mode and a large burst of traffic arrives at the Allegro. This leads to minor packet drops . The solution is to use the [[#Sink Mode]] and to activate the '''Analyzer queue overcommit''' at '''Settings''' → '''Global settings''' → '''Expert settings'''. This allows to buffer sudden network bursts.
=== Skipped packets at low analyzer load ===


[[File:Queue overcommit.png|600px]]
The '''Allegro Network Multimeter''' conserves energy in very low traffic situations. Large packet bursts can lead to a high traffic situation so the analyzer threads cannot keep up with the incoming packets fast enough during the period of power adjustment. After that small period of time, the analyzer threads are again fast enough to process the traffic.


==== High analyzer load ====
This can be identified if packets are not processed while the system load is still not very high at the same time.


There are a few options for reducing the analyzer load but it always comes with the penalty of not processing all data. You can '''reduce''' the statistic level or add a '''NIC filter'''.
The '''Ingress packet memory''' provides a buffer for received packets which helps to avoid skipping packets in that situation. The historic usage of this memory can be viewed on the '''Info → System Info''' page.


You can reduce the level of analysis at '''Settings''' -> '''Global Settings'''.
If for some reason the amount of '''Ingress packet memory''' does not suffice it can be increased under '''Settings''' '''Global settings''' → '''Expert settings.'''


[[File:Detail of traffic analysis.png|600px]]
=== Skipped packets due to analyzer load imbalance ===


Every Level reduction will reduce the amount of analyzed data and saves database operations, see [[Global_settings#Limit_module_processing]] for more details of this mode. You can use this mode to store high packet rates into the ring buffer and re-analyze parts of it with full analytics with the [[Parallel packet processing]].
By default, the Allegro load balances the traffic between the analyzers based on the IP addresses of the client and server. This provides good balancing in most situations.  


The NIC filter can be used to reduce the amount of monitored traffic. It excludes traffic from the analyzers for the cost of not-seeing all traffic of the link. See the [[Filter|interface filter]] for more details.
Network packets cannot distributed equally among all analyzer threads if there are many connections between only few IP addresses. An example is 2 SIP trunks with many RTP connections.


==== Not-balanced analyzer load ====
The load statistics will show parts of the analyzers with a constant high load and others with a significantly lower load.


By default, the Allegro load balances the traffic between the analyzers based on the IP addresses of the client and server IP. This provides a very good balancing in most situations. The load distribution could be not balanced if there are many connections and load between 2 IP addresses. An example are 2 SIP trunks with many RTP connections. This can be improved by using the flow load balancing mode at '''Settings''' -> '''Global Settings''' → '''Expert Settings'''.
The load balancing behavior of the '''Allegro Network Multimeter''' can be changed to flow-based load balancing mode at '''Settings''' -> '''Global settings''' → '''Expert settings'''.


[[File:Flow load balancing.png|600px]]
[[File:Flow load balancing.png|1000px]]


This mode improves the performance only for unbalanced traffic. Please use it only if required as it has a negative performance impact on balanced traffic.
This mode improves the performance only for imbalanced traffic. Use the option only if required since it has a negative performance impact on balanced traffic.


=== DB Queue ===
== Database queues ==


The DB mode is an extension for large Allegros with NUMA setups and it is disabled by default. This mode is only recommended for Allegro 3500 rev1 and Allegro 5500 rev1. It helps improving very high database loads ( millions of open connections and new connections ) over multiple cpus. See [[DB mode]] for more details.
The database mode is an extension for large Allegro models with multiple CPU sockets and it is disabled by default. This mode is only recommended for Allegro 3500 rev1 and Allegro 5500 rev1. The database mode helps to improve the performance for very high database loads. This could happen for millions of open and new connections in combination with NUMA bottlenecks. See [[DB mode]] for more details.


If enabled, you can check if there are message drops between the analyzer threads and the DB threads in the load statistics.
If enabled, you can check if there are message drops between the analyzer threads and the DB threads in the load statistics.
The ratio of DB threads vs analyzer threads can be adjusted so that ideally all threads have similar load.
The advantage of the DB mode is that additional message queues are used which can buffer much more information and therefore reduce the load on the analyzer threads. This will reduce the likelihood of skipped packets.
== Disk I/O queues ==
The analyzer threads have to use additional queues for capturing packets to each disk or each disk cluster. Storage devices like HDDs and SSDs do not offer a constant write rate and have sudden write slowdowns. Please read the performance guide for the ring buffer [[Ring_Buffer_Configuration_Guide#Performance]] on how to adjust the options for high capturing performance.
The two generic solutions are to increase the buffer and to use filter rules. Both will reduce the number of bytes that are written to the disk.

Latest revision as of 13:57, 30 April 2025

About

This guide is about performance optimization of the Allegro Network Multimeter for specific use cases. By default, the device runs in a configuration that fits for the majority of users and you do not need to change any parameter of the configuration. Depending on the actual network traffic and measurement setup, it can be beneficial to adjust some performance-related parameters to achieve better overall performance.

High level Allegro system layout

The Allegro Network Multimeter has various components that process traffic. These components are:

  • I/O threads: responsible for all I/O operations between the network interface cards and the CPUs.
  • Analyzer threads: responsible for decoding network traffic and most of the database operations for the statistical values.
  • DB threads: optional threads which offload memory intensive database operations, see DB mode.

The Allegro Network Multimeter uses queues to buffer packets and messages between the hardware components (interfaces chips, central processing unit, storage) and threads. All threads measure their load individually which can be monitored at InfoSystem infoLoad.

The utilization of the following queues can be monitored to see if and where changes to queue settings are helpful:

Interface hardware queues

The interface hardware queue is between the network interfaces and the I/O threads of the central processing unit. Whenever the I/O threads are too slow to consume all packets from the built-in or extension network interfaces, the hardware miss counter of the Interface statistics will increase over time.

The load of the I/O threads can be checked at the InfoSystem infoLoad.

TODO: add overloaded Interface graph.

If the load is near 100%, packet loss can occur and the following countermeasures can be attempted:

  1. The Bridge mode requires approximately 10% - 30% more load on the I/O threads than the Sink mode. The I/O threads have to send the incoming traffic to the corresponding outgoing network interface for forwarding. If packet forwarding is not necessary (for example when being deployed at a Mirror Port), switching to Sink mode will improve the performance of the device. For configuration, please see Global settings.
  2. The number of queues can be adjusted at the cost of analyzer threads. Each queue uses a corresponding CPU thread so more queues means less CPU threads available for other components. This option is available on the Allegro 3000 and above due to the large number of internal CPU cores. Allegro Packets recommends to increase the number of queues only on the Allegro 3000 and above if necessary.
This setting can be changed at Settings -> Global settingsExpert settings
Rx sockets io thread.png
Allegro Packets recommends to test with HT enabled and 2 or 4 queue for I/O. If you see a high load on the analyzers, you can also test with 4 queues (I/O threads) without HT for maximum performance.

Analyzer queues

The Allegro Network Multimeter has a packet queue between I/O threads and the analyzer threads which distribute the incoming packets to the actual processing threads. There are two statistics describing the load of the queues and analyzer threads. If analyzer threads cannot process incoming packets quickly enough, the corresponding queue will eventually overflow and packets must be skipped for processing.

  1. Graphs for skipped packets: You can check at the Interface stats per interface and check whether all I/O threads were able to push all packets to the analyzer queues or not. The corresponding counter is Not processed due to overload.
    Interface analyzer packet drop.png
    Note that high counters for a few seconds at the initial startup of the device are normal when it is started under high network load scenarios.
  2. Utilization of individual analyzer threads: The load of the analyzer threads can be checked at the InfoSystem infoLoad. Depending on the Allegro model there can be two (Allegro 200, Allegro 500/510) or up to 120 (Allegro 5300 or 5500) analyzer threads. The load graph gives an indication about the overall utilization of each thread but the important counter is the Not processed due to overload counter since this is the event when ultimately one or more packets could not be processed due to overload.

There are 3 scenarios where the a queue overload can occur which are described in the following sections:

Skipped packets at high analyzer load

The Allegro Network Multimeter has reached its processing limit for current traffic when the load of one or multiple analyzers reaches 100%.

There are a some options to reduce the analyzer load but they come with the penalty of no longer seeing the entire measurement data. You can disable some features or add a Ingress (NIC) filter to process only parts of the traffic.

  1. You can reduce the level of analysis at Settings -> Global settings.
    Detail of traffic analysis.png
    Every level reduction will reduce the amount of analyzed data and saved database operations, see Global_settings#Detail_of_traffic_analysis for more details of this option. It is possible to adjust the setting so that live traffic is stored as fast possible to the ring buffer without further analysis and re-analyze parts of the recorded traffic with full analysis by using Parallel packet processing.
  2. The Ingress (NIC) filter can be used to reduce the amount of monitored traffic. It excludes traffic from the analyzers for the cost of not seeing all traffic of the link. See the interface filter for more details.

If none of these options are applicable, you need to upgrade the Allegro Network Multimeter to a larger model with more performance (Allegro 1000 to 3000 or 3000 to 5000).

Skipped packets at low analyzer load

The Allegro Network Multimeter conserves energy in very low traffic situations. Large packet bursts can lead to a high traffic situation so the analyzer threads cannot keep up with the incoming packets fast enough during the period of power adjustment. After that small period of time, the analyzer threads are again fast enough to process the traffic.

This can be identified if packets are not processed while the system load is still not very high at the same time.

The Ingress packet memory provides a buffer for received packets which helps to avoid skipping packets in that situation. The historic usage of this memory can be viewed on the Info → System Info page.

If for some reason the amount of Ingress packet memory does not suffice it can be increased under SettingsGlobal settingsExpert settings.

Skipped packets due to analyzer load imbalance

By default, the Allegro load balances the traffic between the analyzers based on the IP addresses of the client and server. This provides good balancing in most situations.

Network packets cannot distributed equally among all analyzer threads if there are many connections between only few IP addresses. An example is 2 SIP trunks with many RTP connections.

The load statistics will show parts of the analyzers with a constant high load and others with a significantly lower load.

The load balancing behavior of the Allegro Network Multimeter can be changed to flow-based load balancing mode at Settings -> Global settingsExpert settings.

Flow load balancing.png

This mode improves the performance only for imbalanced traffic. Use the option only if required since it has a negative performance impact on balanced traffic.

Database queues

The database mode is an extension for large Allegro models with multiple CPU sockets and it is disabled by default. This mode is only recommended for Allegro 3500 rev1 and Allegro 5500 rev1. The database mode helps to improve the performance for very high database loads. This could happen for millions of open and new connections in combination with NUMA bottlenecks. See DB mode for more details.

If enabled, you can check if there are message drops between the analyzer threads and the DB threads in the load statistics.

The ratio of DB threads vs analyzer threads can be adjusted so that ideally all threads have similar load.

The advantage of the DB mode is that additional message queues are used which can buffer much more information and therefore reduce the load on the analyzer threads. This will reduce the likelihood of skipped packets.

Disk I/O queues

The analyzer threads have to use additional queues for capturing packets to each disk or each disk cluster. Storage devices like HDDs and SSDs do not offer a constant write rate and have sudden write slowdowns. Please read the performance guide for the ring buffer Ring_Buffer_Configuration_Guide#Performance on how to adjust the options for high capturing performance.

The two generic solutions are to increase the buffer and to use filter rules. Both will reduce the number of bytes that are written to the disk.