Generic modules(Teil 3)
Packet ring buffer
The ring buffer feature allows to create a buffer of fixed size on an external storage device to which all processed packets will be recorded. If the fixed size buffer is full then the oldest packets in the buffer will be replaced with new packets in a round robin fashion. If the feature is not enabled a button titled ‘Create ring buffer’ is visible. Upon clicking on it a dialog will be displayed and allows to specify the size of the ring buffer. It must be ensured that enough space is available on the external storage device. As soon as the ring buffer has been created statistics about the ring buffer will be displayed instead of the button:
- Timestamp of oldest packet: The timestamp of the oldest packet in the ring buffer.
- Total size: The total size of the ring buffer on the external storage device. If the cluster packet ring buffer feature is active and the Write redundancy level is set to a different value than no replication an adjusted value is displayed to reflect the redundant copies of packet data. The raw on-disk value will be displayed next to it in parentheses.
- Used size: The currently used amount of memory in the capture buffer. If the cluster packet ring buffer feature is active and the Write redundancy level is set to a different value than no replication an adjusted value is displayed to reflect the redundant copies of packet data. The raw on-disk value will be displayed next to it in parentheses.
- Overall bytes captured since start: The amount of captured bytes since system start. This may be smaller than the used size if the system has been restarted. And it may be larger than the used size in case the ring buffer is full. The history graph shows the captured traffic of the last minute or in the selected interval (if set).
- Bytes dropped since start: The traffic which was processed but could not be written to the ring buffer since the start of processing. This is usually an indicator that writes to the external storage device were not fast enough. The history graph shows the drops over time.
- Bytes discarded due to snapshot length rules since start: The traffic which matched the snapshot length rules criteria and was not written to the ring buffer. The history graph shows discarding over time.
- Data in flight: The amount of data which is currently stored in the queue that holds processed packets before they are written to the packet ring buffer. If larger bursts of traffic need to be stored in this queue the size can be modified in the capture module settings.
When the ring buffer is full and old packets are deleted, the graphs will show the time range with no data in darkgrey background color. The time range before start of the ring buffer will be visualized in the same way. When the ring buffer is running, the behavior of the PCAP capture buttons throughout the system changes: if the user interface is in live mode and a capture is started, a dialog will appear asking to specify from how far back in time the capture should start. This way it is possible to e.g. capture the traffic of an IP address starting from an hour ago. The capture will also continue with live traffic. If the user interface is in “back-in-time” mode (a timespan from the past is selected) starting a capture will produce a dialog asking to confirm that the capture will cover exactly the timespan selected. The capture will automatically stop after the selected timespan has been processed.
Cluster ring buffer
The cluster ring buffer feature allows to use multiple whole disks in parallel for a single packet ring buffer. It also allows to optionally write redundant copies of packets to multiple disks to provide fault tolerance in case of a disk failure. When clicking the ‘Create cluster ring buffer’ button an empty cluster ring buffer will be created and the ‘Cluster configuration’ tab on the now visible packet ring buffer statistics page becomes available. In the ‘Cluster configuration’ tab you can configure the ‘Write redundancy level’ at the very top. This level controls how many redundant copies of each packet are written. no replication means, that only a single copy of each packet is written and provides no redundancy. This level gives the highest write bandwidth for a given number of disks. single replication means that one additional copy of each packet is written to some other disk and thus reduces the total write performance for a given number of disk to half the performance of no replication. double replication and triple replication write two and three additional copies of each packet respectively. Note that for each level to work there must be at least the number of replications + 1 disks available in the cluster.
Below the ‘Write redundancy level’ setting is the list of all disks available for use in the cluster. Following columns are displayed in the list:
- Disk: A description of the disk and its capacity.
- Enclosure: If the disk is part of a multi-disk enclosure this column will show the enclosure number along with the slot number.
- Status: If the disk has been added to the cluster this column will display the current status as ‘ok’ or ‘failed’.
- Locator: For disks in a multi-disk enclosure the button displayed in this column allows to turn the slot locator LED on and off.
In the last unlabeled column there are three buttons displayed which have the following functionality:
- Add to cluster: Add a fresh disk to the cluster. The disk will be formatted and added as empty storage to the cluster. All previous data on the disk is lost.
- Resume in cluster: If the disk was previously part of a cluster it can be resumed. The data on that disk is now part of the packet ring buffer.
- Remove from cluster: Remove the disk from the ring buffer. The data stored on that disk is not part of the packet ring buffer anymore but the data is not removed from the disk. It can be resumed in the cluster at a later time.
If a disk is missing because it was e.g. removed from the enclosure it will be displayed in a separate list with much of the information as in the list described above but only one button with the option to remove it from the cluster packet ring buffer.
Packet ring buffer snapshot length filter
Rules can be configured that control the snapshot length of each packet which shall be stored in the packet ring buffer. These rules can also be used to prevent certain packets from being stored in the packet ring buffer. This allows to fine tune how much packet data needs to be written to the packet ring buffer. The information about the original length of a packet will still be available in captures except when the packet was not written to the packet ring buffer at all (e.g. due to a ‘discard’ rule).
These rules can be created, edited, deleted, moved up and moved down in the rules list by using the respective buttons.
Evaluation of the rules takes place in the order of the rules as displayed in the rules list from top to bottom. The first rule that matches for a given packet will be applied and no further rules will be evaluated for that packet. This means that the most generic rule should be at the bottom of the list (like e.g. ‘all packets will be discarded’) and more specific rules should be higher up in the list (like e.g ‘packets with an IP matching 192.168.1.0/24 will be fully captured’).
When creating a snapshot length filter rule, a dialog is displayed and allows following options:
- Rule condition: Match all packets or a certain MAC or IP address, TCP/UDP port, a layer 7 protocol a VLAN tag or an interface. The input field below allows entering the corresponding value.
- Negate: Controls comparison of the rule condition to the value. If this is off, the value must match. If this is on, the value must not match.
- Action: What shall be done with the matching packets.
– Snapshot length: The packet is captured with a max length as specified in the input field below. If the packet is larger, the remaining bytes will be discarded.
– Discard: Discard the whole packet.
– Full: The whole packet is captured.
– Header + data: Capture just certain parts of the packet. When selecting “L3 header”, layer 2 and layer 3 headers are stored. When selecting “L3 + L4 header”, layer 2, 3 and 4 headers are stored. When selecting “L3 + L4 + L7 data”, an input field is shown where the length of layer 7 data can be configured. In this case layer 2, 3 and 4 are stored together with the specified amount of layer 7 data.
Analyzing the packet ring buffer
When the packet ring buffer is activated it is possible to restart the packet processing core and analyze all packets contained in the packet ring buffer. When the Analyze packet ring buffer button is pressed a dialog will appear which allows to choose the time range of the packet ring buffer which is to be replayed. After confirming this dialog the Network Multimeter will reset all statistics and start analyzing the contents of the packet ring buffer. Progress, statistics and the option to resume normal operation will appear on the Packet ring buffer page.
Extracting the packet ring buffer
When the packet ring buffer is active the complete contents of it can be extracted by capturing the complete timespan that is contained within. For convenience a button labeled Extract packet ring buffer is available that opens the capture dialog with the start time and end time set to the appropriate values.
Pcap analysis module
The pcap analysis module allows analyzing pcap files by sending them to the device. After analyzing the pcap, the web interface shows all the metadata as if the packets are live traffic at the time of the pcap recording.
Starting pcap analyze will stop the network ports and thus the normal packet processing and forwarding is disabled. The network connections of the devices connected to the Multimeter will stop working.
Start new Upload
To select a file to analyze, simply drag a file from your file manager to the drop zone. The second option is to click into the drop zone. After a click, a file selection dialog will open. After selecting a file, the name and the size of the pcap will be displayed in the drop zone box.
To proceed, press the “Upload and analyze pcap” button. A modal dialog will open.
- A warning will be shown if the device is in bridge mode, since no more packets will be forwarded when startin pcap analyze mode.
- If a packet ring buffer is configured, it is possible to write packets to it. This allows simple extraction of packets as in live packet processing.
The pcap file itself will not be stored on the storage of the Multimeter (except in the packet ring buffer, if activated in the upload modal dialog).
PCAP analysis statistics
After the upload started, a progress section will be displayed. This includes a progress bar and the time of the last processed packet. When viewing the progress bar on a different tab or on a different browser, the progress bar will not show the correct value.
Viewing the pcap metadata
During and after the upload of the file, all modules will show the metadata produced by analyzing the packets in the pcap file.
Resuming normal operation
After finishing the analysis, the processing can be set back to live mode by clicking the “Resume normal operation” button at the bottom of the page.
The Incidents module allows for notifications to be created when certain network incidents are detected. These notifications can be viewed in the web GUI and may also be delivered by email. Repeating incidents are counted as such and the time of the first and last occurrence of an incident is remembered. What makes an incident unique depends on the type of incident. Incidents can be configured with three levels of severity: low, medium and high. The first occurrence of a medium or high severity incident will trigger a status notification which is visible at the top right of the web GUI. Up to 1000 incidents will be remembered by the system and if this limit is exceeded the oldest incidents will be discarded.
Types of incidents
The following list shows which types of incidents can currently be detected and how they are triggered.
- new MAC : report an incident when a unicast Ethernet MAC address is seen for the first time.
- new DPI protocol for MAC: report an incident when a layer 7 protocol is first detected for a unicast Ethernet MAC address.
- broadcast packet rate exceeded threshold: report an incident when the number of broadcast packets within the duration of one second exceeds the configurable threshold
- ARP responses with different MACs for the same IP within 60 seconds: report an incident when within the duration of 60 seconds two different unicast Ethernet MAC addresses respond as having the same IP address through ARP (address resolution protocol) messages. This may point to a configuration issue as two devices try to use the same IP address.
- new local IP address: report an incident when an IPv4 address belonging to a private network address range is seen for the first time.
- new DPI protocol for local IP: report an incident when a layer 7 protocol is first detected for an IPv4 address belonging to a private network address range.
- local IP address on multiple Ethernet MACs: report an incident when an IPv4 address belonging to a private network address range is seen with multiple Ethernet MAC addresses. This may point to a configuration issue as two devices try to use the same IP address.
- TCP handshake time exceeded threshold: report an incident when the time needed for the completion of a TCP handshake by a server exceeds the configurable threshold. If the TCP handshake time suddenly rises this may point e.g. to an overload of the server.
- TCP zero window packet: report an incident when a TCP zero window packet is seen. This means that the receive buffer for the connection at the IP sending the TCP zero window packet is full.
- DNS server stopped responding: report an incident if more than 3 requests to the DNS server went unanswered for a period of more than 5 seconds.
- interface link status changed: Report an incident if the Ethernet link status of the network interface changed. This incident is always reported as a new incident even for the same network interface.
- Interface link speed changed: Report an incident if the link speed of the network interface changed.
- Interface pair link speed and duplex mismatch: Report an incident if the link speed or the duplex mode of two corresponding mutual interfaces in bridge mode are different.
- Bandwidth below lower threshold: Report an incident when the measured bandwidth falls below a certain threshold. The threshold is configured in Mbit/s. The incident is active until the bandwidth is above the threshold again.
- Bandwidth above upper threshold: Report an incident when the measured bandwidth exceeds a certain threshold. The threshold is configured in Mbit/s. The incident is active until the bandwidth falls below the threshold again.
- Packet rate below lower threshold: Report an incident when the measured packet rate falls below a certain threshold. The threshold is configured in packets/s. The incident is active until the packet rate is above the threshold again.
- Packet rate above upper threshold: Report an incident when the measured packet rate exceeds a certain threshold. The threshold is configured in packets/s. The incident is active until the packet rate falls below the threshold again.
- Timeout for finishing an active bandwith incident: Defines the time for how long the bandwidth or packet rate has to be above the lower (or below the upper) threshold again to end the incident. By using this setting, e.g. a traffic burst that is constantly moving around the threshold within this configured range can be reduced to just one incident.
Interface throughput incidents are generated by the throughput measurement module as soon as a configurablethreshold exceeds. The incident contains a graph of traffic for that interface with some data points before and after the threshold has been exceeded depending on the measurement interval. A PCAP link for capturing from the packet ring buffer is shown. For further investigation of that incident, the button Use as global time range can be used to set the global range to the start and end of the incident graph (at least 5 seconds) so that all modules of the Allegro Network Multimeter show that time span. The incident generation can be configured as follows:
- throughput threshold exceeded: report an incident if the throughput of any network interface exceeded.
- Throughput threshold (Mbit/s): The threshold is configured in Mbit/s.
- How long throughput must be above threshold to generate incident (in milliseconds): The throughput must exceed the threshold for this duration in order to generate the incident. If set to zero (defualt) the incident is generated immediately after the threshold has been exceede.
- Throughput cool-down period between two incidents in milliseconds: Defines the time after an incident where no new incident is generated even if the threshold is exceeded. If this period is passed, throughput incidents could be generated again.
The configuration of incidents is done in the Incidents tab on the settings page located under Settings → Incident settings. An incident detection is enabled by setting the reporting severity to a value different from disabled. Some incidents also have further configuration options like e.g. a threshold value. These options are located below the incident’s reporting severity setting. When finished configuring the incidents the Save Settings button on the bottom of the Incidents settings page will commit the changes.
Incident email notifications
Email notifications for incidents can be configured in the Notification settings tab on the settings page located under Settings → Incident settings:
- Enable email incident notifications: turns the feature on or off.
- Severity threshold: defines the minimum severity an incident must have to trigger an email notification.
Viewing and deleting incidents in the web GUI
Incidents that have occurred can be viewed and deleted on the page located under Generic → Incidents. Here it is possible to filter incidents by their severity using the colored buttons on the top of the page. In the incidents table the incidents can also be filtered by searching for a text in their subject message as well as sorted by their severity or time of last occurrence. The Delete button at the end of each line will discard a single incident and the Delete all incidents button at the top of the page will discard all incidents. If an incident was discarded and is reported again it will be treated as a new incident and e.g. an email notification will be generated again. When the subject of an incident is clicked the detail page for the incident will be displayed. This page contains some more detailed information about the incident as well as links to statistics that may be relevant for investigating the incident.