Investigating network problems on remote sites
Often multiple locations are connected by dedicated lines. Sometimes, network problems are reported which cannot be easily tracked down. These can be caused by issues in the local network, the remote network, or the dedicated line itself.
This section describes how to use two Allegro Network Multimeters to measure packet loss and network latency between a local installation and a remote site.
The image shows an example setup for measuring packet loss and latency between two networks which are connected via the Internet.
- You need two Allegro Network Multimeters (any model).
- One Multimeter (the master device) will receive traffic information from the remote Multimeter via a atandard SSL connection; the master device must be able to connect to the remote device. If a firewall is in place between both devices, an additional rule might be necessary to connect the remote device on port 443.
- Install the master device in your network where it can see the traffic sent and receievd to the remote site. The Multimeter can be installed on a Mirror Port on a Switch which sees all the traffic, or inline between the local network and uplink to the remote site.
- Install the device on the remote site at a network position where it sees all traffic from and to the local network. Again, this can be a Mirror Port on a Switch on the remote site, or inline between the remote network and the link to the local network.
- Remote device: There are no special configuration settings necessary on the remote device. The remote device just needs to be added as a multi-device.
- Master device:
- Access the web page of the Multimeter and go to "Generic -> Path measurement".
- Switch to the configuration tab.
- Click on the toggle button to enable the feature.
- Enter an descriptive name for the master device to make it easier to read the statistics.
- For the remote device, enter the information described above. You can select a descriptive name for this device also.
- Enter a maximum packet delay. This parameter defines how long the master device waits for data from the remote device until it decides whether or not a packet as been lost. Larger values requires more memory. Typical values are between 2 and 5 seconds.
- Finally save the configuration settings.
- At the bottom of the page a note will appear in most cases that a restart of the packet processing is necessary. Follow the link to the administration page and click on "Restart processing". Be aware that this will interrupt the network connection for a few seconds (if in Bridge mode); you will lose all previously measured data.
- Return to the "Generic -> Path measurement" page.
- The "Measurement" tab should indicate that the measurement status is either warming up or running.
- The "Remote client status" should say "connected". If not, a button appears which can be clicked to reconnect to the remote device. Any error will be displayed in a information box.
The "Measurement" tab shows the results of the analysis of packet data between the master and remote device(s).
At the bottom, the fourth graph shows the packet rate of all traffic that is used for measurement. This includes the traffic seen on both devices, but excludes the traffic that is only seen on one device.
Checking packet loss
To identify packet loss during a time interval, first select the corresponding zoom level to see the entire time range. You can also select the time range by clicking into the graph.
The second and third graphs show the number of packets lost separately for each direction. If packet loss occurred, you will see a non-zero value in the graph.
Keep in mind that the analysis waits the configured maximum packet delay before deciding if a packet is lost. This means that the time of the packet loss is actually before the data point in the graph, up to the maximum number of packet delay seconds in the past.
Example: In the graph above, a packet loss is indicated at 17:45:34. For a configured maximum packet delay of 5 seconds, the original lost packet was sent 5 seconds earlier starting from 17:45:29.
The "Two-way latency" graph shows the minimum, maximum and average delay of both aggregated directions. Select the zoom level for the required time period and check if there are any unusual events in the graph.
A high maximum but low average value means that there have been a few points in time where the delay was high but the majority of the traffic experienced a lower delay.
A high maximum and high average indicates a general latency problem.
A high latency does not necessarily mean low bandwidth since network buffers can handle latency and still provide high bandwidth. But real-time applications such as audio calls or video chats will exhibit poorer quality due to high latency.
How to identify packet loss or high latency
First, select the relevant time window when the packet loss or high latency occurred.
Once the time window has been selected, switch the "IP -> IP statistics" to see which IP addresses had traffic within the time window.
Packet loss for TCP connections always generate retransmission packets. Toggle the display of "TCP counters" on the top bar and sort the table for "TCP retransmissions" to see the IP addresses with the most retransmissions in that time period.
Select an IP address which shows a high retransmission rate and check its peers or connections to identify the traffic during the packet loss or high latency period.
Read on about the Path measurement module for additional information about configuration, usage, and limitations.