LANline Presents Skype Debugging Methods

Debugging Skype with the Allegro Network Multimeter - Technical Article in LANline 2020

The Challenges of Skype

Everything is ready for the Skype conference. But then technology bites back. What to do? While defects in the camera, cables and microphone can usually be fixed quickly, a deeper protocol analysis is required if Skype is not performing as expected.

Skype as a phone and video solution

Skype was developed in 2003 and acquired by Microsoft in 2011. Initially launched as a peer-to-peer application, dedicated Microsoft servers are now used. Microsoft integrated the service as an important building block in its range of products, which are primarily aimed at organisations. Skype is integrated in Office 365, and all new corporate subscriptions include Teams (formerly Skype for Business). The application enables individual and group calls, video telephony and online conferences. In many organisations it is increasingly incorporated as an alternative to VoIP and through an increase in functionality, it is widely recognized as a useful multimedia business solution. For many organizations, the second quarter of 2020 was marked by social constraints. Corporate communications had to be relocated - at least in part - to the digital space. A large percentage of business activities are increasingly dependent on conference calls and virtual meetings.

Analysing Skype traffic using logs

What distinguishes these connections? What is the quality of the connection for the client? Is the correct Skype server used? Is the server overloaded or is the network the problem? Can the switch prioritize the data? Is the audio and video quality excellent or poor? These questions mainly concern the quality of connection, service, and configuration. Microsoft servers are expected to be reliable and efficient. Are there any losses? What does the data throughput look like in your own network? Do all packets arrive (on time) and through the firewall? The routes that Skype packets take can be long and occasionally delayed because of congestion.

So, if an employee complains of a "bad connection," you need detailed network analysis tools to examine the Skype connection data and track the error. Analysis is complicated by the fact that the content is SSL and RTP encrypted. In addition, Skype does not use a standard protocol such as SIP to establish a connection within SSL; it uses a proprietary protocol that could change at any time. For us to investigate an issue, we need to approach a problem by looking at the protocols used and their parameters. Since SSL uses TCP/IP as the Layer 4 protocol, all TCP/IP connection quality statistics can be examined for debugging. In addition, we can include some SSL and RTP connection data in the review.

Skype Debugging in real-time or via a pcap

Debugging is the structured examination of errors with professional tools and experience. Error-finding does not enable you to fix errors. How do we obtain the relevant data? A network analyzer helps to make traffic visibility available, i.e. all relevant information about the connections and protocols. The best method to analyze a live connection is to create a pcap of a Skype session and examine the results offline. If we have generated a pcap, but do not know when and where the Skype quality issue occurred for other users, then the analyzer search function can help. Some analyzers incorporate a free text search to identify all Skype users. If you have access to the Skype connection protocol statistics, a network administrator can begin the search. They can also check other layers on this connection to gain a better understanding of the problem.

IP and TCP provide information

Once Skype users and their IP addresses are identified, we can see which Skype servers they were connected to. On Layer 4 we see TCP/IP traffic. This measures a small portion of the total Skype data stream. These connections are responsible for control information, control traffic, session set-up and termination. What were the TCP handshake times? How long did it take a Microsoft server to send the receipt? Which servers have good response times, which indicated poor response times?

DNS names and IP address ranges can be different for Skype because the Microsoft cloud control servers use load balancing that can direct data to different servers. How the connection was set up can be displayed on an appropriate network analysis appliance. A TCP module can measure the time it takes to get the first response to a TCP connection setup packet. This value is affected by the packet runtime and the load on the server. A valuation value is calculated and displayed based on a valuation algorithm.

TCP-Handshake Time
Figure 1: Example of TCP statistics showing TCP handshake times

 

If a successful TCP handshake occurred, relatively little control data is sent over the session. However, the time required for these control connections are important. Are the response times constant? If not, this indicates a fluctuating network load. If there are network bursts between the test appliance, server, or client, these may be the cause of the poor connection. If TCP response times are extremely high, it may be because the Microsoft server is on another continent, or due to network congestion, increasing latency. Other effects can be triggered by TCP retransmissions. If a Skype client always takes a long time to log in or frequently loses the connection, this may be caused by the failure of the server to accept the data packets. Again, the appliance can indicate how stable the control traffic is. If there is no data throughput, it may be due to TCP Zero Windows. This usually means that the TCP receive buffer of the system that sends the Zero Window packet is full and the system cannot receive any more data for this connection. A chart view helps you view TCP Zero Window packets over measured time periods.

SSL and RTP

Layer 7 of the OSI stack uses SSL to encrypt Skype control traffic, login meta information, and all messages through Microsoft servers; the SSL handshake response time and the first response time for encrypted SSL data is visible. For example, handshake times in the two-digit millisecond range are normal, but from the three-digit range they can become problematic. A good network analyzer provides at least the SSL handshake time, SSL data response times, the number of SSL requests and responses, the minimum and maximum response time in milliseconds, and the quality of the SSL server.

SSL statistic
Figure 2: SSL statistics provide connection-specific information

 

Audio and video traffic are sent over encrypted RTP frames with unencrypted headers. Skype uses its own voice or media codecs and transmits the content fully encrypted. Since RTP encryption is applied only to the content and not to the RTP header, we can examine RTP traffic. Were there lost packages, to whom, how many were there and how is the jitter? RTP is transported by UDP and is classed as a ‘best effort’ delivery mechanism. The data is not error-checked and there is no feedback in the event of packet loss. However, we recognize lost data as incomplete words and jerky video. In the analyzer, we can include packet loss as a source of error to our investigation. Ideally, RTP packets should be complete. Unevenness leads to fluctuating quality. We should expect a data stream with constant, preferably low latency. A time-critical application, such as Skype requires a low runtime fluctuation. A typical jitter of 2-3ms can be acceptable. If the jitter rises to 50ms and we detect lost packets, there is probably a connection issue. With a network analysis tool, we can see which IP address the protocol used RTP. It shows an overview of the received and sent data, the RTP packet loss, and the jitter based on the RTP sequence numbers.

RTP statistics debugging skype
Figure 3: A graphical display of RTP statistics helps to analyze packet loss and jitter

 

Incorrect Routes Affect Skype Quality

Examining TCP, SSL, and RTP connections, you can identify starting points that provide information about the cause of Skype problems. Two practical examples illustrate the relevance of this approach to Skype debugging. A user was incorrectly located in Asia and all connections to an Asian concentrator were routed. This resulted in high packet circulation times and the audio quality suffered significantly. Without examining TCP statistics and the jitter, the source of the error would not have been detected. In another case, data could be sent but no voice and video data could be received. Insufficient rights of the application restricted the passing of the firewall. Visibility of the TCP/IP connection and subsequent customization of the port shares solved the problem.

Skype is a global service and mission critical. If a subscriber complains of a poor Skype connection, the issues need to be investigated and resolved. An analysis over TCP, SSL and RTP provides a remedy.

Author
Klaus Degner, Managing Director of Allegro Packets GmbH

Go back