Characterization and Classification of Internet Backbone Traffic
Abstract: With this thesis, we contribute to a better understanding of Internet traffic characteristics by measurement and analysis of real-world Internet backbone data. We start with an overview of a number of important considerations for passive Internet traffic collection on large-scale network links. The lessons learned from a successful measurement project on academic Internet backbone links can serve as guidelines to others setting up and performing similar measurements. The collected data are the basis for the analyses performed in this thesis. As first result we present a detailed characterization of packet headers, which reveals protocol-specific features and provides a systematic survey of packet header anomalies. The packet-level analysis is followed by a flow-level characterization. We propose a method and accompanying metrics to assess routing symmetry on a flow-level based on passive measurements from a specific link. This method will help to improve traffic analysis techniques. We used the method on our data, and the results suggest that routing symmetry is uncommon on non-edge Internet links. We then confirm the predominance of TCP as transport protocol in backbone traffic. However, we observe an increase of UDP traffic during the last few years, which we attribute to P2P signaling traffic. We also analyze further flow characteristics such as connection establishment and termination behavior, which reveals differences among traffic from various classes of applications. These results show that there is a need to perform a more detailed analysis, i.e., classification of traffic according to network application. To accomplish this, we review state-of-the-art traffic classification approaches and subsequently propose two new methods. The first method provides a payload-independent classification of aggregated traffic based on connection patterns. This provides a rough traffic decomposition in a privacy insensitive way. We then present a classification method for fine-grained protocol identification by utilizing statistical packet and flow features. Preliminary results indicate that this method is capable of accurate classification in a simple and efficient way. We conclude the thesis by discussing limitations of current Internet measurement research. Considering the role of the Internet as a critical infrastructure of global importance, detailed understanding of Internet traffic is essential. This thesis therefore presents methods and results contributing additional perspectives on global Internet characteristics on different levels of granularity.
This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.