NFV Service Chains at the Speed of the Underlying Commodity Hardware

Abstract: Link speeds in networks will in the near-future reach and exceed 100 Gbps. While available specialized hardware can accommodate these speeds, modern networks have adopted a new networking paradigm, also known as Network Functions Virtualization (NFV), that replaces expensive specialized hardware with open-source software running on commodity hardware. However, achieving high performance using commodity hardware is a hard problem mainly because of the processor-memory gap. This gap suggests that only the fastest memories of today’s commodity servers can achieve the desirable access latencies for high speed networks. Existing NFV systems realize chained network functions (also known as service chains) mostly using slower memories; this implies a need for multiple additional CPU cores or even multiple servers to achieve high speed packet processing. In contrast, this thesis combines four contributions to realize NFV service chains with dramatically higher performance and better efficiency than the state of the art.The first contribution is a framework that profiles NFV service chains to uncover reasons for performance degradation, while the second contribution leverages the profiler’s data to accelerate these service chains by combining multiplexing of system calls with scheduling strategies. The third contribution synthesizes input/output and processing service chain operations to increase the spatial locality of network traffic with respect to a system’s caches. The fourth contribution combines the profiler’s insights from the first contribution and the synthesis approach of the third contribution to realize NFV service chains at the speed of the underlying commodity hardware. To do so, stateless traffic classification operations are offloaded into available hardware (i.e., programmable switches and/or network cards) and a tag is associated with each traffic class. At the server side, input traffic classes are classified by the hardware based upon the values of these tags, which indicate the CPU core that should undertake their stateful processing, while ensuring zero inter-core communication.With commodity hardware, this thesis realizes Internet Service Provider-level service chains and deep packet inspection at a line-rate 40 Gbps and stateful service chains at the speed of a 100 GbE network card on a 16 core single server. This results in up to (i) 4.7x lower latency, (ii) 8.5x higher throughput, and (iii) 6.5x better efficiency than the state of the art. The techniques described in this thesis are crucial for realizing future high speed NFV deployments.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)