From fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Sun, 25 Oct 2015 21:00:20 +0100 Subject: import of old now defunct presentation slides svn repo --- .../network_performance.mgp | 236 +++++++++++++++++++++ 1 file changed, 236 insertions(+) create mode 100644 2006/hardware_kerneltuning_netperf-slac/network_performance.mgp (limited to '2006/hardware_kerneltuning_netperf-slac/network_performance.mgp') diff --git a/2006/hardware_kerneltuning_netperf-slac/network_performance.mgp b/2006/hardware_kerneltuning_netperf-slac/network_performance.mgp new file mode 100644 index 0000000..303f527 --- /dev/null +++ b/2006/hardware_kerneltuning_netperf-slac/network_performance.mgp @@ -0,0 +1,236 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + +%center +%size 7 +Hardware Selection +and Kernel Tuning +for High Performance Networking + +Dec 07, 2006 +SLAC, Berlin + +%center +%size 4 +by + +Harald Welte + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +About the Speaker + +Who is speaking to you? + an independent Free Software developer + Linux kernel related consulting + development for 10 years + one of the authors of Linux kernel packet filter + busy with enforcing the GPL at gpl-violations.org + working on Free Software for smartphones (openezx.org) + ...and Free Software for RFID (librfid) + ...and Free Software for ePassports (libmrtd) + ...and Free Hardware for RFID (openpcd.org, openbeacon.org) + ...and the worlds first Open GSM Phone (openmoko.com) + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection is important + +Hardware selection is important + linux runs on about anything from a cellphone to a mainframe + good system performance depends on optimum selection of components + sysadmins and managers have to undestand importance of hardware choice + determine hardware needs before doing purchase ! + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network usage patterns + +Network usage patterns + + TCP server workload (web server, ftp server, samba, nfs-tcp) + high-bandwidth TCP end-host performance + UDP server workload (nfs udp) + don't use it on gigabit speeds, data integrity problems! + Router (Packet filter / IPsec / ... ) workload + packet forwarding has fundamentally different requirements + none of the offloading tricks works in this case + important limit: pps, not bandwidth! +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Contemporary PC hardware + +Contemporary PC hardware + + CPU often is extremely fast + 2GHz CPU: 0.5nS clock cycle + L1/L2 cache access (four bytes): 2..3 clock cycles + everything that is not in L1 or L2 cache is like a disk access + 40..180 clock cycles on Opteron (DDR-333) + 250.460 clock cycles on Xeon (DDR-333) + I/O read + easily up to 3600 clock cycles for a register read on NIC + this happens synchronously, no other work can be executed! + disk access + don't talk about it. Like getting a coke from the moon. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection + +Hardware selection + CPU + cache + as much cache as possible + shared cache (in multi-core setup) is great + SMP or not + problem: increased code complexity + problem: cache line ping-pong (on real SMP) + depends on workload + depends on number of interfaces! + Pro: IPsec, tc, complex routing + Con: NAT-only box + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection + +Hardware selection + RAM + as fast as possible + use chipsets with highest possible speed + amd64 (Opteron, ..) + has per-cpu memory controller + doesn't waste system bus bandwidth for RAM access + Intel + has a traditional 'shared system bus' architecture + RAM is system-wide and not per-CPU + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection + +Hardware selection + Bus architecture + as little bridges as possible + host bridge, PCI-X / PXE bridge + NIC chipset enough! + check bus speeds + real interrupts (PCI, PCI-X) have lower latency than message-signalled interrupts (MSI) + some boards use PCIe chipset and then additional PCIe-to-PCI-X bridge :( + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection + +Hardware selection + NIC selection + NIC hardware + avoid additional bridges (fourport cards) + PCI-X: 64bit, highest clock rate, if possible (133MHz) + NIC driver support + many optional features + checksum offload + scatter gather DMA + segmentation offload (TSO/GSO) + interrupt flood behaviour (NAPI) + is the vendor supportive of the developers + Intel: e100/e1000 docs public! + is the vendor merging his patches mainline? + Syskonnect (bad) vs. Intel (good) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Hardware selection + +Hardware selection + hard disk + kernel network stack always is 100% resident in RAM + therefore, disk performance not important for network stack + however, one hint: + for SMTP servers, use battery buffered RAM disks (Gigabyte) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network Stack Tuning + +Network Stack Tuning + hardware related + prevent multiple NICs from sharing one irq line + can be checked in /proc/interrupts + highly dependent on specific mainboard/chipset + configure irq affinity + in an SMP system, interrupts can be bound to one CPU + irq affinity should be set to assure all packets from one interface are handled on same CPU (cache locality) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network Stack Tuning + +Network Stack Tuning + 32bit or 64bit kernel? + most contemporary x86 systems support x86_64 + biggest advantage: larger address space for kernel memory + however, problem: all pointers now 8bytes instead of 4 + thus, increase of in-kernel data structures + thus, decreased cache efficiency + in packet forwarding applications, ca. 10% less performance + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network Stack Tuning + +Network Stack Tuning + firewall specific + organize ruleset in tree shape rather than linear list + conntrack: hashsize / ip_conntrack_max + log: don't use syslog, rather ulogd-1.x or 2.x + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network Stack Tuning + +Network Stack Tuning + local sockets + SO_SNDBUF / SO_RCVBUF should be used by apps + in recent 2.6.x kenrnels, they can override /proc/sys/net/ipv4/tcp_[rw]mem + on long fat pipes, increase /proc/sys/net/ipv4/tcp_adv_win_scale + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Network Stack Tuning + +Network Stack Tuning + core network stack + disable rp_filter, it adds lots of per-packet routing lookups + check linux-x.y.z/Documentation/networking/ip-sysctl.txt for more information + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Network Performance & Tuning +Links + +Links + The Linux Advanced Routing and Traffic Control HOWTO + http://www.lartc.org/ + The netdev mailinglist + netdev@vger.kernel.org + -- cgit v1.2.3