diff options
Diffstat (limited to '2006/hardware_kerneltuning_netperf-slac/gliederung.txt')
-rw-r--r-- | 2006/hardware_kerneltuning_netperf-slac/gliederung.txt | 84 |
1 files changed, 84 insertions, 0 deletions
diff --git a/2006/hardware_kerneltuning_netperf-slac/gliederung.txt b/2006/hardware_kerneltuning_netperf-slac/gliederung.txt new file mode 100644 index 0000000..ec51802 --- /dev/null +++ b/2006/hardware_kerneltuning_netperf-slac/gliederung.txt @@ -0,0 +1,84 @@ + +- hardware selection is important + - linux runs on about anything from a cellphone to a mainframe + - good system performance depends on optimum selection of components + - sysadmins and managers have to undestand importance of hardware choice + - determine hardware needs before doing purchase ! + +- network usage patterns + - TCP server workload (web server, ftp server, samba, nfs-tcp) + - high-bandwidth TCP end-host performance + - UDP server workload (nfs udp) + - don't use it on gigabit speeds, data integrity problems! + - Router (Packet filter / IPsec / ... ) workload + - packet forwarding has fundamentally different requirements + - none of the offloading tricks works in this case + - important limit: pps, not bandwidth! + +- todays PC hardware + - CPU often is extremely fast + 2GHz CPU: 0.5nS clock cycle + L1/L2 cache access (four bytes): 2..3 clock cycles + - everything that is not in L1 or L2 cache is like a disk access + 40..180 clock cycles on Opteron (DDR-333) + 250.460 clock cycles on Xeon (DDR-333) + - I/O read + easily up to 3600 clock cycles for a register read on NIC + this happens synchronously, no other work can be executed! + - disk access + don't talk about them ;) +- hardware for high performance networking + - CPU + - cache + - as much cache as possible + - shared cache (in multi-core setup) is great + - SMP or not + - problem: increased code complexity + - problem: cache line ping-pong (on real SMP) + - depends on workload + - depends on number of interfaces! + - Pro: IPsec, tc, complex routing + - Con: NAT-only box + - RAM + - as fast as possible + - Bus architecture + - as little bridges as possible + - host bridge, PCI-X / PXE bridge + NIC chipset enough! + - check bus speeds + - real interrupts (PCI, PCI-X) have lower latency than message-signalled interrupts (MSI) + - NIC selection + - NIC hardware + avoid additional bridges (fourport cards) + PCI-X: 64bit, highest clock rate, if possible (133MHz) + - NIC driver support + - many optional features + checksum offload + scatter gather DMA + segmentation offload (TSO/GSO) + interrupt flood behaviour (NAPI) + - is the vendor supportive of the developers + - Intel: e100/e1000 docs ! + - is the vendor merging his patches mainline? + - syskonnect vs. Intel + - hard disk + - kernel network stack always is 100% resident in RAM + - therefore, disk performance not important for network stack + - however, one hint: + - for SMTP servers, use battery buffered RAM disks (Gigabyte) + +- tuning + - hardware related + - irq affinity + + - firewall specific + - organize ruleset in tree shape rather than linear list + - conntrack: hashsize / ip_conntrack_max + - log: don't use syslog, rather ulogd-1.x or 2.x + - local sockets + - SO_SNDBUF / SO_RCVBUF should be used by apps + - in recent 2.6.x kenrnels, they can override /proc/sys/net/ipv4/tcp_[rw]mem + - on long fat pipes, increase /proc/sys/net/ipv4/tcp_adv_win_scale + - core network stack + - disable rp_filter, it adds lots of per-packet routing lookups + + - check linux-x.y.z/Documentation/networking/ip-sysctl.txt for more information |