1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
- hardware selection is important
- linux runs on about anything from a cellphone to a mainframe
- good system performance depends on optimum selection of components
- sysadmins and managers have to undestand importance of hardware choice
- determine hardware needs before doing purchase !
- network usage patterns
- TCP server workload (web server, ftp server, samba, nfs-tcp)
- high-bandwidth TCP end-host performance
- UDP server workload (nfs udp)
- don't use it on gigabit speeds, data integrity problems!
- Router (Packet filter / IPsec / ... ) workload
- packet forwarding has fundamentally different requirements
- none of the offloading tricks works in this case
- important limit: pps, not bandwidth!
- todays PC hardware
- CPU often is extremely fast
2GHz CPU: 0.5nS clock cycle
L1/L2 cache access (four bytes): 2..3 clock cycles
- everything that is not in L1 or L2 cache is like a disk access
40..180 clock cycles on Opteron (DDR-333)
250.460 clock cycles on Xeon (DDR-333)
- I/O read
easily up to 3600 clock cycles for a register read on NIC
this happens synchronously, no other work can be executed!
- disk access
don't talk about them ;)
- hardware for high performance networking
- CPU
- cache
- as much cache as possible
- shared cache (in multi-core setup) is great
- SMP or not
- problem: increased code complexity
- problem: cache line ping-pong (on real SMP)
- depends on workload
- depends on number of interfaces!
- Pro: IPsec, tc, complex routing
- Con: NAT-only box
- RAM
- as fast as possible
- Bus architecture
- as little bridges as possible
- host bridge, PCI-X / PXE bridge + NIC chipset enough!
- check bus speeds
- real interrupts (PCI, PCI-X) have lower latency than message-signalled interrupts (MSI)
- NIC selection
- NIC hardware
avoid additional bridges (fourport cards)
PCI-X: 64bit, highest clock rate, if possible (133MHz)
- NIC driver support
- many optional features
checksum offload
scatter gather DMA
segmentation offload (TSO/GSO)
interrupt flood behaviour (NAPI)
- is the vendor supportive of the developers
- Intel: e100/e1000 docs !
- is the vendor merging his patches mainline?
- syskonnect vs. Intel
- hard disk
- kernel network stack always is 100% resident in RAM
- therefore, disk performance not important for network stack
- however, one hint:
- for SMTP servers, use battery buffered RAM disks (Gigabyte)
- tuning
- hardware related
- irq affinity
- firewall specific
- organize ruleset in tree shape rather than linear list
- conntrack: hashsize / ip_conntrack_max
- log: don't use syslog, rather ulogd-1.x or 2.x
- local sockets
- SO_SNDBUF / SO_RCVBUF should be used by apps
- in recent 2.6.x kenrnels, they can override /proc/sys/net/ipv4/tcp_[rw]mem
- on long fat pipes, increase /proc/sys/net/ipv4/tcp_adv_win_scale
- core network stack
- disable rp_filter, it adds lots of per-packet routing lookups
- check linux-x.y.z/Documentation/networking/ip-sysctl.txt for more information
|