summaryrefslogtreecommitdiff
path: root/2006/hardware_kerneltuning_netperf-slac/network_performance.mgp
blob: 303f52718171a2c8f83e725c748f338d2985b816 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"

%center
%size 7
Hardware Selection
and Kernel Tuning
for High Performance Networking

Dec 07, 2006
SLAC, Berlin

%center
%size 4
by

Harald Welte <laforge@gnumonks.org>

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
About the Speaker

Who is speaking to you?
		an independent Free Software developer
		Linux kernel related consulting + development for 10 years
		one of the authors of Linux kernel packet filter
		busy with enforcing the GPL at gpl-violations.org
		working on Free Software for smartphones (openezx.org)
		...and Free Software for RFID (librfid)
		...and Free Software for ePassports (libmrtd)
		...and Free Hardware for RFID (openpcd.org, openbeacon.org)
		...and the worlds first Open GSM Phone (openmoko.com)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection is important

Hardware selection is important
	linux runs on about anything from a cellphone to a mainframe
	good system performance depends on optimum selection of components
	sysadmins and managers have to undestand importance of hardware choice
	determine hardware needs before doing purchase !

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network usage patterns

Network usage patterns

	TCP server workload (web server, ftp server, samba, nfs-tcp)
		high-bandwidth TCP end-host performance
	UDP server workload (nfs udp)
		don't use it on gigabit speeds, data integrity problems!
	Router (Packet filter / IPsec / ... ) workload
		packet forwarding has fundamentally different requirements
		none of the offloading tricks works in this case
		important limit: pps, not bandwidth!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Contemporary PC hardware

Contemporary PC hardware

	CPU often is extremely fast
		2GHz CPU: 0.5nS clock cycle
		L1/L2 cache access (four bytes): 2..3 clock cycles
	everything that is not in L1 or L2 cache is like a disk access 
		40..180 clock cycles on Opteron (DDR-333)
		250.460 clock cycles on Xeon (DDR-333)
	I/O read
		easily up to 3600 clock cycles for a register read on NIC
		this happens synchronously, no other work can be executed!
	disk access
		don't talk about it. Like getting a coke from the moon.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection

Hardware selection
	CPU
		cache
			as much cache as possible
			shared cache (in multi-core setup) is great
		SMP or not
			problem: increased code complexity
			problem: cache line ping-pong (on real SMP)
			depends on workload
			depends on number of interfaces!
			Pro: IPsec, tc, complex routing
			Con: NAT-only box

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection

Hardware selection
	RAM
		as fast as possible
		use chipsets with highest possible speed
		amd64 (Opteron, ..)
			has per-cpu memory controller
			doesn't waste system bus bandwidth for RAM access
		Intel
			has a traditional 'shared system bus' architecture
			RAM is system-wide and not per-CPU


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection

Hardware selection
	Bus architecture
		as little bridges as possible
			host bridge, PCI-X / PXE bridge + NIC chipset enough!
		check bus speeds
		real interrupts (PCI, PCI-X) have lower latency than message-signalled interrupts (MSI)
		some boards use PCIe chipset and then additional PCIe-to-PCI-X bridge :(

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection

Hardware selection
	NIC selection
		NIC hardware
			avoid additional bridges (fourport cards)
			PCI-X: 64bit, highest clock rate, if possible (133MHz)
	NIC driver support
		many optional features
			checksum offload
			scatter gather DMA
			segmentation offload (TSO/GSO)
			interrupt flood behaviour (NAPI)
		is the vendor supportive of the developers
			Intel: e100/e1000 docs public!
		is the vendor merging his patches mainline?
			Syskonnect (bad) vs. Intel (good)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Hardware selection

Hardware selection
	hard disk
		kernel network stack always is 100% resident in RAM
		therefore, disk performance not important for network stack
		however, one hint:
			for SMTP servers, use battery buffered RAM disks (Gigabyte)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network Stack Tuning

Network Stack Tuning
	hardware related
		prevent multiple NICs from sharing one irq line
			can be checked in /proc/interrupts
			highly dependent on specific mainboard/chipset
		configure irq affinity
			in an SMP system, interrupts can be bound to one CPU
			irq affinity should be set to assure all packets from one interface are handled on same CPU (cache locality)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network Stack Tuning

Network Stack Tuning
	32bit or 64bit kernel?
		most contemporary x86 systems support x86_64
		biggest advantage: larger address space for kernel memory
		however, problem: all pointers now 8bytes instead of 4
		thus, increase of in-kernel data structures
		thus, decreased cache efficiency
		in packet forwarding applications, ca. 10% less performance

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network Stack Tuning

Network Stack Tuning
	firewall specific
		organize ruleset in tree shape rather than linear list
		conntrack: hashsize / ip_conntrack_max
		log: don't use syslog, rather ulogd-1.x or 2.x

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network Stack Tuning

Network Stack Tuning
	local sockets
		SO_SNDBUF / SO_RCVBUF should be used by apps
		in recent 2.6.x kenrnels, they can override /proc/sys/net/ipv4/tcp_[rw]mem
		on long fat pipes, increase /proc/sys/net/ipv4/tcp_adv_win_scale 
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Network Stack Tuning

Network Stack Tuning
	core network stack
		disable rp_filter, it adds lots of per-packet routing lookups
		check linux-x.y.z/Documentation/networking/ip-sysctl.txt for more information

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Network Performance & Tuning
Links 

Links
	The Linux Advanced Routing and Traffic Control HOWTO
		http://www.lartc.org/
	The netdev mailinglist
		netdev@vger.kernel.org

personal git repositories of Harald Welte. Your mileage may vary