summaryrefslogtreecommitdiff
path: root/iproute2/iproute2+tc-slides.mgp
blob: 9791f7974d7c82fa65e456e41842b4dd49e3c54c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"

%center
%size 7


Advanced Linux Networking


%center
%size 4
by

Harald Welte <laforge@gnumonks.org>


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Contents

	Introduction

	Advanced Routing with iproute2

	Bandwidth Management using tc

	Advanced netfilter concepts

	References / Further Reading

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Introduction

Changes in the Linux IP stack

	Alexey Kuznetsov introduced new routing in 2.2

	IPv6 support required generalization

	tc subsystem (traffic control)

	Hooks in the Network stack (netfilter)

	Netlink Sockets


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page 
Advanced Linux Networking
Introduction

What can Linux do for me?

	Sophisticated routing (not only destination based)

	Control how the bandwidth is divided

	Prevent DoS attacks (various kinds of flooding)

	Advanced packet filtering (see my other talk)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART I - Advanced Routing

	Traditional IP routing

		router is connected to more than one network segment

		router knows which hosts are direcltly attached to these segments

		router knows where to send packets if destination not link-local

		router builds decision for each packet, based on its destination

	Why is this insufficient?

		Real-world network scenario getting more complex

		People want to have different routing for different services

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART I - Advanced Routing

Policy routing with iproute2

	Multiple routing tables

	Rules describing which routing table to use

	Configurable using commandline tool 'iproute2'

	Each rule consists of
		priority		(Determining order of rules)
		match		(Which packets match this rule)
			packet source address
			packet destination address
			TOS value
			incoming interface
			fwmark (set by ipchains / iptables)
		action		Which routing table


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART I - Advanced Routing

The 'ip' command
		used for 
			interface configuration
			neighbour/arp tables
			policy routing
			routing tables
			tunnels
			multicast routing
		communication with kernel through netlink sockets 

		Important commands for policy routing
			ip rule show	Show all rules in policy database
			ip rule add	Add new rule to policy database
			ip rule delete	Delete rule from policy database

Examples:
%font "typewriter"
%size 3
> ip rule add from 1.2.3.4/16 to 5.6.7.8/24 dev eth0 table 10

> ip rule show

0:      from all lookup local 
32765:  from 1.2.3.4/16 to 5.6.7.8/24 iif eth0 lookup 10 
32766:  from all lookup main 
32767:  from all lookup 253 
%font "standard"
%size 5

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page 
Advanced Linux Networking
PART I - Advanced Routing

The 'ip' command

		Important commands for routing tables
			ip route add	Add routing table entry
			ip route del	Delete routing table entry
			ip route list	List routing table
			ip route flush	Flush routing cache

		In reality far more sophisticated
%font "typewriter"
%size 2
Usage: ip route { list | flush } SELECTOR
       ip route get ADDRESS [ from ADDRESS iif STRING ]
                            [ oif STRING ]  [ tos TOS ]
       ip route { add | del | change | append | replace | monitor } ROUTE
SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ]
            [ table TABLE_ID ] [ proto RTPROTO ]
            [ type TYPE ] [ scope SCOPE ]
ROUTE := NODE_SPEC [ INFO_SPEC ]
NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ]
             [ table TABLE_ID ] [ proto RTPROTO ]
             [ scope SCOPE ] [ metric METRIC ]
INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...
NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS
OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]
           [ rtt NUMBER ] [ rttvar NUMBER ]
           [ window NUMBER] [ cwnd NUMBER ] [ ssthresh REALM ]
           [ realms REALM ]
TYPE := [ unicast | local | broadcast | multicast | throw |
          unreachable | prohibit | blackhole | nat ]
TABLE_ID := [ local | main | default | all | NUMBER ]
SCOPE := [ host | link | global | NUMBER ]
FLAGS := [ equalize ]
NHFLAGS := [ onlink | pervasive ]
RTPROTO := [ kernel | boot | static | NUMBER ]
%font "standard"
%size 5

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART II - Bandwidth Management

	What do I need Bandwidth Management for?

		Decide how and who available bandwidth is devided

		Limit available bandwidth for certain users / applications

		Guarantee bandwidth for certain users / applications

		Divide bandwidth more equally between users / applications

		QoS, DiffServ, IntServ

	Linux 2.2 / 2.4 provides elaborate framework

		Called 'packet scheduling' or 'traffic control'
		Another major achievement of Alexey Kuznetsov

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART II - Bandwidth Management

Basic iptables commands

To build a complete iptable command, we must specify
	which table to work with
	which chain in this table to use
	an operation (insert, add, delete, modify)
	a match
	a target

The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5

Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART II - packet filtering

Targets

	Builtin Targets to be used in filter table
		ACCEPT	accept the packet
		DROP	silently drop the packet 
		QUEUE	enqueue packet to userspace
		RETURN	return to previous (calling) chain
		foobar	user defined chain

Targets implemented as loadable modules 
		REJECT	drop the packet but inform sender
		MIRROR	change source/destination IP and resend
		LOG  	log via syslog
		ULOG	log via userspace

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART II - packet filtering

Matches

	Basic matches
		-p			protocol (tcp/udp/icmp/...)
		-s			source address (ip/mask)
		-d			destination address (ip/mask)
		-i			incoming interface
		-o			outgoing interface

	Match extensions
		--dport		destination port
		--sport	 	source port
		--mac-source 	source MAC address
		--mark 		nfmark
		--tos		TOS field of IP header
		--ttl		TTL field of IP header
		--limit	 	rate limiting (n packets per timeframe)
		--owner	 	owner uid of the socket sending the packet

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART III - NAT

Overview

		Previous Linux Kernels only implemented one special case of NAT: Masquerading

		Netfilter enables Linux to do any kind of NAT.

		All matches from packet filtering are available for the nat tables, too

		We divide NAT into 'source NAT' and 'destination NAT'

			SNAT changes the packet's source whille passing NF_IP_POST_ROUTING

			DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING

			MASQUERADE is a special case of SNAT

			REDIRECT is a special case of DNAT


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART III - NAT

Source NAT

	SNAT Example:
%font "typewriter"
%size 3

iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4

Masquerading does almost the same as SNAT, but if the outgoing interfaces' address changes (in case we have a dialup with dynamic ip), the new address is used.

	MASQUERADE Example:
%font "typewriter"
%size 3

iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART III - NAT

Destination NAT

	DNAT example:
%font "typewriter"
%size 3

iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4

REDIRECT is a special case of DNAT, which alters the destination to the address of the incoming interface.

	REDIRECT example:
%font "typewriter"
%size 3

iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
PART IV - Packet mangling

	Change certain parts of a packet based on rules in IP tables

	Again all the matches available, as described in packet filtering section.

	Currently, the supported packet mangling targets are:
		TOS	manipulate the TOS bits 
		TTL	set / increase / decrease TTL field
		MARK	change the nfmark field of the skb

Simple example:
%font "typewriter"
%size 3

iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Advanced Netfilter concepts

	Connection tracking

		Implemented seperately from NAT 

		Enables stateful filtering 

		Implementation
			hooks into NF_IP_PRE_ROUTING to track packets
			hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to drop information about connections which got filtered out
			protocol modules (currently TCP/UDP/ICMP)
			application helpers (currently FTP and IRC-DCC)

		Conntrack divides packets in the following four categories
			NEW - would establish new connection
			ESTABLISHED - part of already established connection
			RELATED - is related to established connection
			INVALID - (multicast, errors...)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Advanced Netfilter concepts

%size 4
	Userspace logging
		flexible replacement for old syslog-based logging
		packets to userspace via multicast netlink sockets
		easy-to-use library (libipulog)
		plugin-extensible userspace logging daemon already available

	Queuing
		reliable asynchronous packet handling 
		packets to userspace via unicast netlink socket
		easy-to-use library (libipq)
		experimental queue multiplex daemon (ipqmpd)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Current Development and Future

Netfilter (although it proved very stable) is still work in progress. 

Areas of current development
	infrastructure for conntrack/nat helpers in userspace
	full TCP sequence number tracking
	multicast support for connection tracking
	more flexible matches (MAXCONN, ...)
	more conntrack and NAT modules (RPC, SNMP, SMB, ...)
	better IPv6 support (conntrack, more matches / targets)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Advanced Linux Networking
Availability of slides / Links

The slides and the an according paper of this presentation are available at 
	http://www.gnumonks.org

The netfilter homepage is mirrored at:
	http://netfilter.samba.org
	http://netfilter.kernelnotes.org
	http://netfilter.filewatcher.org

More documents / netfilter extensions (ulogd, ipqmpd, ...)
	http://www.gnumonks.org/projects
personal git repositories of Harald Welte. Your mileage may vary