%include "default.mgp" %default 1 bgrad %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %nodefault %back "blue" %center %size 7 netfilter/iptables training Nov 05/06/07, 2007 Day 2 %center %size 4 by Harald Welte %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Contents Day 2 Practical Exercises Logging with ulogd Choice of Hardware Network Stack Tuning Ruleset Optimization %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Practical Exercises Practical Exercises As discussed within the course %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Logging with ulogd Why? because LOG is extremely inefficient because LOG is unreliable, too LOG on full-speed DoS: 1100 logs/sec ULOG/LOGEMU on full-speed DoS: 96000 log/sec %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Logging with ulogd Configuration of ruleset: -j ULOG --ulog-nlgroup: which netlink group (up to 32) --ulog-cprange: how many bytes of each package? --ulog-qthreshold: how many packets to queue --ulog-prefix: like "--log-prefix" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Logging with ulogd Configuration of ulogd: Please refer to "doc/ulogd.html" documentation If logging remotely, make sure you don't ever log log-packets (!) Debian woody ships with a broken ulogd (and refuses to fix it) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Choice of hardware Choice of hardware is important for high scalability Packet forwarding is one of the most demanding tasks Important issues Optimization of NIC driver RAM latency Cache size Interrupt Latency I/O Bandwidth %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Choice of hardware Past benchmarking has shown AMD Opteron/Athlon64 has way better RAM latency than Intel PCI-X is the preferred bus technology Intel e1000 card + driver combo has good performance Never use four-port cards, sicne they have additional bridges %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Choice of hardwawre SMP or not SMP ? The improvement of SMP is arguable for packet forwarding Esp. connection tracking suffers from excessive cache ping-pong In case of two interfaces, there can be no improvement all packets will affect DMA with both interfaces putting one device on each IRQ causes more cache misses than anything else In case of four, eight interfaces, IRQ affinity can be used to distribute put a pair of interfaces on each cpu forwarding between those two interfaces will be fast forwarding between interfaces on differenc cpu's slower %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Network Stack tuning Tuning areas IRQ affinity neighbour cache kernel compile-time config %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Optimization of Ruleset Optimization of ruleset important iptables itself does no optimization all rules are traversed linearily all matches are processed linearily therefore, order _does_ matter for performance reasons %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page netfilter/iptables tutorial Optimization of Ruleset Good ideas for optimization build a tree-like structure out of user-defined chains avoid long lists keep in mind the average number of traversed rules per packet don't repeat excessive matching in each rule, use new chains