%include "default.mgp" %default 1 bgrad %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %nodefault %back "blue" %center %size 7 Advanced Linux Networking %center %size 4 by Harald Welte %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Contents Introduction Advanced Routing with iproute2 Bandwidth Management using tc Advanced netfilter concepts References / Further Reading %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Introduction Changes in the Linux IP stack Alexey Kuznetsov introduced new routing in 2.2 IPv6 support required generalization tc subsystem (traffic control) Hooks in the Network stack (netfilter) Netlink Sockets %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Introduction What can Linux do for me? Sophisticated routing (not only destination based) Control how the bandwidth is divided Prevent DoS attacks (various kinds of flooding) Advanced packet filtering (see my other talk) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART I - Advanced Routing Traditional IP routing router is connected to more than one network segment router knows which hosts are direcltly attached to these segments router knows where to send packets if destination not link-local router builds decision for each packet, based on its destination Why is this insufficient? Real-world network scenario getting more complex People want to have different routing for different services %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART I - Advanced Routing Policy routing with iproute2 Multiple routing tables Rules describing which routing table to use Configurable using commandline tool 'iproute2' Each rule consists of priority (Determining order of rules) match (Which packets match this rule) packet source address packet destination address TOS value incoming interface fwmark (set by ipchains / iptables) action Which routing table %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART I - Advanced Routing The 'ip' command used for interface configuration neighbour/arp tables policy routing routing tables tunnels multicast routing communication with kernel through netlink sockets Important commands for policy routing ip rule show Show all rules in policy database ip rule add Add new rule to policy database ip rule delete Delete rule from policy database Examples: %font "typewriter" %size 3 > ip rule add from 1.2.3.4/16 to 5.6.7.8/24 dev eth0 table 10 > ip rule show 0: from all lookup local 32765: from 1.2.3.4/16 to 5.6.7.8/24 iif eth0 lookup 10 32766: from all lookup main 32767: from all lookup 253 %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART I - Advanced Routing The 'ip' command Important commands for routing tables ip route add Add routing table entry ip route del Delete routing table entry ip route list List routing table ip route flush Flush routing cache In reality far more sophisticated %font "typewriter" %size 2 Usage: ip route { list | flush } SELECTOR ip route get ADDRESS [ from ADDRESS iif STRING ] [ oif STRING ] [ tos TOS ] ip route { add | del | change | append | replace | monitor } ROUTE SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ] [ table TABLE_ID ] [ proto RTPROTO ] [ type TYPE ] [ scope SCOPE ] ROUTE := NODE_SPEC [ INFO_SPEC ] NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ] [ table TABLE_ID ] [ proto RTPROTO ] [ scope SCOPE ] [ metric METRIC ] INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]... NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ rtt NUMBER ] [ rttvar NUMBER ] [ window NUMBER] [ cwnd NUMBER ] [ ssthresh REALM ] [ realms REALM ] TYPE := [ unicast | local | broadcast | multicast | throw | unreachable | prohibit | blackhole | nat ] TABLE_ID := [ local | main | default | all | NUMBER ] SCOPE := [ host | link | global | NUMBER ] FLAGS := [ equalize ] NHFLAGS := [ onlink | pervasive ] RTPROTO := [ kernel | boot | static | NUMBER ] %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART II - Bandwidth Management What do I need Bandwidth Management for? Decide how and who available bandwidth is devided Limit available bandwidth for certain users / applications Guarantee bandwidth for certain users / applications Divide bandwidth more equally between users / applications QoS, DiffServ, IntServ Linux 2.2 / 2.4 provides elaborate framework Called 'packet scheduling' or 'traffic control' Another major achievement of Alexey Kuznetsov %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART II - Bandwidth Management Basic iptables commands To build a complete iptable command, we must specify which table to work with which chain in this table to use an operation (insert, add, delete, modify) a match a target The syntax is %font "typewriter" %size 3 iptables -t table -Operation chain -j target match(es) %font "standard" %size 5 Example: %font "typewriter" %size 3 iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART II - packet filtering Targets Builtin Targets to be used in filter table ACCEPT accept the packet DROP silently drop the packet QUEUE enqueue packet to userspace RETURN return to previous (calling) chain foobar user defined chain Targets implemented as loadable modules REJECT drop the packet but inform sender MIRROR change source/destination IP and resend LOG log via syslog ULOG log via userspace %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART II - packet filtering Matches Basic matches -p protocol (tcp/udp/icmp/...) -s source address (ip/mask) -d destination address (ip/mask) -i incoming interface -o outgoing interface Match extensions --dport destination port --sport source port --mac-source source MAC address --mark nfmark --tos TOS field of IP header --ttl TTL field of IP header --limit rate limiting (n packets per timeframe) --owner owner uid of the socket sending the packet %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART III - NAT Overview Previous Linux Kernels only implemented one special case of NAT: Masquerading Netfilter enables Linux to do any kind of NAT. All matches from packet filtering are available for the nat tables, too We divide NAT into 'source NAT' and 'destination NAT' SNAT changes the packet's source whille passing NF_IP_POST_ROUTING DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING MASQUERADE is a special case of SNAT REDIRECT is a special case of DNAT %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART III - NAT Source NAT SNAT Example: %font "typewriter" %size 3 iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8 %font "standard" %size 4 Masquerading does almost the same as SNAT, but if the outgoing interfaces' address changes (in case we have a dialup with dynamic ip), the new address is used. MASQUERADE Example: %font "typewriter" %size 3 iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0 %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART III - NAT Destination NAT DNAT example: %font "typewriter" %size 3 iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1 %font "standard" %size 4 REDIRECT is a special case of DNAT, which alters the destination to the address of the incoming interface. REDIRECT example: %font "typewriter" %size 3 iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80 %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking PART IV - Packet mangling Change certain parts of a packet based on rules in IP tables Again all the matches available, as described in packet filtering section. Currently, the supported packet mangling targets are: TOS manipulate the TOS bits TTL set / increase / decrease TTL field MARK change the nfmark field of the skb Simple example: %font "typewriter" %size 3 iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Advanced Netfilter concepts Connection tracking Implemented seperately from NAT Enables stateful filtering Implementation hooks into NF_IP_PRE_ROUTING to track packets hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to drop information about connections which got filtered out protocol modules (currently TCP/UDP/ICMP) application helpers (currently FTP and IRC-DCC) Conntrack divides packets in the following four categories NEW - would establish new connection ESTABLISHED - part of already established connection RELATED - is related to established connection INVALID - (multicast, errors...) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Advanced Netfilter concepts %size 4 Userspace logging flexible replacement for old syslog-based logging packets to userspace via multicast netlink sockets easy-to-use library (libipulog) plugin-extensible userspace logging daemon already available Queuing reliable asynchronous packet handling packets to userspace via unicast netlink socket easy-to-use library (libipq) experimental queue multiplex daemon (ipqmpd) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Current Development and Future Netfilter (although it proved very stable) is still work in progress. Areas of current development infrastructure for conntrack/nat helpers in userspace full TCP sequence number tracking multicast support for connection tracking more flexible matches (MAXCONN, ...) more conntrack and NAT modules (RPC, SNMP, SMB, ...) better IPv6 support (conntrack, more matches / targets) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Advanced Linux Networking Availability of slides / Links The slides and the an according paper of this presentation are available at http://www.gnumonks.org The netfilter homepage is mirrored at: http://netfilter.samba.org http://netfilter.kernelnotes.org http://netfilter.filewatcher.org More documents / netfilter extensions (ulogd, ipqmpd, ...) http://www.gnumonks.org/projects