diff options
Diffstat (limited to '2002/netfilter-internals-lt2002')
-rw-r--r-- | 2002/netfilter-internals-lt2002/abstract | 49 | ||||
-rw-r--r-- | 2002/netfilter-internals-lt2002/biography | 22 | ||||
-rw-r--r-- | 2002/netfilter-internals-lt2002/netfilter-internals-lt2002.mgp | 466 | ||||
-rw-r--r-- | 2002/netfilter-internals-lt2002/netfilter-internals-lt2002.tex | 537 |
4 files changed, 1074 insertions, 0 deletions
diff --git a/2002/netfilter-internals-lt2002/abstract b/2002/netfilter-internals-lt2002/abstract new file mode 100644 index 0000000..1cc18b0 --- /dev/null +++ b/2002/netfilter-internals-lt2002/abstract @@ -0,0 +1,49 @@ +Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524) + + The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains. + + The netfilter/iptables project has a very modular design and it's +sub-projects can be split in several parts: netfilter, iptables, connection +tracking, NAT and packet mangling. + + While most users will already have learned how to use the basic functions +of netfilter/iptables in order to convert their old ipchains firewalls to +iptables, there's more advanced but less used functionality in +netfilter/iptables. + + The presentation covers the design principles behind the netfilter/iptables +implementation. This knowledge enables us to understand how the individual +parts of netfilter/iptables fit together, and for which potential applications +this is useful. + +Topics covered: + +- overview about the internal netfilter/iptables architecture + - the netfilter hooks inside the network protocol stacks + - packet selection with IP tables + - how is connection tracking and NAT integrated into the framework +- the connection tracking system + - how good does it track the TCP state? + - how does it track ICMP and UDP state at all? + - layer 4 protocol helpers (GRE, ...) + - application helpers (ftp, irc, h323, ...) + - restrictions/limitations +- the NAT system + - how does it interact with connection tracking? + - layer 4 protocol helpers + - application helpers (ftp, irc, ...) +- misc + - how far is IPv6 firewalling with ip6tables? + - advances in failover/HA of stateful firewalls + - ivisible firewalls with iptables on a bridge + - userspace packet queueing with QUEUE + - userspace packet logging with ULOG + +Requirements: +- knowledge about the TCP/IP protocol family +- knowledge about general firewalling and packet filtering concepts +- prior experience with linux packet filters + +Audience: +- firewall administrators +- network developers diff --git a/2002/netfilter-internals-lt2002/biography b/2002/netfilter-internals-lt2002/biography new file mode 100644 index 0000000..27b77bd --- /dev/null +++ b/2002/netfilter-internals-lt2002/biography @@ -0,0 +1,22 @@ + <a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one +of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core +team members, and the current Linux 2.4.x firewalling maintainer. + + His main interest in computing has always been networking. In the few time +left besides netfilter/iptables related work, he's writing obscure documents +like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP +over SSL HOWTO</a>. Other kernel-related projects he has been contributing are +user mode linux and the international (crypto) kernel patch. + + In the past he has been working as an independent IT Consultant working on +closed-source projecst for various companies ranging from banks to +manufacturers of networking gear. During the year 2001 he was living in +Curitiba (Brazil), where he got sponsored for his Linux related work by +<a href="http://www.conectiva.com/">Conectiva Inc.</a>. + + Starting with February 2002, Harald has been contracted part-time by +<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his +current netfilter/iptables work. + + Harald is living in Erlangen, Germany. + diff --git a/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.mgp b/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.mgp new file mode 100644 index 0000000..9487ff4 --- /dev/null +++ b/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.mgp @@ -0,0 +1,466 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + +%center +%size 7 + + +Linux 2.4.x netfilter/iptables +firewalling internals + + +%center +%size 4 +by + +Harald Welte <laforge@gnumonks.org> + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Contents + + + Introduction + Netfilter hooks in protocol stacks + Packet selection based on IP Tables + The Connection Tracking Subsystem + The NAT Subsystem based on netfilter + iptables + Packet filtering using the 'filter' table + Packet mangling using the 'mangle' table + Advanced netfilter concepts + Current development and Future + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Introduction + +Why did we need netfilter/iptables? +Because ipchains... + + has no infrastructure for passing packets to userspace + makes transparent proxying extremely difficult + has interface address dependent Packet filter rules + has Masquerading implemented as part of packet filtering + code is too complex and intermixed with core ipv4 stack + is neither modular nor extensible + only barely supports one special case of NAT (masquerading) + has only stateless packet filtering + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Introduction + +Who's behind netfilter/iptables + Paul 'Rusty' Russel + co-author of iptables in Linux 2.2 + was paid by Watchguard for about one Year of development + James Morris + userspace queuing (kernel, library and tools) + REJECT target + Marc Boucher + NAT and packet filtering controlled by one command + Mangle table + Harald Welte + Conntrack+NAT helper infrastructure (newnat) + Userspace packet logging (ULOG) + PPTP and IRC conntrack/NAT helpers + Jozsef Kadlecsik + TCP window tracking + H.323 conntrack + NAT helper + Continued newnat development + Non-core team contributors + http://www.netfilter.org/scoreboard/ +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Netfilter Hooks + +What is netfilter? + + System of callback functions within network stack + Callback function to be called for every packet traversing certain point (hook) within network stack + Protocol independent framework + Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP) + Multiple kernel modules can register with each of the hooks + Asynchronous packet handling in userspace (ip_queue) + +Traditional packet filtering, NAT, ... is implemented on top of this framework + +Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Netfilter Hooks + +Netfilter architecture in IPv4 +%font "courier" + + --->[1]--->[ROUTE]--->[3]--->[4]---> + | ^ + | | + | [ROUTE] + v | + [2] [5] + | ^ + | | + v | + +%font "standard" +1=NF_IP_PRE_ROUTING +2=NF_IP_LOCAL_IN +3=NF_IP_FORWARD +4=NF_IP_POST_ROUTING +5=NF_IP_LOCAL_OUT +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Netfilter Hooks + +Netfilter Hooks + + Any kernel module may register a callback function at any of the hooks + + The module has to return one of the following constants + + NF_ACCEPT continue traversal as normal + NF_DROP drop the packet, do not continue + NF_STOLEN I've taken over the packet do not continue + NF_QUEUE enqueue packet to userspace + NF_REPEAT call this hook again + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +IP tables + +Packet selection using IP tables + + The kernel provides generic IP tables support + + Each kernel module may create it's own IP table + + The three major parts of 2.4 firewalling subsystem are implemented using IP tables + Packet filtering table 'filter' + NAT table 'nat' + Packet mangling table 'mangle' + + Can potentially be used for other stuff, i.e. IPsec SPDB + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +IP Tables + +Managing chains and tables + + An IP table consists out of multiple chains + A chain consists out of a list of rules + Every single rule in a chain consists out of + match[es] (rule executed if all matches true) + target (what to do if the rule is matched) + +%size 4 +matches and targets can either be builtin or implemented as kernel modules + +%size 6 + The userspace tool iptables is used to control IP tables + handles all different kinds of IP tables + supports a plugin/shlib interface for target/match specific options + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +IP Tables + +Basic iptables commands + + To build a complete iptables command, we must specify + which table to work with + which chain in this table to use + an operation (insert, add, delete, modify) + one or more matches (optional) + a target + +The syntax is +%font "typewriter" +%size 3 +iptables -t table -Operation chain -j target match(es) +%font "standard" +%size 5 + +Example: +%font "typewriter" +%size 3 +iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp +%font "standard" +%size 5 + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +IP Tables + +Matches + Basic matches + -p protocol (tcp/udp/icmp/...) + -s source address (ip/mask) + -d destination address (ip/mask) + -i incoming interface + -o outgoing interface + + Match extensions (examples) + tcp/udp TCP/udp source/destination port + icmp ICMP code/type + ah/esp AH/ESP SPID match + mac source MAC address + mark nfmark + length match on length of packet + limit rate limiting (n packets per timeframe) + owner owner uid of the socket sending the packet + tos TOS field of IP header + ttl TTL field of IP header + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +IP Tables + +Targets + very dependent on the particular table. + + Table specific targets will be discussed later + + Generic Targets, always available + ACCEPT accept packet within chain + DROP silently drop packet + QUEUE enqueue packet to userspace + LOG log packet via syslog + ULOG log packet via ulogd + RETURN return to previous (calling) chain + foobar jump to user defined chain + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Packet Filtering + +Overview + + Implemented as 'filter' table + Registers with three netfilter hooks + + NF_IP_LOCAL_IN (packets destined for the local host) + NF_IP_FORWARD (packets forwarded by local host) + NF_IP_LOCAL_OUT (packets from the local host) + +Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT) + +Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Packet Filtering + +Targets available within 'filter' table + + Builtin Targets to be used in filter table + ACCEPT accept the packet + DROP silently drop the packet + QUEUE enqueue packet to userspace + RETURN return to previous (calling) chain + foobar user defined chain + + Targets implemented as loadable modules + REJECT drop the packet but inform sender + MIRROR change source/destination IP and resend + LOG log via syslog + ULOG log via userspace + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Connection Tracking Subsystem + + Connection tracking... + + implemented seperately from NAT + enables stateful filtering + implementation + hooks into NF_IP_PRE_ROUTING to track packets + hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules + protocol modules (currently TCP/UDP/ICMP) + application helpers currently (FTP,IRC,H.323,talk,SNMP) + divides packets in the following four categories + NEW - would establish new connection + ESTABLISHED - part of already established connection + RELATED - is related to established connection + INVALID - (multicast, errors...) + does _NOT_ filter packets itself + can be utilized by iptables using the 'state' match + is used by NAT Subsystem + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Network Address Translation + +Overview + + Previous Linux Kernels only implemented one special case of NAT: Masquerading + Linux 2.4.x can do any kind of NAT. + NAT subsystem implemented on top of netfilter, iptables and conntrack + NAT subsystem registers with all five netfilter hooks + 'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT + Following targets available within 'nat' Table + SNAT changes the packet's source whille passing NF_IP_POST_ROUTING + DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING + MASQUERADE is a special case of SNAT + REDIRECT is a special case of DNAT + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Network Address Translation + + Source NAT + SNAT Example: +%font "typewriter" +%size 3 +iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8 +%font "standard" +%size 4 + + MASQUERADE Example: +%font "typewriter" +%size 3 +iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0 +%font "standard" +%size 5 + + Destination NAT + DNAT example +%font "typewriter" +%size 3 +iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1 +%font "standard" +%size 4 + + REDIRECT example +%font "typewriter" +%size 3 +iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80 +%font "standard" +%size 5 + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Packet Mangling + + Purpose of mangle table + packet manipulation except address manipulation + + Integration with netfilter + 'mangle' table hooks in all five netfilter hooks + priority: after conntrack + + Targets specific to the 'mangle' table: + DSCP - manipulate DSCP field + IPV4OPTSSTRIP - strip IPv4 options + MARK - change the nfmark field of the skb + TCPMSS - set TCP MSS option + TOS - manipulate the TOS bits + TTL - set / increase / decrease TTL field + +Simple example: +%font "typewriter" +%size 3 +iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80 + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Advanced Netfilter concepts + +%size 4 + Userspace logging + flexible replacement for old syslog-based logging + packets to userspace via multicast netlink sockets + easy-to-use library (libipulog) + plugin-extensible userspace logging daemon (ulogd) + Can even be used to directly log into MySQL + + Queuing + reliable asynchronous packet handling + packets to userspace via unicast netlink socket + easy-to-use library (libipq) + provides Perl bindings + experimental queue multiplex daemon (ipqmpd) + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Current Development and Future + +Netfilter (although it proved very stable) is still work in progress. + + Areas of current development + infrastructure for conntrack manipulation from userspace + failover of stateful firewalls + making iptables layer3 independent (pkttables) + new userspace library (libiptables) to hide plugins from apps + more matches and targets for advanced functions (pool, hashslot) + more conntrack and NAT modules (RPC, SNMP, SMB, ...) + better IPv6 support (conntrack, more matches / targets) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Thanks + + Thanks to + the BBS people, Z-Netz, FIDO, ... + for heavily increasing my computer usage in 1992 + + KNF + for bringing me in touch with the internet as early as 1994 + for providing a playground for technical people + for telling me about the existance of Linux! + + Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen + for implementing (one of?) the world's best TCP/IP stacks + + Paul 'Rusty' Russell + for starting the netfilter/iptables project + for trusting me to maintain it today + + Astaro AG + for sponsoring parts of my netfilter work + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +netfilter/iptables in Linux 2.4 +Availability of slides / Links + +The slides and the an according paper of this presentation are available at + http://www.gnumonks.org/ + +The netfilter homepage + http://www.netfilter.org/ + diff --git a/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.tex b/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.tex new file mode 100644 index 0000000..c3a28ea --- /dev/null +++ b/2002/netfilter-internals-lt2002/netfilter-internals-lt2002.tex @@ -0,0 +1,537 @@ +\documentclass{article} +\usepackage{german} +\usepackage{fancyheadings} +\usepackage{a4} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} +\setlength{\topmargin}{0.0in} +\setlength{\headheight}{0in} +\setlength{\headsep}{0in} +\setlength{\textwidth}{6.5in} +\setlength{\textheight}{9.5in} +\setlength{\parindent}{0in} +\setlength{\parskip}{0.05in} + + +\begin{document} +\title{Linux 2.4.x netfilter/iptables firewalling internals} + +\author{Harald Welte\\ + laforge@gnumonks.org\\ + \copyright{}2002 H. Welte} + +\date{25. April 2002} + +\maketitle + +\setcounter{section}{0} +\setcounter{subsection}{0} +\setcounter{subsubsection}{0} + +\section{Introduction} +The Linux 2.4.x kernel series has introduced a totally new kernel firewalling +subsystem. It is much more than a plain successor of ipfwadm or ipchains. + +The netfilter/iptables project has a very modular design and it's +sub-projects can be split in several parts: netfilter, iptables, connection +tracking, NAT and packet mangling. + +While most users will already have learned how to use the basic functions +of netfilter/iptables in order to convert their old ipchains firewalls to +iptables, there's more advanced but less used functionality in +netfilter/iptables. + +The presentation covers the design principles behind the netfilter/iptables +implementation. This knowledge enables us to understand how the individual +parts of netfilter/iptables fit together, and for which potential applications +this is useful. + +\section{Internal netfilter/iptables architecture} + +\subsection{Netfilter hooks in protocol stacks} + +One of the major motivations behind the redesign of the linux packet +filtering and NAT system during the 2.3.x kernel series was the widespread +firewall specific code parts within the core IPv4 stack. Ideally the core +IPv4 stack (as used by regular hosts and routers) shouldn't contain any +firewalling specific code, resulting in no unwanted interaction and less +code complexity. This desire lead to the invention of {\it netfilter}. + +\subsubsection{Architecture of netfilter} + +Netfilter is basically a system of callback functions within the network +stack. It provides a non-portable API towards in-kernel networking +extensions. + +What we call {\it netfilter hook} is a well-defined call-out point within a +layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three +network stack can define an arbitrary number of hooks, usually placed at +strategic points within the packet flow. + +Any other kernel code can now subsequently register callback functions for +any of these hooks. As in most sytems will be more than one callback +function registered for a particular hook, a {\it priority} is specified upon +registration of the callback function. This priority defines the order in +which the individual callback functions at a particular hook are called. + +The return value of any registered callback functions can be: +\begin{itemize} +\item +{\bf NF\_ACCEPT}: continue traversal as usual +\item +{\bf NF\_DROP}: drop the packet; do not continue traversal +\item +{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue +\item +{\bf NF\_QUEUE}: enqueue the packet to userspace +\item +{\bf NF\_REPEAT}: call this hook again +\end{itemize} + +\subsubsection{Netfilter hooks within IPv4} + +The IPv4 stack provides five netfilter hooks, which are placed at the +following peculiar places within the code: + +\begin{verbatim} + --->[1]--->[ROUTE]--->[3]--->[4]---> + | ^ + | | + | [ROUTE] + v | + [2] [5] + | ^ + | | + v | + + local processes +\end{verbatim} + +Packets received on any network interface arrive at the left side of the +diagram. After the verification of the IP header checksum, the +NF\_IP\_PRE\_ROUTING [1] hook is traversed. + +If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the +routing code. Where we continue from here depends on the destintion of the +packet. + +Packets with a local destination (i.e. packets where the destination address is +one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2] +hook. If all callback function return NF\_ACCEPT, the packet is finally passed +to the socket code, which eventually passes the packet to a local process. + +Packets with a remote destination (i.e. packets which are forwarded by the +local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'', +they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the +outgoing network interface. + +Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then +enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4] +hook before being sent off the outgoing network interface. + +\subsubsection{Netfilter hooks within IPv6} + +As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the +IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The +only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN, +NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT. + +\subsubsection{Netfilter hooks within DECnet} + +There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING, +NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING) +are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO, +NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets. + +\subsubsection{Netfilter hooks within ARP} + +Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code. +There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing +ARP packets respectively. + +\subsubsection{Netfilter hooks within IPX} + +There have been experimental patches to add netfilter hooks to the IPX code, +but they never got integrated into the kernel source. + +\subsection{Packet selection using IP Tables} + +The IP tables core (ip\_tables.o) provides a generic layer for evaluation +of rulesets. + +An IP table consists out of an arbitrary number of {\it chains}, which in turn +consist out of a linear list of {\it rules}, which again consist out of any +number of {\it matches} and one {\it target}. + +{\it Chains} can further be devided into two classes: Either {\it builtin +chains} or {\it user-defined chains}. Builtin chains are always present, they +are created upon table registration. They are also the entry points for table +iteration. User defined chains are created at runtime upon user interaction. + +{\it Matches} specify the matching criteria, there can be zero or more matches + +{\it Targets} specify the action which is to be executed in case {\bf all} +matches match. There can only be a single target per rule. + +Matches and targets can either be {\it builtin} or {\it linux kernel modules}. + +There are two special targets: +\begin{itemize} +\item +By using a chain name as target, it is possible to jump to the respective chain +in case the matches match. +\item +By using the RETURN target, it is possible to return to the previous (calling) +chain +\end{itemize} + +The IP tables core handles the following functions +\begin{itemize} +\item +Registering and unregistering tables +\item +Registering and unregistering matches and targets (can be implemented as linux kernel modules) +\item +Kernel / userspace interface for manipulation of IP tables +\item +Traversal of IP tables +\end{itemize} + +\subsubsection{Packet filtering unsing the ``filter'' table} + +Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes +place in the ``filter'' table. Packet filtering works like a sieve: A packet +is (in the end) either dropped or accepted - but never modified. + +The ``filter'' table is implemented in the {\it iptable\_filter.o} module +and contains three builtin chains: + +\begin{itemize} +\item +{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN +\item +{\bf FORWARD} attaches to NF\_IP\_FORWARD +\item +{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT +\end{itemize} + +The placement of the chains / hooks is done in such way, that evey concievable +packet always traverses only one of the built-in chains. Packets destined for +the local host traverse only INPUT, packets forwarded only FORWARD and +locally-originated packets only OUTPUT. + +\subsubsection{Packet mangling using the ``mangle'' table} + +As stated above, operations which would modify a packet do not belong in the +``filter'' table. The ``mangle'' table is available for all kinds of packet +manipulation - but not manipulation of addresses (which is NAT). + +The mangle table attaches to all five netfilter hooks and provides the +respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING) +\footnote{This has changed through recent 2.4.x kernel series, old kernels may +only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}. + +\subsection{Connection Tracking Subsystem} + +Traditional packet filters can only match on matching criteria within the +currently processed packet, like source/destination IP address, port numbers, +TCP flags, etc. As most applications have a notion of connections or at least +a request/response style protocol, there is a lot of information which can not +be derived from looking at a single packet. + +Thus, modern (stateful) packet filters attempt to track connections (flows) +and their respective protocol states for all traffic through the packet +filter. + +Connection tracking within linux is implemented as a netfilter module, called +ip\_conntrack.o. + +Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code. + +A connection is represented within the conntrack subsystem using {\it struct +ip\_conntrack}, also called {\it connection tracking entry}. + +Connection tracking is utilizing {\it conntrack tuples}, which are tuples +consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is +uniquely identified by two tuples: The tuple in the original direction +(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction +(IP\_CT\_DIR\_REPLY). + +Connection tracking itself does not drop packets\footnote{well, in some rare +cases in combination with NAT it needs to drop. But don't tell anyone, this is +secret.} or impose any policy. It just associates every packet with a +connection tracking entry, which in turn has a particular state. All other +kernel code can use this state information\footnote{state information is +internally represented via the {\it struct sk\_buff.nfct} structure member of a +packet.}. + +\subsubsection{Integration of conntrack with netfilter} + +If the ip\_conntrack.o module is registered with netfilter, it attaches to the +NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and +NF\_IP\_LOCAL\_OUT hooks. + +Because forwarded packets are the most common case on firewalls, I will only +describe how connection tracking works for forwarded packets. The two relevant +hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING. + +Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection +tracking creates a conntrack tuple from the packet. It then compares this +tuple to the original and reply tuples of all already-seen connections +\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing +connection. If there is no match, a new conntrack table entry (struct +ip\_conntrack) is created. + +Let's assume the case where we have already existing connections but are +starting from scratch. + +The first packet comes in, we derive the tuple from the packet headers, look up +the conntrack hash table, don't find any matching entry. As a result, we +create a new struct ip\_conntrack. This struct ip\_conntrack is filled with +all necessarry data, like the original and reply tuple of the connection. +How do we know the reply tuple? By inverting the source and destination +parts of the original tuple.\footnote{So why do we need two tuples, if they can +be derived from each other? Wait until we discuss NAT.} +Please note that this new struct ip\_conntrack is {\bf not} yet placed +into the conntrack hash table. + +The packet is now passed on to other callback functions which have registered +with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of +the network stack as usual, including all respective netfilter hooks. + +If the packet survives (i.e. is not dropped by the routing code, network stack, +firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case, +we can now safely assume that this packet will be sent off on the outgoing +interface, and thus put the connection tracking entry which we created at +NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called +{\it confirming the conntrack}. + +The connection tracking code itself is not monolithic, but consists out of a +couple of seperate modules\footnote{They don't actually have to be seperate +kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of +the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there +are two important kind of modules: Protocol helpers and application helpers. + +Protocol helpers implement the layer-4-protocol specific parts. They currently +exist for TCP, UDP and ICMP (an experimental helper for GRE exists). + +\subsubsection{TCP connection tracking} + +As TCP is a connection oriented protocol, it is not very difficult to imagine +how conntection tracking for this protocol could work. There are well-defined +state transitions possible, and conntrack can decide which state transitions +are valid within the TCP specification. In reality it's not all that easy, +since we cannot assume that all packets that pass the packet filter actually +arrive at the receiving end, ... + +It is noteworthy that the standard connection tracking code does {\bf not} +do TCP sequence number and window tracking. A well-maintained patch to add +this feature exists almost as long as connection tracking itself. It will +be integrated with the 2.5.x kernel. The problem with window tracking is +it's bad interaction with connection pickup. The TCP conntrack code is able to +pick up already existing connections, e.g. in case your firewall was rebooted. +However, connection pickup is conflicting with TCP window tracking: The TCP +window scaling option is only transferred at connection setup time, and we +don't know about it in case of pickup... + +\subsubsection{ICMP tracking} + +ICMP is not really a connection oriented protocol. So how is it possible to +do connection tracking for ICMP? + +The ICMP protocol can be split in two groups of messages + +\begin{itemize} +\item +ICMP error messages, which sort-of belong to a different connection +ICMP error messages are associated {\it RELATED} to a different connection. +(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED, +ICMP\_PARAMETERPROB, ICMP\_REDIRECT). +\item +ICMP queries, which have a request->reply character. So what the conntrack +code does, is let the request have a state of {\it NEW}, and the reply +{\it ESTABLISHED}. The reply closes the connection immediately. +(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS) +\end{itemize} + +\subsubsection{UDP connection tracking} + +UDP is designed as a connectionless datagram protocol. But most common +protocols using UDP as layer 4 protocol have bi-directional UDP communication. +Imagine a DNS query, where the client sends an UDP frame to port 53 of the +nameserver, and the nameserver sends back a DNS reply packet from it's UDP +port 53 to the client. + +Netfilter trats this as a connection. The first packet (the DNS request) is +assigned a state of {\it NEW}, because the packet is expected to create a new +'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}. + +\subsubsection{conntrack application helpers} + +More complex application protocols involving multiple connections need special +support by a so-called ``conntrack application helper module''. Modules in +the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains +patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking +a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear +until somebody really needs them and either develops them on his own or +funds development. + +\subsubsection{Integration of connection tracking with iptables} + +As stated earlier, conntrack doesn't impose any policy on packets. It just +determines the relation of a packet to already existing connections. To base +packet filtering decision on this sate information, the iptables {\it state} +match can be used. Every packet is within one of the following categories: + +\begin{itemize} +\item +{\bf NEW}: packet would create a new connection, if it survives +\item +{\bf ESTABLISHED}: packet is part of an already established connection +(either direction) +\item +{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions +\item +{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category. +\end{itemize} + +\subsection{NAT Subsystem} + +The NAT (Network Address Translation) subsystem is probably the worst +documented subsystem within the whole framework. This has two reasons: NAT is +nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so +nobody needs to know the nasty details. + +Nonetheless, as I was traditionally concentrating most on the conntrack and NAT +systems, I will give a short overview. + +NAT uses almost all of the previously described subsystems: +\begin{itemize} +\item +IP tables to specify which packets to NAT in which particular way. NAT +registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains. +\item +Connection tracking to associate NAT state with the connection. +\item +Netfilter to do the actuall packet manipulation transparent to the rest of the +kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, +NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT. +\end{itemize} + +The NAT implementation supports all kinds of different nat; Source NAT, +Destination NAT, NAT to address/port ranges, 1:1 NAT, ... + +This fundamental design principle is still frequently misunderstood:\\ +The information about which NAT mappings apply to a certain connection +is only gathered once - with the first packet of every connection. + +So let's start to look at the life of a poor to-be-nat'ed packet. +For ease of understanding, I have chosen to describe the most frequently +used NAT scenario: Source NAT of a forwarded packet. Let's assume the +packet has an original source address of 1.1.1.1, an original destination +address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further +ignore the fact that there are port numbers. + +Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where +conntrack has registered with highest priority. This means that a conntrack +entry with the following two tuples is created: +\begin{verbatim} +IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2 +IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1 +\end{verbatim} +After conntrack, the packet traverses the PREROUTING chain of the ``nat'' +IP table. Since only destination NAT happens at PREROUTING, no action +occurs. After it's lengthy way through the rest of the network stack, +the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses +the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule, +causing the following actions: +\begin{itemize} +\item +Fill in a {\it struct ip\_nat\_manip}, indicating the new source address +and the type of NAT (source NAT at POSTROUTING). This struct is part of the +conntrack entry. +\item +Automatically derive the inverse NAT transormation for the reply packets: +Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}. +\item +Alter the REPLY tuple of the conntrack entry to +\begin{verbatim} +IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9 +\end{verbatim} +\item +Apply the SNAT transformation to the packet +\end{itemize} + +Every other packt within this connection, independent of its direction, +will only execute the last step. Since all NAT information is connected +with the conntrack entry, there is no need to do anything but to apply +the same transormations to all packets witin the same connection. + +\subsection{IPv6 Firewalling with ip6tables} + +Yes, Linux 2.4.x comes with a usable, though incomplete system to secure +your IPv6 network. + +The parts ported to IPv6 are +\begin{itemize} +\item +IP tables (called IP6 tables) +\item +The ``filter'' table +\item +The ``mangle'' table +\item +The userspace library (libip6tc) +\item +The command line tool (ip6tables) +\end{itemize} + +Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT +with IPv6}, only traditional, stateless packet filtering is possible. Apart +from the obvious matches/targets, ip6tables can match on +\begin{itemize} +\item +{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address +\item +{\it frag6 match}, matches on IPv6 fragmentation header +\item +{\it route6 match}, matches on IPv6 routing header +\item +{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets +\end{itemize} + +However, the ip6tables code doesn't seem to be used very widely (yet?). +So please expect some potential remaining issues, since it is not tested +as heavily as iptables. + +\subsection{Recent Development} + +Please refer to the spoken word at the presentation. Development at the +time this paper was written can be quite different from development at the +time the presentation is held. + +\section{Thanks} + +I'd like to thank +\begin{itemize} +\item +{\it Linus Torvalds} for starting this interesting UNIX-like kernel +\item +{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building +(one of?) the world's best TCP/IP stacks. +\item +{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project +\item +{\it The Netfilter Core Team} for continuing the netfilter/iptables effort +\item +{\it Astaro AG} for partially funding my current netfilter/iptables work +\item +{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables +work and for inviting me to live in Brazil +\item +{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter +homepage, CVS, mailing lists, ... +\end{itemize} + +\end{document} |