summaryrefslogtreecommitdiff
path: root/2002/netfilter-internals-lsm2002
diff options
context:
space:
mode:
authorHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
committerHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
commitfca59bea770346cf1c1f9b0e00cb48a61b44a8f3 (patch)
treea2011270df48d3501892ac1a56015c8be57e8a7d /2002/netfilter-internals-lsm2002
import of old now defunct presentation slides svn repo
Diffstat (limited to '2002/netfilter-internals-lsm2002')
-rw-r--r--2002/netfilter-internals-lsm2002/abstract49
-rw-r--r--2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.mgp520
-rw-r--r--2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.tex537
3 files changed, 1106 insertions, 0 deletions
diff --git a/2002/netfilter-internals-lsm2002/abstract b/2002/netfilter-internals-lsm2002/abstract
new file mode 100644
index 0000000..1cc18b0
--- /dev/null
+++ b/2002/netfilter-internals-lsm2002/abstract
@@ -0,0 +1,49 @@
+Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524)
+
+ The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
+
+ The netfilter/iptables project has a very modular design and it's
+sub-projects can be split in several parts: netfilter, iptables, connection
+tracking, NAT and packet mangling.
+
+ While most users will already have learned how to use the basic functions
+of netfilter/iptables in order to convert their old ipchains firewalls to
+iptables, there's more advanced but less used functionality in
+netfilter/iptables.
+
+ The presentation covers the design principles behind the netfilter/iptables
+implementation. This knowledge enables us to understand how the individual
+parts of netfilter/iptables fit together, and for which potential applications
+this is useful.
+
+Topics covered:
+
+- overview about the internal netfilter/iptables architecture
+ - the netfilter hooks inside the network protocol stacks
+ - packet selection with IP tables
+ - how is connection tracking and NAT integrated into the framework
+- the connection tracking system
+ - how good does it track the TCP state?
+ - how does it track ICMP and UDP state at all?
+ - layer 4 protocol helpers (GRE, ...)
+ - application helpers (ftp, irc, h323, ...)
+ - restrictions/limitations
+- the NAT system
+ - how does it interact with connection tracking?
+ - layer 4 protocol helpers
+ - application helpers (ftp, irc, ...)
+- misc
+ - how far is IPv6 firewalling with ip6tables?
+ - advances in failover/HA of stateful firewalls
+ - ivisible firewalls with iptables on a bridge
+ - userspace packet queueing with QUEUE
+ - userspace packet logging with ULOG
+
+Requirements:
+- knowledge about the TCP/IP protocol family
+- knowledge about general firewalling and packet filtering concepts
+- prior experience with linux packet filters
+
+Audience:
+- firewall administrators
+- network developers
diff --git a/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.mgp b/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.mgp
new file mode 100644
index 0000000..fb8b444
--- /dev/null
+++ b/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.mgp
@@ -0,0 +1,520 @@
+%include "default.mgp"
+%default 1 bgrad
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+%nodefault
+%back "blue"
+
+%center
+%size 7
+
+
+Linux 2.4.x netfilter/iptables
+firewalling internals
+
+
+%center
+%size 4
+by
+
+Harald Welte <laforge@gnumonks.org>
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Contents
+
+
+ Introduction
+ Netfilter hooks in protocol stacks
+ Packet selection based on IP Tables
+ The Connection Tracking Subsystem
+ The NAT Subsystem based on netfilter + iptables
+ Packet filtering using the 'filter' table
+ Packet mangling using the 'mangle' table
+ Advanced netfilter concepts
+ Current development and Future
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Introduction
+
+Why did we need netfilter/iptables?
+Because ipchains...
+
+ has no infrastructure for passing packets to userspace
+ makes transparent proxying extremely difficult
+ has interface address dependent Packet filter rules
+ has Masquerading implemented as part of packet filtering
+ code is too complex and intermixed with core ipv4 stack
+ is neither modular nor extensible
+ only barely supports one special case of NAT (masquerading)
+ has only stateless packet filtering
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Introduction
+
+Who's behind netfilter/iptables
+ Paul 'Rusty' Russel
+ co-author of iptables in Linux 2.2
+ was paid by Watchguard for about one Year of development
+ James Morris
+ userspace queuing (kernel, library and tools)
+ REJECT target
+ Marc Boucher
+ NAT and packet filtering controlled by one command
+ Mangle table
+ Harald Welte
+ Conntrack+NAT helper infrastructure (newnat)
+ Userspace packet logging (ULOG)
+ PPTP and IRC conntrack/NAT helpers
+ Jozsef Kadlecsik
+ TCP window tracking
+ H.323 conntrack + NAT helper
+ Continued newnat development
+ Non-core team contributors
+ http://www.netfilter.org/scoreboard/
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Netfilter Hooks
+
+What is netfilter?
+
+ System of callback functions within network stack
+ Callback function to be called for every packet traversing certain point (hook) within network stack
+ Protocol independent framework
+ Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
+ Multiple kernel modules can register with each of the hooks
+ Asynchronous packet handling in userspace (ip_queue)
+
+Traditional packet filtering, NAT, ... is implemented on top of this framework
+
+Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Netfilter Hooks
+
+Netfilter architecture in IPv4
+%font "courier"
+
+ --->[1]--->[ROUTE]--->[3]--->[4]--->
+ | ^
+ | |
+ | [ROUTE]
+ v |
+ [2] [5]
+ | ^
+ | |
+ v |
+
+%font "standard"
+1=NF_IP_PRE_ROUTING
+2=NF_IP_LOCAL_IN
+3=NF_IP_FORWARD
+4=NF_IP_POST_ROUTING
+5=NF_IP_LOCAL_OUT
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Netfilter Hooks
+
+Netfilter Hooks
+
+ Any kernel module may register a callback function at any of the hooks
+
+ The module has to return one of the following constants
+
+ NF_ACCEPT continue traversal as normal
+ NF_DROP drop the packet, do not continue
+ NF_STOLEN I've taken over the packet do not continue
+ NF_QUEUE enqueue packet to userspace
+ NF_REPEAT call this hook again
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+IP tables
+
+Packet selection using IP tables
+
+ The kernel provides generic IP tables support
+
+ Each kernel module may create it's own IP table
+
+ The three major parts of 2.4 firewalling subsystem are implemented using IP tables
+ Packet filtering table 'filter'
+ NAT table 'nat'
+ Packet mangling table 'mangle'
+
+ Can potentially be used for other stuff, i.e. IPsec SPDB
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+IP Tables
+
+Managing chains and tables
+
+ An IP table consists out of multiple chains
+ A chain consists out of a list of rules
+ Every single rule in a chain consists out of
+ match[es] (rule executed if all matches true)
+ target (what to do if the rule is matched)
+
+%size 4
+matches and targets can either be builtin or implemented as kernel modules
+
+%size 6
+ The userspace tool iptables is used to control IP tables
+ handles all different kinds of IP tables
+ supports a plugin/shlib interface for target/match specific options
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+IP Tables
+
+Basic iptables commands
+
+ To build a complete iptables command, we must specify
+ which table to work with
+ which chain in this table to use
+ an operation (insert, add, delete, modify)
+ one or more matches (optional)
+ a target
+
+The syntax is
+%font "typewriter"
+%size 3
+iptables -t table -Operation chain -j target match(es)
+%font "standard"
+%size 5
+
+Example:
+%font "typewriter"
+%size 3
+iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
+%font "standard"
+%size 5
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+IP Tables
+
+Matches
+ Basic matches
+ -p protocol (tcp/udp/icmp/...)
+ -s source address (ip/mask)
+ -d destination address (ip/mask)
+ -i incoming interface
+ -o outgoing interface
+
+ Match extensions (examples)
+ tcp/udp TCP/udp source/destination port
+ icmp ICMP code/type
+ ah/esp AH/ESP SPID match
+ mac source MAC address
+ mark nfmark
+ length match on length of packet
+ limit rate limiting (n packets per timeframe)
+ owner owner uid of the socket sending the packet
+ tos TOS field of IP header
+ ttl TTL field of IP header
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+IP Tables
+
+Targets
+ very dependent on the particular table.
+
+ Table specific targets will be discussed later
+
+ Generic Targets, always available
+ ACCEPT accept packet within chain
+ DROP silently drop packet
+ QUEUE enqueue packet to userspace
+ LOG log packet via syslog
+ ULOG log packet via ulogd
+ RETURN return to previous (calling) chain
+ foobar jump to user defined chain
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Packet Filtering
+
+Overview
+
+ Implemented as 'filter' table
+ Registers with three netfilter hooks
+
+ NF_IP_LOCAL_IN (packets destined for the local host)
+ NF_IP_FORWARD (packets forwarded by local host)
+ NF_IP_LOCAL_OUT (packets from the local host)
+
+Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
+
+Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Packet Filtering
+
+Targets available within 'filter' table
+
+ Builtin Targets to be used in filter table
+ ACCEPT accept the packet
+ DROP silently drop the packet
+ QUEUE enqueue packet to userspace
+ RETURN return to previous (calling) chain
+ foobar user defined chain
+
+ Targets implemented as loadable modules
+ REJECT drop the packet but inform sender
+ MIRROR change source/destination IP and resend
+ LOG log via syslog
+ ULOG log via userspace
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Connection Tracking Subsystem
+
+ Connection tracking...
+
+ implemented seperately from NAT
+ enables stateful filtering
+ implementation
+ hooks into NF_IP_PRE_ROUTING to track packets
+ hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
+ protocol modules (currently TCP/UDP/ICMP)
+ application helpers currently (FTP,IRC,H.323,talk,SNMP)
+ divides packets in the following four categories
+ NEW - would establish new connection
+ ESTABLISHED - part of already established connection
+ RELATED - is related to established connection
+ INVALID - (multicast, errors...)
+ does _NOT_ filter packets itself
+ can be utilized by iptables using the 'state' match
+ is used by NAT Subsystem
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Connection Tracking Subsystem
+
+Common structures
+ struct ip_conntrack_tuple, representing unidirectional flow
+ layer 3 src + dst
+ layer 4 protocol
+ layer 4 src + dst
+
+
+ connetions represented as struct ip_conntrack
+ original tuple
+ reply tuple
+ timeout
+ l4 state private data
+ app helper
+ app helper private data
+ expected connections
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Connection Tracking Subsystem
+
+Flow of events for new packet
+ packet enters NF_IP_PRE_ROUTING
+ tuple is derived from packet
+ lookup conntrack hash table with hash(tuple) -> fails
+ new ip_conntrack is allocated
+ fill in original and reply == inverted(original) tuple
+ initialize timer
+ assign app helper if applicable
+ see if we've been expected -> fails
+ call layer 4 helper 'new' function
+
+ ...
+
+ packet enters NF_IP_POST_ROUTING
+ do hashtable lookup for packet -> fails
+ place struct ip_conntrack in hashtable
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Connection Tracking Subsystem
+
+Flow of events for packet part of existing connection
+ packet enters NF_IP_PRE_ROUTING
+ tuple is derived from packet
+ lookup conntrack hash table with hash(tuple)
+ assosiate conntrack entry with skb->nfct
+ call l4 protocol helper 'packet' function
+ do l4 state tracking
+ update timeouts as needed [i.e. TCP TIME_WAIT,...]
+
+ ...
+
+ packet enters NF_IP_POST_ROUTING
+ do hashtable lookup for packet -> succeds
+ do nothing else
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Network Address Translation
+
+Overview
+
+ Previous Linux Kernels only implemented one special case of NAT: Masquerading
+ Linux 2.4.x can do any kind of NAT.
+ NAT subsystem implemented on top of netfilter, iptables and conntrack
+ NAT subsystem registers with all five netfilter hooks
+ 'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
+ Following targets available within 'nat' Table
+ SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
+ DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
+ MASQUERADE is a special case of SNAT
+ REDIRECT is a special case of DNAT
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Network Address Translation
+
+ Source NAT
+ SNAT Example:
+%font "typewriter"
+%size 3
+iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
+%font "standard"
+%size 4
+
+ MASQUERADE Example:
+%font "typewriter"
+%size 3
+iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
+%font "standard"
+%size 5
+
+ Destination NAT
+ DNAT example
+%font "typewriter"
+%size 3
+iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
+%font "standard"
+%size 4
+
+ REDIRECT example
+%font "typewriter"
+%size 3
+iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
+%font "standard"
+%size 5
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Packet Mangling
+
+ Purpose of mangle table
+ packet manipulation except address manipulation
+
+ Integration with netfilter
+ 'mangle' table hooks in all five netfilter hooks
+ priority: after conntrack
+
+ Targets specific to the 'mangle' table:
+ DSCP - manipulate DSCP field
+ IPV4OPTSSTRIP - strip IPv4 options
+ MARK - change the nfmark field of the skb
+ TCPMSS - set TCP MSS option
+ TOS - manipulate the TOS bits
+ TTL - set / increase / decrease TTL field
+
+Simple example:
+%font "typewriter"
+%size 3
+iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Advanced Netfilter concepts
+
+%size 4
+ Userspace logging
+ flexible replacement for old syslog-based logging
+ packets to userspace via multicast netlink sockets
+ easy-to-use library (libipulog)
+ plugin-extensible userspace logging daemon (ulogd)
+ Can even be used to directly log into MySQL
+
+ Queuing
+ reliable asynchronous packet handling
+ packets to userspace via unicast netlink socket
+ easy-to-use library (libipq)
+ provides Perl bindings
+ experimental queue multiplex daemon (ipqmpd)
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+netfilter/iptables in Linux 2.4
+Current Development and Future
+
+Netfilter (although it proved very stable) is still work in progress.
+
+ Areas of current development
+ infrastructure for conntrack manipulation from userspace
+ failover of stateful firewalls
+ making iptables layer3 independent (pkttables)
+ new userspace library (libiptables) to hide plugins from apps
+ more matches and targets for advanced functions (pool, hashslot)
+ more conntrack and NAT modules (RPC, SNMP, SMB, ...)
+ better IPv6 support (conntrack, more matches / targets)
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Thanks
+ The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
+
+ The netfilter homepage http://www.netfilter.org/
+
+ Thanks to
+ the BBS people, Z-Netz, FIDO, ...
+ for heavily increasing my computer usage in 1992
+ KNF
+ for bringing me in touch with the internet as early as 1994
+ for providing a playground for technical people
+ for telling me about the existance of Linux!
+ Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
+ for implementing (one of?) the world's best TCP/IP stacks
+ Paul 'Rusty' Russell
+ for starting the netfilter/iptables project
+ for trusting me to maintain it today
+ Astaro AG
+ for sponsoring parts of my netfilter work
+
diff --git a/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.tex b/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.tex
new file mode 100644
index 0000000..c3a28ea
--- /dev/null
+++ b/2002/netfilter-internals-lsm2002/netfilter-internals-lsm2002.tex
@@ -0,0 +1,537 @@
+\documentclass{article}
+\usepackage{german}
+\usepackage{fancyheadings}
+\usepackage{a4}
+
+\setlength{\oddsidemargin}{0in}
+\setlength{\evensidemargin}{0in}
+\setlength{\topmargin}{0.0in}
+\setlength{\headheight}{0in}
+\setlength{\headsep}{0in}
+\setlength{\textwidth}{6.5in}
+\setlength{\textheight}{9.5in}
+\setlength{\parindent}{0in}
+\setlength{\parskip}{0.05in}
+
+
+\begin{document}
+\title{Linux 2.4.x netfilter/iptables firewalling internals}
+
+\author{Harald Welte\\
+ laforge@gnumonks.org\\
+ \copyright{}2002 H. Welte}
+
+\date{25. April 2002}
+
+\maketitle
+
+\setcounter{section}{0}
+\setcounter{subsection}{0}
+\setcounter{subsubsection}{0}
+
+\section{Introduction}
+The Linux 2.4.x kernel series has introduced a totally new kernel firewalling
+subsystem. It is much more than a plain successor of ipfwadm or ipchains.
+
+The netfilter/iptables project has a very modular design and it's
+sub-projects can be split in several parts: netfilter, iptables, connection
+tracking, NAT and packet mangling.
+
+While most users will already have learned how to use the basic functions
+of netfilter/iptables in order to convert their old ipchains firewalls to
+iptables, there's more advanced but less used functionality in
+netfilter/iptables.
+
+The presentation covers the design principles behind the netfilter/iptables
+implementation. This knowledge enables us to understand how the individual
+parts of netfilter/iptables fit together, and for which potential applications
+this is useful.
+
+\section{Internal netfilter/iptables architecture}
+
+\subsection{Netfilter hooks in protocol stacks}
+
+One of the major motivations behind the redesign of the linux packet
+filtering and NAT system during the 2.3.x kernel series was the widespread
+firewall specific code parts within the core IPv4 stack. Ideally the core
+IPv4 stack (as used by regular hosts and routers) shouldn't contain any
+firewalling specific code, resulting in no unwanted interaction and less
+code complexity. This desire lead to the invention of {\it netfilter}.
+
+\subsubsection{Architecture of netfilter}
+
+Netfilter is basically a system of callback functions within the network
+stack. It provides a non-portable API towards in-kernel networking
+extensions.
+
+What we call {\it netfilter hook} is a well-defined call-out point within a
+layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three
+network stack can define an arbitrary number of hooks, usually placed at
+strategic points within the packet flow.
+
+Any other kernel code can now subsequently register callback functions for
+any of these hooks. As in most sytems will be more than one callback
+function registered for a particular hook, a {\it priority} is specified upon
+registration of the callback function. This priority defines the order in
+which the individual callback functions at a particular hook are called.
+
+The return value of any registered callback functions can be:
+\begin{itemize}
+\item
+{\bf NF\_ACCEPT}: continue traversal as usual
+\item
+{\bf NF\_DROP}: drop the packet; do not continue traversal
+\item
+{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue
+\item
+{\bf NF\_QUEUE}: enqueue the packet to userspace
+\item
+{\bf NF\_REPEAT}: call this hook again
+\end{itemize}
+
+\subsubsection{Netfilter hooks within IPv4}
+
+The IPv4 stack provides five netfilter hooks, which are placed at the
+following peculiar places within the code:
+
+\begin{verbatim}
+ --->[1]--->[ROUTE]--->[3]--->[4]--->
+ | ^
+ | |
+ | [ROUTE]
+ v |
+ [2] [5]
+ | ^
+ | |
+ v |
+
+ local processes
+\end{verbatim}
+
+Packets received on any network interface arrive at the left side of the
+diagram. After the verification of the IP header checksum, the
+NF\_IP\_PRE\_ROUTING [1] hook is traversed.
+
+If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the
+routing code. Where we continue from here depends on the destintion of the
+packet.
+
+Packets with a local destination (i.e. packets where the destination address is
+one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2]
+hook. If all callback function return NF\_ACCEPT, the packet is finally passed
+to the socket code, which eventually passes the packet to a local process.
+
+Packets with a remote destination (i.e. packets which are forwarded by the
+local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'',
+they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the
+outgoing network interface.
+
+Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then
+enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4]
+hook before being sent off the outgoing network interface.
+
+\subsubsection{Netfilter hooks within IPv6}
+
+As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the
+IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The
+only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN,
+NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT.
+
+\subsubsection{Netfilter hooks within DECnet}
+
+There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING,
+NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING)
+are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO,
+NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets.
+
+\subsubsection{Netfilter hooks within ARP}
+
+Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code.
+There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing
+ARP packets respectively.
+
+\subsubsection{Netfilter hooks within IPX}
+
+There have been experimental patches to add netfilter hooks to the IPX code,
+but they never got integrated into the kernel source.
+
+\subsection{Packet selection using IP Tables}
+
+The IP tables core (ip\_tables.o) provides a generic layer for evaluation
+of rulesets.
+
+An IP table consists out of an arbitrary number of {\it chains}, which in turn
+consist out of a linear list of {\it rules}, which again consist out of any
+number of {\it matches} and one {\it target}.
+
+{\it Chains} can further be devided into two classes: Either {\it builtin
+chains} or {\it user-defined chains}. Builtin chains are always present, they
+are created upon table registration. They are also the entry points for table
+iteration. User defined chains are created at runtime upon user interaction.
+
+{\it Matches} specify the matching criteria, there can be zero or more matches
+
+{\it Targets} specify the action which is to be executed in case {\bf all}
+matches match. There can only be a single target per rule.
+
+Matches and targets can either be {\it builtin} or {\it linux kernel modules}.
+
+There are two special targets:
+\begin{itemize}
+\item
+By using a chain name as target, it is possible to jump to the respective chain
+in case the matches match.
+\item
+By using the RETURN target, it is possible to return to the previous (calling)
+chain
+\end{itemize}
+
+The IP tables core handles the following functions
+\begin{itemize}
+\item
+Registering and unregistering tables
+\item
+Registering and unregistering matches and targets (can be implemented as linux kernel modules)
+\item
+Kernel / userspace interface for manipulation of IP tables
+\item
+Traversal of IP tables
+\end{itemize}
+
+\subsubsection{Packet filtering unsing the ``filter'' table}
+
+Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes
+place in the ``filter'' table. Packet filtering works like a sieve: A packet
+is (in the end) either dropped or accepted - but never modified.
+
+The ``filter'' table is implemented in the {\it iptable\_filter.o} module
+and contains three builtin chains:
+
+\begin{itemize}
+\item
+{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN
+\item
+{\bf FORWARD} attaches to NF\_IP\_FORWARD
+\item
+{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT
+\end{itemize}
+
+The placement of the chains / hooks is done in such way, that evey concievable
+packet always traverses only one of the built-in chains. Packets destined for
+the local host traverse only INPUT, packets forwarded only FORWARD and
+locally-originated packets only OUTPUT.
+
+\subsubsection{Packet mangling using the ``mangle'' table}
+
+As stated above, operations which would modify a packet do not belong in the
+``filter'' table. The ``mangle'' table is available for all kinds of packet
+manipulation - but not manipulation of addresses (which is NAT).
+
+The mangle table attaches to all five netfilter hooks and provides the
+respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING)
+\footnote{This has changed through recent 2.4.x kernel series, old kernels may
+only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}.
+
+\subsection{Connection Tracking Subsystem}
+
+Traditional packet filters can only match on matching criteria within the
+currently processed packet, like source/destination IP address, port numbers,
+TCP flags, etc. As most applications have a notion of connections or at least
+a request/response style protocol, there is a lot of information which can not
+be derived from looking at a single packet.
+
+Thus, modern (stateful) packet filters attempt to track connections (flows)
+and their respective protocol states for all traffic through the packet
+filter.
+
+Connection tracking within linux is implemented as a netfilter module, called
+ip\_conntrack.o.
+
+Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code.
+
+A connection is represented within the conntrack subsystem using {\it struct
+ip\_conntrack}, also called {\it connection tracking entry}.
+
+Connection tracking is utilizing {\it conntrack tuples}, which are tuples
+consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
+uniquely identified by two tuples: The tuple in the original direction
+(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
+(IP\_CT\_DIR\_REPLY).
+
+Connection tracking itself does not drop packets\footnote{well, in some rare
+cases in combination with NAT it needs to drop. But don't tell anyone, this is
+secret.} or impose any policy. It just associates every packet with a
+connection tracking entry, which in turn has a particular state. All other
+kernel code can use this state information\footnote{state information is
+internally represented via the {\it struct sk\_buff.nfct} structure member of a
+packet.}.
+
+\subsubsection{Integration of conntrack with netfilter}
+
+If the ip\_conntrack.o module is registered with netfilter, it attaches to the
+NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
+NF\_IP\_LOCAL\_OUT hooks.
+
+Because forwarded packets are the most common case on firewalls, I will only
+describe how connection tracking works for forwarded packets. The two relevant
+hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
+
+Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
+tracking creates a conntrack tuple from the packet. It then compares this
+tuple to the original and reply tuples of all already-seen connections
+\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
+connection. If there is no match, a new conntrack table entry (struct
+ip\_conntrack) is created.
+
+Let's assume the case where we have already existing connections but are
+starting from scratch.
+
+The first packet comes in, we derive the tuple from the packet headers, look up
+the conntrack hash table, don't find any matching entry. As a result, we
+create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
+all necessarry data, like the original and reply tuple of the connection.
+How do we know the reply tuple? By inverting the source and destination
+parts of the original tuple.\footnote{So why do we need two tuples, if they can
+be derived from each other? Wait until we discuss NAT.}
+Please note that this new struct ip\_conntrack is {\bf not} yet placed
+into the conntrack hash table.
+
+The packet is now passed on to other callback functions which have registered
+with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
+the network stack as usual, including all respective netfilter hooks.
+
+If the packet survives (i.e. is not dropped by the routing code, network stack,
+firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
+we can now safely assume that this packet will be sent off on the outgoing
+interface, and thus put the connection tracking entry which we created at
+NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
+{\it confirming the conntrack}.
+
+The connection tracking code itself is not monolithic, but consists out of a
+couple of seperate modules\footnote{They don't actually have to be seperate
+kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
+the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
+are two important kind of modules: Protocol helpers and application helpers.
+
+Protocol helpers implement the layer-4-protocol specific parts. They currently
+exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
+
+\subsubsection{TCP connection tracking}
+
+As TCP is a connection oriented protocol, it is not very difficult to imagine
+how conntection tracking for this protocol could work. There are well-defined
+state transitions possible, and conntrack can decide which state transitions
+are valid within the TCP specification. In reality it's not all that easy,
+since we cannot assume that all packets that pass the packet filter actually
+arrive at the receiving end, ...
+
+It is noteworthy that the standard connection tracking code does {\bf not}
+do TCP sequence number and window tracking. A well-maintained patch to add
+this feature exists almost as long as connection tracking itself. It will
+be integrated with the 2.5.x kernel. The problem with window tracking is
+it's bad interaction with connection pickup. The TCP conntrack code is able to
+pick up already existing connections, e.g. in case your firewall was rebooted.
+However, connection pickup is conflicting with TCP window tracking: The TCP
+window scaling option is only transferred at connection setup time, and we
+don't know about it in case of pickup...
+
+\subsubsection{ICMP tracking}
+
+ICMP is not really a connection oriented protocol. So how is it possible to
+do connection tracking for ICMP?
+
+The ICMP protocol can be split in two groups of messages
+
+\begin{itemize}
+\item
+ICMP error messages, which sort-of belong to a different connection
+ICMP error messages are associated {\it RELATED} to a different connection.
+(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
+ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
+\item
+ICMP queries, which have a request->reply character. So what the conntrack
+code does, is let the request have a state of {\it NEW}, and the reply
+{\it ESTABLISHED}. The reply closes the connection immediately.
+(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
+\end{itemize}
+
+\subsubsection{UDP connection tracking}
+
+UDP is designed as a connectionless datagram protocol. But most common
+protocols using UDP as layer 4 protocol have bi-directional UDP communication.
+Imagine a DNS query, where the client sends an UDP frame to port 53 of the
+nameserver, and the nameserver sends back a DNS reply packet from it's UDP
+port 53 to the client.
+
+Netfilter trats this as a connection. The first packet (the DNS request) is
+assigned a state of {\it NEW}, because the packet is expected to create a new
+'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
+
+\subsubsection{conntrack application helpers}
+
+More complex application protocols involving multiple connections need special
+support by a so-called ``conntrack application helper module''. Modules in
+the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
+patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
+a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
+until somebody really needs them and either develops them on his own or
+funds development.
+
+\subsubsection{Integration of connection tracking with iptables}
+
+As stated earlier, conntrack doesn't impose any policy on packets. It just
+determines the relation of a packet to already existing connections. To base
+packet filtering decision on this sate information, the iptables {\it state}
+match can be used. Every packet is within one of the following categories:
+
+\begin{itemize}
+\item
+{\bf NEW}: packet would create a new connection, if it survives
+\item
+{\bf ESTABLISHED}: packet is part of an already established connection
+(either direction)
+\item
+{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
+\item
+{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
+\end{itemize}
+
+\subsection{NAT Subsystem}
+
+The NAT (Network Address Translation) subsystem is probably the worst
+documented subsystem within the whole framework. This has two reasons: NAT is
+nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so
+nobody needs to know the nasty details.
+
+Nonetheless, as I was traditionally concentrating most on the conntrack and NAT
+systems, I will give a short overview.
+
+NAT uses almost all of the previously described subsystems:
+\begin{itemize}
+\item
+IP tables to specify which packets to NAT in which particular way. NAT
+registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains.
+\item
+Connection tracking to associate NAT state with the connection.
+\item
+Netfilter to do the actuall packet manipulation transparent to the rest of the
+kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING,
+NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT.
+\end{itemize}
+
+The NAT implementation supports all kinds of different nat; Source NAT,
+Destination NAT, NAT to address/port ranges, 1:1 NAT, ...
+
+This fundamental design principle is still frequently misunderstood:\\
+The information about which NAT mappings apply to a certain connection
+is only gathered once - with the first packet of every connection.
+
+So let's start to look at the life of a poor to-be-nat'ed packet.
+For ease of understanding, I have chosen to describe the most frequently
+used NAT scenario: Source NAT of a forwarded packet. Let's assume the
+packet has an original source address of 1.1.1.1, an original destination
+address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further
+ignore the fact that there are port numbers.
+
+Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where
+conntrack has registered with highest priority. This means that a conntrack
+entry with the following two tuples is created:
+\begin{verbatim}
+IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2
+IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1
+\end{verbatim}
+After conntrack, the packet traverses the PREROUTING chain of the ``nat''
+IP table. Since only destination NAT happens at PREROUTING, no action
+occurs. After it's lengthy way through the rest of the network stack,
+the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses
+the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule,
+causing the following actions:
+\begin{itemize}
+\item
+Fill in a {\it struct ip\_nat\_manip}, indicating the new source address
+and the type of NAT (source NAT at POSTROUTING). This struct is part of the
+conntrack entry.
+\item
+Automatically derive the inverse NAT transormation for the reply packets:
+Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}.
+\item
+Alter the REPLY tuple of the conntrack entry to
+\begin{verbatim}
+IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9
+\end{verbatim}
+\item
+Apply the SNAT transformation to the packet
+\end{itemize}
+
+Every other packt within this connection, independent of its direction,
+will only execute the last step. Since all NAT information is connected
+with the conntrack entry, there is no need to do anything but to apply
+the same transormations to all packets witin the same connection.
+
+\subsection{IPv6 Firewalling with ip6tables}
+
+Yes, Linux 2.4.x comes with a usable, though incomplete system to secure
+your IPv6 network.
+
+The parts ported to IPv6 are
+\begin{itemize}
+\item
+IP tables (called IP6 tables)
+\item
+The ``filter'' table
+\item
+The ``mangle'' table
+\item
+The userspace library (libip6tc)
+\item
+The command line tool (ip6tables)
+\end{itemize}
+
+Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT
+with IPv6}, only traditional, stateless packet filtering is possible. Apart
+from the obvious matches/targets, ip6tables can match on
+\begin{itemize}
+\item
+{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address
+\item
+{\it frag6 match}, matches on IPv6 fragmentation header
+\item
+{\it route6 match}, matches on IPv6 routing header
+\item
+{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets
+\end{itemize}
+
+However, the ip6tables code doesn't seem to be used very widely (yet?).
+So please expect some potential remaining issues, since it is not tested
+as heavily as iptables.
+
+\subsection{Recent Development}
+
+Please refer to the spoken word at the presentation. Development at the
+time this paper was written can be quite different from development at the
+time the presentation is held.
+
+\section{Thanks}
+
+I'd like to thank
+\begin{itemize}
+\item
+{\it Linus Torvalds} for starting this interesting UNIX-like kernel
+\item
+{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building
+(one of?) the world's best TCP/IP stacks.
+\item
+{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project
+\item
+{\it The Netfilter Core Team} for continuing the netfilter/iptables effort
+\item
+{\it Astaro AG} for partially funding my current netfilter/iptables work
+\item
+{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables
+work and for inviting me to live in Brazil
+\item
+{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter
+homepage, CVS, mailing lists, ...
+\end{itemize}
+
+\end{document}
personal git repositories of Harald Welte. Your mileage may vary