diff options
author | Harald Welte <laforge@gnumonks.org> | 2015-10-25 21:00:20 +0100 |
---|---|---|
committer | Harald Welte <laforge@gnumonks.org> | 2015-10-25 21:00:20 +0100 |
commit | fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 (patch) | |
tree | a2011270df48d3501892ac1a56015c8be57e8a7d /2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml |
import of old now defunct presentation slides svn repo
Diffstat (limited to '2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml')
-rw-r--r-- | 2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml | 341 |
1 files changed, 341 insertions, 0 deletions
diff --git a/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml b/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml new file mode 100644 index 0000000..a992555 --- /dev/null +++ b/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml @@ -0,0 +1,341 @@ +<?xml version='1.0' encoding='ISO-8859-1'?> +<!DOCTYPE article PUBLIC '-//OASIS//DTD DocBook XML V4.3//EN' 'http://www.docbook.org/xml/4.3/docbookx.dtd'> + +<article id="rfid_introduction-ds"> + +<articleinfo> + <title>First steps towards the next generation netfilter subsystem</title> + <authorgroup> + <author> + <personname> + <firstname>Harald</firstname> + <surname>Welte</surname> + </personname> + <!-- + <personblurb>Harald Welte</personblurb> + <affiliation> + <orgname>netfilter core team</orgname> + <address> + <email>laforge@netfilter.org</email> + </address> + </affiliation> + + --> + <email>laforge@netfilter.org</email> + </author> + </authorgroup> + <copyright> + <year>2005</year> + <holder>Harald Welte <laforge@netfilter.org> </holder> + </copyright> + <date>Sep 21, 2005</date> + <edition>1</edition> + <!-- <orgname>netfilter core team</orgname> --> + <releaseinfo> + 1.0 + </releaseinfo> + + <abstract> + +<para> +Until 2.6, every new kernel version came with its own incarnation of a packet +filter: ipfw, ipfwadm, ipchains, iptables. 2.6.x still had iptables. What was +wrong? Or was iptables good enough to last even two generations? +</para> +<para> +In reality the netfilter project is working on gradually transforming the +existing framework into something new. Some of those changes are transparent to +the user, so they slip into a kernel release almost unnoticed. However, for +expert users and developers those changes are noteworthy anyway. +</para> +<para> +Some other changes just extend the existing framework, so most users again +won't even notice them - they just don't take advantage of those new features. +</para> +<para> +The 2.6.14 kernel release will mark a milestone, since it is scheduled to +contain nfnetlink, ctnetlink, nfnetlink_queue and nfnetlink_log - basically a +totally new netlink-based kernel/userspace interface for most parts of the +netfilter subsystem. +</para> +<para> +nf_conntrack, a generic layer-3 independent connection tracking subsystem, +initially supporting IPv4 and IPv6, is also in the queue of pending patches. +Chances are high that it will be included in the mainline kernel at the time +this paper is presented at Linux Kongress. +</para> +<para> +Another new subsystem within the framework is the "ipset" filter, basically an +alternative to using iptables in certain areas. +</para> +<para> +The presentation (but not this paper) will also summarize the results of the +annual netfilter development workshop, which is scheduled just the week before +Linux Kongress. +</para> + </abstract> + +</articleinfo> + +<section> +<title>nfnetlink</title> +<para> +In the current (pre-2.6.14) linux kernel, there is no unified communications +infrastructure used by all parts of the netfilter/iptables subsystem. Some +parameters can be read from /proc, some can be set via sysctl, some as module +load time parameters. The iptables configuraiton happens via get/setsockopt, +and the userspace queueing and logging use two separate (scarce) netlink family +numbers. +</para> +<para> +Most of the network stack is controlled via netlink. Examples are routing +tables, routing policy, interface configuration, traffic control and ipsec. +</para> +<para> +nfnetlink is the answer for all netfilter-related kernel/userspace interaction. +It provides a thin layer on top of netlink. The nfnetlink code in the kernel +has its userspace counterpart called "libnfnetlink". +</para> +</section> + +<section> +<title>conntrack event API</title> +<para> +For some applications (such as state replication or flow-based accounting) it +is interesting to learn about conntrack state changes. +</para> +<para> +The new conntrack event API provides in-kernel notification of conntrack event changes via a standard <structname>notifier_chain</structname>. +</para> +</section> + +<section> +<title>nfnetlink_conntrack (aka ctnetlink)</title> +<para> +nfnetlink_conntrack is a nfnetlink-based interface for reading, dumping and +manipulating connection tracking state from userspace. +</para> +<para> +The most straight-forward application is to obtain a list of currently tracked +connections. In pre-2.6.14 kernels, this can only be via the ugly +<filename>/proc/net/ip_conntrack</filename> virtual file. The file-based +access is slow, unreliable, suboptimal and doesn't allow for efficient +searching. +</para> +<para> +However, certain monitoring applications or e.g. a NAT-aware identd +implementation have demand for efficient fine-grained access. +</para> +<para> +Also, the administrator might want to selectively delete connection tracking +entries, or even flush the whole table. In pre-2.6.14, there i no intrface for +that apart from the "rmmod ip_conntrack; modprobe ip_conntrack" kludge. +</para> +<para> +Addidional (future) users of ctnetlink are connection tracking helpers in +userspace. Imagine something like a hybrid between transparent proxying and +the current in-kernel helpers. Get the features of running insensitive +userspace code that cannot crash your kernel, and still retain the benefits of +e.g. not having to do userspace processing on ftp data (but only control) +packets. +</para> +</section> + +<section> +<title>libnfnetlink_conntrack</title> +<para> +libnfnetlink_conntrack is the userspace counterpart to nfnetlink_conntrack +inside the kernel. It constructs and parses nfnetlink packets and thus +provides a "function and struct" style C API. +</para> +</section> + +<section> +<title>The "conntrack" program</title> +<para> +The <command>conntrack</command> command is a userspace program linked against +libnfnetlink_conntrack. It allows commandline-level acces to the connection +tracking table. +</para> +<para> +<command>conntrack</command> supports listing, deleting, updating, flushing and +even creating connection tracking entries. It also allows listing, deleting +and updating of conntrack expectations. +</para> +</section> + +<section> +<title>nf_queue</title> +<para> +nf_queue is not really something new, but still very little people have known +it until now. The 2.4.x netfilter subsystem first introduced a generic +packet queueing mechanism for asynchronously sending packets to userspace (and +reinjecting them or a verdict. This mechanism is mostly known as ip_queue, or +the QUEUE target. +</para> +<para> +In reality, ip_queue sits in top of a small layer called nf_queue. nf_queue +allows for one netfilter queue handler per network protocol family. All +netfilter hooks within this protocol family that return the NF_QUEUE verdict +will send the packet to this nf_queue handler. +</para> +<para> +In the existing 2.4.x and pre-2.6.14 code, the mainline kernel only had one +queue handler: ip_queue. This basically means that only IP packets could be +queued for an unserspace process. +</para> +<para> +Outside of the official kernel tree, a "copy+paste" port of ip_queue was made +to IPv6. The netfilter/iptables project has had enough copy+paste style +"ports" due to architectural limitations. Therefore the code was not accepted +into the mainline kernel. Rather, work on a generic replacement was continued. +</para> +<para> +Which log handler is to be used for what protocol family can now be configured +via nfnetlink_queue (see below). The current status can also be read from +<filename>/proc/net/netfilter/nf_queue</filename>. +</para> +</section> + +<section> +<title>nfnetlink_queue</title> +<para> +nfnetlink_queue is a nfnetlink-based and layer 3 protocol independent +replacement of ip_queue. +</para> +<para> +It provides all features of ip_queue for packets independent of their protocol. +</para> +<para> +In addition to mere replication of ip_queue functionality, it fixes the most +funamental problem with the old ip_queue code: That there was only one global +queue, and there could only be one userspace process attached to it. +</para> +<para> +nfnetlink_queue supports up to 65535 different dynamically-created queues. +Packets can be put into a specific queue by using the NFQUEUE target. For +backwards compatibility, packets coming from the iptables QUEUE target will be +placed in queue number 0. +</para> +<para> +Userspace processes can now also receive additional packet metadata such as the +PHYSINDEV/PHYSOUTDEV devices in case of bridging. +</para> +</section> + +<section> +<title>libnfnetlink_queue</title> +<para> +The library libnfnetlink_queue is the userspace counterpart to nfnetlink_queue +inside the kernel. It provides an easy-to-use C language interface to packet +usrespace queueing. +</para> +<para> +For legacy applications using <filename>libipq</filename>, an API-compatible +(but not ABI-compatible) libipq replacement is available together with +libnfnetlink_queue. +</para> +</section> + +<section> +<title>nf_log</title> +<para> +Traditionally, netfilter itself doesn't provide any packet logging +infrastructure. Only iptables provides the LOG target (for klogd/syslogd +logging). In 2001, the ULOG target was added to support more efficient logging +via a dedicated netlink socket. +</para> +<para> +When the TCP window tracking code was introduced, the requirement for +logging packets (such as TCP out of window packets) from non-iptables code +became immediate. +</para> +<para> +Instead of a more generic solution, it was decided to have module load time +parameters (nf_log) decide whether ipt_LOG or ipt_ULOG register as "internal +logging backend" that can be used by conntrack. +</para> +<para> +In 2.6.14, nf_log became a first-class citizen. This means that the iptables +LOG target doesn't do any direct logging. Instead it registers as a nf_log +backend with the core, and calls the nf_log frontend when it wishes to log a +packet. +</para> +<para> +The nf_log core can then decide whether to log the packet using the ipt_LOG +provided syslog backend, or via old style ipt_ULOG netlink logging, or the +newly-introduced nfnetlink_log mechanism (see below). +</para> +<para> +Which log handler is to be used for what protocol family can be configured +via nfnetlink (see below). The current status can also be read from +<filename>/proc/net/netfilter/nf_log</filename>. +</para> +</section> + +<section> +<title>nfnetlink_log</title> +<para> +nfnetlink_log is for logging what nfnetlink_queue is for queueing. It takes +the ideas of the ipt_ULOG target and reimplements them in a layer 3 protocol +independent fashion, as well as shifts the transport layer on top of nfnetlink. +</para> +<para> +ipt_ULOG already allowed for up to 32 logging groups, whcih seemed to be enough +in all practical cases. To be more orthogonal to nfnetlink_queue, +nfnetlink_log now also suports 65535 logging groups, each of which can be +terminated by a different logging process. +</para> +</section> + +<section> +<title>libnfnetlink_log</title> +<para> +Orthogonal to libnfnetlink_queue, libnfnetlink_log is the userspace counterpart +to nfnetlink_log in the kernel. +</para> +<para> +libnfnetlink_log also provides a libipulog backwards compatibility API. +</para> +</section> + +<section> +<title>Flow based accounting</title> +<para> +The fundamental idea of flow-based (or more correctly: connection-based) +accounting is to keep per-connection byte an packet counters within the connection tracking table. +</para> +<para> +On firewall systems that already use ip_conntrack, keeping those per-connection +counters only adds very little overhead to the existing connection tracking, +and is thus almost free. +</para> +<para> +Internally, flow-based accounting uses both the conntrack event API and +nfnetlink_conntrack. +</para> +<para> +For a more detailed description of flow based accounting and the motivations +behind it, please refer to my paper on flow based accounting published in the +proceedings of Linuxtag 2005. +</para> +</section> + +<section> +<title>nf_conntrack</title> +<para> +nf_conntrack is a generalized version of ip_conntrack. This generalization is +required to provide connection tracking for non-ipv4 protcols. Currently only +IPv4 and IPv6 are supported in nf_conntrack. +</para> +<para> +The architecture of nf_conntrack is almost exactly the same like ip_conntrack, +only +</para> +<para> +nf_conntrack is not in the 2.6.14 kernel series but will very likely be merged +during the early 2.6.15 development process. The latest nf_conntrack version can be obtained from the netfilter-2.6 git tree. +</para> +</section> + +</article> |