From fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Sun, 25 Oct 2015 21:00:20 +0100 Subject: import of old now defunct presentation slides svn repo --- .../netfilter_nextgen-lk2005.xml | 341 +++++++++++++++++++++ 1 file changed, 341 insertions(+) create mode 100644 2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml (limited to '2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml') diff --git a/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml b/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml new file mode 100644 index 0000000..a992555 --- /dev/null +++ b/2005/netfilter_nextgen-lk2005/netfilter_nextgen-lk2005.xml @@ -0,0 +1,341 @@ + + + +
+ + + First steps towards the next generation netfilter subsystem + + + + Harald + Welte + + + laforge@netfilter.org + + + + 2005 + Harald Welte <laforge@netfilter.org> + + Sep 21, 2005 + 1 + + + 1.0 + + + + + +Until 2.6, every new kernel version came with its own incarnation of a packet +filter: ipfw, ipfwadm, ipchains, iptables. 2.6.x still had iptables. What was +wrong? Or was iptables good enough to last even two generations? + + +In reality the netfilter project is working on gradually transforming the +existing framework into something new. Some of those changes are transparent to +the user, so they slip into a kernel release almost unnoticed. However, for +expert users and developers those changes are noteworthy anyway. + + +Some other changes just extend the existing framework, so most users again +won't even notice them - they just don't take advantage of those new features. + + +The 2.6.14 kernel release will mark a milestone, since it is scheduled to +contain nfnetlink, ctnetlink, nfnetlink_queue and nfnetlink_log - basically a +totally new netlink-based kernel/userspace interface for most parts of the +netfilter subsystem. + + +nf_conntrack, a generic layer-3 independent connection tracking subsystem, +initially supporting IPv4 and IPv6, is also in the queue of pending patches. +Chances are high that it will be included in the mainline kernel at the time +this paper is presented at Linux Kongress. + + +Another new subsystem within the framework is the "ipset" filter, basically an +alternative to using iptables in certain areas. + + +The presentation (but not this paper) will also summarize the results of the +annual netfilter development workshop, which is scheduled just the week before +Linux Kongress. + + + + + +
+nfnetlink + +In the current (pre-2.6.14) linux kernel, there is no unified communications +infrastructure used by all parts of the netfilter/iptables subsystem. Some +parameters can be read from /proc, some can be set via sysctl, some as module +load time parameters. The iptables configuraiton happens via get/setsockopt, +and the userspace queueing and logging use two separate (scarce) netlink family +numbers. + + +Most of the network stack is controlled via netlink. Examples are routing +tables, routing policy, interface configuration, traffic control and ipsec. + + +nfnetlink is the answer for all netfilter-related kernel/userspace interaction. +It provides a thin layer on top of netlink. The nfnetlink code in the kernel +has its userspace counterpart called "libnfnetlink". + +
+ +
+conntrack event API + +For some applications (such as state replication or flow-based accounting) it +is interesting to learn about conntrack state changes. + + +The new conntrack event API provides in-kernel notification of conntrack event changes via a standard notifier_chain. + +
+ +
+nfnetlink_conntrack (aka ctnetlink) + +nfnetlink_conntrack is a nfnetlink-based interface for reading, dumping and +manipulating connection tracking state from userspace. + + +The most straight-forward application is to obtain a list of currently tracked +connections. In pre-2.6.14 kernels, this can only be via the ugly +/proc/net/ip_conntrack virtual file. The file-based +access is slow, unreliable, suboptimal and doesn't allow for efficient +searching. + + +However, certain monitoring applications or e.g. a NAT-aware identd +implementation have demand for efficient fine-grained access. + + +Also, the administrator might want to selectively delete connection tracking +entries, or even flush the whole table. In pre-2.6.14, there i no intrface for +that apart from the "rmmod ip_conntrack; modprobe ip_conntrack" kludge. + + +Addidional (future) users of ctnetlink are connection tracking helpers in +userspace. Imagine something like a hybrid between transparent proxying and +the current in-kernel helpers. Get the features of running insensitive +userspace code that cannot crash your kernel, and still retain the benefits of +e.g. not having to do userspace processing on ftp data (but only control) +packets. + +
+ +
+libnfnetlink_conntrack + +libnfnetlink_conntrack is the userspace counterpart to nfnetlink_conntrack +inside the kernel. It constructs and parses nfnetlink packets and thus +provides a "function and struct" style C API. + +
+ +
+The "conntrack" program + +The conntrack command is a userspace program linked against +libnfnetlink_conntrack. It allows commandline-level acces to the connection +tracking table. + + +conntrack supports listing, deleting, updating, flushing and +even creating connection tracking entries. It also allows listing, deleting +and updating of conntrack expectations. + +
+ +
+nf_queue + +nf_queue is not really something new, but still very little people have known +it until now. The 2.4.x netfilter subsystem first introduced a generic +packet queueing mechanism for asynchronously sending packets to userspace (and +reinjecting them or a verdict. This mechanism is mostly known as ip_queue, or +the QUEUE target. + + +In reality, ip_queue sits in top of a small layer called nf_queue. nf_queue +allows for one netfilter queue handler per network protocol family. All +netfilter hooks within this protocol family that return the NF_QUEUE verdict +will send the packet to this nf_queue handler. + + +In the existing 2.4.x and pre-2.6.14 code, the mainline kernel only had one +queue handler: ip_queue. This basically means that only IP packets could be +queued for an unserspace process. + + +Outside of the official kernel tree, a "copy+paste" port of ip_queue was made +to IPv6. The netfilter/iptables project has had enough copy+paste style +"ports" due to architectural limitations. Therefore the code was not accepted +into the mainline kernel. Rather, work on a generic replacement was continued. + + +Which log handler is to be used for what protocol family can now be configured +via nfnetlink_queue (see below). The current status can also be read from +/proc/net/netfilter/nf_queue. + +
+ +
+nfnetlink_queue + +nfnetlink_queue is a nfnetlink-based and layer 3 protocol independent +replacement of ip_queue. + + +It provides all features of ip_queue for packets independent of their protocol. + + +In addition to mere replication of ip_queue functionality, it fixes the most +funamental problem with the old ip_queue code: That there was only one global +queue, and there could only be one userspace process attached to it. + + +nfnetlink_queue supports up to 65535 different dynamically-created queues. +Packets can be put into a specific queue by using the NFQUEUE target. For +backwards compatibility, packets coming from the iptables QUEUE target will be +placed in queue number 0. + + +Userspace processes can now also receive additional packet metadata such as the +PHYSINDEV/PHYSOUTDEV devices in case of bridging. + +
+ +
+libnfnetlink_queue + +The library libnfnetlink_queue is the userspace counterpart to nfnetlink_queue +inside the kernel. It provides an easy-to-use C language interface to packet +usrespace queueing. + + +For legacy applications using libipq, an API-compatible +(but not ABI-compatible) libipq replacement is available together with +libnfnetlink_queue. + +
+ +
+nf_log + +Traditionally, netfilter itself doesn't provide any packet logging +infrastructure. Only iptables provides the LOG target (for klogd/syslogd +logging). In 2001, the ULOG target was added to support more efficient logging +via a dedicated netlink socket. + + +When the TCP window tracking code was introduced, the requirement for +logging packets (such as TCP out of window packets) from non-iptables code +became immediate. + + +Instead of a more generic solution, it was decided to have module load time +parameters (nf_log) decide whether ipt_LOG or ipt_ULOG register as "internal +logging backend" that can be used by conntrack. + + +In 2.6.14, nf_log became a first-class citizen. This means that the iptables +LOG target doesn't do any direct logging. Instead it registers as a nf_log +backend with the core, and calls the nf_log frontend when it wishes to log a +packet. + + +The nf_log core can then decide whether to log the packet using the ipt_LOG +provided syslog backend, or via old style ipt_ULOG netlink logging, or the +newly-introduced nfnetlink_log mechanism (see below). + + +Which log handler is to be used for what protocol family can be configured +via nfnetlink (see below). The current status can also be read from +/proc/net/netfilter/nf_log. + +
+ +
+nfnetlink_log + +nfnetlink_log is for logging what nfnetlink_queue is for queueing. It takes +the ideas of the ipt_ULOG target and reimplements them in a layer 3 protocol +independent fashion, as well as shifts the transport layer on top of nfnetlink. + + +ipt_ULOG already allowed for up to 32 logging groups, whcih seemed to be enough +in all practical cases. To be more orthogonal to nfnetlink_queue, +nfnetlink_log now also suports 65535 logging groups, each of which can be +terminated by a different logging process. + +
+ +
+libnfnetlink_log + +Orthogonal to libnfnetlink_queue, libnfnetlink_log is the userspace counterpart +to nfnetlink_log in the kernel. + + +libnfnetlink_log also provides a libipulog backwards compatibility API. + +
+ +
+Flow based accounting + +The fundamental idea of flow-based (or more correctly: connection-based) +accounting is to keep per-connection byte an packet counters within the connection tracking table. + + +On firewall systems that already use ip_conntrack, keeping those per-connection +counters only adds very little overhead to the existing connection tracking, +and is thus almost free. + + +Internally, flow-based accounting uses both the conntrack event API and +nfnetlink_conntrack. + + +For a more detailed description of flow based accounting and the motivations +behind it, please refer to my paper on flow based accounting published in the +proceedings of Linuxtag 2005. + +
+ +
+nf_conntrack + +nf_conntrack is a generalized version of ip_conntrack. This generalization is +required to provide connection tracking for non-ipv4 protcols. Currently only +IPv4 and IPv6 are supported in nf_conntrack. + + +The architecture of nf_conntrack is almost exactly the same like ip_conntrack, +only + + +nf_conntrack is not in the 2.6.14 kernel series but will very likely be merged +during the early 2.6.15 development process. The latest nf_conntrack version can be obtained from the netfilter-2.6 git tree. + +
+ +
-- cgit v1.2.3