summaryrefslogtreecommitdiff
path: root/2003/netfilter-curdevel-lt2003
diff options
context:
space:
mode:
authorHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
committerHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
commitfca59bea770346cf1c1f9b0e00cb48a61b44a8f3 (patch)
treea2011270df48d3501892ac1a56015c8be57e8a7d /2003/netfilter-curdevel-lt2003
import of old now defunct presentation slides svn repo
Diffstat (limited to '2003/netfilter-curdevel-lt2003')
-rw-r--r--2003/netfilter-curdevel-lt2003/abstract12
-rw-r--r--2003/netfilter-curdevel-lt2003/biography22
-rw-r--r--2003/netfilter-curdevel-lt2003/curdevel19
-rw-r--r--2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp299
4 files changed, 352 insertions, 0 deletions
diff --git a/2003/netfilter-curdevel-lt2003/abstract b/2003/netfilter-curdevel-lt2003/abstract
new file mode 100644
index 0000000..d6ee41c
--- /dev/null
+++ b/2003/netfilter-curdevel-lt2003/abstract
@@ -0,0 +1,12 @@
+The netfilter/iptables system is about three years old. With Linux kernel 2.4.x being deployed widely during the last two years, lots of systems worldwide are using netfilter/iptables as their packet filtering subsystem.
+
+netfilter/iptables is no doubt a big improvement over the old ipchains system in the 2.2.x kernels. Hoewever, as with any project - after wide deployment for some time, we start to discover aspects that can be implemented more cleanly, more efficently.
+
+The constant innovation and development of new applications and protocols (like SIP) on the internet also raise new requirements towards the linux packet filter.
+
+So the question is: Is it time for yet another generation of the linux packet filtering subsystem? Will the tradition of change (ipfwadm->ipchains->iptables->?) be continued? Or can we integrate all necessarry changes within the current framework?
+
+The presentation will cover a summary of the problems with the current netfilter/iptables implementation and describe the proposed solutions.
+
+Intended Audience: System and Network Administrators
+Prerequsites: Knowledge about Packet Filters. Usage of iptables.
diff --git a/2003/netfilter-curdevel-lt2003/biography b/2003/netfilter-curdevel-lt2003/biography
new file mode 100644
index 0000000..0280091
--- /dev/null
+++ b/2003/netfilter-curdevel-lt2003/biography
@@ -0,0 +1,22 @@
+ <a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
+of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
+team members, and the current Linux 2.4.x firewalling maintainer.
+
+ His main interest in computing has always been networking. In the few time
+left besides netfilter/iptables related work, he's writing obscure documents
+like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
+over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
+user mode linux and the international (crypto) kernel patch.
+
+ In the past he has been working as an independent IT Consultant working on
+closed-source projecst for various companies ranging from banks to
+manufacturers of networking gear. During the year 2001 he was living in
+Curitiba (Brazil), where he got sponsored for his Linux related work by
+<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
+
+ Starting with February 2002, Harald has been contracted part-time by
+<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
+current netfilter/iptables work.
+
+ Harald is living in Berlin, Germany.
+
diff --git a/2003/netfilter-curdevel-lt2003/curdevel b/2003/netfilter-curdevel-lt2003/curdevel
new file mode 100644
index 0000000..07f11d7
--- /dev/null
+++ b/2003/netfilter-curdevel-lt2003/curdevel
@@ -0,0 +1,19 @@
+- pkttables
+ - linked lists instead of blob
+ - explain current situation
+ - dynamic rulesets are slow with iptables
+ - independent of layer 3 protocol
+ - current code duplication between [ip|ip6|arp]tables
+ - some matches (mac, interface, ...) are independent anyway
+- nfnetlink
+ - idea
+ - ctnetlink
+ - iptnetlink / pkttnetlink
+ - ulog/queue port to it
+ - libnfnetlink, libctnetlink, libpkttnetlink
+- libiptables / libpkttnetlink
+ - high-level API for rule-manipulation
+ - covering all the plugins which are currently part of iptables
+
+- failover / load balancing for stateful firewalls
+ - slides from OLS
diff --git a/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp b/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp
new file mode 100644
index 0000000..0f13d07
--- /dev/null
+++ b/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp
@@ -0,0 +1,299 @@
+%include "default.mgp"
+%default 1 bgrad
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+%nodefault
+%back "blue"
+
+%center
+%size 7
+
+
+The future of Linux packet filtering
+
+
+
+%center
+%size 4
+by
+
+Harald Welte <laforge@netfilter.org>
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Contents
+
+
+ Problems with current 2.4/2.5 netfilter/iptables
+ Solution to code replication
+ Solution for dynamic rulesets
+ Solution for API to GUI's and other management programs
+
+ Other current work
+ Optimizing Rule load time of large rulesets
+ Making netfilter/iptables compatible with zerocopy tcp
+
+ HA for stateful firewalling
+ What's special about firewalling HA
+ Poor man's failover
+ Real state replication
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Problems with 2.4.x netfilter/iptables
+
+ code replication between iptables/ip6tables/arptables
+ iptables not meant for other protocols, but people did copy+paste 'ports'
+ replication of
+ core kernel code
+ layer 3 independent matches (mac, interface, ...)
+ userspace library (libiptc)
+ userspace tool (iptables)
+ userspace plugins (libipt_xxx.so)
+ doesn't suit the needs for dynamically changing rulesets
+ dynamic rulesets becomming more common due (service selection, IDS)
+ a whole table is created in userspace and sent as blob to kernel
+ for every ruleset the table needs to be copied to userspace and back
+ inside kernel consistency checks on whole table, loop detection
+ too extensible for writing any forward-compatible GUI
+ new extensions showing up all the time
+ a frontend would need to know about the options and use of a new extension
+ thus frontends are always incomplete and out-of-date
+ no high-level API other than piping to iptables-restore
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Reducing code replication
+
+ code replication is a real problem: unclean, bugfixes missed
+ we need layer 3 independent layer for
+ submitting rules to the kernel
+ traversing packet-rulesets supporting match/target modules
+ registering matches/targets
+ layer 3 specific (like matching ipv4 address)
+ layer 3 independent (like matching MAC address)
+ solution
+ pkt_tables inside kernel
+ pkt_tables_ipv4 registers layer 3 handler with pkt_tables
+ pkt_tables_ipv6 registers layer 3 handler with pkt_tables
+ everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
+ libraries in userspace (see later)
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Supporting dynamic rulesets
+
+ atomic table-replacement turned out to be bad idea
+ need new interface for sending individual rules to kernel
+ policy routing has the same problem and good solution: rtnetlink
+ solution: nfnetlink
+ multicast-netlink based packet-orinented socket between kernel and userspace
+ has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
+ nfnetlink will be low-layer below all kernel/userspace communication
+ pkttnetlink [aka iptnetlink]
+ ctnetlink
+ ulog
+ ip_queue
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Communication with other programs
+
+ libnfnetlink for low-layer communication
+ libpkttnetlink for rule modifications
+ will handle all plugins [which are currently part of iptables]
+ query functions about avaliable matches/targets
+ query functions about parameters
+ query functions for help messages about specific match/parameter of a match
+ generic structure from which rules can be built
+ conversion functions to parse generic structure into in-kernel structure
+ conversion functions to perse kernel structure into generic structure
+ functions to convert generic structure in plain text
+ libipq will stay API-compatible to current version
+ libipulog will stay API-compatible to current version
+ libiptc will go away [compatibility layer extremely difficult]
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Optimizing rule load time
+
+ Current situation
+ loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz
+ this is caused by two bottlenecks
+ loop detection algorithm on kernel side inefficient
+ a couple of O^2 complexity functions in libiptc
+
+ Solution
+ efficient loop detection and mark_source_chains() algorithm (graph coloring)
+ current CVS libiptc with only one O^2 function: 2minutes37
+ whole reimplementation of libiptc needed for removing the last O^2 function
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Optimizing the connection tracking code
+
+ Conntrack hash function optimization
+ old hash function not good for even hash bucket count
+ hash function evaluation tool [cttest] avaliable
+ other hash functions in development (already in 2.4.21)
+ introduce per-system randomness to prevent hash attack
+ code optimization (locking/timers/...)
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+netfilter and zerocopy TCP
+
+ Current situation (2.4.x)
+ skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled
+ this is a big performance loss on stand-alone servers which filter packets locally
+
+ Solution
+ remove skb_linearize() from conntrack, nat and ip_tables core
+ all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Introduction
+
+What is special about firewall failover?
+
+ Nothing, in case of the stateless packet filter
+ Common IP takeover solutions can be used
+ VRRP
+ Hartbeat
+
+ Distribution of packet filtering ruleset no problem
+ can be done manually
+ or implemented with simple userspace process
+
+ Problems arise with stateful packet filters
+ Connection state only on active node
+ NAT mappings only on active node
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Poor man's failover
+
+ principle
+ let every node do it's own tracking rather than replicating state
+ two possible implementations
+ connect every node to shared media (i.e. real ethernet)
+ forwarding only turned on on active node
+ slave nodes use promiscuous mode to sniff packets
+ copy all traffic to slave nodes
+ active master needs to copy all traffic to other nodes
+ disadvantage: high load, sync traffic == payload traffic
+ IMHO stupid way of solving the problem
+ advantages
+ very easy implementation
+ only addition of sniffing mode to conntrack needed
+ existing means of address takeover can be used
+ same load on active master and slave nodes
+ no additional load on active master
+ disadvantages
+ can only be used with real shared media (no switches, ...)
+ can not be used with NAT
+ remaining problem
+ no initial state sync after reboot of slave node!
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Real state replication
+
+Parts needed
+ state replication protocol
+ multicast based
+ sequence numbers for detection of packet loss
+ NACK-based retransmission
+ no security, since private ethernet segment to be used
+ event interface on active node
+ calling out to callback function at all state changes
+ exported interface to manipulate conntrack hash table
+ kernel thread for sending conntrack state protocol messages
+ registers with event interface
+ creates and accumulates state replication packets
+ sends them via in-kernel sockets api
+ kernel thread for receiving conntrack state replication messages
+ receives state replication packets via in-kernel sockets
+ uses conntrack hashtable manipulation interface
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Real state replication
+
+ Flow of events in chronological order:
+ on active node, inside the network RX softirq
+ connection tracking code is analyzing a forwarded packet
+ connection tracking gathers some new state information
+ connection tracking updates local connection tracking database
+ connection tracking sends event message to event API
+ on active node, inside the conntrack-sync kernel thread
+ conntrack sync daemon receives event through event API
+ conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
+ conntrack sync daemon generates state replication protocol message
+ conntrack sync daemon sends state replication protocol message
+ on slave node(s), inside network RX softirq
+ connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
+ state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
+ on slave node(s), inside conntrack-sync kernel thread
+ conntrack sync daemon receives state replication message
+ conntrack sync daemon creates/updates conntrack entry
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfillter/iptables
+Neccessary changes to kernel
+
+Neccessary changes to current conntrack core
+
+ event generation (callback functions) for all state changes
+
+ conntrack hashtable manipulation API
+ is needed (and already implemented) for 'ctnetlink' API
+
+ conntrack exemptions
+ needed to _not_ track conntrack state replication packets
+ is needed for other cases as well
+ currently being developed by Jozsef Kadlecsik
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Future of Linux packet filtering
+Thanks
+ The slides of this presentation are available at http://www.gnumonks.org/
+
+ Visit the netfilter homepage http://www.netfilter.org/
+
+ Thanks to
+ the BBS people, Z-Netz, FIDO, ...
+ for heavily increasing my computer usage in 1992
+ KNF
+ for bringing me in touch with the internet as early as 1994
+ for providing a playground for technical people
+ for telling me about the existance of Linux!
+ Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
+ for implementing (one of?) the world's best TCP/IP stacks
+ Paul 'Rusty' Russell
+ for starting the netfilter/iptables project
+ for trusting me to maintain it today
+ Astaro AG
+ for sponsoring most of my current netfilter work
+
personal git repositories of Harald Welte. Your mileage may vary