From fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Sun, 25 Oct 2015 21:00:20 +0100 Subject: import of old now defunct presentation slides svn repo --- 2003/netfilter-curdevel-lt2003/abstract | 12 + 2003/netfilter-curdevel-lt2003/biography | 22 ++ 2003/netfilter-curdevel-lt2003/curdevel | 19 ++ .../netfilter-curdevel-lt2003.mgp | 299 +++++++++++++++++++++ 4 files changed, 352 insertions(+) create mode 100644 2003/netfilter-curdevel-lt2003/abstract create mode 100644 2003/netfilter-curdevel-lt2003/biography create mode 100644 2003/netfilter-curdevel-lt2003/curdevel create mode 100644 2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp (limited to '2003/netfilter-curdevel-lt2003') diff --git a/2003/netfilter-curdevel-lt2003/abstract b/2003/netfilter-curdevel-lt2003/abstract new file mode 100644 index 0000000..d6ee41c --- /dev/null +++ b/2003/netfilter-curdevel-lt2003/abstract @@ -0,0 +1,12 @@ +The netfilter/iptables system is about three years old. With Linux kernel 2.4.x being deployed widely during the last two years, lots of systems worldwide are using netfilter/iptables as their packet filtering subsystem. + +netfilter/iptables is no doubt a big improvement over the old ipchains system in the 2.2.x kernels. Hoewever, as with any project - after wide deployment for some time, we start to discover aspects that can be implemented more cleanly, more efficently. + +The constant innovation and development of new applications and protocols (like SIP) on the internet also raise new requirements towards the linux packet filter. + +So the question is: Is it time for yet another generation of the linux packet filtering subsystem? Will the tradition of change (ipfwadm->ipchains->iptables->?) be continued? Or can we integrate all necessarry changes within the current framework? + +The presentation will cover a summary of the problems with the current netfilter/iptables implementation and describe the proposed solutions. + +Intended Audience: System and Network Administrators +Prerequsites: Knowledge about Packet Filters. Usage of iptables. diff --git a/2003/netfilter-curdevel-lt2003/biography b/2003/netfilter-curdevel-lt2003/biography new file mode 100644 index 0000000..0280091 --- /dev/null +++ b/2003/netfilter-curdevel-lt2003/biography @@ -0,0 +1,22 @@ + Harald Welte is one +of the five netfilter/iptables core +team members, and the current Linux 2.4.x firewalling maintainer. + + His main interest in computing has always been networking. In the few time +left besides netfilter/iptables related work, he's writing obscure documents +like the UUCP +over SSL HOWTO. Other kernel-related projects he has been contributing are +user mode linux and the international (crypto) kernel patch. + + In the past he has been working as an independent IT Consultant working on +closed-source projecst for various companies ranging from banks to +manufacturers of networking gear. During the year 2001 he was living in +Curitiba (Brazil), where he got sponsored for his Linux related work by +Conectiva Inc.. + + Starting with February 2002, Harald has been contracted part-time by +Astaro AG, who are sponsoring him for his +current netfilter/iptables work. + + Harald is living in Berlin, Germany. + diff --git a/2003/netfilter-curdevel-lt2003/curdevel b/2003/netfilter-curdevel-lt2003/curdevel new file mode 100644 index 0000000..07f11d7 --- /dev/null +++ b/2003/netfilter-curdevel-lt2003/curdevel @@ -0,0 +1,19 @@ +- pkttables + - linked lists instead of blob + - explain current situation + - dynamic rulesets are slow with iptables + - independent of layer 3 protocol + - current code duplication between [ip|ip6|arp]tables + - some matches (mac, interface, ...) are independent anyway +- nfnetlink + - idea + - ctnetlink + - iptnetlink / pkttnetlink + - ulog/queue port to it + - libnfnetlink, libctnetlink, libpkttnetlink +- libiptables / libpkttnetlink + - high-level API for rule-manipulation + - covering all the plugins which are currently part of iptables + +- failover / load balancing for stateful firewalls + - slides from OLS diff --git a/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp b/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp new file mode 100644 index 0000000..0f13d07 --- /dev/null +++ b/2003/netfilter-curdevel-lt2003/netfilter-curdevel-lt2003.mgp @@ -0,0 +1,299 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + +%center +%size 7 + + +The future of Linux packet filtering + + + +%center +%size 4 +by + +Harald Welte + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Contents + + + Problems with current 2.4/2.5 netfilter/iptables + Solution to code replication + Solution for dynamic rulesets + Solution for API to GUI's and other management programs + + Other current work + Optimizing Rule load time of large rulesets + Making netfilter/iptables compatible with zerocopy tcp + + HA for stateful firewalling + What's special about firewalling HA + Poor man's failover + Real state replication + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Problems with 2.4.x netfilter/iptables + + code replication between iptables/ip6tables/arptables + iptables not meant for other protocols, but people did copy+paste 'ports' + replication of + core kernel code + layer 3 independent matches (mac, interface, ...) + userspace library (libiptc) + userspace tool (iptables) + userspace plugins (libipt_xxx.so) + doesn't suit the needs for dynamically changing rulesets + dynamic rulesets becomming more common due (service selection, IDS) + a whole table is created in userspace and sent as blob to kernel + for every ruleset the table needs to be copied to userspace and back + inside kernel consistency checks on whole table, loop detection + too extensible for writing any forward-compatible GUI + new extensions showing up all the time + a frontend would need to know about the options and use of a new extension + thus frontends are always incomplete and out-of-date + no high-level API other than piping to iptables-restore + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Reducing code replication + + code replication is a real problem: unclean, bugfixes missed + we need layer 3 independent layer for + submitting rules to the kernel + traversing packet-rulesets supporting match/target modules + registering matches/targets + layer 3 specific (like matching ipv4 address) + layer 3 independent (like matching MAC address) + solution + pkt_tables inside kernel + pkt_tables_ipv4 registers layer 3 handler with pkt_tables + pkt_tables_ipv6 registers layer 3 handler with pkt_tables + everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol + libraries in userspace (see later) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Supporting dynamic rulesets + + atomic table-replacement turned out to be bad idea + need new interface for sending individual rules to kernel + policy routing has the same problem and good solution: rtnetlink + solution: nfnetlink + multicast-netlink based packet-orinented socket between kernel and userspace + has extra benefit that other userspace processes get notified of rule changes [just like routing daemons] + nfnetlink will be low-layer below all kernel/userspace communication + pkttnetlink [aka iptnetlink] + ctnetlink + ulog + ip_queue + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Communication with other programs + + libnfnetlink for low-layer communication + libpkttnetlink for rule modifications + will handle all plugins [which are currently part of iptables] + query functions about avaliable matches/targets + query functions about parameters + query functions for help messages about specific match/parameter of a match + generic structure from which rules can be built + conversion functions to parse generic structure into in-kernel structure + conversion functions to perse kernel structure into generic structure + functions to convert generic structure in plain text + libipq will stay API-compatible to current version + libipulog will stay API-compatible to current version + libiptc will go away [compatibility layer extremely difficult] + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Optimizing rule load time + + Current situation + loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz + this is caused by two bottlenecks + loop detection algorithm on kernel side inefficient + a couple of O^2 complexity functions in libiptc + + Solution + efficient loop detection and mark_source_chains() algorithm (graph coloring) + current CVS libiptc with only one O^2 function: 2minutes37 + whole reimplementation of libiptc needed for removing the last O^2 function + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Optimizing the connection tracking code + + Conntrack hash function optimization + old hash function not good for even hash bucket count + hash function evaluation tool [cttest] avaliable + other hash functions in development (already in 2.4.21) + introduce per-system randomness to prevent hash attack + code optimization (locking/timers/...) + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +netfilter and zerocopy TCP + + Current situation (2.4.x) + skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled + this is a big performance loss on stand-alone servers which filter packets locally + + Solution + remove skb_linearize() from conntrack, nat and ip_tables core + all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Introduction + +What is special about firewall failover? + + Nothing, in case of the stateless packet filter + Common IP takeover solutions can be used + VRRP + Hartbeat + + Distribution of packet filtering ruleset no problem + can be done manually + or implemented with simple userspace process + + Problems arise with stateful packet filters + Connection state only on active node + NAT mappings only on active node + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Poor man's failover + + principle + let every node do it's own tracking rather than replicating state + two possible implementations + connect every node to shared media (i.e. real ethernet) + forwarding only turned on on active node + slave nodes use promiscuous mode to sniff packets + copy all traffic to slave nodes + active master needs to copy all traffic to other nodes + disadvantage: high load, sync traffic == payload traffic + IMHO stupid way of solving the problem + advantages + very easy implementation + only addition of sniffing mode to conntrack needed + existing means of address takeover can be used + same load on active master and slave nodes + no additional load on active master + disadvantages + can only be used with real shared media (no switches, ...) + can not be used with NAT + remaining problem + no initial state sync after reboot of slave node! + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Real state replication + +Parts needed + state replication protocol + multicast based + sequence numbers for detection of packet loss + NACK-based retransmission + no security, since private ethernet segment to be used + event interface on active node + calling out to callback function at all state changes + exported interface to manipulate conntrack hash table + kernel thread for sending conntrack state protocol messages + registers with event interface + creates and accumulates state replication packets + sends them via in-kernel sockets api + kernel thread for receiving conntrack state replication messages + receives state replication packets via in-kernel sockets + uses conntrack hashtable manipulation interface + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Real state replication + + Flow of events in chronological order: + on active node, inside the network RX softirq + connection tracking code is analyzing a forwarded packet + connection tracking gathers some new state information + connection tracking updates local connection tracking database + connection tracking sends event message to event API + on active node, inside the conntrack-sync kernel thread + conntrack sync daemon receives event through event API + conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy + conntrack sync daemon generates state replication protocol message + conntrack sync daemon sends state replication protocol message + on slave node(s), inside network RX softirq + connection tracking code ignores packets coming from the interface attached to the private conntrac sync network + state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread + on slave node(s), inside conntrack-sync kernel thread + conntrack sync daemon receives state replication message + conntrack sync daemon creates/updates conntrack entry + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfillter/iptables +Neccessary changes to kernel + +Neccessary changes to current conntrack core + + event generation (callback functions) for all state changes + + conntrack hashtable manipulation API + is needed (and already implemented) for 'ctnetlink' API + + conntrack exemptions + needed to _not_ track conntrack state replication packets + is needed for other cases as well + currently being developed by Jozsef Kadlecsik + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Future of Linux packet filtering +Thanks + The slides of this presentation are available at http://www.gnumonks.org/ + + Visit the netfilter homepage http://www.netfilter.org/ + + Thanks to + the BBS people, Z-Netz, FIDO, ... + for heavily increasing my computer usage in 1992 + KNF + for bringing me in touch with the internet as early as 1994 + for providing a playground for technical people + for telling me about the existance of Linux! + Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen + for implementing (one of?) the world's best TCP/IP stacks + Paul 'Rusty' Russell + for starting the netfilter/iptables project + for trusting me to maintain it today + Astaro AG + for sponsoring most of my current netfilter work + -- cgit v1.2.3