From fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Sun, 25 Oct 2015 21:00:20 +0100 Subject: import of old now defunct presentation slides svn repo --- .../netfilter-bof-ols2004.mgp | 272 +++++++++++++++++++++ 1 file changed, 272 insertions(+) create mode 100644 2004/netfilter-bof-ols2004/netfilter-bof-ols2004.mgp (limited to '2004/netfilter-bof-ols2004') diff --git a/2004/netfilter-bof-ols2004/netfilter-bof-ols2004.mgp b/2004/netfilter-bof-ols2004/netfilter-bof-ols2004.mgp new file mode 100644 index 0000000..ccf8ba4 --- /dev/null +++ b/2004/netfilter-bof-ols2004/netfilter-bof-ols2004.mgp @@ -0,0 +1,272 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + +%center +%size 7 + + +Netfilter BOF + + + +%center +%size 4 +by + +Harald Welte + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Contents + + + Problems with current 2.4/2.6 netfilter/iptables + Solution to code replication + Solution for dynamic rulesets + Solution for API to GUI's and other management programs + + Other current work + nf_conntrack - l3 independent connection tracking + ulogd2 - conntrack based flow accounting (ipfix) + qsearch - efficient in-kernel pattern matching + ctstat - runtime conntrack statistics + ipset - replacement for ippool + benchmarking at gigagbit wirespeed + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Problem with 2.4/2.6 netfilter/iptables + + code replication between iptables/ip6tables/arptables/ebtables + iptables was never meant for other protocols, but people did copy+paste 'ports' + replication of + core kernel code + layer 3 independent matches (mac, interface, ...) + userspace library (libiptc) + userspace tool (iptables) + userspace plugins (libipt_xxx.so) + + doesn't suit the needs for dynamically changing rulesets + dynamic rulesets becomming more common due (service selection, IDS) + a whole table is created in userspace and sent as blob to kernel + for every ruleset the table needs to be copied to userspace and back + inside kernel consistency checks on whole table, loop detection + +%page +Netfilter BOF +Problem with 2.4/2.6 netfilter/iptables + + too extensible for writing any forward-compatible GUI + new extensions showing up all the time + a frontend would need to know about the options and use of a new extension + thus frontends are always incomplete and out-of-date + no high-level API other than piping to iptables-restore + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Reducing code replication + + code replication is a real problem: unclean, bugfixes missed + we need layer 3 independent layer for + submitting rules to the kernel + traversing packet-rulesets supporting match/target modules + registering matches/targets + layer 3 specific (like matching ipv4 address) + layer 3 independent (like matching MAC address) + + solution + pkt_tables inside kernel + pkt_tables_ipv4 registers layer 3 handler with pkt_tables + pkt_tables_ipv6 registers layer 3 handler with pkt_tables + everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol + libraries in userspace (see later) + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Supporting dynamic rulesets + + atomic table-replacement turned out to be bad idea + need new interface for sending individual rules to kernel + policy routing has the same problem and good solution: rtnetlink + solution: nfnetlink + multicast-netlink based packet-orinented socket between kernel and userspace + has extra benefit that other userspace processes get notified of rule changes [just like routing daemons] + nfnetlink will be low-layer below all kernel/userspace communication + pkttnetlink [aka iptnetlink] + ctnetlink + ulog + ip_queue + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Communication with other programs + +whole set of libraries + libnfnetlink for low-layer communication + libpkttnetlink for rule modifications + will handle all plugins [which are currently part of iptables] + query functions about avaliable matches/targets + query functions about parameters + query functions for help messages about specific match/parameter of a match + generic structure from which rules can be built + conversion functions to parse generic structure into in-kernel structure + conversion functions to perse kernel structure into generic structure + functions to convert generic structure in plain text + libipq will stay API-compatible to current version + libipulog will stay API-compatible to current version + libiptc will go away [compatibility layer extremely difficult] + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Optimizing rule load time + + Current situation + loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz + this is caused by two bottlenecks + loop detection algorithm on kernel side inefficient + a couple of O^2 complexity functions in libiptc + + Solution + efficient loop detection and mark_source_chains() algorithm (graph coloring) + current CVS libiptc with only one O^2 function: 2minutes37 + whole reimplementation of libiptc needed for removing the last O^2 function + + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +nf_conntrack + + USAGI did a port of ip_conntrack to ip6_conntrack + same code replication we're fighting with ip[6]tables :( + netfilter core team had ideas about layer 3 independent conntrack + Yasuyuki Kozakai implemented nf_conntrack based on those ideas + Implementation is now clean, available from CVS + Needs re-sync with all the ip_conntrack changes of the last months + Needs support for ipv4 and ipv4<->ipv6 transition NAT + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +ulogd2 + + Linux doesn't currently offer any sane accounting system + nacctd - needs all packets via PF_PACKET in userspace + ulogd - uses efficient netlink socket, but still packet based + Solution: add per-direction packet and byte counters to ip_conntrack + combination with ctnetlink delete events + needs userspace daemon for further processing + is related to what IETF ipfix working group doees + Redesign of ulogd to ulogd2: + no difference between input and output plugins + stack of plugins like: ctnetlink->ipfix + other possible stack: ULOG->interpreter->flow_aggregator->mysql + implementation on underway, author highly motivated ;) + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +qsearch + + Conntrack helpers (FTP, IRC, ...) often have to do pattern-matching + Some people like to employ ipt_string matching + This all became more complex through nonlinear/fragmented skb's + Solution: + Implement a single pattern-matching api to be used from all places + Starting point: Rusty's skb_iter() and libqsearch + Turns out that libqsearch API needs more work + Many similarities to cryptoAPI + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +ctstat + + Martin Josefsson wrote ctstat + similar to rtstat of Robert Olsson + runtime per-cpu statistics of + number of conntracks + how many lookups + how many found + how many new + how many invalid packets + how many ignored packets + how many deleted conntracks + how many instered conntrack + how many icmp errors + how many new expects + how many deleted expects + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +ipset + + Implemented by Jozsef Kadlecsik + Efficient way to handle a whole set of addresses in single rule + also provides target to add addresses into set + currently implemented: ipmap, macipmap, portmap and iphash + ipmap uses bitmask where each bit represents one ip address + ipmacmap uses memory range with 8 byte per IP/mac + portmap uses memory range where each bit represents one port + iphash uses fixed size hash (for random adresses) + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +benchmarking at gigagbit wirespeed + + Harald did lots of benchmarking + Dual Opteron machines + e1000 Gigabit adapters with irq-affinity + 2.4.x / 2.6.x kernel, both 32bit and 64bit + Results to be published soon + Performance problems mostly ip_tables related, not ip_conntrack + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Netfilter BOF +Thanks + + Thanks to + the BBS scenee, Z-Netz, FIDO, ... + for heavily increasing my computer usage in 1992 + KNF + for bringing me in touch with the internet as early as 1994 + for providing a playground for technical people + for introducing me to the existance of Linux! + Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen + for implementing (one of?) the world's best TCP/IP stacks + Paul 'Rusty' Russell + for starting the netfilter/iptables project + for trusting me to maintain it today + Astaro AG + for sponsoring my netfilter failover work + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Availability of slides / Links + +The slides + http://www.gnumonks.org/ + +The netfilter homepage + http://www.netfilter.org/ + +My Sponsor, Astaro AG + http://www.astaro.com/ -- cgit v1.2.3