From fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Sun, 25 Oct 2015 21:00:20 +0100 Subject: import of old now defunct presentation slides svn repo --- .../netfilter_status-netconf2005.tpp | 240 +++++++++++++++++++++ 1 file changed, 240 insertions(+) create mode 100644 2005/netfilter_status-netconf2005/netfilter_status-netconf2005.tpp (limited to '2005/netfilter_status-netconf2005') diff --git a/2005/netfilter_status-netconf2005/netfilter_status-netconf2005.tpp b/2005/netfilter_status-netconf2005/netfilter_status-netconf2005.tpp new file mode 100644 index 0000000..5d4b715 --- /dev/null +++ b/2005/netfilter_status-netconf2005/netfilter_status-netconf2005.tpp @@ -0,0 +1,240 @@ +--author Harald Welte +--title What's been happening in the netfilter world +--date 16 Jul 2005 +This is an overview about what has been going on in the netfilter world recently. The main purpose is to keep the rest of the linux kenrel networking crowd informed. +--footer This presentation is made with tpp http://synflood.at/tpp.html + +--newpage +--footer netconf'05 - netfilter update +--header Overview +rustynat +nfnetlink +ctnetlink +flow-based accounting +conntrack tool +helpers (pptp, h.323, sip) +pkttables +ipset +ct_sync +transparent proxies +misc + +--newpage +--footer netconf'05 - netfilter update +--header rustynat +Three years ago, the "newnat" design was adopted as architecture and API for conntrack/nat helpers. This is what most people are using, and what's in kernel 2.4.x and 2.6.x (for x < 11). + +In 2.6.11, a new scheme (which I call "rustynat") was integrated. + +Fundamental changes: + struct ip_conntrack no longer has sibling_list + struct ip_conntrack_expect is killed when expected conntrack comes in + NAT helpers are now called by callback functions from conntrack helpers + cleanup of NAT manip data structures to reduce size of ip_conntrack + +Problems: + All existing helpers need to be ported (non-trivial port) + Some fallout related to sequence number updates in NAT helper case + +--newpage +--footer netconf'05 - netfilter update +--header nfnetlink +Fundamental idea is to have a generic layer for all netfilter related netlink messages. It basically adds another layer of abstraction/multiplexing on top of netlink. Is it really needed? + +Looking at the real users, they are extremely different: + +ctnetlink + dump/read/flush/update connection tracking table + dump/read/flush/update connection tracking expectation table +ulog-ng + log arbitrary (even non-ip) packets to userspace +nf_queue + queue arbitrary (even non-ip) packets to userspace +pkttnetlink + ruleset management + +--newpage +--footer netconf'05 - netfilter update +--header ctnetlink +Purpose of ctnetlink is to have a userspace interface to the conntrack table + +message types + IPCTNL_MSG_CT_NEW - create a new conntrack + IPCTNL_MSG_CT_DELETE - delete a conntrack, flush table + IPCTNL_MSG_CT_GET - read one or more conntracks + IPCTNL_MSG_CT_GET_CTRZERO - read conntrack and zero counters + + IPCTNL_MSG_EXP_NEW - create a new expect + IPCTNL_MSG_EXP_DELETE - delete an expect + IPCTNL_MSG_EXP_GET - read one or more expects + + IPCTNL_MSG_CONFIG - configuration of masks (see later) + +--newpage +--footer netconf'05 - netfilter update +--header conntrack event cache +ctnetlink also wants to have events, i.e. inform userspace about updates + +ip_conntrack was extended to build an 'event cache', i.e. a list of events that have happened while one specific packet passes throught the stack: + + IPCT_DESTROY + IPCT_NEW + IPCT_RELATED + IPCT_STATUS + IPCT_PROTOINFO + IPCT_HELPER + IPCT_HELPINFO + IPCT_NATINFO + +When packet traversal finishes, a notifier is called with the bitmask of accumulated events for this packet (skb->nfcache) +Event API is used by ct_sync and ctnetlink + +--newpage +--footer netconf'05 - netfilter update +--header ctnetlink +ctnetlink registers with the event API and sends ctnetlink multicast msgs + +ctnetlink event messages are either NEW, NEW with F_UPDATE or DELETE + +Problem: + There can be lots of events. + We can easily see 200,000 NEW conntracks per second + +Interim Solution: + Have userspace app specify the bitmask of interesting events via + IPCTNL_MSG_CONFIG. This defeats use by multiple incooperative apps. + +--newpage +--footer netconf'05 - netfilter update +--header ctnetlink +Proposed Real Solution: + Have generic netlink event message filters. + - Every socket can set it's local bitmask of events using setsockopt() + - netlink core maintains ORed event mask that is used by ctnetlink + - Whenever a socket disappears (or changes its mask), we recalculate + the global mask + +This scheme should really be generic, since other subsystems with potentially many messages can profit from it. + +--newpage +--footer netconf'05 - netfilter update +--header conntrack tool + +To test and use ctnetlink, Pablo Neira wrote the "conntrack" tool +Basically "iproute2" for conntrack: + + -L [table] [-z] List conntrack or expect table + -G [table] params Show conntrac or expect + -D [table] params Delete conntrack or expect + -I [table] params Create conntrack or expect + -E [table] [options] Show events (equals "ip route monitor") + +--newpage +--footer netconf'05 - netfilter update +--header flow-based accounting +Linux misses good accounting solution. +Lots of people use inefficient net-acct/nacctd, ip-acct, ulog-acct, ... +Specialized solutions exist (ipt_ACCOUNT, ...) but are limited in scope +Most people want to have flow-based instead of packet-based logs +NETFLOW (or now IPFIX) format can be used by standard tools for analysis + +Idea: We already have a flow cache in the kernel +Problem: It's read-only per packet +But: ip_conntrack already has per-packet write acccess +So: We can put counters in same already-written-to ip_conntrack cache line + +Userspace interface is ctnetlink (either polling or event-based) +Simplistic implementation can use "conntrack" tool and pipe to perl script +Fully-featured logging daemon (ulogd2) is in the final implementation stage +See my OLS 2005 paper for more details + +--newpage +--footer netconf'05 - netfilter update +--header helpers +PPTP + helper is now finally ported to rustynat + will be merged soon since I'm tired of syncing it with core changes + +H.323 + now has a simplified ASN.1 parser instead of brute-force replace + needs more testing but could probably be merged soon, too + +SIP + first development version showed up + extremely complex protocol, helper can only cover common cases + some features (like host names in SDP) cannot be solved in-kernel + + +--newpage +--footer netconf'05 - netfilter update +--header pkttables + +Sorry, no real progress since last year. Too much other work :( + +We'll have to wait a bit longer until we see the next linux packet filter.. + +--newpage +--footer netconf'05 - netfilter update +--header nf_conntrack + +nf_conntrack is the layer3-independent connection tracking code (ipv4+ipv6) +- Code is still kept in-sync with ip_conntrack changes +- We still don't have IPv4-NAT on top of it +- Should already have been submitted a long time ago +- Problem: you can only have ip_conntrack or nf_conntrack loaded at once +- All the existing users ('state' and 'conntrack' iptables match, ..) + can't deal with it transparently. +- Should get fixed up, but like many ipv6 issues it has low prio :( + +--newpage +--footer netconf'05 - netfilter update +--header ipset +http://ipset.netfilter.org/ +- Supersedes old ippool code +- Idea is to have certain groups of addresses (called "sets") +- Instead of having 100 iptables rules to match on 100 addresses, you have + 1 iptables rule and an ipset with 100 addresses +- It's more optimal since it has efficient data types (such as a 256bit + long bitmask for any N addresses out of a /24) +- Should IMHO get merged soon, too. + +--newpage +--footer netconf'05 - netfilter update +--header ct_sync + +- Development of 2.6.x port seems to have stabilized now +- We're not seeing any oopses for quite some time +- Still doesn't support working failover for 'helped' connections +- 2.6.x branch allows one node to participate in multiple virtual clusters +- Currently working on real active-active failover +- Current code based on 2.6.10, so no "rustynat" port yet + +--newpage +--footer netconf'05 - netfilter update +--header transparent proxying +In 2.2.x we had the kludy bind-to-foreign-address code +In 2.4.x it was removed because netfilter had to clean up core networking code +Now we have huge bloaty TPROXY patches out-of-tree instead: + - they do DNAT of incoming connection + - SNAT on outgoing connection + - use SO_GETORIGDST on incoming connection to retrieve un-nat'ed addr +While the code is working fine, I think it's just not worth the effort: + - NATing _twice_ just to route packets to local sockets, plus + - kludgy socket options and other nasty stuff.... +Al we need is + - route certain packets to local sockets (based on destip/destport) + - bind local processes to foreign addresses (already works) + - send packets from sockets bound to foreign addreses +Transparent proxies with ctnetlink-issued expectations is what you want to enable conntrack helpers in userspace! + +--newpage +--footer netconf'05 - netfilter update +--header misc + +- new sourcecode directory structure: /net/netfilter/* for core stuff +- ipsec interaction -> Patrick +- conntrack reference issue (rmmod ip_conntrack vs. nf_reset() vs. + local nat vs. GETORIGDST) + +not netfilter-related +- would somebody mind 'alias' devices that had their own mac address? -- cgit v1.2.3