summaryrefslogtreecommitdiff
path: root/2005/netfilter_status-netconf2005/netfilter_status-netconf2005.tpp
blob: 5d4b715256d7be452747e3780600403b763ecc11 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
--author Harald Welte <laforge@netfilter.org>
--title What's been happening in the netfilter world
--date 16 Jul 2005
This is an overview about what has been going on in the netfilter world recently.  The main purpose is to keep the rest of the linux kenrel networking crowd informed.
--footer This presentation is made with tpp http://synflood.at/tpp.html

--newpage
--footer netconf'05 - netfilter update
--header Overview
rustynat
nfnetlink
ctnetlink
flow-based accounting
conntrack tool
helpers (pptp, h.323, sip)
pkttables
ipset
ct_sync
transparent proxies
misc

--newpage
--footer netconf'05 - netfilter update
--header rustynat
Three years ago, the "newnat" design was adopted as architecture and API for conntrack/nat helpers.  This is what most people are using, and what's in kernel 2.4.x and 2.6.x (for x < 11).

In 2.6.11, a new scheme (which I call "rustynat") was integrated. 

Fundamental changes:
	struct ip_conntrack no longer has sibling_list
	struct ip_conntrack_expect is killed when expected conntrack comes in
	NAT helpers are now called by callback functions from conntrack helpers
	cleanup of NAT manip data structures to reduce size of ip_conntrack

Problems:
	All existing helpers need to be ported (non-trivial port)
	Some fallout related to sequence number updates in NAT helper case

--newpage
--footer netconf'05 - netfilter update
--header nfnetlink
Fundamental idea is to have a generic layer for all netfilter related netlink messages.  It basically adds another layer of abstraction/multiplexing on top of netlink.  Is it really needed?

Looking at the real users, they are extremely different:

ctnetlink
	dump/read/flush/update connection tracking table
	dump/read/flush/update connection tracking expectation table
ulog-ng
	log arbitrary (even non-ip) packets to userspace
nf_queue
	queue arbitrary (even non-ip) packets to userspace
pkttnetlink
	ruleset management

--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
Purpose of ctnetlink is to have a userspace interface to the conntrack table

message types
	IPCTNL_MSG_CT_NEW		- create a new conntrack
	IPCTNL_MSG_CT_DELETE		- delete a conntrack, flush table
	IPCTNL_MSG_CT_GET		- read one or more conntracks
	IPCTNL_MSG_CT_GET_CTRZERO	- read conntrack and zero counters

	IPCTNL_MSG_EXP_NEW		- create a new expect
	IPCTNL_MSG_EXP_DELETE		- delete an expect
	IPCTNL_MSG_EXP_GET		- read one or more expects
	
	IPCTNL_MSG_CONFIG		- configuration of masks (see later)

--newpage
--footer netconf'05 - netfilter update
--header conntrack event cache
ctnetlink also wants to have events, i.e. inform userspace about updates

ip_conntrack was extended to build an 'event cache', i.e. a list of events that have happened while one specific packet passes throught the stack:

	IPCT_DESTROY
	IPCT_NEW
	IPCT_RELATED
	IPCT_STATUS
	IPCT_PROTOINFO
	IPCT_HELPER
	IPCT_HELPINFO
	IPCT_NATINFO

When packet traversal finishes, a notifier is called with the bitmask of accumulated events for this packet (skb->nfcache)
Event API is used by ct_sync and ctnetlink

--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
ctnetlink registers with the event API and sends ctnetlink multicast msgs

ctnetlink event messages are either NEW, NEW with F_UPDATE or DELETE

Problem: 
	There can be lots of events.  
	We can easily see 200,000 NEW conntracks per second 

Interim Solution: 
	Have userspace app specify the bitmask of interesting events via
	IPCTNL_MSG_CONFIG.  This defeats use by multiple incooperative apps.

--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
Proposed Real Solution:
	Have generic netlink event message filters.
	- Every socket can set it's local bitmask of events using setsockopt()
	- netlink core maintains ORed event mask that is used by ctnetlink
	- Whenever a socket disappears (or changes its mask), we recalculate
	  the global mask

This scheme should really be generic, since other subsystems with potentially many messages can profit from it.

--newpage
--footer netconf'05 - netfilter update
--header conntrack tool

To test and use ctnetlink, Pablo Neira wrote the "conntrack" tool
Basically "iproute2" for conntrack:

	-L [table] [-z]		List conntrack or expect table
	-G [table] params	Show conntrac or expect
	-D [table] params	Delete conntrack or expect
	-I [table] params	Create conntrack or expect
	-E [table] [options]	Show events (equals "ip route monitor")

--newpage
--footer netconf'05 - netfilter update
--header flow-based accounting
Linux misses good accounting solution. 
Lots of people use inefficient net-acct/nacctd, ip-acct, ulog-acct, ...
Specialized solutions exist (ipt_ACCOUNT, ...) but are limited in scope
Most people want to have flow-based instead of packet-based logs
NETFLOW (or now IPFIX) format can be used by standard tools for analysis

Idea: We already have a flow cache in the kernel
Problem: It's read-only per packet
But: ip_conntrack already has per-packet write acccess
So: We can put counters in same already-written-to ip_conntrack cache line

Userspace interface is ctnetlink (either polling or event-based)
Simplistic implementation can use "conntrack" tool and pipe to perl script
Fully-featured logging daemon (ulogd2) is in the final implementation stage
See my OLS 2005 paper for more details

--newpage
--footer netconf'05 - netfilter update
--header helpers
PPTP 
	helper is now finally ported to rustynat
	will be merged soon since I'm tired of syncing it with core changes

H.323
	now has a simplified ASN.1 parser instead of brute-force replace
	needs more testing but could probably be merged soon, too

SIP
	first development version showed up
	extremely complex protocol, helper can only cover common cases
	some features (like host names in SDP) cannot be solved in-kernel


--newpage
--footer netconf'05 - netfilter update
--header pkttables

Sorry, no real progress since last year.  Too much other work :(

We'll have to wait a bit longer until we see the next linux packet filter..

--newpage
--footer netconf'05 - netfilter update
--header nf_conntrack

nf_conntrack is the layer3-independent connection tracking code (ipv4+ipv6)
- Code is still kept in-sync with ip_conntrack changes
- We still don't have IPv4-NAT on top of it
- Should already have been submitted a long time ago
- Problem: you can only have ip_conntrack or nf_conntrack loaded at once
- All the existing users ('state' and 'conntrack' iptables match, ..)
  can't deal with it transparently.
- Should get fixed up, but like many ipv6 issues it has low prio :(

--newpage
--footer netconf'05 - netfilter update
--header ipset
http://ipset.netfilter.org/
- Supersedes old ippool code
- Idea is to have certain groups of addresses (called "sets")
- Instead of having 100 iptables rules to match on 100 addresses, you have
  1 iptables rule and an ipset with 100 addresses
- It's more optimal since it has efficient data types (such as a 256bit
  long bitmask for any N addresses out of a /24)
- Should IMHO get merged soon, too.

--newpage
--footer netconf'05 - netfilter update
--header ct_sync

- Development of 2.6.x port seems to have stabilized now
- We're not seeing any oopses for quite some time
- Still doesn't support working failover for 'helped' connections
- 2.6.x branch allows one node to participate in multiple virtual clusters
- Currently working on real active-active failover
- Current code based on 2.6.10, so no "rustynat" port yet

--newpage
--footer netconf'05 - netfilter update
--header transparent proxying
In 2.2.x we had the kludy bind-to-foreign-address code
In 2.4.x it was removed because netfilter had to clean up core networking code
Now we have huge bloaty TPROXY patches out-of-tree instead:
	- they do DNAT of incoming connection
	- SNAT on outgoing connection
	- use SO_GETORIGDST on incoming connection to retrieve un-nat'ed addr
While the code is working fine, I think it's just not worth the effort:
	- NATing _twice_ just to route packets to local sockets, plus
	- kludgy socket options and other nasty stuff....
Al we need is
	- route certain packets to local sockets (based on destip/destport)
	- bind local processes to foreign addresses (already works)
	- send packets from sockets bound to foreign addreses
Transparent proxies with ctnetlink-issued expectations is what you want to enable conntrack helpers in userspace!

--newpage
--footer netconf'05 - netfilter update
--header misc

- new sourcecode directory structure: /net/netfilter/* for core stuff
- ipsec interaction -> Patrick
- conntrack reference issue (rmmod ip_conntrack vs. nf_reset() vs.
  local nat vs. GETORIGDST)

not netfilter-related
- would somebody mind 'alias' devices that had their own mac address?
personal git repositories of Harald Welte. Your mileage may vary