%include "default.mgp" %default 1 bgrad %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% #%deffont "typewriter" tfont "MONOTYPE.TTF" %page %nodefault %back "blue" %center %size 7 Developing netfilter/iptables extensions %center %size 4 by Harald Welte %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Contents Introduction The netfilter/iptables architecture Netfilter hooks in protocol stacks Packet selection based on IP Tables The Connection Tracking Subsystem The NAT Subsystem based on netfilter + iptables Packet filtering using the 'filter' table Packet mangling using the 'mangle' table Advanced netfilter concepts Current development and Future Developing a netfilter module Developing a new iptables match %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Netfilter Hooks What is netfilter? System of callback functions within network stack Callback function to be called for every packet traversing certain point (hook) within network stack Protocol independent framework Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP) Multiple kernel modules can register with each of the hooks Asynchronous packet handling in userspace (ip_queue) Traditional packet filtering, NAT, ... is implemented on top of this framework Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Netfilter Hooks Netfilter architecture in IPv4 %font "typewriter" %size 3 --->[1]--->[ROUTE]--->[3]--->[4]---> | ^ | | | [ROUTE] v | [2] [5] | ^ | | v | %font "standard" 1=NF_IP_PRE_ROUTING 2=NF_IP_LOCAL_IN 3=NF_IP_FORWARD 4=NF_IP_POST_ROUTING 5=NF_IP_LOCAL_OUT %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Netfilter Hooks Netfilter Hooks Any kernel module may register a callback function at any of the hooks The module has to return one of the following constants NF_ACCEPT continue traversal as normal NF_DROP drop the packet, do not continue NF_STOLEN I've taken over the packet do not continue NF_QUEUE enqueue packet to userspace NF_REPEAT call this hook again %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Developing netfilter/iptables extensions Developing a netfilter module Netfilter modules are very low-layer Get called for every packet passing the hook in this l3prot Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat %font "typewriter" %size 3 #include %size 3 nf_register_hook(struct nf_hook_ops *reg) %size 3 nf_unregister_hook(struct nf_hook_ops *reg) %size 3 struct nf_hook_ops: %size 3 struct list_head list; /* list header */ %size 3 nf_hookfn *hook; /* the callback function */ %size 3 int pf; /* protocol family */ %size 3 int hooknum; /* hook to register with */ %size 3 int priority; /* priority (ordering) */ %font "standard" Example code see "nf_workshop.c" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture IP tables Packet selection using IP tables The kernel provides generic IP tables support Each kernel module may create it's own IP table The three major parts of 2.4 firewalling subsystem are implemented using IP tables Packet filtering table 'filter' NAT table 'nat' Packet mangling table 'mangle' Could potentially be used for other stuff, i.e. IPsec SPDB %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture IP Tables Managing chains and tables An IP table consists out of multiple chains A chain consists out of a list of rules Every single rule in a chain consists out of match[es] (rule executed if all matches true) target (what to do if the rule is matched) %size 4 matches and targets can either be builtin or implemented as kernel modules %size 5 The userspace tool iptables is used to control IP tables handles all different kinds of IP tables supports a plugin/shlib interface for target/match specific options %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture IP Tables Basic iptables commands To build a complete iptables command, we must specify which table to work with which chain in this table to use an operation (insert, add, delete, modify) one or more matches (optional) a target The syntax is %font "typewriter" %size 3 iptables -t table -Operation chain -j target match(es) %font "standard" %size 5 Example: %font "typewriter" %size 3 iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp %font "standard" %size 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture IP Tables Matches Basic matches -p protocol (tcp/udp/icmp/...) -s source address (ip/mask) -d destination address (ip/mask) -i incoming interface -o outgoing interface Match extensions (examples) tcp/udp TCP/udp source/destination port icmp ICMP code/type ah/esp AH/ESP SPID match mac source MAC address mark nfmark length match on length of packet limit rate limiting (n packets per timeframe) owner owner uid of the socket sending the packet tos TOS field of IP header ttl TTL field of IP header %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture IP Tables Targets very dependent on the particular table. Table specific targets will be discussed later Generic Targets, always available ACCEPT accept packet within chain DROP silently drop packet QUEUE enqueue packet to userspace LOG log packet via syslog ULOG log packet via ulogd RETURN return to previous (calling) chain foobar jump to user defined chain %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Packet Filtering Overview Implemented as 'filter' table Registers with three netfilter hooks NF_IP_LOCAL_IN (packets destined for the local host) NF_IP_FORWARD (packets forwarded by local host) NF_IP_LOCAL_OUT (packets from the local host) Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT) Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Packet Filtering Targets available within 'filter' table Builtin Targets to be used in filter table ACCEPT accept the packet DROP silently drop the packet QUEUE enqueue packet to userspace RETURN return to previous (calling) chain foobar user defined chain Targets implemented as loadable modules REJECT drop the packet but inform sender MIRROR change source/destination IP and resend LOG log via syslog ULOG log via userspace %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Developing netfilter/iptables extensions Developing an ip_tables match module ip_tables modules are at a high layer Get called for every packet iterating a rule with this match Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss %font "typewriter" %size 3 #include %size 3 ipt_register_match(struct ipt_match *match) %size 3 ipt_unregister_match(struct ipt_match *match) %size 3 struct ipt_match: %size 3 struct list_head list; /* list header {NULL,NULL} */ %size 3 const char name[]; /* name of the match */ %size 3 int (*match); /* called when pkt is matched */ %size 3 int (*checkentry); /* called when entry inserted */ %size 3 void (*destroy); /* called when entry deleted */ %size 3 struct module *me; /* set to THIS_MODULE */ %font "standard" Example code see "ipt_workshop.c" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Developing netfilter/iptables extensions Developing an iptables match module Something has to parse the commandline options for ipt_workshop.c Solution: libpt_workshop.c as iptables plugin %font "typewriter" %size 3 #include : %size 3 register_match(struct iptables_match) %size 3 struct iptables_match: %size 3 struct iptables_match *next; /* next one */ %size 3 ipt_chainlabel name; /* name */ %size 3 const char *version; /* version */ %size 3 size_t size; /* size of match data */ %size 3 size_t userspacesize; /* size for userspace */ %size 3 void (*help); /* print help message */ %size 3 void (*init); /* init the matchinfo */ %size 3 int (*parse); /* parse getopt chars */ %size 3 void (*final_check); /* consistency check */ %size 3 void (*print); /* print (iptables -L) */ %size 3 void (*save); /* iptables-save */ %size 3 struct option extra_opts; /* getopt-style opts */ %font "typewriter" Example code see "libipt_workshop.c" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Connection Tracking Subsystem Connection tracking... implemented seperately from NAT enables stateful filtering implementation hooks into NF_IP_PRE_ROUTING to track packets hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules protocol modules (currently TCP/UDP/ICMP) application helpers currently (FTP,IRC,H.323,talk,SNMP) divides packets in the following four categories NEW - would establish new connection ESTABLISHED - part of already established connection RELATED - is related to established connection INVALID - (multicast, errors...) does _NOT_ filter packets itself can be utilized by iptables using the 'state' match is used by NAT Subsystem %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Connection Tracking Subsystem Common structures struct ip_conntrack_tuple, representing unidirectional flow layer 3 src + dst layer 4 protocol layer 4 src + dst connetions represented as struct ip_conntrack original tuple reply tuple timeout l4 state private data app helper app helper private data expected connections %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Connection Tracking Subsystem Flow of events for new packet packet enters NF_IP_PRE_ROUTING tuple is derived from packet lookup conntrack hash table with hash(tuple) -> fails new ip_conntrack is allocated fill in original and reply == inverted(original) tuple initialize timer assign app helper if applicable see if we've been expected -> fails call layer 4 helper 'new' function ... packet enters NF_IP_POST_ROUTING do hashtable lookup for packet -> fails place struct ip_conntrack in hashtable %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Connection Tracking Subsystem Flow of events for packet part of existing connection packet enters NF_IP_PRE_ROUTING tuple is derived from packet lookup conntrack hash table with hash(tuple) assosiate conntrack entry with skb->nfct call l4 protocol helper 'packet' function do l4 state tracking update timeouts as needed [i.e. TCP TIME_WAIT,...] ... packet enters NF_IP_POST_ROUTING do hashtable lookup for packet -> succeds do nothing else %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Writing extensions for the conntrack subsystem new l4 protocol modules are very rare more common: application helpers for ftp,irc,h.323,quake,mms,... API for conntrack helper modules: %font "typewriter" %size 3 #include %size 3 struct ip_conntrack_helper %size 3 struct list_head *list; %size 3 const char *name; %size 3 unsigned char flags; %size 3 struct module *me; %size 3 unsigned int max_expected; %size 3 unsigned int timeout; %size 3 struct ip_conntrack_tuple tuple; %size 3 struct ip_conntrack_mask mask; %size 3 int (*help)(const struct iphdr *iph, size_t, struct ip_conntrack, enum ip_conntrack_info); %size 3 int ip_conntrack_helper_register(struct ip_conntrack_helper); %size 3 void ip_conntrack_helper_unregister(struct ip_conntrack_helper); %size 3 int ip_conntrack_expect_related(struct ip_conntrack, struct ip_conntrack_expect); %size 3 int ip_conntrack_change_expect(struct ip_conntrack_expect, struct ip_conntrack_tuple); %size 3 void ip_conntrack_unexpect_related(struct ip_conntrack_expect); %font "standard" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Network Address Translation Overview Previous Linux Kernels only implemented one special case of NAT: Masquerading Linux 2.4.x can do any kind of NAT. NAT subsystem implemented on top of netfilter, iptables and conntrack NAT subsystem registers with all five netfilter hooks 'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT Following targets available within 'nat' Table SNAT changes the packet's source whille passing NF_IP_POST_ROUTING DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING MASQUERADE is a special case of SNAT REDIRECT is a special case of DNAT %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Network Address Translation flow of events for NEW packet: packet enters NF_IP_PRE_ROUTING after conntrack resolve conntrack entry for packet if (expectfn of helper) call it else iterate over rules in PREROUTING chain of nat table save respective NAT mappings in conntrack apply the NAT mappings to the packet call NAT helper function, if there is one for this proto ... packet enters NF_IP_POST_ROUTING resolve conntrack entry for packet iterate over rules in POSTROUTING chain of nat table save respectiva NAT mappings in conntrack apply the NAT mappings to the packet call NAT helper function, if there is one for this proto %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Network Address Translation flow of events for ESTABLISHED packets: packet enters NF_IP_PRE_ROUTING after conntrack reseolve conntrack entry for packet apply the NAT mappings (read from conntrack entry) to the packet call NAT helper function, if there is one for this proto ... packet enters NF_IP_POST_ROUTING resolve conntrack entry for packet apply the NAT mappings (read from conntrack entry) to the packet call NAT helper function, if there is one for this proto %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Developing a NAT helper module Network Address Translation %font "typewriter" %size 3 #include %size 3 struct ip_nat_helper %size 3 struct list_head list; %size 3 const char *name; %size 3 unsigned char *flags; %size 3 struct module *me; %size 3 struct ip_conntrack_tuple tuple; %size 3 struct ip_conntrack_tuple mask; %size 3 unsigned int (*help)(struct ip_conntrack *, struct ip_conntrack_expect *, struct ip_nat_info *, enum ip_conntrack_info, unsigned int hooknum, struct sk_buff **) %size 3 unsigned int (*expect)(struct sk_buff **, unsigned int hooknum, struct ip_conntrack, struct ip_nat_info *) %size 3 int ip_nat_helper_register(struct ip_nat_helper *); %size 3 void ip_nat_helper_unregister(struct ip_nat_helper *); %size 3 int ip_nat_mangle_tcp_packet(); %size 3 int ip_nat_mangle_udp_packet(); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The netfilter/iptables architecture Advanced Netfilter concepts %size 4 Userspace logging flexible replacement for old syslog-based logging packets to userspace via multicast netlink sockets easy-to-use library (libipulog) plugin-extensible userspace logging daemon (ulogd) Can even be used to directly log into MySQL Queuing reliable asynchronous packet handling packets to userspace via unicast netlink socket easy-to-use library (libipq) provides Perl bindings experimental queue multiplex daemon (ipqmpd) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Developing netfilter/iptables extensions Thanks The slides and the an according paper of this presentation are available at http://www.gnumonks.org/ The netfilter homepage: http://www.netfilter.org/ Thanks to the BBS people, Z-Netz, FIDO, ... for heavily increasing my computer usage in 1992 KNF for bringing me in touch with the internet as early as 1994 for providing a playground for technical people for telling me about the existance of Linux! Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen for implementing (one of?) the world's best TCP/IP stacks Paul 'Rusty' Russell for starting the netfilter/iptables project for trusting me to maintain it today Astaro AG (http://www.astaro.com/) for sponsoring parts of my netfilter work for sponsoring my travel cost to OLS