diff options
author | Harald Welte <laforge@gnumonks.org> | 2015-10-25 21:00:20 +0100 |
---|---|---|
committer | Harald Welte <laforge@gnumonks.org> | 2015-10-25 21:00:20 +0100 |
commit | fca59bea770346cf1c1f9b0e00cb48a61b44a8f3 (patch) | |
tree | a2011270df48d3501892ac1a56015c8be57e8a7d /2004/netfilter-failover-lk2004 |
import of old now defunct presentation slides svn repo
Diffstat (limited to '2004/netfilter-failover-lk2004')
-rw-r--r-- | 2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp | 369 | ||||
-rw-r--r-- | 2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex | 656 | ||||
-rw-r--r-- | 2004/netfilter-failover-lk2004/zrl.sty | 432 |
3 files changed, 1457 insertions, 0 deletions
diff --git a/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp new file mode 100644 index 0000000..76a9206 --- /dev/null +++ b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp @@ -0,0 +1,369 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + +%center +%size 7 + + +How to replicate the fire +HA for netfilter-based firewalls + + +%center +%size 4 +by + +Harald Welte <laforge@netfilter.org> + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Contents + + + Introduction + Connection Tracking Subsystem + Packet selection based on IP Tables + The Connection Tracking Subsystem + The NAT Subsystem + Poor man's failover + Real state replication + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Introduction + +What is special about firewall failover? + + Nothing, in case of the stateless packet filter + Common IP takeover solutions can be used + VRRP + Heartbeat + Distribution of packet filtering ruleset no problem + can be done manually + or implemented with simple userspace process + Problems arise with stateful packet filters + Connection state only on active node + NAT mappings only on active node + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Connection Tracking Subsystem + +Connection tracking... + enables stateful filtering + implementation + hooks into netfilter to track packets + protocol modules (currently TCP/UDP/ICMP) + application helpers currently (FTP,IRC,H.323,talk,SNMP) + divides packets in the following four categories + NEW - would establish new connection + ESTABLISHED - part of already established connection + RELATED - is related to established connection + INVALID - (multicast, errors...) + does _NOT_ filter packets itself + can be utilized by iptables using the 'state' match + is used by NAT Subsystem + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Connection Tracking Subsystem + +Common structures + struct ip_conntrack_tuple, representing unidirectional flow + layer 3 src + dst + layer 4 protocol + layer 4 src + dst + + connections represented as struct ip_conntrack + original tuple + reply tuple + timeout + l4 state private data + app helper + app helper private data + expected connections + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Connection Tracking Subsystem + +Flow of events for new packet + packet enters NF_IP_PRE_ROUTING + tuple is derived from packet + lookup conntrack hash table with hash(tuple) -> fails + new ip_conntrack is allocated + fill in original and reply == inverted(original) tuple + initialize timer + assign app helper if applicable + see if we've been expected -> fails + call layer 4 helper 'new' function + ... + packet enters NF_IP_POST_ROUTING + do hashtable lookup for packet -> fails + place struct ip_conntrack in hashtable + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Connection Tracking Subsystem + +Flow of events for packet part of existing connection + packet enters NF_IP_PRE_ROUTING + tuple is derived from packet + lookup conntrack hash table with hash(tuple) + associate conntrack entry with skb->nfct + call l4 protocol helper 'packet' function + do l4 state tracking + update timeouts as needed [i.e. TCP TIME_WAIT,...] + ... + packet enters NF_IP_POST_ROUTING + do hashtable lookup for packet -> succeds + do nothing else + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Network Address Translation + +Overview + Previous Linux Kernels only implemented one special case of NAT: Masquerading + Linux 2.4.x can do any kind of NAT. + NAT subsystem implemented on top of netfilter, iptables and conntrack + NAT subsystem registers with all five netfilter hooks + 'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT + Following targets available within 'nat' Table + SNAT changes the packet's source while passing NF_IP_POST_ROUTING + DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING + MASQUERADE is a special case of SNAT + REDIRECT is a special case of DNAT + NAT bindings determined only for NEW packet and saved in ip_conntrack + Further packets within connection NATed according NAT bindings + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Poor man's failover + +Poor man's failover + principle + let every node do its own tracking rather than replicating state + two possible implementations + connect every node to shared media (i.e. real ethernet) + forwarding only turned on on active node + slave nodes use promiscuous mode to sniff packets + copy all traffic to slave nodes + active master needs to copy all traffic to other nodes + disadvantage: high load, sync traffic == payload traffic + IMHO stupid way of solving the problem + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Poor man's failover + +Poor man's failover + advantages + very easy implementation + only addition of sniffing mode to conntrack needed + existing means of address takeover can be used + same load on active master and slave nodes + no additional load on active master + disadvantages + can only be used with real shared media (no switches, ...) + can not be used with NAT + remaining problem + no initial state sync after reboot of slave node! + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication (ct_sync) + +Real state replication (ct_sync) + characteristics + replicates state changes from active master to slave(s) + seperate shared ethernet segment for sync + advantages + can be used with any network media + works with NAT + initial sync after new slave is introduced + problems + complex implementation + current limitations + no replication of connection relations (ftp/h.323/...) + current problems + bugs, bugs, bugs + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication (ct_sync) + +Required parts + state replication protocol + multicast based + sequence numbers for detection of packet loss + NACK-based retransmission + no security, since private ethernet segment to be used + event interface on active node + calling out to callback function at all state changes + exported interface to manipulate conntrack hash table + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication (ct_sync) + +Required parts + kernel thread for sending conntrack state protocol messages + registers with event interface + creates and accumulates state replication packets + sends them via in-kernel sockets api + kernel thread for receiving conntrack state replication messages + receives state replication packets via in-kernel sockets + uses conntrack hashtable manipulation interface + kernel thread for initial or full re-sync + sends full conntrack table with fixed speed + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication + +Flow of events in chronological order: + on active node, inside the network RX softirq + connection tracking code is analyzing a forwarded packet + connection tracking gathers some new state information + connection tracking updates local connection tracking database + connection tracking sends event message to event API + function registered at event API enqueues message to send ring + on active node, inside the conntrack-sync kernel thread + conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy + conntrack sync daemon dequeues packets from ring + conntrack sync daemon sends state replication protocol packet via in-kernel sockets + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication + +Flow of events in chronological order: + on slave node(s), inside network RX softirq + connection tracking code ignores packets coming from the interface attached to the private conntrac sync network + state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread + on slave node(s), inside conntrack-sync kernel thread + conntrack sync daemon receives state replication message + conntrack sync daemon creates/updates conntrack entry + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Real state replication + +Neccessary changes to conntrack core + event generation (callback functions) for all state changes + is needed (and already implemented) for 'ctnetlink' API + conntrack hashtable manipulation API + is needed (and already implemented) for 'ctnetlink' API + conntrack exemptions + needed to _not_ track conntrack state replication packets + is needed for other cases as well (raw table / NOTRACK target) + works by + layer two packet drop (l2netfilter hooks) + disables any incoming or outgoing packets on other than the sync device on slave nodes + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Usage + +To set up a conntrack cluster you need + + hardware + two firewalls with identical iptables rulesets + all ethernet interfaces (internal, dmz, external) connected to both nodes + seperate network segment for conntrack sync device + software + configure any working ip address range/subnet to sync device + assign every node a unique node id (0..255) + decide which of the nodes is master, which slave + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Usage + +To set up a conntrack cluster you need + + configuration on master + first: modprobe ct_sync syncdev=ethX state=1 id=1 l2drop=1 + second: configure your 'real' devices (internal, external) + configuration on slave + modprobe ct_sync syncdev=ethX state=0 id=2 l2drop=1 + second: configure your 'real' devices (internal, external) + + after loading ct_sync with l2drop=1, a slave node will be invisible on the 'real' networks. ssh access is only possible via sync device + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Usage + + Cluster manager + set up a cluster manager with some heartbeat mechanism + configure it to run the following command on a slave that is to be propagated to master: + echo "1" > /proc/net/ct_sync + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Thanks + + Thanks to + the BBS scenee, Z-Netz, FIDO, ... + for heavily increasing my computer usage in 1992 + KNF + for bringing me in touch with the internet as early as 1994 + for providing a playground for technical people + for introducing me to the existance of Linux! + Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen + for implementing (one of?) the world's best TCP/IP stacks + Paul 'Rusty' Russell + for starting the netfilter/iptables project + for trusting me to maintain it today + Astaro AG + for sponsoring my netfilter failover work + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +HA for netfilter/iptables +Availability of slides / Links + +The code + http://cvs.netfilter.org/netfilter-ha/ct_sync + +The slides + http://www.gnumonks.org/ + +The netfilter homepage + http://www.netfilter.org/ + +Astaro AG + http://www.astaro.com/ diff --git a/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex new file mode 100644 index 0000000..d327bac --- /dev/null +++ b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex @@ -0,0 +1,656 @@ +\documentclass[twocolumn,12pt]{article} + +\usepackage{alltt} + +\usepackage[T1]{fontenc} +\usepackage[latin1]{inputenc} +\usepackage{isolatin1} +\usepackage{latexsym} +\usepackage{textcomp} +\usepackage{times} +\usepackage{url} +\usepackage[T1,obeyspaces]{zrl} + +% "verbatim" with line breaks, obeying spaces +\providecommand\code{\begingroup \xrlstyle{tt}\Xrl} +% as above, but okay to break lines at spaces +\providecommand\brcode{\begingroup \zrlstyle{tt}\Zrl} + +% Same as the pair above, but 'l' for long == small type +\providecommand\lcode{\begingroup \small\xrlstyle{tt}\Xrl} +\providecommand\lbrcode{\begingroup \small\zrlstyle{tt}\Zrl} + +% For identifiers - "verbatim" with line breaks at punctuation +\providecommand\ident{\begingroup \urlstyle{tt}\Url} +\providecommand\lident{\begingroup \small\urlstyle{tt}\Url} + + + + +\begin{document} + +% Required: do not print the date. +\date{} + +\title{\texttt{ct\_sync}: state replication of \texttt{ip\_conntrack}\\ +% {\normalsize Subtitle goes here} +} + +\author{ +Harald Welte \\ +{\em netfilter core team / Astaro AG / hmw-consulting.de}\\ +{\tt\normalsize laforge@gnumonks.org}\\ +% \and +% Second Author\\ +% {\em Second Institution}\\ +% {\tt\normalsize another@address.for.email.com}\\ +} % end author section + +\maketitle + +% Required: do not use page numbers on title page. +\thispagestyle{empty} + +\section*{Abstract} + +With traditional, stateless firewalling (such as ipfwadm, ipchains) +there is no need for special HA support in the firewalling +subsystem. As long as all packet filtering rules and routing table +entries are configured in exactly the same way, one can use any +available tool for IP-Address takeover to accomplish the goal of +failing over from one node to the other. + +With Linux 2.4/2.6 netfilter/iptables, the Linux firewalling code +moves beyond traditional packet filtering. Netfilter provides a +modular connection tracking susbsystem which can be employed for +stateful firewalling. The connection tracking subsystem gathers +information about the state of all current network flows +(connections). Packet filtering decisions and NAT information is +associated with this state information. + +In a high availability scenario, this connection tracking state needs +to be replicated from the currently active firewall node to all +standby slave firewall nodes. Only when all connection tracking state +is replicated, the slave node will have all necessary state +information at the time a failover event occurs. + +Due to funding by Astaro AG, the netfilter/iptables project now offers +a \ident{ct_sync} kernel module for replicating connection tracking state +accross multiple nodes. The presentation will cover the architectural +design and implementation of the connection tracking failover sytem. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%% BODY OF PAPER GOES HERE %%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\section{Failover of stateless firewalls} + +There are no special precautions when installing a highly available +stateless packet filter. Since there is no state kept, all information +needed for filtering is the ruleset and the individual, separate packets. + +Building a set of highly available stateless packet filters can thus be +achieved by using any traditional means of IP-address takeover, such +as Heartbeat or VRRPd. + +The only remaining issue is to make sure the firewalling ruleset is +exactly the same on both machines. This should be ensured by the firewall +administrator every time he updates the ruleset and can be optionally managed +by some scripts utilizing scp or rsync. + +If this is not applicable, because a very dynamic ruleset is employed, one can +build a very easy solution using iptables-supplied tools iptables-save and +iptables-restore. The output of iptables-save can be piped over ssh to +iptables-restore on a different host. + +Limitations +\begin{itemize} +\item +no state tracking +\item +not possible in combination with iptables stateful NAT +\item +no counter consistency of per-rule packet/byte counters +\end{itemize} + +\section{Failover of stateful firewalls} + +Modern firewalls implement state tracking (a.k.a.\ connection tracking) in order +to keep some state about the currently active sessions. The amount of +per-connection state kept at the firewall depends on the particular +configuration and networking protocols used. + +As soon as \texttt{any} state is kept at the packet filter, this state +information needs to be replicated to the slave/backup nodes within the +failover setup. + +Since Linux 2.4.x, all relevant state is kept within the \textit{connection +tracking subsystem}. In order to understand how this state could possibly be +replicated, we need to understand the architecture of this conntrack subsystem. + +\subsection{Architecture of the Linux Connection Tracking Subsystem} + +Connection tracking within Linux is implemented as a netfilter module, called +\ident{ip_conntrack.o} (\ident{ip_conntrack.ko} in 2.6.x kernels). + +Before describing the connection tracking subsystem, we need to describe a +couple of definitions and primitives used throughout the conntrack code. + +A connection is represented within the conntrack subsystem using +\brcode{struct ip_conntrack}, also called \textit{connection tracking entry}. + +Connection tracking is utilizing \textit{conntrack tuples}, which are tuples +consisting of +\begin{itemize} +\item + source IP address +\item + source port (or icmp type/code, gre key, ...) +\item + destination IP address +\item + destination port +\item + layer 4 protocol number +\end{itemize} + +A connection is uniquely identified by two tuples: The tuple in the original +direction (\lident{IP_CT_DIR_ORIGINAL}) and the tuple for the reply direction +(\lident{IP_CT_DIR_REPLY}). + +Connection tracking itself does not drop packets\footnote{well, in some rare +cases in combination with NAT it needs to drop. But don't tell anyone, this is +secret.} or impose any policy. It just associates every packet with a +connection tracking entry, which in turn has a particular state. All other +kernel code can use this state information\footnote{State information is +referenced via the \brcode{struct sk_buff.nfct} structure member of a +packet.}. + +\subsubsection{Integration of conntrack with netfilter} + +If the \ident{ip_conntrack.[k]o} module is registered with netfilter, it +attaches to the \lident{NF_IP_PRE_ROUTING}, \lident{NF_IP_POST_ROUTING}, \lident{NF_IP_LOCAL_IN}, +and \lident{NF_IP_LOCAL_OUT} hooks. + +Because forwarded packets are the most common case on firewalls, I will only +describe how connection tracking works for forwarded packets. The two relevant +hooks for forwarded packets are \lident{NF_IP_PRE_ROUTING} and \lident{NF_IP_POST_ROUTING}. + +Every time a packet arrives at the \lident{NF_IP_PRE_ROUTING} hook, connection +tracking creates a conntrack tuple from the packet. It then compares this +tuple to the original and reply tuples of all already-seen +connections +\footnote{Of course this is not implemented as a linear +search over all existing connections.} to find out if this +just-arrived packet belongs to any existing +connection. If there is no match, a new conntrack table entry +(\brcode{struct ip_conntrack}) is created. + +Let's assume the case where we have already existing connections but are +starting from scratch. + +The first packet comes in, we derive the tuple from the packet headers, look up +the conntrack hash table, don't find any matching entry. As a result, we +create a new \brcode{struct ip_conntrack}. This \brcode{struct ip_conntrack} is filled with +all necessarry data, like the original and reply tuple of the connection. +How do we know the reply tuple? By inverting the source and destination +parts of the original tuple.\footnote{So why do we need two tuples, if they can +be derived from each other? Wait until we discuss NAT.} +Please note that this new \brcode{struct ip_conntrack} is \textbf{not} yet placed +into the conntrack hash table. + +The packet is now passed on to other callback functions which have registered +with a lower priority at \lident{NF_IP_PRE_ROUTING}. It then continues traversal of +the network stack as usual, including all respective netfilter hooks. + +If the packet survives (i.e., is not dropped by the routing code, network stack, +firewall ruleset, \ldots), it re-appears at \lident{NF_IP_POST_ROUTING}. In this case, +we can now safely assume that this packet will be sent off on the outgoing +interface, and thus put the connection tracking entry which we created at +\lident{NF_IP_PRE_ROUTING} into the conntrack hash table. This process is called +\textit{confirming the conntrack}. + +The connection tracking code itself is not monolithic, but consists of a +couple of separate modules\footnote{They don't actually have to be separate +kernel modules; e.g.\ TCP, UDP, and ICMP tracking modules are all part of +the linux kernel module \ident{ip_conntrack.o}.}. Besides the conntrack core, +there are two important kind of modules: Protocol helpers and application +helpers. + +Protocol helpers implement the layer-4-protocol specific parts. They currently +exist for TCP, UDP, and ICMP (an experimental helper for GRE exists). + +\subsubsection{TCP connection tracking} + +As TCP is a connection oriented protocol, it is not very difficult to imagine +how conntection tracking for this protocol could work. There are well-defined +state transitions possible, and conntrack can decide which state transitions +are valid within the TCP specification. In reality it's not all that easy, +since we cannot assume that all packets that pass the packet filter actually +arrive at the receiving end\ldots + +It is noteworthy that the standard connection tracking code does \textbf{not} +do TCP sequence number and window tracking. A well-maintained patch to add +this feature has existed for almost as long as connection tracking itself. It +will be integrated with the 2.5.x kernel. The problem with window tracking is +its bad interaction with connection pickup. The TCP conntrack code is able to +pick up already existing connections, e.g.\ in case your firewall was rebooted. +However, connection pickup is conflicting with TCP window tracking: The TCP +window scaling option is only transferred at connection setup time, and we +don't know about it in case of pickup\ldots + +\subsubsection{ICMP tracking} + +ICMP is not really a connection oriented protocol. So how is it possible to +do connection tracking for ICMP? + +The ICMP protocol can be split in two groups of messages: + +\begin{itemize} +\item +ICMP error messages, which sort-of belong to a different connection +ICMP error messages are associated \textit{RELATED} to a different connection. +(\lident{ICMP_DEST_UNREACH}, \lident{ICMP_SOURCE_QUENCH}, +\lident{ICMP_TIME_EXCEEDED}, +\lident{ICMP_PARAMETERPROB}, \lident{ICMP_REDIRECT}). +\item +ICMP queries, which have a \ident{request-reply} character. So what +the conntrack +code does, is let the request have a state of \textit{NEW}, and the reply +\textit{ESTABLISHED}. The reply closes the connection immediately. +(\lident{ICMP_ECHO}, \lident{ICMP_TIMESTAMP}, \lident{ICMP_INFO_REQUEST}, \lident{ICMP_ADDRESS}) +\end{itemize} + +\subsubsection{UDP connection tracking} + +UDP is designed as a connectionless datagram protocol. But most common +protocols using UDP as layer 4 protocol have bi-directional UDP communication. +Imagine a DNS query, where the client sends an UDP frame to port 53 of the +nameserver, and the nameserver sends back a DNS reply packet from its UDP +port 53 to the client. + +Netfilter treats this as a connection. The first packet (the DNS request) is +assigned a state of \textit{NEW}, because the packet is expected to create a new +`connection.' The DNS server's reply packet is marked as \textit{ESTABLISHED}. + +\subsubsection{conntrack application helpers} + +More complex application protocols involving multiple connections need special +support by a so-called ``conntrack application helper module.'' Modules in +the stock kernel come for FTP, IRC (DCC), TFTP and Amanda. Netfilter CVS currently contains +%%% orig: ``tftp ald talk'' -- um, 'tftp and talk'? Yes, that's correct. It refers +%%% to the talk protocol. +patches for PPTP, H.323, Eggdrop botnet, mms, DirectX, RTSP and talk/ntalk. We're still lacking +a lot of protocols (e.g.\ SIP, SMB/CIFS)---but they are unlikely to appear +until somebody really needs them and either develops them on his own or +funds development. + +\subsubsection{Integration of connection tracking with iptables} + +As stated earlier, conntrack doesn't impose any policy on packets. It just +determines the relation of a packet to already existing connections. +To base +packet filtering decision on this state information, the iptables \textit{state} +match can be used. Every packet is within one of the following categories: + +\begin{itemize} +\item +\textbf{NEW}: packet would create a new connection, if it survives +\item +\textbf{ESTABLISHED}: packet is part of an already established connection +(either direction) +\item +\textbf{RELATED}: packet is in some way related to an already established +connection, e.g.\ ICMP errors or FTP data sessions +\item +\textbf{INVALID}: conntrack is unable to derive conntrack information +from this packet. Please note that all multicast or broadcast packets +fall in this category. +\end{itemize} + + +\subsection{Poor man's conntrack failover} + +When thinking about failover of stateful firewalls, one usually thinks about +replication of state. This presumes that the state is gathered at one +firewalling node (the currently active node), and replicated to several other +passive standby nodes. There is, however, a very different approach to +replication: concurrent state tracking on all firewalling nodes. + +While this scheme has not been implemented within \ident{ct_sync}, the author +still thinks it is worth an explanation in this paper. + +The basic assumption of this approach is: In a setup where all firewalling +%%% deduct or deduce? I'd guess the latter, but I don't know, so I'm +%%% leaving it... +nodes receive exactly the same traffic, all nodes will deduct the same state +information. + +The implementability of this approach is totally dependent on fulfillment of +this assumption. + +\begin{itemize} +\item +\textit{All packets need to be seen by all nodes}. This is not always true, but +can be achieved by using shared media like traditional ethernet (no switches!!) +and promiscuous mode on all ethernet interfaces. +\item +\textit{All nodes need to be able to process all packets}. This cannot be +universally guaranteed. Even if the hardware (CPU, RAM, Chipset, NICs) and +software (Linux kernel) are exactly the same, they might behave different, +especially under high load. To avoid those effects, the hardware should be +able to deal with way more traffic than seen during operation. Also, there +should be no userspace processes (like proxies, etc.) running on the firewalling +nodes at all. WARNING: Nobody guarantees this behaviour. However, the poor +man is usually not interested in scientific proof but in usability in his +particular practical setup. +\end{itemize} + +However, even if those conditions are fulfilled, there are remaining issues: +\begin{itemize} +\item +\textit{No resynchronization after reboot}. If a node is rebooted (because of +a hardware fault, software bug, software update, etc.) it will lose all state +information until the event of the reboot. This means, the state information +of this node after reboot will not contain any old state, gathered before the +reboot. The effects depend on the traffic. Generally, it is only assured that +state information about all connections initiated after the reboot will be +present. If there are short-lived connections (like http), the state +information on the just rebooted node will approximate the state information of +an older node. Only after all sessions active at the time of reboot have +terminated, state information is guaranteed to be resynchronized. +\item +\textit{Only possible with shared medium}. The practical implication is that no +switched ethernet (and thus no full duplex) can be used. +\end{itemize} + +The major advantage of the poor man's approach is implementation simplicity. +No state transfer mechanism needs to be developed. Only very little changes +to the existing conntrack code would be needed in order to be able to +do tracking based on packets received from promiscuous interfaces. The active +node would have packet forwarding turned on, the passive nodes, off. + +I'm not proposing this as a real solution to the failover problem. It's +hackish, buggy, and likely to break very easily. But considering it can be +implemented in very little programming time, it could be an option for very +small installations with low reliability criteria. + +\subsection{Conntrack state replication} + +The preferred solution to the failover problem is, without any doubt, +replication of the connection tracking state. + +The proposed conntrack state replication soltution consists of several +parts: +\begin{itemize} +\item +A connection tracking state replication protocol +\item +An event interface generating event messages as soon as state information +changes on the active node +\item +An interface for explicit generation of connection tracking table entries on +the standby slaves +\item +Some code (preferrably a kernel thread) running on the active node, receiving +state updates by the event interface and generating conntrack state replication +protocol messages +\item +Some code (preferrably a kernel thread) running on the slave node(s), receiving +conntrack state replication protocol messages and updating the local conntrack +table accordingly +\end{itemize} + +Flow of events in chronological order: +\begin{itemize} +\item +\textit{on active node, inside the network RX softirq} +\begin{itemize} +\item + \ident{ip_conntrack} analyzes a forwarded packet +\item + \ident{ip_conntrack} gathers some new state information +\item + \ident{ip_conntrack} updates conntrack hash table +\item + \ident{ip_conntrack} calls event API +\item + function registered to event API builds and enqueues message to send ring +\end{itemize} +\item +\textit{on active node, inside the conntrack-sync sender kernel thread} + \begin{itemize} + \item + \ident{ct_sync_send} aggregates multiple messages into one packet + \item + \ident{ct_sync_send} dequeues packet from ring + \item + \ident{ct_sync_send} sends packet via in-kernel sockets API + \end{itemize} +\item +\textit{on slave node(s), inside network RX softirq} + \begin{itemize} + \item + \ident{ip_conntrack} ignores packets coming from the \ident{ct_sync} interface via NOTRACK mechanism + \item + UDP stack appends packet to socket receive queue of \ident{ct_sync_recv} kernel thread + \end{itemize} +\item +\textit{on slave node(s), inside conntrack-sync receive kernel thread} + \begin{itemize} + \item + \ident{ct_sync_recv} thread receives state replication packet + \item + \ident{ct_sync_recv} thread parses packet into individual messages + \item + \ident{ct_sync_recv} thread creates/updates local \ident{ip_conntrack} entry + \end{itemize} +\end{itemize} + + +\subsubsection{Connection tracking state replication protocol} + + + In order to be able to replicate the state between two or more firewalls, a +state replication protocol is needed. This protocol is used over a private +network segment shared by all nodes for state replication. It is designed to +work over IP unicast and IP multicast transport. IP unicast will be used for +direct point-to-point communication between one active firewall and one +standby firewall. IP multicast will be used when the state needs to be +replicated to more than one standby firewall. + + + The principal design criteria of this protocol are: +\begin{itemize} +\item + \textbf{reliable against data loss}, as the underlying UDP layer only + provides checksumming against data corruption, but doesn't employ any + means against data loss +\item + \textbf{lightweight}, since generating the state update messages is + already a very expensive process for the sender, eating additional CPU, + memory, and IO bandwith. +\item + \textbf{easy to parse}, to minimize overhead at the receiver(s) +\end{itemize} + +The protocol does not employ any security mechanism like encryption, +authentication, or reliability against spoofing attacks. It is +assumed that the private conntrack sync network is a secure communications +channel, not accessible to any malicious third party. + +To achieve the reliability against data loss, an easy sequence numbering +scheme is used. All protocol messages are prefixed by a sequence number, +determined by the sender. If the slave detects packet loss by discontinuous +sequence numbers, it can request the retransmission of the missing packets +by stating the missing sequence number(s). Since there is no acknowledgement +for sucessfully received packets, the sender has to keep a +reasonably-sized\footnote{\textit{reasonable size} must be large enough for the +round-trip time between master and slowest slave.} backlog of recently-sent +packets in order to be able to fulfill retransmission +requests. + +The different state replication protocol packet types are: +\begin{itemize} +\item +\textbf{\ident{CT_SYNC_PKT_MASTER_ANNOUNCE}}: A new master announces itself. +Any still existing master will downgrade itself to slave upon +reception of this packet. +\item +\textbf{\ident{CT_SYNC_PKT_SLAVE_INITSYNC}}: A slave requests initial +synchronization from the master (after reboot or loss of sync). +\item +\textbf{\ident{CT_SYNC_PKT_SYNC}}: A packet containing synchronization data +from master to slaves +\item +\textbf{\ident{CT_SYNC_PKT_NACK}}: A slave indicates packet loss of a +particular sequence number +\end{itemize} + +The messages within a \lident{CT_SYNC_PKT_SYNC} packet always refer to a particular +\textit{resource} (currently \lident{CT_SYNC_RES_CONNTRACK} and \lident{CT_SYNC_RES_EXPECT}, +although support for the latter has not been fully implemented yet). + +For every resource, there are several message types. So far, only +\lident{CT_SYNC_MSG_UPDATE} and \lident{CT_SYNC_MSG_DELETE} have been implemented. This +means a new connection as well as state changes to an existing connection will +always be encapsulated in a \lident{CT_SYNC_MSG_UDPATE} message and therefore contain +the full conntrack entry. + +To uniquely identify (and later reference) a conntrack entry, the only unique +criteria is used: \ident{ip_conntrack_tuple}. + +\subsubsection{\texttt{ct\_sync} sender thread} + +Maximum care needs to be taken for the implementation of the ctsyncd sender. + +The normal workload of the active firewall node is likely to be already very +high, so generating and sending the conntrack state replication messages needs +to be highly efficient. + +It was therefore decided to use a pre-allocated ringbuffer for outbound +\ident{ct_sync} packets. New messages are appended to individual buffers in this +ring, and pointers into this ring are passed to the in-kernel sockets API to +ensure a minimum number of copies and memory allocations. + +\subsubsection{\texttt{ct\_sync} initsync sender thread} + +In order to facilitate ongoing state synchronization at the same time as +responding to initial sync requests of an individual slave, the sender has a +separate kernel thread for initial state synchronization (and \ident{ct_sync_initsync}). + +At the moment it iterates over the state table and transmits packets with a +fixed rate of about 1000 packets per second, resulting in about 4000 +connections per second, averaging to about 1.5 Mbps of bandwith consumed. + +The speed of this initial sync should be configurable by the system +administrator, especially since there is no flow control mechanism, and the +slave node(s) will have to deal with the packets or otherwise lose sync again. + +This is certainly an area of future improvement and development---but first we +want to see practical problems with this primitive scheme. + +\subsubsection{\texttt{ct\_sync} receiver thread} + +Implementation of the receiver is very straightforward. + +For performance reasons, and to facilitate code-reuse, the receiver uses the +same pre-allocated ring buffer structure as the sender. Incoming packets are +written into ring members and then successively parsed into their individual +messages. + +Apart from dealing with lost packets, it just needs to call the +respective conntrack add/modify/delete functions. + +\subsubsection{Necessary changes within netfilter conntrack core} + +To be able to achieve the described conntrack state replication mechanism, +the following changes to the conntrack core were implemented: +\begin{itemize} +\item + Ability to exclude certain packets from being tracked. This was a + long-wanted feature on the TODO list of the netfilter project and is + implemented by having a ``raw'' table in combination with a + ``NOTRACK'' target. +\item + Ability to register callback functions to be called every time a new + conntrack entry is created or an existing entry modified. This is + part of the nfnetlink-ctnetlink patch, since the ctnetlink event + interface also uses this API. +\item + Export an API to externally add, modify, and remove conntrack entries. +\end{itemize} + +Since the number of changes is very low, their inclusion into the mainline +kernel is not a problem and can happen during the 2.6.x stable kernel series. + + +\subsubsection{Layer 2 dropping and \texttt{ct\_sync}} + +In most cases, netfilter/iptables-based firewalls will not only function as +packet filter but also run local processes such as proxies, dns relays, smtp +relays, etc. + +In order to minimize failover time, it is helpful if the full startup and +configuration of all network interfaces and all of those userspace processes +can happen at system bootup time rather then in the instance of a failover. + +l2drop provides a convenient way for this goal: It hooks into layer 2 +netfilter hooks (immediately attached to \ident{netif_rx()} and +\ident{dev_queue_xmit}) and blocks all incoming and outgoing network packets at this +very low layer. Even kernel-generated messages such as ARP replies, IPv6 +neighbour discovery, IGMP, \dots are blocked this way. + +Of course there has to be an exemption for the state synchronization messages +themselves. In order to still facilitate remote administration via SSH and +other communication between the cluster nodes, the whole network +interface used for synchronization is subject to this exemption from +l2drop. + +As soon as a node is propagated to master state, l2drop is disabled and the +system becomes visible to the network. + + +\subsubsection{Configuration} + +All configuration happens via module parameters. + +\begin{itemize} +\item + \texttt{syncdev}: Name of the multicast-capable network device + used for state synchronization among the nodes +\item + \texttt{state}: Initial state of the node (0=slave, 1=master) +\item + \texttt{id}: Unique Node ID (0..255) +\item + \texttt{l2drop}: Enable (1) or disable (0) the l2drop functionality +\end{itemize} + +\subsubsection{Interfacing with the cluster manager} + +As indicated in the beginning of this paper, \ident{ct_sync} itself does not provide +any mechanism to determine outage of the master node within a cluster. This +job is left to a cluster manager software running in userspace. + +Once an outage of the master is detected, the cluster manager needs to elect +one of the remaining (slave) nodes to become new master. On this elected node, +the cluster manager will write the ascii character \texttt{1} into the +\ident{/proc/net/ct_sync} file. Reading from this file will return the current state +of the local node. + +\section{Acknowledgements} + +The author would like to thank his fellow netfilter developers for their +help. Particularly important to \ident{ct_sync} is Krisztian KOVACS +\ident{<hidden@balabit.hu>}, who did a proof-of-concept implementation based on my +first paper on \ident{ct_sync} at OLS2002. + +Without the financial support of Astaro AG, I would not have been able to spend any +time on \ident{ct_sync} at all. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\end{document} + diff --git a/2004/netfilter-failover-lk2004/zrl.sty b/2004/netfilter-failover-lk2004/zrl.sty new file mode 100644 index 0000000..fb97b03 --- /dev/null +++ b/2004/netfilter-failover-lk2004/zrl.sty @@ -0,0 +1,432 @@ + +%%%%% This file is a kludge until such time as I learn to do it elegantly. Sorry. +%% url - external. Intended for items which do not contain spaces, and +%% containing global options for obeying & breaking at spaces. But +%% we need to do change those things on the fly, so we're making a copy +%% of url.sty and defining two extra groups, zrl and xrl, that +%% permit handling these options on the fly. + +%% Thus you can mix url without obeyspaces and/or spaces with the following: +%% zrl - url with obeyspaces,spaces turned on +%% xrl - url with obeyspaces turned on + +% zrl.sty ver 1.4 02-Mar-1999 Donald Arseneau asnd@triumf.ca +% Copyright 1996-1999 Donald Arseneau, Vancouver, Canada. +% This program can be used, distributed, and modified under the terms +% of the LaTeX Project Public License. +% +% A form of \verb that allows linebreaks at certain characters or +% combinations of characters, accepts reconfiguration, and can usually +% be used in the argument to another command. It is intended for email +% addresses, hypertext links, directories/paths, etc., which normally +% have no spaces. The font may be selected using the \zrlstyle command, +% and new zrl-like commands can be defined using \zrldef. +% +% Usage: Conditions: +% \zrl{ } If the argument contains any "%", "#", or "^^", or ends with +% "\", it can't be used in the argument to another command. +% The argument must not contain unbalanced braces. +% \zrl| | ...where "|" is any character not used in the argument and not +% "{" or a space. The same restrictions as above except that the +% argument may contain unbalanced braces. +% \xyz for "\xyz" a defined-zrl; this can be used anywhere, no matter +% what characters it contains. +% +% See further instructions after "\endinput" +% +\def\Zrl@ttdo{% style assignments for tt fonts or T1 encoding +\def\ZrlBreaks{\do\.\do\@\do\\\do\/\do\!\do\_\do\|\do\%\do\;\do\>\do\]% + \do\)\do\,\do\?\do\'\do\+\do\=}% +\def\ZrlBigBreaks{\do\:\do@zrl@hyp}% +\def\ZrlNoBreaks{\do\(\do\[\do\{\do\<}% (unnecessary) +\def\ZrlSpecials{\do\ {\ }}% +\def\ZrlOrds{\do\*\do\-\do\~}% any ordinary characters that aren't usually +} + +\def\Xrl@ttdo{% style assignments for tt fonts or T1 encoding +\def\XrlBreaks{\do\.\do\@\do\\\do\/\do\!\do\_\do\|\do\%\do\;\do\>\do\]% + \do\)\do\,\do\?\do\'\do\+\do\=}% +\def\XrlBigBreaks{\do\:\do@xrl@hyp}% +\def\XrlNoBreaks{\do\(\do\[\do\{\do\<}% (unnecessary) +\def\XrlSpecials{\do\ {\ }}% +\def\XrlOrds{\do\*\do\-\do\~}% any ordinary characters that aren't usually +} + +\def\Zrl@do{% style assignments for OT1 fonts except tt +\def\ZrlBreaks{\do\.\do\@\do\/\do\!\do\%\do\;\do\]\do\)\do\,\do\?\do\+\do\=}% +\def\ZrlBigBreaks{\do\:\do@zrl@hyp}% +\def\ZrlNoBreaks{\do\(\do\[\do\{}% prevents breaks after *next* character +\def\ZrlSpecials{\do\<{\langle}\do\>{\mathbin{\rangle}}\do\_{\_% + \penalty\@m}\do\|{\mid}\do\{{\lbrace}\do\}{\mathbin{\rbrace}}\do + \\{\mathbin{\backslash}}\do\~{\raise.6ex\hbox{\m@th$\scriptstyle\sim$}}\do + \ {\ }}% +\def\ZrlOrds{\do\'\do\"\do\-}% +} +\def\Xrl@do{% style assignments for OT1 fonts except tt +\def\XrlBreaks{\do\.\do\@\do\/\do\!\do\%\do\;\do\]\do\)\do\,\do\?\do\+\do\=}% +\def\XrlBigBreaks{\do\:\do@xrl@hyp}% +\def\XrlNoBreaks{\do\(\do\[\do\{}% prevents breaks after *next* character +\def\XrlSpecials{\do\<{\langle}\do\>{\mathbin{\rangle}}\do\_{\_% + \penalty\@m}\do\|{\mid}\do\{{\lbrace}\do\}{\mathbin{\rbrace}}\do + \\{\mathbin{\backslash}}\do\~{\raise.6ex\hbox{\m@th$\scriptstyle\sim$}}\do + \ {\ }}% +\def\XrlOrds{\do\'\do\"\do\-}% +} + + +\def\zrl@ttstyle{% +\@ifundefined{selectfont}{\def\ZrlFont{\tt}}{\def\ZrlFont{\ttfamily}}\Zrl@ttdo +} +\def\xrl@ttstyle{% +\@ifundefined{selectfont}{\def\XrlFont{\tt}}{\def\XrlFont{\ttfamily}}\Xrl@ttdo +} + + +\def\zrl@rmstyle{% +\@ifundefined{selectfont}{\def\ZrlFont{\rm}}{\def\ZrlFont{\rmfamily}}\Zrl@do +} +\def\xrl@rmstyle{% +\@ifundefined{selectfont}{\def\XrlFont{\rm}}{\def\XrlFont{\rmfamily}}\Xrl@do +} + + +\def\zrl@sfstyle{% +\@ifundefined{selectfont}{\def\ZrlFont{\sf}}{\def\ZrlFont{\sffamily}}\Zrl@do +} +\def\xrl@sfstyle{% +\@ifundefined{selectfont}{\def\XrlFont{\sf}}{\def\XrlFont{\sffamily}}\Xrl@do +} + + +\def\zrl@samestyle{\ifdim\fontdimen\thr@@\font=\z@ \zrl@ttstyle \else + \zrl@rmstyle \fi \def\ZrlFont{}} +\def\xrl@samestyle{\ifdim\fontdimen\thr@@\font=\z@ \xrl@ttstyle \else + \xrl@rmstyle \fi \def\XrlFont{}} + +\@ifundefined{strip@prefix}{\def\strip@prefix#1>{}}{} +\@ifundefined{verbatim@nolig@list}{\def\verbatim@nolig@list{\do\`}}{} + +\def\Zrl{% + \begingroup \let\zrl@moving\relax\relax \endgroup + \ifmmode\@nomatherr$\fi + \ZrlFont $\fam\z@ \textfont\z@\font + \let\do\@makeother \dospecials % verbatim catcodes + \catcode`{\@ne \catcode`}\tw@ \catcode`\ 10 % except braces and spaces + \medmuskip0mu \thickmuskip\medmuskip \thinmuskip\medmuskip + \@tempcnta\fam\multiply\@tempcnta\@cclvi + \let\do\set@mathcode \ZrlOrds % ordinary characters that were special + \advance\@tempcnta 8192 \ZrlBreaks % bin + \advance\@tempcnta 4096 \ZrlBigBreaks % rel + \advance\@tempcnta 4096 \ZrlNoBreaks % open + \let\do\set@mathact \ZrlSpecials % active + \let\do\set@mathnolig \verbatim@nolig@list % prevent ligatures + \@ifnextchar\bgroup\Zrl@z\Zrl@y} + +\def\Zrl@y#1{\catcode`{11 \catcode`}11 + \def\@tempa##1#1{\Zrl@z{##1}}\@tempa} +\def\Zrl@z#1{\def\@tempa{#1}\expandafter\expandafter\expandafter\Zrl@Hook + \expandafter\strip@prefix\meaning\@tempa\ZrlRight\m@th$\endgroup} +\def\Zrl@Hook{\ZrlLeft} +\let\ZrlRight\@empty +\let\ZrlLeft\@empty + +\def\Xrl{% + \begingroup \let\xrl@moving\relax\relax \endgroup + \ifmmode\@nomatherr$\fi + \XrlFont $\fam\z@ \textfont\z@\font + \let\do\@makeother \dospecials % verbatim catcodes + \catcode`{\@ne \catcode`}\tw@ \catcode`\ 10 % except braces and spaces + \medmuskip0mu \thickmuskip\medmuskip \thinmuskip\medmuskip + \@tempcnta\fam\multiply\@tempcnta\@cclvi + \let\do\set@mathcode \XrlOrds % ordinary characters that were special + \advance\@tempcnta 8192 \XrlBreaks % bin + \advance\@tempcnta 4096 \XrlBigBreaks % rel + \advance\@tempcnta 4096 \XrlNoBreaks % open + \let\do\set@mathact \XrlSpecials % active + \let\do\set@mathnolig \verbatim@nolig@list % prevent ligatures + \@ifnextchar\bgroup\Xrl@z\Xrl@y} + +\def\Xrl@y#1{\catcode`{11 \catcode`}11 + \def\@tempa##1#1{\Xrl@z{##1}}\@tempa} +\def\Xrl@z#1{\def\@tempa{#1}\expandafter\expandafter\expandafter\Xrl@Hook + \expandafter\strip@prefix\meaning\@tempa\XrlRight\m@th$\endgroup} +\def\Xrl@Hook{\XrlLeft} +\let\XrlRight\@empty +\let\XrlLeft\@empty + + +\def\set@mathcode#1{\count@`#1\advance\count@\@tempcnta\mathcode`#1\count@} +\def\set@mathact#1#2{\mathcode`#132768 \lccode`\~`#1\lowercase{\def~{#2}}} +\def\set@mathnolig#1{\ifnum\mathcode`#1<32768 + \lccode`\~`#1\lowercase{\edef~{\mathchar\number\mathcode`#1_{\/}}}% + \mathcode`#132768 \fi} + +\def\zrldef#1#2{\begingroup \setbox\z@\hbox\bgroup + \def\Zrl@z{\Zrl@def{#1}{#2}}#2} +\expandafter\ifx\csname DeclareRobustCommand\endcsname\relax + \def\Zrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup + \def#1{#2{#3}}} +\else + \def\Zrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup + \DeclareRobustCommand{#1}{#2{#3}}} +\fi + +\def\xrldef#1#2{\begingroup \setbox\z@\hbox\bgroup + \def\Xrl@z{\Xrl@def{#1}{#2}}#2} +\expandafter\ifx\csname DeclareRobustCommand\endcsname\relax + \def\Xrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup + \def#1{#2{#3}}} +\else + \def\Xrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup + \DeclareRobustCommand{#1}{#2{#3}}} +\fi + +\def\zrlstyle#1{\csname zrl@#1style\endcsname} +\def\xrlstyle#1{\csname xrl@#1style\endcsname} + +% Sample (and default) configuration: +% +\newcommand\zrl{\begingroup \Zrl} +\newcommand\xrl{\begingroup \Xrl} +% +% picTeX defines \path, so declare it optionally: +\@ifundefined{path}{\newcommand\path{\begingroup \zrlstyle{tt}\Zrl}}{} +\@ifundefined{path}{\newcommand\path{\begingroup \xrlstyle{tt}\Xrl}}{} +% +% too many styles define \email like \address, so I will not define it. +% \newcommand\email{\begingroup \zrlstyle{rm}\Zrl} + +% Process LaTeX \package options +% +\zrlstyle{tt} +%\let\Zrl@sppen\@M +\def\do@zrl@hyp{}% by default, no breaks after hyphens +%%%%% +\let\Zrl@sppen\relpenalty +\let\Zrl@Hook\relax +\xrlstyle{tt} +\let\Xrl@sppen\@M +\def\do@xrl@hyp{}% by default, no breaks after hyphens +\let\Xrl@Hook\relax +%%%%% +\@ifundefined{ProvidesPackage}{}{ + \ProvidesPackage{zrl}[1999/03/02 \space ver 1.4 \space + Verb mode for zrls, email addresses, and file names] + \DeclareOption{hyphens}{\def\do@zrl@hyp{\do\-}\def\do@xrl@hyp{\do\-}}% allow breaks after hyphens + \DeclareOption{obeyspaces}{\let\Zrl@Hook\relax\let\Xrl@Hook\relax}% a flag for later + \DeclareOption{spaces}{\let\Zrl@sppen\relpenalty} + \DeclareOption{T1}{\let\Zrl@do\Zrl@ttdo\let\Xrl@do\Xrl@ttdo} + \ProcessOptions +\ifx\Zrl@Hook\relax % [obeyspaces] was declared + \def\Zrl@Hook#1\ZrlRight\m@th{\edef\@tempa{\noexpand\ZrlLeft + \Zrl@retain#1\Zrl@nosp\, }\@tempa\ZrlRight\m@th} + \def\Zrl@retain#1 {#1\penalty\Zrl@sppen\ \Zrl@retain} + \def\Zrl@nosp\,#1\Zrl@retain{} +\fi +\ifx\Xrl@Hook\relax % [obeyspaces] was declared + \def\Xrl@Hook#1\XrlRight\m@th{\edef\@tempa{\noexpand\XrlLeft + \Xrl@retain#1\Xrl@nosp\, }\@tempa\XrlRight\m@th} + \def\Xrl@retain#1 {#1\penalty\Xrl@sppen\ \Xrl@retain} + \def\Xrl@nosp\,#1\Xrl@retain{} +\fi +} + +\edef\zrl@moving{\csname Zrl Error\endcsname} +\expandafter\edef\zrl@moving + {\csname zrl used in a moving argument.\endcsname} +\expandafter\expandafter\expandafter \let \zrl@moving\undefined + +\edef\xrl@moving{\csname Xrl Error\endcsname} +\expandafter\edef\xrl@moving + {\csname xrl used in a moving argument.\endcsname} +\expandafter\expandafter\expandafter \let \xrl@moving\undefined + +\endinput +% +% zrl.sty ver 1.4 02-Mar-1999 Donald Arseneau asnd@reg.triumf.ca +% +% This package defines "\zrl", a form of "\verb" that allows linebreaks, +% and can often be used in the argument to another command. It can be +% configured to print in different formats, and is particularly useful for +% hypertext links, email addresses, directories/paths, etc. The font may +% be selected using the "\zrlstyle" command and pre-defined text can be +% stored with the "\zrldef" command. New zrl-like commands can be defined, +% and a "\path" command is provided this way. +% +% Usage: Conditions: +% \zrl{ } If the argument contains any "%", "#", or "^^", or ends with +% "\", it can't be used in the argument to another command. +% The argument must not contain unbalanced braces. +% \zrl| | ...where "|" is any character not used in the argument and not +% "{" or a space. The same restrictions as above except that the +% argument may contain unbalanced braces. +% \xyz for "\xyz" a defined-zrl; this can be used anywhere, no matter +% what characters it contains. +% +% The "\zrl" command is fragile, and its argument is likely to be very +% fragile, but a defined-zrl is robust. +% +% Package Option: obeyspaces +% Ordinarily, all spaces are ignored in the zrl-text. The "[obeyspaces]" +% option allows spaces, but may introduce spurious spaces when a zrl +% containing "\" characters is given in the argument to another command. +% So if you need to obey spaces you can say "\usepackage[obeyspaces]{zrl}", +% and if you need both spaces and backslashes, use a `defined-zrl' for +% anything with "\". +% +% Package Option: hyphens +% Ordinarily, breaks are not allowed after "-" characters because this +% leads to confusion. (Is the "-" part of the address or just a hyphen?) +% The package option "[hyphens]" allows breaks after explicit hyphen +% characters. The "\zrl" command will *never ever* hyphenate words. +% +% Package Option: spaces +% Likewise, breaks are not usually allowed after spaces under the +% "[obeyspaces]" option, but giving the options "[obeyspaces,spaces]" +% will allow breaks at those spaces. +% +% Package Option: T1 +% This signifies that you will be using T1-encoded fonts which contain +% some characters missing from most older (OT1) encoded TeX fonts. This +% changes the default definition for "\zrlstyle{rm}". +% +% Defining a defined-zrl: +% Take for example the email address "myself%node@gateway.net" which could +% not be given (using "\zrl" or "\verb") in a caption or parbox due to the +% percent sign. This address can be predefined with +% \zrldef{\myself}\zrl{myself%node@gateway.net} or +% \zrldef{\myself}\zrl|myself%node@gateway.net| +% and then you may use "\myself" instead of "\zrl{myself%node@gateway.net}" +% in an argument, and even in a moving argument like a caption because a +% defined-zrl is robust. +% +% Style: +% You can switch the style of printing using "\zrlstyle{tt}", where "tt" +% can be any defined style. The pre-defined styles are "tt", "rm", "sf", +% and "same" which all allow the same linebreaks but different fonts -- +% the first three select a specific font and the "same" style uses the +% current text font. You can define your own styles with different fonts +% and/or line-breaking by following the explanations below. The "\zrl" +% command follows whatever the currently-set style dictates. +% +% Alternate commands: +% It may be desireable to have different things treated differently, each +% in a predefined style; e.g., if you want directory paths to always be +% in tt and email addresses to be rm, then you would define new zrl-like +% commands as follows: +% +% \newcommand\email{\begingroup \zrlstyle{rm}\Zrl} +% \newcommand\directory{\begingroup \zrlstyle{tt}\Zrl} +% +% You must follow this format closely, and NOTE that the final command is +% "\Zrl", not "\zrl". In fact, the "\directory" example is exactly the +% "\path" definition which is pre-defined in the package. If you look +% above, you will see that "\zrl" is defined with +% \newcommand\zrl{\begingroup \Zrl} +% I.e., using whatever zrl-style has been selected. +% +% You can make a defined-zrl for these other styles, using the usual +% "\zrldef" command as in this example: +% +% \zrldef{\myself}{\email}{myself%node.domain@gateway.net} +% +% which makes "\myself" act like "\email{myself%node.domain@gateway.net}", +% if the "\email" command is defined as above. The "\myself" command +% would then be robust. +% +% Defining styles: +% Before describing how to customize the printing style, it is best to +% mention something about the unusual implementation of "\zrl". Although +% the material is textual in nature, and the font specification required +% is a text-font command, the text is actually typeset in *math* mode. +% This allows the context-sensitive linebreaking, but also accounts for +% the default behavior of ignoring spaces. Now on to defining styles. +% +% To change the font or the list of characters that allow linebreaks, you +% could redefine the commands "\ZrlFont", "\ZrlBreaks", "\ZrlSpecials" etc. +% directly in the document, but it is better to define a new `zrl-style' +% (following the example of "\zrl@ttstyle" and "\zrl@rmstyle") which defines +% all of "\ZrlBigbreaks", "\ZrlNoBreaks", "\ZrlBreaks", "\ZrlSpecials", and +% "\ZrlFont". +% +% Changing font: +% The "\ZrlFont" command selects the font. The definition of "\ZrlFont" +% done by the pre-defined styles varies to cope with a variety of LaTeX +% font selection schemes, but it could be as simple as "\def\ZrlFont{\tt}". +% Depending on the font selected, some characters may need to be defined +% in the "\ZrlSpecials" list because many fonts don't contain all the +% standard input characters. +% +% Changing linebreaks: +% The list of characters that allow line-breaks is given by "\ZrlBreaks" +% and "\ZrlBigBreaks", which have the format "\do\c" for character "c". +% The differences are that `BigBreaks' have a lower penalty and have +% different breakpoints when in sequence (as in "http://"): `BigBreaks' +% are treated as mathrels while `Breaks' are mathbins (see The TeXbook, +% p.170). In particular, a series of `BigBreak' characters will break at +% the end and only at the end; a series of `Break' characters will break +% after the first and after every following *pair*; there will be no +% break after a `Break' character if a `BigBreak' follows. In the case +% of "http://" it doesn't matter whether ":" is a `Break' or `BigBreak' -- +% the breaks are the same in either case; but for DECnet nodes with "::" +% it is important to prevent breaks *between* the colons, and that is why +% colons are `BigBreaks'. +% +% It is possible for characters to prevent breaks after the next following +% character (I use this for parentheses). Specify these in "\ZrlNoBreaks". +% +% You can do arbitrarily complex things with characters by making them +% active in math mode (mathcode hex-8000) and specifying the definition(s) +% in "\ZrlSpecials". This is used in the rm and sf styles for OT1 font +% encoding to handle several characters that are not present in those +% computer-modern style fonts. See the definition of "\Zrl@do", which +% is used by both "\zrl@rmstyle" and "\zrl@sfstyle"; it handles missing +% characters via "\ZrlSpecials". The nominal format for setting each +% special character "c" is: "\do\c{<definition>}", but you can include +% other definitions too. +% +% +% If all this sounds confusing ... well, it is! But I hope you won't need +% to redefine breakpoints -- the default assignments seem to work well for +% a wide variety of applications. If you do need to make changes, you can +% test for breakpoints using regular math mode and the characters "+=(a". +% +% Yet more flexibility: +% You can also customize the verbatim text by defining "\ZrlRight" and/or +% "\ZrlLeft", e.g., for ISO formatting of zrls surrounded by "< >", define +% +% \renewcommand\zrl{\begingroup \def\ZrlLeft{<zrl: }\def\ZrlRight{>}% +% \zrlstyle{tt}\Zrl} +% +% The meanings of "\ZrlLeft" and "\ZrlRight" are *not* reproduced verbatim. +% This lets you use formatting commands there, but you must be careful not +% to use TeX's special characters ("\^_%~#$&{}" etc.) improperly. +% You can also define "\ZrlLeft" to reprocess the verbatim text, but the +% format of the definition is special: +% +% \def\ZrlLeft#1\ZrlRight{ ... do things with #1 ... } +% +% Yes, that is "#1" followed by "\ZrlRight" then the definition. For +% example, to put a hyperTeX hypertext link in the DVI file: +% +% \def\ZrlLeft#1\ZrlRight{\special{html:<a href="#1">}#1\special{html:</a>}} +% +% Using this technique, zrl.sty can provide a convenient interface for +% performing various operations on verbatim text. You don't even need +% to print out the argument! For greatest efficiency in such obscure +% applications, you can define a null zrl-style where all the lists like +% "\ZrlBreaks" are empty. +% +% Revision History: +% ver 1.1 6-Feb-1996: +% Fix hyphens that wouldn't break and ligatures that weren't suppressed. +% ver 1.2 19-Oct-1996: +% Package option for T1 encoding; Hooks: "\ZrlLeft" and "\ZrlRight". +% ver 1.3 21-Jul-1997: +% Prohibit spaces as delimiter characters; change ascii tilde in OT1. +% ver 1.4 02-Mar-1999 +% LaTeX license; moving-argument-error +% The End + +Test file integrity: ASCII 32-57, 58-126: !"#$%&'()*+,-./0123456789 +:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ |