summaryrefslogtreecommitdiff
path: root/2004/netfilter-failover-lk2004
diff options
context:
space:
mode:
authorHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
committerHarald Welte <laforge@gnumonks.org>2015-10-25 21:00:20 +0100
commitfca59bea770346cf1c1f9b0e00cb48a61b44a8f3 (patch)
treea2011270df48d3501892ac1a56015c8be57e8a7d /2004/netfilter-failover-lk2004
import of old now defunct presentation slides svn repo
Diffstat (limited to '2004/netfilter-failover-lk2004')
-rw-r--r--2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp369
-rw-r--r--2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex656
-rw-r--r--2004/netfilter-failover-lk2004/zrl.sty432
3 files changed, 1457 insertions, 0 deletions
diff --git a/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp
new file mode 100644
index 0000000..76a9206
--- /dev/null
+++ b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.mgp
@@ -0,0 +1,369 @@
+%include "default.mgp"
+%default 1 bgrad
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+%nodefault
+%back "blue"
+
+%center
+%size 7
+
+
+How to replicate the fire
+HA for netfilter-based firewalls
+
+
+%center
+%size 4
+by
+
+Harald Welte <laforge@netfilter.org>
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Contents
+
+
+ Introduction
+ Connection Tracking Subsystem
+ Packet selection based on IP Tables
+ The Connection Tracking Subsystem
+ The NAT Subsystem
+ Poor man's failover
+ Real state replication
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Introduction
+
+What is special about firewall failover?
+
+ Nothing, in case of the stateless packet filter
+ Common IP takeover solutions can be used
+ VRRP
+ Heartbeat
+ Distribution of packet filtering ruleset no problem
+ can be done manually
+ or implemented with simple userspace process
+ Problems arise with stateful packet filters
+ Connection state only on active node
+ NAT mappings only on active node
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Connection Tracking Subsystem
+
+Connection tracking...
+ enables stateful filtering
+ implementation
+ hooks into netfilter to track packets
+ protocol modules (currently TCP/UDP/ICMP)
+ application helpers currently (FTP,IRC,H.323,talk,SNMP)
+ divides packets in the following four categories
+ NEW - would establish new connection
+ ESTABLISHED - part of already established connection
+ RELATED - is related to established connection
+ INVALID - (multicast, errors...)
+ does _NOT_ filter packets itself
+ can be utilized by iptables using the 'state' match
+ is used by NAT Subsystem
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Connection Tracking Subsystem
+
+Common structures
+ struct ip_conntrack_tuple, representing unidirectional flow
+ layer 3 src + dst
+ layer 4 protocol
+ layer 4 src + dst
+
+ connections represented as struct ip_conntrack
+ original tuple
+ reply tuple
+ timeout
+ l4 state private data
+ app helper
+ app helper private data
+ expected connections
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Connection Tracking Subsystem
+
+Flow of events for new packet
+ packet enters NF_IP_PRE_ROUTING
+ tuple is derived from packet
+ lookup conntrack hash table with hash(tuple) -> fails
+ new ip_conntrack is allocated
+ fill in original and reply == inverted(original) tuple
+ initialize timer
+ assign app helper if applicable
+ see if we've been expected -> fails
+ call layer 4 helper 'new' function
+ ...
+ packet enters NF_IP_POST_ROUTING
+ do hashtable lookup for packet -> fails
+ place struct ip_conntrack in hashtable
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Connection Tracking Subsystem
+
+Flow of events for packet part of existing connection
+ packet enters NF_IP_PRE_ROUTING
+ tuple is derived from packet
+ lookup conntrack hash table with hash(tuple)
+ associate conntrack entry with skb->nfct
+ call l4 protocol helper 'packet' function
+ do l4 state tracking
+ update timeouts as needed [i.e. TCP TIME_WAIT,...]
+ ...
+ packet enters NF_IP_POST_ROUTING
+ do hashtable lookup for packet -> succeds
+ do nothing else
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Network Address Translation
+
+Overview
+ Previous Linux Kernels only implemented one special case of NAT: Masquerading
+ Linux 2.4.x can do any kind of NAT.
+ NAT subsystem implemented on top of netfilter, iptables and conntrack
+ NAT subsystem registers with all five netfilter hooks
+ 'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
+ Following targets available within 'nat' Table
+ SNAT changes the packet's source while passing NF_IP_POST_ROUTING
+ DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
+ MASQUERADE is a special case of SNAT
+ REDIRECT is a special case of DNAT
+ NAT bindings determined only for NEW packet and saved in ip_conntrack
+ Further packets within connection NATed according NAT bindings
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Poor man's failover
+
+Poor man's failover
+ principle
+ let every node do its own tracking rather than replicating state
+ two possible implementations
+ connect every node to shared media (i.e. real ethernet)
+ forwarding only turned on on active node
+ slave nodes use promiscuous mode to sniff packets
+ copy all traffic to slave nodes
+ active master needs to copy all traffic to other nodes
+ disadvantage: high load, sync traffic == payload traffic
+ IMHO stupid way of solving the problem
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Poor man's failover
+
+Poor man's failover
+ advantages
+ very easy implementation
+ only addition of sniffing mode to conntrack needed
+ existing means of address takeover can be used
+ same load on active master and slave nodes
+ no additional load on active master
+ disadvantages
+ can only be used with real shared media (no switches, ...)
+ can not be used with NAT
+ remaining problem
+ no initial state sync after reboot of slave node!
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication (ct_sync)
+
+Real state replication (ct_sync)
+ characteristics
+ replicates state changes from active master to slave(s)
+ seperate shared ethernet segment for sync
+ advantages
+ can be used with any network media
+ works with NAT
+ initial sync after new slave is introduced
+ problems
+ complex implementation
+ current limitations
+ no replication of connection relations (ftp/h.323/...)
+ current problems
+ bugs, bugs, bugs
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication (ct_sync)
+
+Required parts
+ state replication protocol
+ multicast based
+ sequence numbers for detection of packet loss
+ NACK-based retransmission
+ no security, since private ethernet segment to be used
+ event interface on active node
+ calling out to callback function at all state changes
+ exported interface to manipulate conntrack hash table
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication (ct_sync)
+
+Required parts
+ kernel thread for sending conntrack state protocol messages
+ registers with event interface
+ creates and accumulates state replication packets
+ sends them via in-kernel sockets api
+ kernel thread for receiving conntrack state replication messages
+ receives state replication packets via in-kernel sockets
+ uses conntrack hashtable manipulation interface
+ kernel thread for initial or full re-sync
+ sends full conntrack table with fixed speed
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication
+
+Flow of events in chronological order:
+ on active node, inside the network RX softirq
+ connection tracking code is analyzing a forwarded packet
+ connection tracking gathers some new state information
+ connection tracking updates local connection tracking database
+ connection tracking sends event message to event API
+ function registered at event API enqueues message to send ring
+ on active node, inside the conntrack-sync kernel thread
+ conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
+ conntrack sync daemon dequeues packets from ring
+ conntrack sync daemon sends state replication protocol packet via in-kernel sockets
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication
+
+Flow of events in chronological order:
+ on slave node(s), inside network RX softirq
+ connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
+ state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
+ on slave node(s), inside conntrack-sync kernel thread
+ conntrack sync daemon receives state replication message
+ conntrack sync daemon creates/updates conntrack entry
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Real state replication
+
+Neccessary changes to conntrack core
+ event generation (callback functions) for all state changes
+ is needed (and already implemented) for 'ctnetlink' API
+ conntrack hashtable manipulation API
+ is needed (and already implemented) for 'ctnetlink' API
+ conntrack exemptions
+ needed to _not_ track conntrack state replication packets
+ is needed for other cases as well (raw table / NOTRACK target)
+ works by
+ layer two packet drop (l2netfilter hooks)
+ disables any incoming or outgoing packets on other than the sync device on slave nodes
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Usage
+
+To set up a conntrack cluster you need
+
+ hardware
+ two firewalls with identical iptables rulesets
+ all ethernet interfaces (internal, dmz, external) connected to both nodes
+ seperate network segment for conntrack sync device
+ software
+ configure any working ip address range/subnet to sync device
+ assign every node a unique node id (0..255)
+ decide which of the nodes is master, which slave
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Usage
+
+To set up a conntrack cluster you need
+
+ configuration on master
+ first: modprobe ct_sync syncdev=ethX state=1 id=1 l2drop=1
+ second: configure your 'real' devices (internal, external)
+ configuration on slave
+ modprobe ct_sync syncdev=ethX state=0 id=2 l2drop=1
+ second: configure your 'real' devices (internal, external)
+
+ after loading ct_sync with l2drop=1, a slave node will be invisible on the 'real' networks. ssh access is only possible via sync device
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Usage
+
+ Cluster manager
+ set up a cluster manager with some heartbeat mechanism
+ configure it to run the following command on a slave that is to be propagated to master:
+ echo "1" > /proc/net/ct_sync
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Thanks
+
+ Thanks to
+ the BBS scenee, Z-Netz, FIDO, ...
+ for heavily increasing my computer usage in 1992
+ KNF
+ for bringing me in touch with the internet as early as 1994
+ for providing a playground for technical people
+ for introducing me to the existance of Linux!
+ Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
+ for implementing (one of?) the world's best TCP/IP stacks
+ Paul 'Rusty' Russell
+ for starting the netfilter/iptables project
+ for trusting me to maintain it today
+ Astaro AG
+ for sponsoring my netfilter failover work
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+HA for netfilter/iptables
+Availability of slides / Links
+
+The code
+ http://cvs.netfilter.org/netfilter-ha/ct_sync
+
+The slides
+ http://www.gnumonks.org/
+
+The netfilter homepage
+ http://www.netfilter.org/
+
+Astaro AG
+ http://www.astaro.com/
diff --git a/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex
new file mode 100644
index 0000000..d327bac
--- /dev/null
+++ b/2004/netfilter-failover-lk2004/netfilter-failover-lk2004.tex
@@ -0,0 +1,656 @@
+\documentclass[twocolumn,12pt]{article}
+
+\usepackage{alltt}
+
+\usepackage[T1]{fontenc}
+\usepackage[latin1]{inputenc}
+\usepackage{isolatin1}
+\usepackage{latexsym}
+\usepackage{textcomp}
+\usepackage{times}
+\usepackage{url}
+\usepackage[T1,obeyspaces]{zrl}
+
+% "verbatim" with line breaks, obeying spaces
+\providecommand\code{\begingroup \xrlstyle{tt}\Xrl}
+% as above, but okay to break lines at spaces
+\providecommand\brcode{\begingroup \zrlstyle{tt}\Zrl}
+
+% Same as the pair above, but 'l' for long == small type
+\providecommand\lcode{\begingroup \small\xrlstyle{tt}\Xrl}
+\providecommand\lbrcode{\begingroup \small\zrlstyle{tt}\Zrl}
+
+% For identifiers - "verbatim" with line breaks at punctuation
+\providecommand\ident{\begingroup \urlstyle{tt}\Url}
+\providecommand\lident{\begingroup \small\urlstyle{tt}\Url}
+
+
+
+
+\begin{document}
+
+% Required: do not print the date.
+\date{}
+
+\title{\texttt{ct\_sync}: state replication of \texttt{ip\_conntrack}\\
+% {\normalsize Subtitle goes here}
+}
+
+\author{
+Harald Welte \\
+{\em netfilter core team / Astaro AG / hmw-consulting.de}\\
+{\tt\normalsize laforge@gnumonks.org}\\
+% \and
+% Second Author\\
+% {\em Second Institution}\\
+% {\tt\normalsize another@address.for.email.com}\\
+} % end author section
+
+\maketitle
+
+% Required: do not use page numbers on title page.
+\thispagestyle{empty}
+
+\section*{Abstract}
+
+With traditional, stateless firewalling (such as ipfwadm, ipchains)
+there is no need for special HA support in the firewalling
+subsystem. As long as all packet filtering rules and routing table
+entries are configured in exactly the same way, one can use any
+available tool for IP-Address takeover to accomplish the goal of
+failing over from one node to the other.
+
+With Linux 2.4/2.6 netfilter/iptables, the Linux firewalling code
+moves beyond traditional packet filtering. Netfilter provides a
+modular connection tracking susbsystem which can be employed for
+stateful firewalling. The connection tracking subsystem gathers
+information about the state of all current network flows
+(connections). Packet filtering decisions and NAT information is
+associated with this state information.
+
+In a high availability scenario, this connection tracking state needs
+to be replicated from the currently active firewall node to all
+standby slave firewall nodes. Only when all connection tracking state
+is replicated, the slave node will have all necessary state
+information at the time a failover event occurs.
+
+Due to funding by Astaro AG, the netfilter/iptables project now offers
+a \ident{ct_sync} kernel module for replicating connection tracking state
+accross multiple nodes. The presentation will cover the architectural
+design and implementation of the connection tracking failover sytem.
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%% BODY OF PAPER GOES HERE %%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+\section{Failover of stateless firewalls}
+
+There are no special precautions when installing a highly available
+stateless packet filter. Since there is no state kept, all information
+needed for filtering is the ruleset and the individual, separate packets.
+
+Building a set of highly available stateless packet filters can thus be
+achieved by using any traditional means of IP-address takeover, such
+as Heartbeat or VRRPd.
+
+The only remaining issue is to make sure the firewalling ruleset is
+exactly the same on both machines. This should be ensured by the firewall
+administrator every time he updates the ruleset and can be optionally managed
+by some scripts utilizing scp or rsync.
+
+If this is not applicable, because a very dynamic ruleset is employed, one can
+build a very easy solution using iptables-supplied tools iptables-save and
+iptables-restore. The output of iptables-save can be piped over ssh to
+iptables-restore on a different host.
+
+Limitations
+\begin{itemize}
+\item
+no state tracking
+\item
+not possible in combination with iptables stateful NAT
+\item
+no counter consistency of per-rule packet/byte counters
+\end{itemize}
+
+\section{Failover of stateful firewalls}
+
+Modern firewalls implement state tracking (a.k.a.\ connection tracking) in order
+to keep some state about the currently active sessions. The amount of
+per-connection state kept at the firewall depends on the particular
+configuration and networking protocols used.
+
+As soon as \texttt{any} state is kept at the packet filter, this state
+information needs to be replicated to the slave/backup nodes within the
+failover setup.
+
+Since Linux 2.4.x, all relevant state is kept within the \textit{connection
+tracking subsystem}. In order to understand how this state could possibly be
+replicated, we need to understand the architecture of this conntrack subsystem.
+
+\subsection{Architecture of the Linux Connection Tracking Subsystem}
+
+Connection tracking within Linux is implemented as a netfilter module, called
+\ident{ip_conntrack.o} (\ident{ip_conntrack.ko} in 2.6.x kernels).
+
+Before describing the connection tracking subsystem, we need to describe a
+couple of definitions and primitives used throughout the conntrack code.
+
+A connection is represented within the conntrack subsystem using
+\brcode{struct ip_conntrack}, also called \textit{connection tracking entry}.
+
+Connection tracking is utilizing \textit{conntrack tuples}, which are tuples
+consisting of
+\begin{itemize}
+\item
+ source IP address
+\item
+ source port (or icmp type/code, gre key, ...)
+\item
+ destination IP address
+\item
+ destination port
+\item
+ layer 4 protocol number
+\end{itemize}
+
+A connection is uniquely identified by two tuples: The tuple in the original
+direction (\lident{IP_CT_DIR_ORIGINAL}) and the tuple for the reply direction
+(\lident{IP_CT_DIR_REPLY}).
+
+Connection tracking itself does not drop packets\footnote{well, in some rare
+cases in combination with NAT it needs to drop. But don't tell anyone, this is
+secret.} or impose any policy. It just associates every packet with a
+connection tracking entry, which in turn has a particular state. All other
+kernel code can use this state information\footnote{State information is
+referenced via the \brcode{struct sk_buff.nfct} structure member of a
+packet.}.
+
+\subsubsection{Integration of conntrack with netfilter}
+
+If the \ident{ip_conntrack.[k]o} module is registered with netfilter, it
+attaches to the \lident{NF_IP_PRE_ROUTING}, \lident{NF_IP_POST_ROUTING}, \lident{NF_IP_LOCAL_IN},
+and \lident{NF_IP_LOCAL_OUT} hooks.
+
+Because forwarded packets are the most common case on firewalls, I will only
+describe how connection tracking works for forwarded packets. The two relevant
+hooks for forwarded packets are \lident{NF_IP_PRE_ROUTING} and \lident{NF_IP_POST_ROUTING}.
+
+Every time a packet arrives at the \lident{NF_IP_PRE_ROUTING} hook, connection
+tracking creates a conntrack tuple from the packet. It then compares this
+tuple to the original and reply tuples of all already-seen
+connections
+\footnote{Of course this is not implemented as a linear
+search over all existing connections.} to find out if this
+just-arrived packet belongs to any existing
+connection. If there is no match, a new conntrack table entry
+(\brcode{struct ip_conntrack}) is created.
+
+Let's assume the case where we have already existing connections but are
+starting from scratch.
+
+The first packet comes in, we derive the tuple from the packet headers, look up
+the conntrack hash table, don't find any matching entry. As a result, we
+create a new \brcode{struct ip_conntrack}. This \brcode{struct ip_conntrack} is filled with
+all necessarry data, like the original and reply tuple of the connection.
+How do we know the reply tuple? By inverting the source and destination
+parts of the original tuple.\footnote{So why do we need two tuples, if they can
+be derived from each other? Wait until we discuss NAT.}
+Please note that this new \brcode{struct ip_conntrack} is \textbf{not} yet placed
+into the conntrack hash table.
+
+The packet is now passed on to other callback functions which have registered
+with a lower priority at \lident{NF_IP_PRE_ROUTING}. It then continues traversal of
+the network stack as usual, including all respective netfilter hooks.
+
+If the packet survives (i.e., is not dropped by the routing code, network stack,
+firewall ruleset, \ldots), it re-appears at \lident{NF_IP_POST_ROUTING}. In this case,
+we can now safely assume that this packet will be sent off on the outgoing
+interface, and thus put the connection tracking entry which we created at
+\lident{NF_IP_PRE_ROUTING} into the conntrack hash table. This process is called
+\textit{confirming the conntrack}.
+
+The connection tracking code itself is not monolithic, but consists of a
+couple of separate modules\footnote{They don't actually have to be separate
+kernel modules; e.g.\ TCP, UDP, and ICMP tracking modules are all part of
+the linux kernel module \ident{ip_conntrack.o}.}. Besides the conntrack core,
+there are two important kind of modules: Protocol helpers and application
+helpers.
+
+Protocol helpers implement the layer-4-protocol specific parts. They currently
+exist for TCP, UDP, and ICMP (an experimental helper for GRE exists).
+
+\subsubsection{TCP connection tracking}
+
+As TCP is a connection oriented protocol, it is not very difficult to imagine
+how conntection tracking for this protocol could work. There are well-defined
+state transitions possible, and conntrack can decide which state transitions
+are valid within the TCP specification. In reality it's not all that easy,
+since we cannot assume that all packets that pass the packet filter actually
+arrive at the receiving end\ldots
+
+It is noteworthy that the standard connection tracking code does \textbf{not}
+do TCP sequence number and window tracking. A well-maintained patch to add
+this feature has existed for almost as long as connection tracking itself. It
+will be integrated with the 2.5.x kernel. The problem with window tracking is
+its bad interaction with connection pickup. The TCP conntrack code is able to
+pick up already existing connections, e.g.\ in case your firewall was rebooted.
+However, connection pickup is conflicting with TCP window tracking: The TCP
+window scaling option is only transferred at connection setup time, and we
+don't know about it in case of pickup\ldots
+
+\subsubsection{ICMP tracking}
+
+ICMP is not really a connection oriented protocol. So how is it possible to
+do connection tracking for ICMP?
+
+The ICMP protocol can be split in two groups of messages:
+
+\begin{itemize}
+\item
+ICMP error messages, which sort-of belong to a different connection
+ICMP error messages are associated \textit{RELATED} to a different connection.
+(\lident{ICMP_DEST_UNREACH}, \lident{ICMP_SOURCE_QUENCH},
+\lident{ICMP_TIME_EXCEEDED},
+\lident{ICMP_PARAMETERPROB}, \lident{ICMP_REDIRECT}).
+\item
+ICMP queries, which have a \ident{request-reply} character. So what
+the conntrack
+code does, is let the request have a state of \textit{NEW}, and the reply
+\textit{ESTABLISHED}. The reply closes the connection immediately.
+(\lident{ICMP_ECHO}, \lident{ICMP_TIMESTAMP}, \lident{ICMP_INFO_REQUEST}, \lident{ICMP_ADDRESS})
+\end{itemize}
+
+\subsubsection{UDP connection tracking}
+
+UDP is designed as a connectionless datagram protocol. But most common
+protocols using UDP as layer 4 protocol have bi-directional UDP communication.
+Imagine a DNS query, where the client sends an UDP frame to port 53 of the
+nameserver, and the nameserver sends back a DNS reply packet from its UDP
+port 53 to the client.
+
+Netfilter treats this as a connection. The first packet (the DNS request) is
+assigned a state of \textit{NEW}, because the packet is expected to create a new
+`connection.' The DNS server's reply packet is marked as \textit{ESTABLISHED}.
+
+\subsubsection{conntrack application helpers}
+
+More complex application protocols involving multiple connections need special
+support by a so-called ``conntrack application helper module.'' Modules in
+the stock kernel come for FTP, IRC (DCC), TFTP and Amanda. Netfilter CVS currently contains
+%%% orig: ``tftp ald talk'' -- um, 'tftp and talk'? Yes, that's correct. It refers
+%%% to the talk protocol.
+patches for PPTP, H.323, Eggdrop botnet, mms, DirectX, RTSP and talk/ntalk. We're still lacking
+a lot of protocols (e.g.\ SIP, SMB/CIFS)---but they are unlikely to appear
+until somebody really needs them and either develops them on his own or
+funds development.
+
+\subsubsection{Integration of connection tracking with iptables}
+
+As stated earlier, conntrack doesn't impose any policy on packets. It just
+determines the relation of a packet to already existing connections.
+To base
+packet filtering decision on this state information, the iptables \textit{state}
+match can be used. Every packet is within one of the following categories:
+
+\begin{itemize}
+\item
+\textbf{NEW}: packet would create a new connection, if it survives
+\item
+\textbf{ESTABLISHED}: packet is part of an already established connection
+(either direction)
+\item
+\textbf{RELATED}: packet is in some way related to an already established
+connection, e.g.\ ICMP errors or FTP data sessions
+\item
+\textbf{INVALID}: conntrack is unable to derive conntrack information
+from this packet. Please note that all multicast or broadcast packets
+fall in this category.
+\end{itemize}
+
+
+\subsection{Poor man's conntrack failover}
+
+When thinking about failover of stateful firewalls, one usually thinks about
+replication of state. This presumes that the state is gathered at one
+firewalling node (the currently active node), and replicated to several other
+passive standby nodes. There is, however, a very different approach to
+replication: concurrent state tracking on all firewalling nodes.
+
+While this scheme has not been implemented within \ident{ct_sync}, the author
+still thinks it is worth an explanation in this paper.
+
+The basic assumption of this approach is: In a setup where all firewalling
+%%% deduct or deduce? I'd guess the latter, but I don't know, so I'm
+%%% leaving it...
+nodes receive exactly the same traffic, all nodes will deduct the same state
+information.
+
+The implementability of this approach is totally dependent on fulfillment of
+this assumption.
+
+\begin{itemize}
+\item
+\textit{All packets need to be seen by all nodes}. This is not always true, but
+can be achieved by using shared media like traditional ethernet (no switches!!)
+and promiscuous mode on all ethernet interfaces.
+\item
+\textit{All nodes need to be able to process all packets}. This cannot be
+universally guaranteed. Even if the hardware (CPU, RAM, Chipset, NICs) and
+software (Linux kernel) are exactly the same, they might behave different,
+especially under high load. To avoid those effects, the hardware should be
+able to deal with way more traffic than seen during operation. Also, there
+should be no userspace processes (like proxies, etc.) running on the firewalling
+nodes at all. WARNING: Nobody guarantees this behaviour. However, the poor
+man is usually not interested in scientific proof but in usability in his
+particular practical setup.
+\end{itemize}
+
+However, even if those conditions are fulfilled, there are remaining issues:
+\begin{itemize}
+\item
+\textit{No resynchronization after reboot}. If a node is rebooted (because of
+a hardware fault, software bug, software update, etc.) it will lose all state
+information until the event of the reboot. This means, the state information
+of this node after reboot will not contain any old state, gathered before the
+reboot. The effects depend on the traffic. Generally, it is only assured that
+state information about all connections initiated after the reboot will be
+present. If there are short-lived connections (like http), the state
+information on the just rebooted node will approximate the state information of
+an older node. Only after all sessions active at the time of reboot have
+terminated, state information is guaranteed to be resynchronized.
+\item
+\textit{Only possible with shared medium}. The practical implication is that no
+switched ethernet (and thus no full duplex) can be used.
+\end{itemize}
+
+The major advantage of the poor man's approach is implementation simplicity.
+No state transfer mechanism needs to be developed. Only very little changes
+to the existing conntrack code would be needed in order to be able to
+do tracking based on packets received from promiscuous interfaces. The active
+node would have packet forwarding turned on, the passive nodes, off.
+
+I'm not proposing this as a real solution to the failover problem. It's
+hackish, buggy, and likely to break very easily. But considering it can be
+implemented in very little programming time, it could be an option for very
+small installations with low reliability criteria.
+
+\subsection{Conntrack state replication}
+
+The preferred solution to the failover problem is, without any doubt,
+replication of the connection tracking state.
+
+The proposed conntrack state replication soltution consists of several
+parts:
+\begin{itemize}
+\item
+A connection tracking state replication protocol
+\item
+An event interface generating event messages as soon as state information
+changes on the active node
+\item
+An interface for explicit generation of connection tracking table entries on
+the standby slaves
+\item
+Some code (preferrably a kernel thread) running on the active node, receiving
+state updates by the event interface and generating conntrack state replication
+protocol messages
+\item
+Some code (preferrably a kernel thread) running on the slave node(s), receiving
+conntrack state replication protocol messages and updating the local conntrack
+table accordingly
+\end{itemize}
+
+Flow of events in chronological order:
+\begin{itemize}
+\item
+\textit{on active node, inside the network RX softirq}
+\begin{itemize}
+\item
+ \ident{ip_conntrack} analyzes a forwarded packet
+\item
+ \ident{ip_conntrack} gathers some new state information
+\item
+ \ident{ip_conntrack} updates conntrack hash table
+\item
+ \ident{ip_conntrack} calls event API
+\item
+ function registered to event API builds and enqueues message to send ring
+\end{itemize}
+\item
+\textit{on active node, inside the conntrack-sync sender kernel thread}
+ \begin{itemize}
+ \item
+ \ident{ct_sync_send} aggregates multiple messages into one packet
+ \item
+ \ident{ct_sync_send} dequeues packet from ring
+ \item
+ \ident{ct_sync_send} sends packet via in-kernel sockets API
+ \end{itemize}
+\item
+\textit{on slave node(s), inside network RX softirq}
+ \begin{itemize}
+ \item
+ \ident{ip_conntrack} ignores packets coming from the \ident{ct_sync} interface via NOTRACK mechanism
+ \item
+ UDP stack appends packet to socket receive queue of \ident{ct_sync_recv} kernel thread
+ \end{itemize}
+\item
+\textit{on slave node(s), inside conntrack-sync receive kernel thread}
+ \begin{itemize}
+ \item
+ \ident{ct_sync_recv} thread receives state replication packet
+ \item
+ \ident{ct_sync_recv} thread parses packet into individual messages
+ \item
+ \ident{ct_sync_recv} thread creates/updates local \ident{ip_conntrack} entry
+ \end{itemize}
+\end{itemize}
+
+
+\subsubsection{Connection tracking state replication protocol}
+
+
+ In order to be able to replicate the state between two or more firewalls, a
+state replication protocol is needed. This protocol is used over a private
+network segment shared by all nodes for state replication. It is designed to
+work over IP unicast and IP multicast transport. IP unicast will be used for
+direct point-to-point communication between one active firewall and one
+standby firewall. IP multicast will be used when the state needs to be
+replicated to more than one standby firewall.
+
+
+ The principal design criteria of this protocol are:
+\begin{itemize}
+\item
+ \textbf{reliable against data loss}, as the underlying UDP layer only
+ provides checksumming against data corruption, but doesn't employ any
+ means against data loss
+\item
+ \textbf{lightweight}, since generating the state update messages is
+ already a very expensive process for the sender, eating additional CPU,
+ memory, and IO bandwith.
+\item
+ \textbf{easy to parse}, to minimize overhead at the receiver(s)
+\end{itemize}
+
+The protocol does not employ any security mechanism like encryption,
+authentication, or reliability against spoofing attacks. It is
+assumed that the private conntrack sync network is a secure communications
+channel, not accessible to any malicious third party.
+
+To achieve the reliability against data loss, an easy sequence numbering
+scheme is used. All protocol messages are prefixed by a sequence number,
+determined by the sender. If the slave detects packet loss by discontinuous
+sequence numbers, it can request the retransmission of the missing packets
+by stating the missing sequence number(s). Since there is no acknowledgement
+for sucessfully received packets, the sender has to keep a
+reasonably-sized\footnote{\textit{reasonable size} must be large enough for the
+round-trip time between master and slowest slave.} backlog of recently-sent
+packets in order to be able to fulfill retransmission
+requests.
+
+The different state replication protocol packet types are:
+\begin{itemize}
+\item
+\textbf{\ident{CT_SYNC_PKT_MASTER_ANNOUNCE}}: A new master announces itself.
+Any still existing master will downgrade itself to slave upon
+reception of this packet.
+\item
+\textbf{\ident{CT_SYNC_PKT_SLAVE_INITSYNC}}: A slave requests initial
+synchronization from the master (after reboot or loss of sync).
+\item
+\textbf{\ident{CT_SYNC_PKT_SYNC}}: A packet containing synchronization data
+from master to slaves
+\item
+\textbf{\ident{CT_SYNC_PKT_NACK}}: A slave indicates packet loss of a
+particular sequence number
+\end{itemize}
+
+The messages within a \lident{CT_SYNC_PKT_SYNC} packet always refer to a particular
+\textit{resource} (currently \lident{CT_SYNC_RES_CONNTRACK} and \lident{CT_SYNC_RES_EXPECT},
+although support for the latter has not been fully implemented yet).
+
+For every resource, there are several message types. So far, only
+\lident{CT_SYNC_MSG_UPDATE} and \lident{CT_SYNC_MSG_DELETE} have been implemented. This
+means a new connection as well as state changes to an existing connection will
+always be encapsulated in a \lident{CT_SYNC_MSG_UDPATE} message and therefore contain
+the full conntrack entry.
+
+To uniquely identify (and later reference) a conntrack entry, the only unique
+criteria is used: \ident{ip_conntrack_tuple}.
+
+\subsubsection{\texttt{ct\_sync} sender thread}
+
+Maximum care needs to be taken for the implementation of the ctsyncd sender.
+
+The normal workload of the active firewall node is likely to be already very
+high, so generating and sending the conntrack state replication messages needs
+to be highly efficient.
+
+It was therefore decided to use a pre-allocated ringbuffer for outbound
+\ident{ct_sync} packets. New messages are appended to individual buffers in this
+ring, and pointers into this ring are passed to the in-kernel sockets API to
+ensure a minimum number of copies and memory allocations.
+
+\subsubsection{\texttt{ct\_sync} initsync sender thread}
+
+In order to facilitate ongoing state synchronization at the same time as
+responding to initial sync requests of an individual slave, the sender has a
+separate kernel thread for initial state synchronization (and \ident{ct_sync_initsync}).
+
+At the moment it iterates over the state table and transmits packets with a
+fixed rate of about 1000 packets per second, resulting in about 4000
+connections per second, averaging to about 1.5 Mbps of bandwith consumed.
+
+The speed of this initial sync should be configurable by the system
+administrator, especially since there is no flow control mechanism, and the
+slave node(s) will have to deal with the packets or otherwise lose sync again.
+
+This is certainly an area of future improvement and development---but first we
+want to see practical problems with this primitive scheme.
+
+\subsubsection{\texttt{ct\_sync} receiver thread}
+
+Implementation of the receiver is very straightforward.
+
+For performance reasons, and to facilitate code-reuse, the receiver uses the
+same pre-allocated ring buffer structure as the sender. Incoming packets are
+written into ring members and then successively parsed into their individual
+messages.
+
+Apart from dealing with lost packets, it just needs to call the
+respective conntrack add/modify/delete functions.
+
+\subsubsection{Necessary changes within netfilter conntrack core}
+
+To be able to achieve the described conntrack state replication mechanism,
+the following changes to the conntrack core were implemented:
+\begin{itemize}
+\item
+ Ability to exclude certain packets from being tracked. This was a
+ long-wanted feature on the TODO list of the netfilter project and is
+ implemented by having a ``raw'' table in combination with a
+ ``NOTRACK'' target.
+\item
+ Ability to register callback functions to be called every time a new
+ conntrack entry is created or an existing entry modified. This is
+ part of the nfnetlink-ctnetlink patch, since the ctnetlink event
+ interface also uses this API.
+\item
+ Export an API to externally add, modify, and remove conntrack entries.
+\end{itemize}
+
+Since the number of changes is very low, their inclusion into the mainline
+kernel is not a problem and can happen during the 2.6.x stable kernel series.
+
+
+\subsubsection{Layer 2 dropping and \texttt{ct\_sync}}
+
+In most cases, netfilter/iptables-based firewalls will not only function as
+packet filter but also run local processes such as proxies, dns relays, smtp
+relays, etc.
+
+In order to minimize failover time, it is helpful if the full startup and
+configuration of all network interfaces and all of those userspace processes
+can happen at system bootup time rather then in the instance of a failover.
+
+l2drop provides a convenient way for this goal: It hooks into layer 2
+netfilter hooks (immediately attached to \ident{netif_rx()} and
+\ident{dev_queue_xmit}) and blocks all incoming and outgoing network packets at this
+very low layer. Even kernel-generated messages such as ARP replies, IPv6
+neighbour discovery, IGMP, \dots are blocked this way.
+
+Of course there has to be an exemption for the state synchronization messages
+themselves. In order to still facilitate remote administration via SSH and
+other communication between the cluster nodes, the whole network
+interface used for synchronization is subject to this exemption from
+l2drop.
+
+As soon as a node is propagated to master state, l2drop is disabled and the
+system becomes visible to the network.
+
+
+\subsubsection{Configuration}
+
+All configuration happens via module parameters.
+
+\begin{itemize}
+\item
+ \texttt{syncdev}: Name of the multicast-capable network device
+ used for state synchronization among the nodes
+\item
+ \texttt{state}: Initial state of the node (0=slave, 1=master)
+\item
+ \texttt{id}: Unique Node ID (0..255)
+\item
+ \texttt{l2drop}: Enable (1) or disable (0) the l2drop functionality
+\end{itemize}
+
+\subsubsection{Interfacing with the cluster manager}
+
+As indicated in the beginning of this paper, \ident{ct_sync} itself does not provide
+any mechanism to determine outage of the master node within a cluster. This
+job is left to a cluster manager software running in userspace.
+
+Once an outage of the master is detected, the cluster manager needs to elect
+one of the remaining (slave) nodes to become new master. On this elected node,
+the cluster manager will write the ascii character \texttt{1} into the
+\ident{/proc/net/ct_sync} file. Reading from this file will return the current state
+of the local node.
+
+\section{Acknowledgements}
+
+The author would like to thank his fellow netfilter developers for their
+help. Particularly important to \ident{ct_sync} is Krisztian KOVACS
+\ident{<hidden@balabit.hu>}, who did a proof-of-concept implementation based on my
+first paper on \ident{ct_sync} at OLS2002.
+
+Without the financial support of Astaro AG, I would not have been able to spend any
+time on \ident{ct_sync} at all.
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\end{document}
+
diff --git a/2004/netfilter-failover-lk2004/zrl.sty b/2004/netfilter-failover-lk2004/zrl.sty
new file mode 100644
index 0000000..fb97b03
--- /dev/null
+++ b/2004/netfilter-failover-lk2004/zrl.sty
@@ -0,0 +1,432 @@
+
+%%%%% This file is a kludge until such time as I learn to do it elegantly. Sorry.
+%% url - external. Intended for items which do not contain spaces, and
+%% containing global options for obeying & breaking at spaces. But
+%% we need to do change those things on the fly, so we're making a copy
+%% of url.sty and defining two extra groups, zrl and xrl, that
+%% permit handling these options on the fly.
+
+%% Thus you can mix url without obeyspaces and/or spaces with the following:
+%% zrl - url with obeyspaces,spaces turned on
+%% xrl - url with obeyspaces turned on
+
+% zrl.sty ver 1.4 02-Mar-1999 Donald Arseneau asnd@triumf.ca
+% Copyright 1996-1999 Donald Arseneau, Vancouver, Canada.
+% This program can be used, distributed, and modified under the terms
+% of the LaTeX Project Public License.
+%
+% A form of \verb that allows linebreaks at certain characters or
+% combinations of characters, accepts reconfiguration, and can usually
+% be used in the argument to another command. It is intended for email
+% addresses, hypertext links, directories/paths, etc., which normally
+% have no spaces. The font may be selected using the \zrlstyle command,
+% and new zrl-like commands can be defined using \zrldef.
+%
+% Usage: Conditions:
+% \zrl{ } If the argument contains any "%", "#", or "^^", or ends with
+% "\", it can't be used in the argument to another command.
+% The argument must not contain unbalanced braces.
+% \zrl| | ...where "|" is any character not used in the argument and not
+% "{" or a space. The same restrictions as above except that the
+% argument may contain unbalanced braces.
+% \xyz for "\xyz" a defined-zrl; this can be used anywhere, no matter
+% what characters it contains.
+%
+% See further instructions after "\endinput"
+%
+\def\Zrl@ttdo{% style assignments for tt fonts or T1 encoding
+\def\ZrlBreaks{\do\.\do\@\do\\\do\/\do\!\do\_\do\|\do\%\do\;\do\>\do\]%
+ \do\)\do\,\do\?\do\'\do\+\do\=}%
+\def\ZrlBigBreaks{\do\:\do@zrl@hyp}%
+\def\ZrlNoBreaks{\do\(\do\[\do\{\do\<}% (unnecessary)
+\def\ZrlSpecials{\do\ {\ }}%
+\def\ZrlOrds{\do\*\do\-\do\~}% any ordinary characters that aren't usually
+}
+
+\def\Xrl@ttdo{% style assignments for tt fonts or T1 encoding
+\def\XrlBreaks{\do\.\do\@\do\\\do\/\do\!\do\_\do\|\do\%\do\;\do\>\do\]%
+ \do\)\do\,\do\?\do\'\do\+\do\=}%
+\def\XrlBigBreaks{\do\:\do@xrl@hyp}%
+\def\XrlNoBreaks{\do\(\do\[\do\{\do\<}% (unnecessary)
+\def\XrlSpecials{\do\ {\ }}%
+\def\XrlOrds{\do\*\do\-\do\~}% any ordinary characters that aren't usually
+}
+
+\def\Zrl@do{% style assignments for OT1 fonts except tt
+\def\ZrlBreaks{\do\.\do\@\do\/\do\!\do\%\do\;\do\]\do\)\do\,\do\?\do\+\do\=}%
+\def\ZrlBigBreaks{\do\:\do@zrl@hyp}%
+\def\ZrlNoBreaks{\do\(\do\[\do\{}% prevents breaks after *next* character
+\def\ZrlSpecials{\do\<{\langle}\do\>{\mathbin{\rangle}}\do\_{\_%
+ \penalty\@m}\do\|{\mid}\do\{{\lbrace}\do\}{\mathbin{\rbrace}}\do
+ \\{\mathbin{\backslash}}\do\~{\raise.6ex\hbox{\m@th$\scriptstyle\sim$}}\do
+ \ {\ }}%
+\def\ZrlOrds{\do\'\do\"\do\-}%
+}
+\def\Xrl@do{% style assignments for OT1 fonts except tt
+\def\XrlBreaks{\do\.\do\@\do\/\do\!\do\%\do\;\do\]\do\)\do\,\do\?\do\+\do\=}%
+\def\XrlBigBreaks{\do\:\do@xrl@hyp}%
+\def\XrlNoBreaks{\do\(\do\[\do\{}% prevents breaks after *next* character
+\def\XrlSpecials{\do\<{\langle}\do\>{\mathbin{\rangle}}\do\_{\_%
+ \penalty\@m}\do\|{\mid}\do\{{\lbrace}\do\}{\mathbin{\rbrace}}\do
+ \\{\mathbin{\backslash}}\do\~{\raise.6ex\hbox{\m@th$\scriptstyle\sim$}}\do
+ \ {\ }}%
+\def\XrlOrds{\do\'\do\"\do\-}%
+}
+
+
+\def\zrl@ttstyle{%
+\@ifundefined{selectfont}{\def\ZrlFont{\tt}}{\def\ZrlFont{\ttfamily}}\Zrl@ttdo
+}
+\def\xrl@ttstyle{%
+\@ifundefined{selectfont}{\def\XrlFont{\tt}}{\def\XrlFont{\ttfamily}}\Xrl@ttdo
+}
+
+
+\def\zrl@rmstyle{%
+\@ifundefined{selectfont}{\def\ZrlFont{\rm}}{\def\ZrlFont{\rmfamily}}\Zrl@do
+}
+\def\xrl@rmstyle{%
+\@ifundefined{selectfont}{\def\XrlFont{\rm}}{\def\XrlFont{\rmfamily}}\Xrl@do
+}
+
+
+\def\zrl@sfstyle{%
+\@ifundefined{selectfont}{\def\ZrlFont{\sf}}{\def\ZrlFont{\sffamily}}\Zrl@do
+}
+\def\xrl@sfstyle{%
+\@ifundefined{selectfont}{\def\XrlFont{\sf}}{\def\XrlFont{\sffamily}}\Xrl@do
+}
+
+
+\def\zrl@samestyle{\ifdim\fontdimen\thr@@\font=\z@ \zrl@ttstyle \else
+ \zrl@rmstyle \fi \def\ZrlFont{}}
+\def\xrl@samestyle{\ifdim\fontdimen\thr@@\font=\z@ \xrl@ttstyle \else
+ \xrl@rmstyle \fi \def\XrlFont{}}
+
+\@ifundefined{strip@prefix}{\def\strip@prefix#1>{}}{}
+\@ifundefined{verbatim@nolig@list}{\def\verbatim@nolig@list{\do\`}}{}
+
+\def\Zrl{%
+ \begingroup \let\zrl@moving\relax\relax \endgroup
+ \ifmmode\@nomatherr$\fi
+ \ZrlFont $\fam\z@ \textfont\z@\font
+ \let\do\@makeother \dospecials % verbatim catcodes
+ \catcode`{\@ne \catcode`}\tw@ \catcode`\ 10 % except braces and spaces
+ \medmuskip0mu \thickmuskip\medmuskip \thinmuskip\medmuskip
+ \@tempcnta\fam\multiply\@tempcnta\@cclvi
+ \let\do\set@mathcode \ZrlOrds % ordinary characters that were special
+ \advance\@tempcnta 8192 \ZrlBreaks % bin
+ \advance\@tempcnta 4096 \ZrlBigBreaks % rel
+ \advance\@tempcnta 4096 \ZrlNoBreaks % open
+ \let\do\set@mathact \ZrlSpecials % active
+ \let\do\set@mathnolig \verbatim@nolig@list % prevent ligatures
+ \@ifnextchar\bgroup\Zrl@z\Zrl@y}
+
+\def\Zrl@y#1{\catcode`{11 \catcode`}11
+ \def\@tempa##1#1{\Zrl@z{##1}}\@tempa}
+\def\Zrl@z#1{\def\@tempa{#1}\expandafter\expandafter\expandafter\Zrl@Hook
+ \expandafter\strip@prefix\meaning\@tempa\ZrlRight\m@th$\endgroup}
+\def\Zrl@Hook{\ZrlLeft}
+\let\ZrlRight\@empty
+\let\ZrlLeft\@empty
+
+\def\Xrl{%
+ \begingroup \let\xrl@moving\relax\relax \endgroup
+ \ifmmode\@nomatherr$\fi
+ \XrlFont $\fam\z@ \textfont\z@\font
+ \let\do\@makeother \dospecials % verbatim catcodes
+ \catcode`{\@ne \catcode`}\tw@ \catcode`\ 10 % except braces and spaces
+ \medmuskip0mu \thickmuskip\medmuskip \thinmuskip\medmuskip
+ \@tempcnta\fam\multiply\@tempcnta\@cclvi
+ \let\do\set@mathcode \XrlOrds % ordinary characters that were special
+ \advance\@tempcnta 8192 \XrlBreaks % bin
+ \advance\@tempcnta 4096 \XrlBigBreaks % rel
+ \advance\@tempcnta 4096 \XrlNoBreaks % open
+ \let\do\set@mathact \XrlSpecials % active
+ \let\do\set@mathnolig \verbatim@nolig@list % prevent ligatures
+ \@ifnextchar\bgroup\Xrl@z\Xrl@y}
+
+\def\Xrl@y#1{\catcode`{11 \catcode`}11
+ \def\@tempa##1#1{\Xrl@z{##1}}\@tempa}
+\def\Xrl@z#1{\def\@tempa{#1}\expandafter\expandafter\expandafter\Xrl@Hook
+ \expandafter\strip@prefix\meaning\@tempa\XrlRight\m@th$\endgroup}
+\def\Xrl@Hook{\XrlLeft}
+\let\XrlRight\@empty
+\let\XrlLeft\@empty
+
+
+\def\set@mathcode#1{\count@`#1\advance\count@\@tempcnta\mathcode`#1\count@}
+\def\set@mathact#1#2{\mathcode`#132768 \lccode`\~`#1\lowercase{\def~{#2}}}
+\def\set@mathnolig#1{\ifnum\mathcode`#1<32768
+ \lccode`\~`#1\lowercase{\edef~{\mathchar\number\mathcode`#1_{\/}}}%
+ \mathcode`#132768 \fi}
+
+\def\zrldef#1#2{\begingroup \setbox\z@\hbox\bgroup
+ \def\Zrl@z{\Zrl@def{#1}{#2}}#2}
+\expandafter\ifx\csname DeclareRobustCommand\endcsname\relax
+ \def\Zrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup
+ \def#1{#2{#3}}}
+\else
+ \def\Zrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup
+ \DeclareRobustCommand{#1}{#2{#3}}}
+\fi
+
+\def\xrldef#1#2{\begingroup \setbox\z@\hbox\bgroup
+ \def\Xrl@z{\Xrl@def{#1}{#2}}#2}
+\expandafter\ifx\csname DeclareRobustCommand\endcsname\relax
+ \def\Xrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup
+ \def#1{#2{#3}}}
+\else
+ \def\Xrl@def#1#2#3{\m@th$\endgroup\egroup\endgroup
+ \DeclareRobustCommand{#1}{#2{#3}}}
+\fi
+
+\def\zrlstyle#1{\csname zrl@#1style\endcsname}
+\def\xrlstyle#1{\csname xrl@#1style\endcsname}
+
+% Sample (and default) configuration:
+%
+\newcommand\zrl{\begingroup \Zrl}
+\newcommand\xrl{\begingroup \Xrl}
+%
+% picTeX defines \path, so declare it optionally:
+\@ifundefined{path}{\newcommand\path{\begingroup \zrlstyle{tt}\Zrl}}{}
+\@ifundefined{path}{\newcommand\path{\begingroup \xrlstyle{tt}\Xrl}}{}
+%
+% too many styles define \email like \address, so I will not define it.
+% \newcommand\email{\begingroup \zrlstyle{rm}\Zrl}
+
+% Process LaTeX \package options
+%
+\zrlstyle{tt}
+%\let\Zrl@sppen\@M
+\def\do@zrl@hyp{}% by default, no breaks after hyphens
+%%%%%
+\let\Zrl@sppen\relpenalty
+\let\Zrl@Hook\relax
+\xrlstyle{tt}
+\let\Xrl@sppen\@M
+\def\do@xrl@hyp{}% by default, no breaks after hyphens
+\let\Xrl@Hook\relax
+%%%%%
+\@ifundefined{ProvidesPackage}{}{
+ \ProvidesPackage{zrl}[1999/03/02 \space ver 1.4 \space
+ Verb mode for zrls, email addresses, and file names]
+ \DeclareOption{hyphens}{\def\do@zrl@hyp{\do\-}\def\do@xrl@hyp{\do\-}}% allow breaks after hyphens
+ \DeclareOption{obeyspaces}{\let\Zrl@Hook\relax\let\Xrl@Hook\relax}% a flag for later
+ \DeclareOption{spaces}{\let\Zrl@sppen\relpenalty}
+ \DeclareOption{T1}{\let\Zrl@do\Zrl@ttdo\let\Xrl@do\Xrl@ttdo}
+ \ProcessOptions
+\ifx\Zrl@Hook\relax % [obeyspaces] was declared
+ \def\Zrl@Hook#1\ZrlRight\m@th{\edef\@tempa{\noexpand\ZrlLeft
+ \Zrl@retain#1\Zrl@nosp\, }\@tempa\ZrlRight\m@th}
+ \def\Zrl@retain#1 {#1\penalty\Zrl@sppen\ \Zrl@retain}
+ \def\Zrl@nosp\,#1\Zrl@retain{}
+\fi
+\ifx\Xrl@Hook\relax % [obeyspaces] was declared
+ \def\Xrl@Hook#1\XrlRight\m@th{\edef\@tempa{\noexpand\XrlLeft
+ \Xrl@retain#1\Xrl@nosp\, }\@tempa\XrlRight\m@th}
+ \def\Xrl@retain#1 {#1\penalty\Xrl@sppen\ \Xrl@retain}
+ \def\Xrl@nosp\,#1\Xrl@retain{}
+\fi
+}
+
+\edef\zrl@moving{\csname Zrl Error\endcsname}
+\expandafter\edef\zrl@moving
+ {\csname zrl used in a moving argument.\endcsname}
+\expandafter\expandafter\expandafter \let \zrl@moving\undefined
+
+\edef\xrl@moving{\csname Xrl Error\endcsname}
+\expandafter\edef\xrl@moving
+ {\csname xrl used in a moving argument.\endcsname}
+\expandafter\expandafter\expandafter \let \xrl@moving\undefined
+
+\endinput
+%
+% zrl.sty ver 1.4 02-Mar-1999 Donald Arseneau asnd@reg.triumf.ca
+%
+% This package defines "\zrl", a form of "\verb" that allows linebreaks,
+% and can often be used in the argument to another command. It can be
+% configured to print in different formats, and is particularly useful for
+% hypertext links, email addresses, directories/paths, etc. The font may
+% be selected using the "\zrlstyle" command and pre-defined text can be
+% stored with the "\zrldef" command. New zrl-like commands can be defined,
+% and a "\path" command is provided this way.
+%
+% Usage: Conditions:
+% \zrl{ } If the argument contains any "%", "#", or "^^", or ends with
+% "\", it can't be used in the argument to another command.
+% The argument must not contain unbalanced braces.
+% \zrl| | ...where "|" is any character not used in the argument and not
+% "{" or a space. The same restrictions as above except that the
+% argument may contain unbalanced braces.
+% \xyz for "\xyz" a defined-zrl; this can be used anywhere, no matter
+% what characters it contains.
+%
+% The "\zrl" command is fragile, and its argument is likely to be very
+% fragile, but a defined-zrl is robust.
+%
+% Package Option: obeyspaces
+% Ordinarily, all spaces are ignored in the zrl-text. The "[obeyspaces]"
+% option allows spaces, but may introduce spurious spaces when a zrl
+% containing "\" characters is given in the argument to another command.
+% So if you need to obey spaces you can say "\usepackage[obeyspaces]{zrl}",
+% and if you need both spaces and backslashes, use a `defined-zrl' for
+% anything with "\".
+%
+% Package Option: hyphens
+% Ordinarily, breaks are not allowed after "-" characters because this
+% leads to confusion. (Is the "-" part of the address or just a hyphen?)
+% The package option "[hyphens]" allows breaks after explicit hyphen
+% characters. The "\zrl" command will *never ever* hyphenate words.
+%
+% Package Option: spaces
+% Likewise, breaks are not usually allowed after spaces under the
+% "[obeyspaces]" option, but giving the options "[obeyspaces,spaces]"
+% will allow breaks at those spaces.
+%
+% Package Option: T1
+% This signifies that you will be using T1-encoded fonts which contain
+% some characters missing from most older (OT1) encoded TeX fonts. This
+% changes the default definition for "\zrlstyle{rm}".
+%
+% Defining a defined-zrl:
+% Take for example the email address "myself%node@gateway.net" which could
+% not be given (using "\zrl" or "\verb") in a caption or parbox due to the
+% percent sign. This address can be predefined with
+% \zrldef{\myself}\zrl{myself%node@gateway.net} or
+% \zrldef{\myself}\zrl|myself%node@gateway.net|
+% and then you may use "\myself" instead of "\zrl{myself%node@gateway.net}"
+% in an argument, and even in a moving argument like a caption because a
+% defined-zrl is robust.
+%
+% Style:
+% You can switch the style of printing using "\zrlstyle{tt}", where "tt"
+% can be any defined style. The pre-defined styles are "tt", "rm", "sf",
+% and "same" which all allow the same linebreaks but different fonts --
+% the first three select a specific font and the "same" style uses the
+% current text font. You can define your own styles with different fonts
+% and/or line-breaking by following the explanations below. The "\zrl"
+% command follows whatever the currently-set style dictates.
+%
+% Alternate commands:
+% It may be desireable to have different things treated differently, each
+% in a predefined style; e.g., if you want directory paths to always be
+% in tt and email addresses to be rm, then you would define new zrl-like
+% commands as follows:
+%
+% \newcommand\email{\begingroup \zrlstyle{rm}\Zrl}
+% \newcommand\directory{\begingroup \zrlstyle{tt}\Zrl}
+%
+% You must follow this format closely, and NOTE that the final command is
+% "\Zrl", not "\zrl". In fact, the "\directory" example is exactly the
+% "\path" definition which is pre-defined in the package. If you look
+% above, you will see that "\zrl" is defined with
+% \newcommand\zrl{\begingroup \Zrl}
+% I.e., using whatever zrl-style has been selected.
+%
+% You can make a defined-zrl for these other styles, using the usual
+% "\zrldef" command as in this example:
+%
+% \zrldef{\myself}{\email}{myself%node.domain@gateway.net}
+%
+% which makes "\myself" act like "\email{myself%node.domain@gateway.net}",
+% if the "\email" command is defined as above. The "\myself" command
+% would then be robust.
+%
+% Defining styles:
+% Before describing how to customize the printing style, it is best to
+% mention something about the unusual implementation of "\zrl". Although
+% the material is textual in nature, and the font specification required
+% is a text-font command, the text is actually typeset in *math* mode.
+% This allows the context-sensitive linebreaking, but also accounts for
+% the default behavior of ignoring spaces. Now on to defining styles.
+%
+% To change the font or the list of characters that allow linebreaks, you
+% could redefine the commands "\ZrlFont", "\ZrlBreaks", "\ZrlSpecials" etc.
+% directly in the document, but it is better to define a new `zrl-style'
+% (following the example of "\zrl@ttstyle" and "\zrl@rmstyle") which defines
+% all of "\ZrlBigbreaks", "\ZrlNoBreaks", "\ZrlBreaks", "\ZrlSpecials", and
+% "\ZrlFont".
+%
+% Changing font:
+% The "\ZrlFont" command selects the font. The definition of "\ZrlFont"
+% done by the pre-defined styles varies to cope with a variety of LaTeX
+% font selection schemes, but it could be as simple as "\def\ZrlFont{\tt}".
+% Depending on the font selected, some characters may need to be defined
+% in the "\ZrlSpecials" list because many fonts don't contain all the
+% standard input characters.
+%
+% Changing linebreaks:
+% The list of characters that allow line-breaks is given by "\ZrlBreaks"
+% and "\ZrlBigBreaks", which have the format "\do\c" for character "c".
+% The differences are that `BigBreaks' have a lower penalty and have
+% different breakpoints when in sequence (as in "http://"): `BigBreaks'
+% are treated as mathrels while `Breaks' are mathbins (see The TeXbook,
+% p.170). In particular, a series of `BigBreak' characters will break at
+% the end and only at the end; a series of `Break' characters will break
+% after the first and after every following *pair*; there will be no
+% break after a `Break' character if a `BigBreak' follows. In the case
+% of "http://" it doesn't matter whether ":" is a `Break' or `BigBreak' --
+% the breaks are the same in either case; but for DECnet nodes with "::"
+% it is important to prevent breaks *between* the colons, and that is why
+% colons are `BigBreaks'.
+%
+% It is possible for characters to prevent breaks after the next following
+% character (I use this for parentheses). Specify these in "\ZrlNoBreaks".
+%
+% You can do arbitrarily complex things with characters by making them
+% active in math mode (mathcode hex-8000) and specifying the definition(s)
+% in "\ZrlSpecials". This is used in the rm and sf styles for OT1 font
+% encoding to handle several characters that are not present in those
+% computer-modern style fonts. See the definition of "\Zrl@do", which
+% is used by both "\zrl@rmstyle" and "\zrl@sfstyle"; it handles missing
+% characters via "\ZrlSpecials". The nominal format for setting each
+% special character "c" is: "\do\c{<definition>}", but you can include
+% other definitions too.
+%
+%
+% If all this sounds confusing ... well, it is! But I hope you won't need
+% to redefine breakpoints -- the default assignments seem to work well for
+% a wide variety of applications. If you do need to make changes, you can
+% test for breakpoints using regular math mode and the characters "+=(a".
+%
+% Yet more flexibility:
+% You can also customize the verbatim text by defining "\ZrlRight" and/or
+% "\ZrlLeft", e.g., for ISO formatting of zrls surrounded by "< >", define
+%
+% \renewcommand\zrl{\begingroup \def\ZrlLeft{<zrl: }\def\ZrlRight{>}%
+% \zrlstyle{tt}\Zrl}
+%
+% The meanings of "\ZrlLeft" and "\ZrlRight" are *not* reproduced verbatim.
+% This lets you use formatting commands there, but you must be careful not
+% to use TeX's special characters ("\^_%~#$&{}" etc.) improperly.
+% You can also define "\ZrlLeft" to reprocess the verbatim text, but the
+% format of the definition is special:
+%
+% \def\ZrlLeft#1\ZrlRight{ ... do things with #1 ... }
+%
+% Yes, that is "#1" followed by "\ZrlRight" then the definition. For
+% example, to put a hyperTeX hypertext link in the DVI file:
+%
+% \def\ZrlLeft#1\ZrlRight{\special{html:<a href="#1">}#1\special{html:</a>}}
+%
+% Using this technique, zrl.sty can provide a convenient interface for
+% performing various operations on verbatim text. You don't even need
+% to print out the argument! For greatest efficiency in such obscure
+% applications, you can define a null zrl-style where all the lists like
+% "\ZrlBreaks" are empty.
+%
+% Revision History:
+% ver 1.1 6-Feb-1996:
+% Fix hyphens that wouldn't break and ligatures that weren't suppressed.
+% ver 1.2 19-Oct-1996:
+% Package option for T1 encoding; Hooks: "\ZrlLeft" and "\ZrlRight".
+% ver 1.3 21-Jul-1997:
+% Prohibit spaces as delimiter characters; change ascii tilde in OT1.
+% ver 1.4 02-Mar-1999
+% LaTeX license; moving-argument-error
+% The End
+
+Test file integrity: ASCII 32-57, 58-126: !"#$%&'()*+,-./0123456789
+:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
personal git repositories of Harald Welte. Your mileage may vary