summaryrefslogtreecommitdiff
path: root/2004/linux2.6-networktour-lb2004
diff options
context:
space:
mode:
Diffstat (limited to '2004/linux2.6-networktour-lb2004')
-rw-r--r--2004/linux2.6-networktour-lb2004/abstract4
-rw-r--r--2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp236
2 files changed, 240 insertions, 0 deletions
diff --git a/2004/linux2.6-networktour-lb2004/abstract b/2004/linux2.6-networktour-lb2004/abstract
new file mode 100644
index 0000000..ae466a0
--- /dev/null
+++ b/2004/linux2.6-networktour-lb2004/abstract
@@ -0,0 +1,4 @@
+Linux based systems are known for performance and realiability in the area of networking. This presentation will give a tour through the Linux 2.6 kernel network stack, it\'s structure and implementation. Some of the topics covered
+are: Network hardware drivers, core network functions, IPv4 protocol stack,
+destination cache, neighbour cache, sockets implementation, zero-copy TCP.
+
diff --git a/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp b/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp
new file mode 100644
index 0000000..7c52001
--- /dev/null
+++ b/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp
@@ -0,0 +1,236 @@
+%include "default.mgp"
+%default 1 bgrad
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+%nodefault
+%back "blue"
+
+
+
+%center
+%size 7
+A tour of the
+Linux 2.6 network stack
+
+
+%center
+%size 4
+by
+
+Harald Welte <laforge@hmw-consulting.de>
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Contents
+
+
+ Introduction
+ Hardirq Context
+ Hard Interrupt Handler
+ Softirq Context
+ Network RX Softirq
+ IPv4 Packet Handler
+ IPv4 Packet Forwarding
+ IPv4 Packet Output
+ Driver TX routine
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Introduction
+
+
+Who is speaking to you?
+ an independent Free Software developer
+ who earns his living off Free Software since 1997
+ who is one of the authors of the Linux kernel firewall system called netfilter/iptables
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Interrupt context
+
+ Also called 'hardirq'
+ Triggered by external interrupt to the cpu
+ Is not reentrant, because the irq is disabled before handler is called
+ Should only do minimum of work and leave as fast as possible
+
+ hardirq handler registered via request_irq()
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Receive Interrupt
+
+ NIC receives packet for local mac address
+ NIC issues interrupt
+ Interrupt is routed to one CPU
+ Kernel enters hardirq context and disables this irq on local cpu
+ Driver's interrupt handler
+ allocates skb (struct sk_buff)
+ calls net/core/dev.c:netif_rx()
+ return irqreturn_t
+ Kernel leaves hardirq context and reenables this irq
+
+ 2.6.x introduces NAPI for polling at high irq rates: netif_rx_schedule()
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Softirq context
+
+ Softirq is the real workhorse of interrupts
+ Continues work where hardirq has finished
+ Can be interrupted by hardirq context
+ Can run in parallel on any number of CPU's
+
+ softirq handler registered via kernel/softirq.c:open_softirq()
+
+ softirq's need to be 'raised' by raise_softirq() from hardirq
+ softirq's are scheduled
+ after hardirq context exits
+ from softirqd in case there's too much work
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Network RX Softirq
+
+
+ kernel/softirq.c:do_softirq()
+ generic softirq code
+ net/core/dev.c:net_rx_action()
+ function that is registered at open_softirq() time
+ net/core/dev.c:process_backlog()
+ dequeue skb from local CPU's backlog queue
+ uses a weighting scheme between different devices
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+netif_receive_skb()
+
+
+ net/core/dev.c:netif_receive_skb()
+ main network rx softirq workhorse
+ check if there are any netpoll users, if yes netpoll_rx()
+ if somebody requested skb rx timestamp, net_timestamp()
+ if interface is part of bound group, skb_bound()
+ tc ingress filtering: ing_filter()
+ packet diverter: handle_diverter()
+ bridging handler: net/core/dev.c:handle_bridge()
+ deliver to l3 protocol handler: net/core/dev.c:deliver_skb()
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+IPv4 packet handler
+
+
+ net/ipv4/ip_input.c:ip_rcv()
+ checksum check
+ size check
+ NF_IP_PRE_ROUTING netfilter hook
+ net/ipv4/ip_input.c:ip_rcv_finish()
+ net/ipv4/route.c/ip_route_input()
+ route/dst cache lookup
+ if lookup fails, ip_route_input_slow()
+ fib lookup
+ allocation of new dst_entry / rtable
+ include/net/dst.h:dst_input()
+ iterate over destination stack
+ call destination function of the respective stack items
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+IPv4 packet forwarding
+
+
+ net/ipv4/ip_forward.c:ip_forward()
+ xfrm4_policy_check()
+ router alert handling (ip_call_ra_chain)
+ ttl decrement
+ if route is redirect route, ip_rt_send_redirect()
+ call NF_IP_FORWARD netfilter hook
+ net/ipv4/ip_forward.c:ip_forward_finish()
+ increase statistics for snmp mib
+ include/net/dst.h:dst_output()
+ iterate over output functions of dst stack
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+IPv4 packet output
+
+
+ net/ipv4/ip_output.c:ip_output()
+ fragment packet via ip_fragment() if needed
+ net/ipv4/ip_output.c:ip_finish_output()
+ call netfilter NF_IP_POST_ROUTING hook
+ net/ipv4/ip_output.c:ip_finish_output2()
+ attach hardware header
+ call header cache output fn (if neighbour in cache)
+ net/core/dev.c:dev_skb_xmit()
+ or neighbour output function (if neighbour unknown)
+ net/core/neighbour.c:neigh_resolve_output()
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+dev_skb_xmit()
+
+
+ skb->dev->qdisc->enqueue()
+ enqueue into devices output queue
+ default: net/sched/sch_generic.c:pfifo_fast_enqueue()
+ net/sched/sch_generic.c:qdisc_restart():
+ dev->qdisc->dequeue()
+ dequeue skb from queue
+ dev->hard_start_xmit()
+ transmit skb via driver
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Driver TX Routine
+
+ drivers/net/e1000/e1000_main.c:e1000_xmit_frame()
+ tons of workarounds for chip bugs
+ set up TX DMA descriptor
+ queue TX DMA descriptor to device hardware
+ return NETDEV_TX_OK
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%page
+Linux 2.6 Network Tour
+Thanks
+
+ Thanks to
+ Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
+ for implementing (one of?) the world's best TCP/IP stacks
+ Paul 'Rusty' Russell
+ for starting the netfilter/iptables project
+ for trusting me to maintain it today
+ Astaro AG
+ for sponsoring parts of my netfilter work
+ Free Software Foundation
+ for the GNU Project
+ for the GNU General Public License
+%size 3
+ The slides of this presentation are available at http://www.gnumonks.org/
+
+ Further Reading
+%size 3
+ The netfilter homepage http://www.netfilter.org/
+%size 3
+ The http://www.gpl-violations.org/ project
+
+
personal git repositories of Harald Welte. Your mileage may vary