diff options
Diffstat (limited to '2004/linux2.6-networktour-lb2004')
-rw-r--r-- | 2004/linux2.6-networktour-lb2004/abstract | 4 | ||||
-rw-r--r-- | 2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp | 236 |
2 files changed, 240 insertions, 0 deletions
diff --git a/2004/linux2.6-networktour-lb2004/abstract b/2004/linux2.6-networktour-lb2004/abstract new file mode 100644 index 0000000..ae466a0 --- /dev/null +++ b/2004/linux2.6-networktour-lb2004/abstract @@ -0,0 +1,4 @@ +Linux based systems are known for performance and realiability in the area of networking. This presentation will give a tour through the Linux 2.6 kernel network stack, it\'s structure and implementation. Some of the topics covered +are: Network hardware drivers, core network functions, IPv4 protocol stack, +destination cache, neighbour cache, sockets implementation, zero-copy TCP. + diff --git a/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp b/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp new file mode 100644 index 0000000..7c52001 --- /dev/null +++ b/2004/linux2.6-networktour-lb2004/linux2.6-networktour-lb2004.mgp @@ -0,0 +1,236 @@ +%include "default.mgp" +%default 1 bgrad +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +%nodefault +%back "blue" + + + +%center +%size 7 +A tour of the +Linux 2.6 network stack + + +%center +%size 4 +by + +Harald Welte <laforge@hmw-consulting.de> + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Contents + + + Introduction + Hardirq Context + Hard Interrupt Handler + Softirq Context + Network RX Softirq + IPv4 Packet Handler + IPv4 Packet Forwarding + IPv4 Packet Output + Driver TX routine + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Introduction + + +Who is speaking to you? + an independent Free Software developer + who earns his living off Free Software since 1997 + who is one of the authors of the Linux kernel firewall system called netfilter/iptables + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Interrupt context + + Also called 'hardirq' + Triggered by external interrupt to the cpu + Is not reentrant, because the irq is disabled before handler is called + Should only do minimum of work and leave as fast as possible + + hardirq handler registered via request_irq() + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Receive Interrupt + + NIC receives packet for local mac address + NIC issues interrupt + Interrupt is routed to one CPU + Kernel enters hardirq context and disables this irq on local cpu + Driver's interrupt handler + allocates skb (struct sk_buff) + calls net/core/dev.c:netif_rx() + return irqreturn_t + Kernel leaves hardirq context and reenables this irq + + 2.6.x introduces NAPI for polling at high irq rates: netif_rx_schedule() + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Softirq context + + Softirq is the real workhorse of interrupts + Continues work where hardirq has finished + Can be interrupted by hardirq context + Can run in parallel on any number of CPU's + + softirq handler registered via kernel/softirq.c:open_softirq() + + softirq's need to be 'raised' by raise_softirq() from hardirq + softirq's are scheduled + after hardirq context exits + from softirqd in case there's too much work + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Network RX Softirq + + + kernel/softirq.c:do_softirq() + generic softirq code + net/core/dev.c:net_rx_action() + function that is registered at open_softirq() time + net/core/dev.c:process_backlog() + dequeue skb from local CPU's backlog queue + uses a weighting scheme between different devices + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +netif_receive_skb() + + + net/core/dev.c:netif_receive_skb() + main network rx softirq workhorse + check if there are any netpoll users, if yes netpoll_rx() + if somebody requested skb rx timestamp, net_timestamp() + if interface is part of bound group, skb_bound() + tc ingress filtering: ing_filter() + packet diverter: handle_diverter() + bridging handler: net/core/dev.c:handle_bridge() + deliver to l3 protocol handler: net/core/dev.c:deliver_skb() + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +IPv4 packet handler + + + net/ipv4/ip_input.c:ip_rcv() + checksum check + size check + NF_IP_PRE_ROUTING netfilter hook + net/ipv4/ip_input.c:ip_rcv_finish() + net/ipv4/route.c/ip_route_input() + route/dst cache lookup + if lookup fails, ip_route_input_slow() + fib lookup + allocation of new dst_entry / rtable + include/net/dst.h:dst_input() + iterate over destination stack + call destination function of the respective stack items + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +IPv4 packet forwarding + + + net/ipv4/ip_forward.c:ip_forward() + xfrm4_policy_check() + router alert handling (ip_call_ra_chain) + ttl decrement + if route is redirect route, ip_rt_send_redirect() + call NF_IP_FORWARD netfilter hook + net/ipv4/ip_forward.c:ip_forward_finish() + increase statistics for snmp mib + include/net/dst.h:dst_output() + iterate over output functions of dst stack + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +IPv4 packet output + + + net/ipv4/ip_output.c:ip_output() + fragment packet via ip_fragment() if needed + net/ipv4/ip_output.c:ip_finish_output() + call netfilter NF_IP_POST_ROUTING hook + net/ipv4/ip_output.c:ip_finish_output2() + attach hardware header + call header cache output fn (if neighbour in cache) + net/core/dev.c:dev_skb_xmit() + or neighbour output function (if neighbour unknown) + net/core/neighbour.c:neigh_resolve_output() + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +dev_skb_xmit() + + + skb->dev->qdisc->enqueue() + enqueue into devices output queue + default: net/sched/sch_generic.c:pfifo_fast_enqueue() + net/sched/sch_generic.c:qdisc_restart(): + dev->qdisc->dequeue() + dequeue skb from queue + dev->hard_start_xmit() + transmit skb via driver + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Driver TX Routine + + drivers/net/e1000/e1000_main.c:e1000_xmit_frame() + tons of workarounds for chip bugs + set up TX DMA descriptor + queue TX DMA descriptor to device hardware + return NETDEV_TX_OK + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%page +Linux 2.6 Network Tour +Thanks + + Thanks to + Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen + for implementing (one of?) the world's best TCP/IP stacks + Paul 'Rusty' Russell + for starting the netfilter/iptables project + for trusting me to maintain it today + Astaro AG + for sponsoring parts of my netfilter work + Free Software Foundation + for the GNU Project + for the GNU General Public License +%size 3 + The slides of this presentation are available at http://www.gnumonks.org/ + + Further Reading +%size 3 + The netfilter homepage http://www.netfilter.org/ +%size 3 + The http://www.gpl-violations.org/ project + + |