Advanced routing mini-HOWTO
Timur A. Bolokhov, timur@tepkom.ru
This document describes new routing features of 2.1.X development and coming 2.2.X stable linux kernels. Among them are source-based routing and Network Address Translation (NAT).
Introduction
Somewhere in the middle of 2.1 development kernel series routing code was rewriten by Alexey Kuznetsov (kuznet@ms2.inr.ac.ru), many new features like policy(source)-based routing, Network Address Translation, scheduling etc were added. Networking is now managed by means of ip
, tc
and rtmon
utilities from iproute2
package. I hope this document will help novices to enter new conception.
Regrets
This document is written by a USER, even some basic notions can be incorrect. The ip
utility is very powerfull, as you can see by its syntax in appendix, only a little part of its possibilities is described. Hope that you can guess the rest. No word is said about cooperation with tc
and about tc
itself. No picture yet. Bad language, punctuation, general mistakes.
Preliminary reading
Suppose that you already have some experince with linux routing, or at least just studied NET-3 HOWTO, IP-Alias, IP-Subnetworking, IP-Masquerading, Proxy-Arp minis. Kernel-HOWTO will help you to compile new-featured kernel.
Where to find them
- The
iproute2
package is available in ftp://ftp.inr.ac.ru/ip-routing/ There is a mirror(s), but I couldnot even resolve it in DNS. May be the situation will change? - Howtos are as usual in
/usr/doc/
or in the nearest mirror of sunsite.unc.edu. - Utility
ipchains
is homed inhttp://www.adelaide.net.au/rustcorp/ipfwchains.
- This document: hope that current version will be somewhere under
ftp://post.tepkom.ru/pub/Linux/
Convention
Value standing in square brackets [ ] is just an option to smth.
Software
Author of this document is using 2.1.121 kernel withglibc-2.0.7
, iproute2-ss980827
along with gated-3.5.9
. Also iproute2-glibc2-patch??
was applied. This combination experienced only a week uptime, I couldnot test it longer.
How it was before
I'll try to remind you in brief routing conception from 2.0.X series kernels. When IP packet hits router's interface kernel, at first, applies to it rules from input firewall chain. Then if packet survives and in case that forwarding is enabled (/proc/sys/net/ipv4/ip_forward
is nonzero) it is being passed to another interface according to the routing table and forward firewall chain. Or just finish its way if its destination is one of the routers' interfaces. Normally routing table contain description of paths to all possible IP destinations. The latest are gathered in groups -- networks, each of them is uniquelly described by network adress (the first address in the group) and netmask (masklengh), which characterizes the number of adresses in the group ( is the right number). Routing table has two main columns:
DESTINATION: HOWTO_REACH_IT
Indeed, look at the example:
router># route -nWe have two network devices, three interfaces (without loopback) -- eth0, eth1 and an alias eth1:1, three networks connected directly, so we have 0.0.0.0 as gateway, one network connected behind the gateway 192.168.0.3 and a wise router 192.168.0.4 which knows how to forward packets to the rest part of the world. Routing table is scanned by kernel from top to bottom, when destination is found within some network (or there is special "host" entry for it) packet is forwarded to the specified gateway via corresponding interface.
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.1.32 0.0.0.0 255.255.255.224 U 0 0 12 eth1:1
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 34 eth0
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 3 eth1
192.168.3.0 192.168.0.3 255.255.255.0 UG 1 0 8 eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 1 lo
0.0.0.0 192.168.0.4 0.0.0.0 UG 1 0 3 eth0
Note that networks are sorted strongly in the direction of decreasing of netmask (masklen), so that if a smaller network within a bigger one has its own gateway then it will appear higher in the table and have its chance to be routed correctly.
Now I want to remind you how to make such a table. Here is some base syntax:
ifconfig DEVICE [ADDRESS] [netmask MASK] [broadcast ADDR] [up,down]and the real commands:
route {add,del,flush} [-net,-host] [NETWORK] [netmask MASK] \
>[gw GATEWAY] [dev DEVICE]
router># ifconfig lo 127.0.0.1 netmask 255.0.0.0 broadcast 127.255.255.255 up
router># ifconfig eth0 192.168.0.1 netmask 255.255.255.0 up
router># ifconfig eth1 192.168.2.1 up
router># ifconfig eth1:1 192.168.1.35 netmask 255.255.255.224 \
> broadcast 192.168.1.63 up
router># route add -net 127.0.0.0 dev lo
router># route add -net 192.168.0.0 netmask 255.255.255.0 dev eth0
router># route add -net 192.168.2.0 dev eth1
router># route add -net 192.168.3.0 netmask 255.255.255.0 gw 192.168.0.3
router># route add -net 192.168.1.32 netmask 255.255.255.224 dev eth1:1
router># route add default gw 192.168.0.4
What it is now
Short description of a new routing mechanisms you can find in linux/Documentation/Policy-routing.txt
. Below I'll try give it in more detail.
Now we have not only one table (string) of correspondencies
DESTINATION: HOWTO_REACH_ITbut a set of such a tables (which are called classes in the document referenced above), each one being applied to the packets satisfying certain conditions. These conditions are set by means of
ip rule
syntaxis of ip
utility, while routing tables are filled by means of ip route
. There are three built-in tables (classes): local, main and default. Here we can see how they are bound by the rules: router># ip ruleRules are scanned by the kernel in order of their preferense (the number before semicolon), so in this initial setup for any arrived packet path to destination will be looked up, at first, in table local and if it's not found -- in tables main and default.
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
When an interface has been configured with ifconfig
(or ip link
and ip addr
) host entries of its ip and broadcast addresses appear in the table local. Route to its attached network appears in the table main. All this is done automatically, you should not type no command now. To check up what do we have in table N
just type ip route list table N
.
Utilities ifconfig
and route
from net-tools are still available under 2.1.X, so set up from the previous section can readily be done as above (but without dealing with attached networks). Another variant is to use ip
:
router># ip link set eth0 upStatic and default routes from this example may have been also put to any other table which is looked up after table main (with preference greater than 32766). For example:
router># ip addr add 192.168.0.1/24 broadcast 192.168.0.255 \
> label eth0 dev eth0
router># ip link set eth1 up
router># ip addr add 192.168.2.1/24 broadcast 192.168.2.255 \
> label eth1 dev eth1
router># ip addr add 192.168.1.35/27 broadcast 192.168.1.63 \
> label eth1:1 dev eth1
router># ip route add 192.168.3.0/24 via 192.168.0.3 table main
router># ip route add 0/0 via 192.168.0.4 table main
router># ip route add 192.168.3.0/24 via 192.168.0.3 table 1so that
router># ip route add 0/0 via 192.168.0.4 table 2
router># ip rule add [from 0/0] table 1 pref 32800
router># ip rule add [from 0/0] table 2 pref 32810
ip rule
gives:
router># ip ruleBut we won't consider this variant below.
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
32800: from all lookup 1
32810: from all lookup 2
So what's the difference of the new routing scheme from the previous one? The main is that ip packets now can be sorted with regards to their source address, TOS field, and may be in the future -- to special marks put on them by external classifier (like ipchains
). Suppose that we want in our example for the packets [with TOS 0x10 (minimum delay)] coming from 192.168.1.32/27 to be routed thruogh default gateway 192.168.0.5, then we type (after our interfaces are up):
router># ip route add 192.168.3.0/24 via 192.168.0.3 table mainRules now looks like this:
router># ip route add 0/0 via 192.168.0.5 table 3
router># ip route add 0/0 via 192.168.0.4 table 4
router># ip rule add from 192.168.1.32/27 [tos 0x10] table 3 pref 32900
router># ip rule add from 0/0 table 4 pref 32910
router># ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
32900: from 192.168.1.32/27 [tos 0x10] lookup 3
32910: from all lookup 4
Similar setup may be usefull for organizations connected to the net through two or more ISPs via one linux gateway (of course, we shouldn't check here TOS field -- just route packets from network assigned by the second ISP to its gateway or ppp interface). It is even possible to make a script notice a problems in one link and redirect (in combination with NAT) critical outgoing connections to another ISPs link. This won't work for incoming calls as long as you do not change your DNS entries accordingly or have multihomed servers.
Here is a syntax for ipchains
to set the TOS field:
ipchains -A input -p PROTO -s SOURCE [port] -d DEST [port] -t 0x01 0x10
NATs
You should be extremely careful playing with NAT, even in a network with complex topology, routed by routing protocols or simply connected to other network through more than one router.Translation of a packet's destination address is always done in routing table local. The syntax is the following:
ip route add nat WHAT/MASKLEN via WHERE table localSo to translate all packets coming to 192.168.1.50 in the packets destinned to 192.168.2.25 you type:
router># ip route add nat 192.168.1.50 via 192.168.2.25 table localAnd to translate whole subnet 192.168.1.40/29 into 192.168.2.48/29 command is
router># ip route add nat 192.168.1.40/29 via 192.168.2.48 table local
Translation of source addresses should be set by means of rules:
ip rule add from REAL_SOURCE/MASKLEN nat PSEUDO_SOURCE table TABLEID
According to the routing conception ip packets comimg from REAL_SOURCE will translate their source addresses to PSEUDO_SOURCE and routed according to the table TABLEID. The translation will be valid only for the packets whos destination is in this table.
Let's illustrate it. Suppose that in our example 192.168.2.0/24 is an address space from ISP with gateway 192.168.0.4 and 192.168.1.32/27 is from ISP with gateway 192.168.0.5. We suddenly want to relink hosts in subnetwork 192.168.2.48/29 to another ISP. We have wisely reserved a spare subnet 192.168.1.40/29 for this. But we want no translation when 192.168.2.48/29 comes to local nets, especially to 192.168.1.0. Next commands provide our needs:
router># ip route add nat 192.168.1.40/29 via 192.168.2.48 table local(Remind that table 3 contains default gw 192.168.0.5). Our setup now is:
router># ip rule add from 192.168.2.48/29 nat 192.168.1.40 table 3 pref 32820
router># ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
32820: from 192.168.2.48/29 nat 192.168.1.40 lookup 3
32900: from 192.168.1.32/27 lookup 3
32910: from all lookup 4
Want the same translation when going to 192.168.1.0 too? Ok, just type
router># ip rule add from 192.168.2.48/29 nat 192.168.1.40 table 5Then you'll get
router># ip rule add 192.168.1.0/24 via 192.168.0.3 table 5
router># ip rule
0: from all lookup local
32765: from 192.168.2.48/29 nat 192.168.1.40 lookup 5
32766: from all lookup main
32767: from all lookup default
32820: from 192.168.2.48/29 nat 192.168.1.40 lookup 3
32900: from 192.168.1.32/27 lookup 3
32910: from all lookup 4
Note that you should allways think of where your rule appears in the list, i.e. control its preference. Otherwise result may be very confusing. Guess why we couldnot just put the route to 192.168.1.0/24 into table 3 with
router># ip rule add 192.168.1.0/24 via 192.168.0.3 table 5instead of last two
ip rule add ...
and ip route add ...
? Hope that those imaginary examples will help to organize your real system.
Appendix
Full syntax ofip
utility is gathered here
ip
Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }
where OBJECT := { link | addr | route | rule | neigh | tunnel }
OPTIONS := { -s[tatistics] | -f[amily] { inet | inet6 }}
ip link
Usage: ip link set DEVICE { up | down | arp { on | off } |
multicast { on | off } | txqueuelen PACKETS |
name NEWNAME }
ip link show [ DEVICE ]
ip addr
Usage: ip addr [ add | del ] IFADDR dev STRING
ip addr show [ dev STRING ] [ ipv4 | ipv6 | link | all ] [txqueuelen]
IFADDR := PREFIX [ local ADDR ]
[ broadcast ADDR ] [ anycast ADDR ]
[ label STRING ] [ scope SCOPE ]
SCOPE := [ host | link | global | NUMBER ]
ip route
Usage: ip route list SELECTOR
ip route { change | del | add | append | replace | monitor } ROUTE
SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ]
[ table TABLE_ID ] [ proto RTPROTO ]
[ type TYPE ] [ scope SCOPE ]
ROUTE := NODE_SPEC [ INFO_SPEC ]
NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ]
[ table TABLE_ID ] [ proto RTPROTO ]
[ type TYPE ] [ scope SCOPE ]
INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...
NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS
OPTIONS := FLAGS [ mtu NUMBER ] [ rtt NUMBER ] [ window NUMBER ]
[ flowid CLASSID ]
TYPE := [ unicast | local | broadcast | multicast | throw |
unreachable | prohibit | blackhole | nat ]
TABLE_ID := [ local | main | default | all | NUMBER ]
SCOPE := [ host | link | global | NUMBER ]
NHFLAGS := [ onlink | pervasive ]
RTPROTO := [ kernel | boot | static | NUMBER ]
ip rule
Usage: ip rule [ list | add | del ] SELECTOR ACTION
SELECTOR := [ from PREFIX ] [ to PREFIX ] [ tos TOS ]
[ dev STRING ] [ pref NUMBER ]
ACTION := [ table TABLE_ID ] [ nat ADDRESS ]
[ prohibit | reject | unreachable ]
[ flowid CLASSID ]
TABLE_ID := [ local | main | default | new | NUMBER ]
ip neigh
Usage: ip neigh { add | del } { ADDR [ lladdr LLADDR ]
[ nud { permanent | noarp | stale | reachable } ]
| proxy ADDR } [ dev DEVICE ]
ip neigh show [ ipv4 | ipv6 | all ]
ip tunnel
Usage: ip tunnel { add | change | del | show } [ NAME ]
[ mode { ipip | gre | sit } ] [ remote ADDR ] [ local ADDR ]
[ [i|o]seq ] [ [i|o]key KEY ] [ [i|o]csum ]
[ ttl TTL ] [ tos TOS ] [ nopmtudisc ] [ dev PHYS_DEV ]
Where: NAME := STRING
ADDR := { IP_ADDRESS | any }
TOS := { NUMBER | inherit }
TTL := { 1..255 | inherit }
KEY := { DOTTED_QUAD | NUMBER }
No comments:
Post a Comment