Ping with TCL and multihomed router for quality measurement.
Introduction
In this article I'm describing how can
be tested quality of links in a multi-homed router environment. At
the end of the article you can find attached texts of the program.
During this article, it's supposed, you are using Debian system. In
case of any other distribution, things should work the same way, but
there may be a need for minimalistic changes.
Imagine, you have a linux router which
is connected to more then one, Internet providers. Such connection is
often reffered as multihomed. In multihomed network scenario, you may
want to track an Internet quality of both connections and choose the
best one.
An interesting case of a multihomed
network is - a multihomed router with VPN connections.
As you may see from the example
drawing, if every office have 2 connections to the Internet, then we
should have 4 VPN tunnels to cover each of possible Internet routs to
interconnect both VPN routers.
In this scenario we have 4 routes to
test. We have 4 directions to 'ping'. Real world cases may be much
more complicated than this example (and have more directions to
'ping').
Unfortunately, TCL is very weak from
the point of view of networking. So, in case of 'ping' utility, we
should, either create new TCL module or use external, already
existing utility. We will use the second approach, which is usually
much easier to implement.
For our purposes, from all existing (on
Linux) 'ping' utilities, I prefer to use “oping”.
apt-cache
show oping
gives us a link to a homepage:
Homepage: http://verplant.org/liboping/
From manual page we can see how to use
the command:
oping [-4
| -6] [-c count] [-i
interval] host [host [host ...]]
The beautiful part of 'oping' syntax is
that you can specify more then one host to ping. For example:
oping
10.0.3.1 10.0.4.1
will send ICMP messages to both
destinations. In reply we will get:
PING
10.0.3.1 (10.0.3.1) 56 bytes of data.
PING
10.0.4.1 (10.0.4.1) 56 bytes of data.
56
bytes from 10.0.3.1 (10.0.3.1): icmp_seq=1 ttl=63 time=1.77 ms
56
bytes from 10.0.4.1 (10.0.4.1): icmp_seq=1 ttl=63 time=1.20 ms
56
bytes from 10.0.3.1 (10.0.3.1): icmp_seq=2 ttl=63 time=0.97 ms
56
bytes from 10.0.4.1 (10.0.4.1): icmp_seq=2 ttl=63 time=1.33 ms
56
bytes from 10.0.3.1 (10.0.3.1): icmp_seq=3 ttl=63 time=1.27 ms
56
bytes from 10.0.4.1 (10.0.4.1): icmp_seq=3 ttl=63 time=1.17 ms
Please, note, ICMP sequence (icmp_seq)
for both hosts is the same on each iteration. This makes it very
handy to write test scripts. You can 'ping' several targets and if
some of them are down, you still will know, your Internet connection
is up until, at least, one of your targets is answering.
Multihomed node and linux routing
To make ping tests, and for many other
tasks, we want some routes to go as if we would be connected only to
first provider, and some as if we would be connected only to the
second provider. For example, in case, main Internet provider has
problems, we want to change default routes for LAN networks, but we
want to keep untouched testing 'oping' commands and/or VPN tunnel
daemons routing. Another example, to make this concept, more obvious:
you have two LAN sub networks and for some reason want to route one
through main provider, another, through redundant. In this case you
want your router behave as if we would have separate router for each
sub network.
Such features are united under term -
'policy based routing'.
For linux, policy based routing mean
you have some criteria according to which you may choose routing
table, instead of using default routing table, for data flow. In
practice, we are to use iproute2 packet utilities to work with
'policy based routing'.
'policy based routing' for our needs
Let's imagine, we have two Internet up
links (eth0, eth1). We want to 'ping' some hosts using all up links
we have. Please pay your attention, to the fact we may want to run
oping 10.0.3.1
10.0.4.1 on both of our Internet links. Which means we
can't just put ip
route add 10.0.3.1/32 via X.X.X.X dev eth0 to our
router. This would direct all 'pings' via only ane route.
To achieve our targets we need to use
'policy based routing'. I suggest to have separate additional routing
table for each Internet link.
So, what we need is:
1. create additional routing tables
2. find a way how to force oping
to run using routing table we want.
To create additional routing table,
open /etc/iproute2/rt_tables.
On Debian you will see something like this:
cat
/etc/iproute2/rt_tables
#
#
reserved values
#
255
local
254
main
253
default
0
unspec
#
#
local
#
This file define routing tables. Local,
main and default are predefined tables. Each table is identified by
table number and table name.
Add to this file two strings:
252
tinc_I1
251
tinc_I2
Now we have two new tables in our
system. I use name tinc_*
, because, I usually use uplink dedicated routing tables to run VPN
daemons and for VPNs I use - tinc http://tinc-vpn.org/.
Now we can fill tables with routes.
Example:
ip
route add default via 10.0.1.254 table tinc_I1 dev eth0
ip
route flush cache
Please, note, system is looking after
rote records in tables. So, if your interface goes down and then up
(for example, you reconnect ethernet cable) - your route will
disappear from all tables. To overcome this you can put next string
in /etc/network/interfaces
iface
eth0 inet static
address
10.0.1.22
netmask
255.255.255.0
up
ip route replace default via 10.0.1.254 table tinc_I1 dev eth0
Then command 'ip
route replace default via 10.0.1.254 table tinc_I1 dev eth0'
will be executed automatically each time eth0
goes up.
Now, when we have table, we should
think, how to force oping
to run using routing table we want. In general, you specify which
table to use by ip
rule command. For packets our system forwards as a
router, there are no problems to put a rule like this:
ip
rule add from 10.10.10.0/24 table tinc_I1
Everything coming from 10.10.10.0/24
network will be routed using tinc_I1
table. In the example we have applied 'from'
identifier. Please read man pages to see all possible identifiers.
For local processes situation is more
complicated. Many of approaches applicable for forwarded traffic are
not working correctly with locally generated traffic. This is due to
Linux kernel is taking decision on how to route packets for local and
for non local traffic in different ways. When kernel does not have
enough criteria to select a routing table, it will always use a table
called main.
(until default ip
rule settings are not changed).
'policy based routing' and locally generated traffic
After many experiments trying to find a
method which works for local daemons, I've find out one approach
which works. Several rules should be followed:
1. Process, generating network traffic
should be possible to bind to an INTERFACE. Binding to IP address
does not help.
2. We should have a criteria ip
rule to distinct a needed IP flow from others.
Rule number one means that not every
local program can direct it's traffic using non 'main'
routing table. Please note, interface we bind to must be recorded in
the routing table we want to use. It may be not obvious, but adding
default route to a table - identifies also interface, which make a
record we need.
Rule number two. That's easy to create
proper 'ip rule'
when you know source and destination addresses. ip
rule add from 10.0.1.254 to 10.0.3.1 table tinc_I1.
This is ok, for VPNs with fixed IPs, but not for 'ping', where our
destination IPs can change, or we can have a wish to 'ping' same IPs
from different interfaces. In case of 'ping' we can follow rule
number two in three steps:
- run each 'ping' process under a certain user group.
- create Iptables rule which will add FWMARK to packets for processes run by certain user group.
- create IP rule for FWMARKs.
FWMARK is internal kernel mark Linux
box can put to network packets. It exists while packet is inside the
box. You can put such marks by Iptables. You can use those in
iptables
and ip rule
commands.
One of criteria upon which Iptables can
put a mark is user group which run a local process. You can look for
a syntax and options by running:
iptables
-m owner --help
You will get long help listing and in
the end of it:
owner
match options:
[!]
--uid-owner userid[-userid] Match local UID
[!]
--gid-owner groupid[-groupid] Match local GID
[!]
--socket-exists Match if socket exists
Those are possible options we can use
in Iptables.
Now we can master the whole
construction.
a) Add new user groups. I prefer to use
the same group numbers as we have for routing tables.
groupadd
-g 252 tinc_I1
groupadd
-g 251 tinc_I2
b) Add iptables rule to assign FWMARK
for processes run by new user groups.
iptables
-t mangle -A OUTPUT -m owner --gid-owner 252 -j MARK --set-mark 252
iptables
-t mangle -A OUTPUT -m owner --gid-owner 251 -j MARK --set-mark 251
to make things easier, I suggest to
have the same mark numbers, as we used while adding new groups.
c) Add 'ip
rule's for each group to direct it's traffic to the
proper routing table. This will be done by looking to FWMARKs
assigned by Iptables.
ip
rule add fwmark 252 prio $r_prio table tinc_I1
ip
rule add fwmark 251 prio $r_prio table tinc_I2
d) Run 'ping' commands under different
user groups.
For this we should use sudo
command. To install it under Debian, run: apt-get
install sudo. Now to make it easier to work with sudo
modify /etc/sudoers
file. Comment string where you have: root
ALL=(ALL) ALL. Put instead of it: root
ALL=(ALL:ALL) ALL. So, you should get:
#
root ALL=(ALL) ALL
root
ALL=(ALL:ALL) ALL
Now we can run our 'ping' commands:
sudo
-g#252 /usr/bin/oping -D eth0 10.0.3.1 10.0.3.2
sudo
-g#251 /usr/bin/oping -D eth1 10.0.3.1
or the same but another way:
sudo
-gtinc_I1 /usr/bin/oping -D eth0 10.0.3.1 10.0.3.2
sudo
-gtinc_I2 /usr/bin/oping -D eth1 10.0.3.1
-D option tells 'oping'
to bind to a specific interface.
'policy based routing' for local 'ping' command summary by example
1. Run 'oping'
command by appropriate user group which corresponds to a certain
routing table:
sudo
-g#252 /usr/bin/oping -D eth0 10.0.3.1 10.0.3.2
2. 'oping'
binds to an interface eth0.
The interface is added to a proper routing table by ip
route replace default via 10.0.1.254 table tinc_I1 dev eth0.
After this command you may need to run:
ip route flush cache.
Please note, nothing prohibits an
interface to be recorded in more then one routing table.
3. Iptables sees you run a command from
user group with number 252
and name tinc_I1.
It assignes FWMARK with number 252
to all packets generated by this command.
4. Linux kernel looks to IP rules and
find among them: ip
rule list|grep tinc_I1. Which gives on my system:
2763:
from all fwmark 0xfc lookup tinc_I1
0xfc
- hexadecimal representation of 252.
2763
- rule priority. Each 'ip
rule' has it's priority. You can specify priority by
'prio'
modifier. Rules are scanned in the order of increasing priority.
This rule 'sends' packets to the proper
routing table.
TCL and 'external' utilities
'oping'
is not part of the Tcl language. To run external utility which is
going to produce continuous output, we should use Tcl command: open.
From the manual page, we can find out:
If the first
character of fileName is “|” then the remaining characters of
fileName are treated as a list of arguments that describe a command
pipeline to invoke, in the same style as the arguments for exec. In
this case, the channel identifier returned by open may be used to
write to the command's input pipe or read from its output pipe
As output of the 'oping'
is line oriented (you see output line by line, not character by
character), we should also say to Tcl channel driver, we are going to
read, when the whole line has arrived. Those things are done in two
lines of code:
set
cmd {sudo -g#252 /usr/bin/oping -D eth0 10.0.3.1 10.0.3.2}
set
pipe [open "|$cmd"]
fconfigure
$pipe -buffering line
Next, we want to read from 'oping'
output and to process each new line which appears on the output. For
this we use Tcl event driven facilities. We tell to Tcl, that as soon
as new line appears in our pipe from the 'oping'
side, we should call special function to process this line:
fileevent
$pipe readable [list Pinger $pipe]
In this example, as soon as new line of
'oping'
output will appear in our pipe channel, function called Pinger
will be called and $pipe
will be passed as a parameter to this function.
I've already mentioned previously, we
are going to use Tcl event driven facilities. Our event processing
mechanism will not start operate, until we are not in special mode.
To go to special event processing mode we can use: vwait
1 command. As a starting point for more information
regarding events, you can read this article:
http://www.tcl.tk/man/tcl8.5/tutorial/Tcl40.html.
Design of the program
There is an interesting book in the
Internet - “How to Design Programs” - www.htdp.org. This book
contains a programming example
http://www.htdp.org/2003-09-26/Book/curriculum-Z-H-5.html#node_sec_2.2,
which later is discussed in chapter 3 -
http://www.htdp.org/2003-09-26/Book/curriculum-Z-H-6.html#node_chap_3.
This example illustrates ideas of how to compose the program from
functions and auxiliary (helping) functions. I'll show those ideas in
short version and converted to Tcl.
Imagine, we want to calculate aria of a
ring. We know R and r, which are radius of inner and outer discs (R -
outer, r - inner). A person would calculate like this, So = pi*R^2,
Si = pi*r^2 → Sring = So - Sr = pi*(R^2 - r^2). But in a world of
real and complicated tasks, we should go different way.
The approach proposed in the book is to
start from the target, and then to divide it into smaller tasks by
applying helping functions. In this scenario, we start our
programming from: Sring = So - Sr. This first statement gives us the
final result, but contains two undefined yet functions, which we are
to express by helping functions.
In Tcl, we could express these
functions like this:
proc
area-of-ring {R r} {
return
[expr \
[area-of-disc
$R] - [area-of-disc $r] \
]
}
proc
area-of-disc r {
return
[expr 3.14*$r*$r]
}
Now we can run something like:
area-of-ring
6 4
…
Please, catch main idea: we start
solving of a problem, from the final result, we want to get.
- This makes program code more readable. At first glance you see biggest steps, to get the result, and if you have a wish, you look in to helping functions, which also may consist of it's helping functions.
- This, leads you to your target in a finite number of steps. You just divide your task in smaller pieces, then all of them in smaller, until elementary one. Which is handy way to solve complicated tasks.
- This makes it easy to change parts of the program if needed.
How the program works
1. We have a config file called
conf.tcl.
To make things easier, this file is included in the program by source
[file join $Path conf.tcl]. It consists of Tcl dict,
which identifies each direction we are going to ping by names: dir1,
dir2 .. dirN. Please stick to this naming, as program
is using patter matching as dir*.
Each dir*
has a key named cmd,
which contains as a value 'oping'
command with all necessary options to ping the direction. For
example:
dict
set directions dir1 cmd {sudo -g#252 /usr/bin/oping -D eth0 10.0.3.1
10.0.3.2}
2. The same dict
has special reccord:
dict
set directions current_dir dir1
This identifies direction used by
default. Later, if and when it is needed to replace main direction by
alternative, the name of alternative which has became active is
stored under this key.
3. Program is using same dict
to save some helping data structures:
- Each dir* has 2 buffers storing values of the last 100 'pings'. One of the buffers called 'metrics' stores 'oping' results. Another called 'drops' stores ones if instead of a result we have got a timeout and zeroes if we've got reply. If more then one destination is 'pinged', then the best result is selected to be stored.
- Each dir* has counter key, which helps us to go through the buffer, it changes values from 0 to 99, and then becomes 0 once again.
- Each dir* has icmp_seq key, which stores the value of the last icmp_seq shown by 'oping'. In case we read line and icmp_seq is the same as recorded in dict, we may understand, that 'oping' has more then one targets to ping, and we have to choose the best result.
- Each dir* has tmp_list key, this key has all results from the same iteration of the 'oping'. Actually, program is writing to this list, until new icmp_seq has arrived from the 'oping'. After that, the best result is chosen and put to the 'metrics' and record of 0 or 1 is made to 'drops'.
4. Program is started by
start_Monitor.tcl
executable file. This contains main big steps to achieve our results.
It:
- loads helping functions by source directive.
- loads conf.tcl.
- initializes helping dict structures.
- starts 'oping' by calling start_Pinger $dir for each direction.
- after 10000 CompareMetrics 5000. This calls subroutine which is to compare results and to decide whether we are to switch to new direction or not. CompareMetrics is started after 10 seconds from the start of theprogram. CompareMetrics is repeated each 5 seconds.
- vwait 1, enters event loop.
5. from doc.txt:
#
=============================================
#
structure of directions dict
#
=============================================
directions
|
+-dir1
|
+-cmd {sudo -g#252 /bin/ping -I eth0tinc 10.0.3.1}
|
+-metrics "list of 100 elements" - filled with Ping
results
|
+-drops "list of 100 elements" - filled with 1-th where
was drops
|
+-counter N (0-99) - Number of element in a list
|
+-icmp_seq N - last icmp_seq
|
+-tmp_list - list of results for the same $icmp_seq numbers
+-dir2
....
+-current_dir
- current dir
6. If there are no timeouts, then route
with better results for last 100 iterations win; If, there was drops,
then the route with less drops win. In conf.tcl
you can specify:
- set Drops 5; - N of not replied pockets to jump to other channell
- set Diff 0.5; - we calculate metrics of currently active dir and metrics of dir better then current, like Better/Active. If Better/Active < Diff → we switch to new dir.
7. If we switch to new direction, then
file change_dir
in scripts
directory is executed. Name of new active dir*
is passed as a parameter. Later, this name is recorded under
directions →
current_dir key. change_dir
file should contain script to take a 1st
parameter and based on it to change system routing.
Finally, after program is tested and
right scripts are composed for switching of the routing, you may have
a wish to start your monitoring program automatically, right after
the system is booted. It's very easy to do under Debian Linux. Open
/etc/rc.local.
This file is executed at the very end of the Debian Linux boot
process. Put link to your script before string exit
0. In my case last two strings of the file look like
this:
/opt/scripts/monitor4ik/start_Monitor.tcl
exit
0
I've put monitoring program to the
archive - monitor.tar.gz.
It can be downloaded from: http://www.mediafire.com/?lvei23ggw9525zu
After download, you can tar
xzvf ./monitor.tar.gz at preferred location. Program
should run from any location. Before running the program, Tcl should
be installed to the Linux platform. Start program by running
start_Monitor.tcl.