dynamic PBR, actions, docs and getting it all straight

Brian S Julin

2007-11-28 17:27:51 UTC

Hi,

Fair warning this may be a bit rambling, and is
definitely a bit long.

I am trying to prototype a system for doing dynamic
policy-based routing (source address dependent based
on reverse routes from BGP or other dynamic routing
protocols.) We need to do this due to a cacophony
of factors I won't get into.

To do so the general plan is to store dynamic routes in
their own table, classify based on source realm, and use
the tc "mirred" action to redirect packets that source
from addresses routed back to by that table onto a
different egress interface.

It seems obvious this can be done, that the old
"iptables -j ROUTE" method is falling into disfavor
and lack of maintenence, and that the tc "mirred"
action is stepping up to take its place.

However this has raised numerous questions, most of which
just because this is my first wade into the LARTC pool.
Also, though, I am having trouble finding any docs
that factor in actions, since they are relatively new --
but not so new that this should really be the case.

(And speaking of docs, one wonders whether the
"Traffic Control HOWTO" posted at linux-ip.net
bearing version 1.0.2 is intended to split/supercede
the LARTC HOWTO or is completely rogue. It appears
to be a very well done doc, but also does not
factor in actions.)

Anyway, the questions:

1) When a packet is "mirred egress redirect"ed, how does
the system determine the destination MAC address to place
on the outgoing interface, assuming it is ethernet? If I
have things straight, this packet will never see the routing
stack again and so a gateway cannot be designated? (The
older iptables -j ROUTE allowed designation of a gateway)
If this: http://www.shorewall.net/NetfilterOverview.html
...is right there is no swat at mangling/rewriting post-qdisc?
I'm guessing "that's a job for IMQ"?

2) If I have things straight again, it is not necessary
to involve iptables to do this. The method cited in the
few examples on the net about doing this use fwmark.
However, with the tc "route" filter it should not be
neccessary to do that anymore. Am I right there?

3) Per 2) which is the better method to use?

4) Is there an authoritative list of which actions are
supported at which points in the syntax tree? The
"route" filter seems to only support classifying and
gact, for example, and if I am interpreting the
not-so-lucid error messages from yesterday's wrestle
with tc correctly, the inability to execute certain
actions extends into any policer appended to the
filter. What's supported where and what will be
eventually supported where?

5) Is there any way to turn on more error messages
from the kernel so I can tell what the heck tc doesn't
like about commands, even if I have to read it from
syslog and the userspace handles aren't meaningful it
still might be nice to have.

6) If I have this right, it's possible to define
a class using the "rule" filter, then a subclass
using a do-nothing filter (u32 match u32 0 0) which
then in turn invokes the "mirred" action. I am not
quite clear, however, precisely when a packet is
counted against a qdisc and when precisely actions
"happen." I am worried about the activation of the
"route" rule counting as link use even though the
packet is redirected (stolen). Mainly because
in order to use a filter just to execute an action,
it's mandatory to have a class to attach it to, and
then a second class for packets that did not match
(the normal traffic) -- each class having bandwith
limits or whatnot depending on the qdisc. If I
have it right a stolen or dropped packet, though,
will not show up because it won't actually be there
in the qdisc when the kernel comes collecting (?).

7) Will eventually classless qdiscs regain support
for attaching filters, given that filters do not
necessarily have to assign a class, they can instead
execute an action or a police with nothing but actions?
Or will it always be necessary to create classes to
contain the filters, and thus use a classy qdisc?
I say regeain because I seem to recall seeing a doc that
showed attaching a filter to a classless qdisc, though
I can't find it now and perhaps that was an error.

8) As a curiosity, why "handle XX fw" rather than
"fw handle XX"?

9) Is there any motion to bring the distributed manpage
up to sync?

10) I haven't even looked into it yet -- how does
(or does?) one integrate L2+L3 criteria/actions with
qdiscs... any docs on other-than "protocol ip"? I am
assuming it is not possible to trick things into
performing a direct route table comparison against
a packet that is not routed, but bridged, other
than to build a netfilter ipset from the route
table with bubblegum and spit and just use ebtables
on it. But I'd be bummed if I assumed so wrongly
and passed up an elegant solution.

Thanks for any help wrapping my head around this.