[6bone] [NOTIFY] XS26 service/peering outage

Petr Baudis pasky@xs26.net
Wed, 18 Dec 2002 17:07:30 +0100


Dear diary, on Tue, Dec 17, 2002 at 08:04:41PM CET, I got a letter,
where John Fraizer <tvo@EnterZone.Net> told me, that...
> On Tue, 17 Dec 2002, Jan Oravec wrote:
> > On Mon, Dec 16, 2002 at 09:26:00PM +0100, Stephane Bortzmeyer wrote:
> > > On Monday 16 December 2002, at 18 h 3, 
> > > Petr Baudis <pasky@xs26.net> wrote:
> > > 
> > > > and bgpd. Apart random crashes in various time periods (from few minutes to
> > > > weeks) 
> > > 
> > > A funny things about distributed systems is the difference in testimonies :-) 
> > > We never had a Zebra crash.
> > 
> > You were probably never running zebra on router with 2048 interfaces, having
> > 2k static routes redistributed into BGP, 10k internal BGP routes, about 200
> > prefixes in IGP and about 300 external BGP routes.
> 
> Find me, outside of 6bone, *ANY* quasi-production router, I'm talking
> about on the entire planet, that has 2048 interfaces.
> 
> This sounds like more a problem of you should be splitting that interface
> load between many routers than one of there being a problem with Zebra.

Please let me describe what XS26 is:

We're a so-called distributed tunnel broker. That is, we provide unified web
interface for user, where she can create certain set of tunnels through
different PoPs (few in Europe, one in US) and then register a /48 zone and
route it through the tunnel (usually only one of them, but she can also let it
be routed through several tunnels on different PoPs).

Inevitably, there is much higher number of people wanting a tunnel than those
donating us a server to act as a PoP. Thus, on each PoP there will be a large
number of tunnel interfaces; those better connected will also obviously have
larger number of tunnel interfaces allocated than those with worse
connectivity.

It is kind of difficult to split that interface load between many routers as we
do not have many routers. We don't make any money, we don't eat any money, all
the PoPs are volunteered by various companies and/or people (usually network
admins of that companies) interested in helping with the IPv6 deployment.
Obviously, you are welcomed to help us with this splitting of the load... ;-)

..snip..
> > Zebra is not ready for production networks.
> > 
> 
> I beg to differ.  Your "network" from what ou've described, is
> under-engineered.  What was the purpose again of terminating 2000+
> endpoints on a single router again?  You can't seriously think that any
> true production (BTW: most of us consider production to be equal to
> billable) network architect would put that many eggs in one basket can
> you?

See above for explanation why we do it like this.

I agree that the "production networks" statement was maybe too radical.

..snip..
> > > > Basically, zebra looks not to be prepared for the networks which change very
> > > > dynamically (our iBGP table changes very frequently as user prefixes appear and
> > > > disappear; it's also relatively big (in the 6bone world, at least ;) 
> > > 
> > > We use Zebra for default-free routers on the IPv4 Internet. The 6 bone is a 
> > > very small experiment when you compare it to the always-changing 100k routes 
> > > of the IPv4 Internet.
> > 
> > We have 10k always-changing routes in the IPv6. BGP implementation is
> > relatively good if you don't dynamically add/remove interfaces.
> > 
> 
> Again, that sounds like an implementation issue in your network.
..snip..
> If you are not assigning each router a "pool" from which you assign tunnel
> space, NLA assignments, etc from, you are making your network topology
> much more complicated than it needs to be.

The idea behind our system is to make addressing independent on the PoP. This
gives us:

* easy migration between PoPs; this actually appears to be very important,
  from our observations a lot of the people were making use of this

* this implies also "immortality" of the zone assignments, even in case a PoP
  is down (service outage or leaving XS26, which inevitably happens sometimes,
  given our organization)

* possibility to make tunnels to more PoPs; currently, this gives you some
  loadbalancing of the incoming traffic; we plan to implement BGP peering with
  users, then this will be even more important

..snip..
> I would like to stress that I don't know of any routing suite that is
> going to be happy in the environment I'm picturing based on your
> description of your network topology.  Perhaps you might look into that a
> bit.

Yes, that's why we are making our own :-).

-- 
 
				Petr "Pasky" Baudis
.
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?
Because People Are Stupid(tm).  Because it's cheaper to put "ACL support: yes"
in the feature list under "Security" than to make sure than userland can cope
with anything more complex than  "Me Og.  Og see directory.  Directory Og's.
Nobody change it".  C.f. snake oil, P.T.Barnum and esp. LSM users
        -- Al Viro
.
Crap: http://pasky.ji.cz/