6bone Database cleansing?

David Kessens david@IPRG.nokia.com
Thu, 25 Apr 2002 17:34:02 -0700


Matteo,

On Tue, Apr 23, 2002 at 11:39:34AM +0200, Matteo Tescione wrote:
> The same results here, tried to ping the entire 6bone database "application
> ping" but get only around 10%, 20% of hosts...
> The question is: does anybody think to clean up the 6bone database?

While humans have the natural instinct that everything needs to be
kept tidy and cleaned, they always want to apply such tidyness rules
the most to other people, while they usually cut themselves a bit more
slack.

For example, I am the maintainer of the registry, but I am also a user
of the registry and I usually update my objects once in a couple of
months instead of doing it after every tiny change that I make to my
setup here. This saves me a lot of time while at the same time keeping
the most important information available: my contact information is
there and people can reach me if there is a problem.

A public database by nature makes it very hard to police and stop
people from putting incorrect data in there. In fact, if we would
strengthen the rules that will make it harder to register garbage
data, it will also make it harder for legitimate people to register
their data. That in turn, usually causes the legitimate users not to
put as much effort in keeping their data up to date anymore which in
return causes all data to be become stale. I much rather err a bit on
the side of making it easy to register things, than making it too
hard. This causes some garbage to get through the system, but I have a
big harddisk and there is really no problem with having extra, not
used data/sites in the database. It really doesn't hurt me as the
maintainer at all.

Keeping data up to date is really ones own responsibility, if I don't
update my contact information, people cannot reach me when *I* have a
problem so I will get burned since the other party will most likely
filter me out and I won't immediately know about it because my contact
information was out of date.

Having said this, I won't oppose at all any efforts to help the active
sites in keeping their data up to date and clean. I can think of an
automatic program that for example checks whether those reported
applications really exist, and if they don't, an email gets send to
the maintainer of that object to let them know that they either have a
problem or that they might want to consider updating their object. If
somebody has spare time to write such a program I would fully support
him/her in doing so. The only thing that we have to make sure is that
there are not going to be ten of such automatic programs and that they
don't send out their emails every hour or so so users will not get
annoyed by all this mail.

>From the other direction, if people write programs to use the registry
data, it makes a lot of sense to do some thinking on filtering out
data that is obviously stale or incorrect. For example, one approach
for a ping program could be to ping all the hosts, but not to report
hosts that could not be reached for 7 days or longer.

Of course, if people find totally obvious 'crap' in the database, they can
certainly drop me a mail and I will take a look at it to address the
problem.

David K.
---