Tags: google, servers
I briefly corresponded with a user who was asking for access to CVS for pilot-link, to try to solve a problem he was having with photos on his Palm.
I mentioned that CVS was not public, and he responded that he googled around and found a message from me on a mailing list I run, that helped him out.
“Wait, how did google spider a list that I know I restrict them from being able to index…”
So I started googling, and found this little site. It is a site in .ph (the Phillapines).
The problem with this, isn’t really that they provide an offsite archive of lists, but that they remove all email obfuscation from the posts. This means anyone posting to my lists, under the knowledge that their email address will be protected (by my site configuration and Mailman itself), will no longer have that address protected when it gets indexed by this site in .ph.
I also noticed a few moderated lists there, which I know have member-only viewable archives. This means you can’t google around and find posts made in those archives… except that google spiders THIS site, and picks them up, including the user’s email addresses.
I sent the webmaster a VERY harsh email about the situation, giving him a deadline of 5 days to remove any and all references to our lists from his/their servers. I also blocked their entire netblock on port 25 and 80, so he can’t even fetch the mbox version of the archives, and I unsubscribed the user “lurker” from all of the lists I run here.
We’ll see what happens. Probably nothing, but at least I can stop rogue users from subscribing to the list, purely for the purpose of putting list archives somewhere else on the Interweb.