I have over 14 years of email stored in Google Gmail now, all sorted, organized and tagged. It’s a huge archive going back through several jobs, plenty of experiences personal and professional. Some day it might make a good bit of data mining for an autobiography, but I digress..
One thing about Google Gmail has been bugging me for years! Google “folds” mail away in your Inbox, out of view, hidden, but not in any folder (what Google calls “Labels”, but they’re really individual IMAP folders).
There doesn’t seem to be any sort of reason for this, no algorithm and no obvious method to why they decide to take email out of your Inbox and hide it away from you. A bug? A feature? Who knows, but it’s annoying.
Here’s an example of what this looks like:
Notice that my search for “email@example.com” (Pinterest‘s mail robot) returns several hits, but only one of which is in my “Inbox”. Those others appear nowhere. They’re not in any folder anywhere in my entire IMAP or Gmail heirarchy.
They’re completely hidden, invisible, and only show up when you do a specific search for those terms. In other words, you can’t clean that junk out, delete it, unless you search for it first. Chicken-and-Egg problem, because you can’t search for what you don’t know exists.
There have been hundreds and hundreds of posts trying to come up with solutions to this problem, including using the “-label:” syntax to exclude labels from the search, leaving only “unlabeled” email.
That works great, if you have a handful of labels, maybe a dozen or two, but I have hundreds of IMAP folders (erm, “labels”), and they’re nested pretty deep in some cases. Trying to append all of my labels into one big long search string, does not work, because of a string limit in the search field. Fail.
So then I tried the somewhat magical “-label:*” search, but it returns mail with labels too, for some randomly odd reason. Another fail.
Once you install the extension, you’ll find a new “Unlabelled” link on the left side, in your labels group. Clicking that will reveal email with no labels, the “hidden” email that Gmail ferrets away from you, away from your searches, away from your folders.
Find it, label it, or as in my case, kill it off. I have 10,543 emails in my “All Mail” folder, and I’m sure a few hundred to a few thousand are going to fall into the Unlabeled category.
Now I can begin the process of pruning that out and cleaning out my mail even further. I hope this helps others who may be facing the same problems.
I’ve been slowly loading all of my mail into GMail in an attempt to try to use the system as a better way to manage my email, “folder-free”.
GMail uses the notion of tagging emails with “labels” and “Archival” of messages instead of the classic mail folder heirarchy. Productivity experts higher-than-me continue to praise the system as being better, so I decided to give it a try… on 10 years of my email; over 300,000 messages.
But today I noticed that some of my larger mail folders had duplicate emails in them. LOTS of duplicate emails (one folder had over 15,000 duplicates!). Removing that many dupes from hundreds of local IMAP folders was not going to be a fun task…
I looked around to find some good tools to do it, and came up with several shell scripts, Python tools and other home-grown things, but nothing I wanted to really try on my large email archive.
Then I found the Remove Duplicate Messages add-on for the Mozilla Thunderbird Mail client. I don’t use Thunderbird, and prefer to use Evolution or Outlook 2007 for managing my PIM data now (yes, I really do use Outlook 2007, because frankly, nothing even comes close to functionality in the Linux space).
But I decided to give it a try. I configured my local IMAP account in Thunderbird, let it query my folder list and then installed the add-on. Here is the process to delete those duplicate messages:
- When your IMAP account is configured in Thunderbird, expand the folder you wish to check for dupes.
- Right-click the folder and select “Remove Duplicate Messages” (highlighted in red in the screenshot below):
- A window will pop up after it scans for dupes, offering the following:
- Click on “Delete Selected” to remove the duplicate messages it found.
That’s it. It’ll move those messages to the Trash folder, and you can go in there later, right-click the Trash folder and select “Empty Trash” to permanently delete them.
Pretty simple and easy. Obviously make sure you back up your mail folders FIRST before you try any of this, just in case.
Update: After I ran this through all of my folders and deleted a lot of “legacy” mail folders (old 3Com palm-dev Palm mailing lists going back to 1999), I now have 144,962 messages in my local mail archive (a 52% reduction in number of messages).
Much better and easier to manage, search and back up to the FreeBSD backup array. It also removed 800M of space from ~/Maildir in the process.