Tuesday, 18 November 2008

Just Delete It Claus, Just Delete It

Claus Mikkelsen has woken up recently and started posting after a large break. Perhaps he's preparing for all those impending Christmas deliveries. Anyway, the crux of his post it to explain how he's moved from 2-4TB of home storage rather than take the time to sort out the mess of his home data. He then goes on to detail lots of clever technology which allows more data to be stored with less.

As I've posted many times before, we're just storing ourselves up a heap of trouble by not addressing the underlying issue here - delete the unwanted data.

We're creating storage landfills which will still need to be sorted out in the future. Like toxic waste in a rubbish dump, losing that critical file will eventually cost dearly.

Think of Claus' problem. Moving from 2-4TB doubles the amount of data that needs to be backed up (how do you back up 2TB of storage at home?), means any restores take longer, means spending more time searching for that file you know you had once, but can't remember what you called it - and if you use an online service for backup means you are paying unnecessarily for each month.

Take my advice, spend the time in developing (a) a good naming standard for your home files (b) a good standard for directories for storing your home files (c) delete the stuff you don't need. Immediately. Period.


the storage anarchist said...

Hear! Hear!

the storage anarchist said...

Hear! Hear!

Tintop said...

Looks to me like you could do with some de-dupe on your blog comments ;)

Martin said...

I have sympathy for both the delete it and the keep it. I hoard digital information like a dragon hoards gold.

I was seriously looking at small arrays on Ebay yesterday before I had a reality moment!

But you are talking to the people who actually moved house once to accommodate the books.

Matt Povey said...

It's hard not to agree. I don't though. It's all well and good to apply tell people to apply the same standards we (usually unsuccessfully) attempt to apply to business data to their personal data. The fact is though that they won't and in most cases can't. These problems won't be solved by applying basic data management principles any more than emailed exhortations to delete mail stop data growth in businesses.

The base problem is that file systems are not appropriate places to be storing the data that we generate through our lives. The different sorts of data really need to be stored using mechanisms and semantics that are appropriate to them. I don't want my mother to have to think of her photos as files (and frankly - she can't) - I want her to be able to treat them as 'photos' - analogous to the physical objects she spent her life making.

That means for Mum - photos in Picasa, email in gmail and music in iTunes. The problem is the stuff that doesn't yet have a sensible home outside of the file system.

This thing that bugs me is that I still have to configure her computer with a back-up service to make sure everything is safe. In other words, although these apps are doing a great job of abstracting away the file system for me, there are still too much 'blue sky between the clouds' for me to really leave my mum to her own devices. There's still just too much sitting in the file system.

It looks like there are going to be a couple of services launching in the not too distant future that will attempt to bring a stronger resolution to this problem. It'll be interesting to see how successful they are as frankly, I'm sick to death of managing mine and my families data.

Chris M Evans said...


Interesting points, however I would posit that the issue of whether objects are stored as files is not the whole problem. Using your photo analogy, everyone will have lots of old photos on paper stored in shoeboxes, albums and so on which are also inappropriately labelled or indexed.

It comes down to having a strategy for indexing your information - whether physical or electronic. In the absence of another method, having a logical file structure and index will certainly help.

Data management, offline or online takes effort, which most people (including myself) are not focused on doing.

Thanks for the comments; it gives me food for thought on future posts!

Pete said...

We will all most likely continue to fill the available space with stuff. It's human nature. Who has any storage space in their home that is empty?

I love the idea of deleting but I don't think it's realistic. Google Search gives me hope that software will make it all findable for me someday. I'm going to keep keeping.

Chris M Evans said...

Pete I don't disagree - people do hoard - I guess it's human nature that comes from a long time back in history. However I think that search engines are going to have to get a whole lot more intelligent to solve the problem we have today. Google is good, but I find non-web search works maybe one time in 10 to find what I was actually looking for.

Matt Povey said...

I agree that indexing\tagging\metadata is the issue. Ultimately, as individuals, we face the same problems enterprises and other organisations face in microcosm.

The point here is that data management, like many things in IT, doesn't work very well when you have to have humans doing the heavy lifting. My mum has a number of beautiful albums that have been organised and layed out. She also (as you correctly identify) has shoe box after shoe box full of photos. She will never sort through them all - best intentions aside but in the digital world, a computer can (or could).

In the media industry, automated metadata generation has been talked about for years now. In serious archives, the technologies are really still too weak to do anything but suggest (i.e. they are untrustworthy) metadata.

In the home though, that doesn't matter so much - metadata doesn't have to be perfect. With that in mind, how long will it be before you tell Picasa what each of your family members look like before allowing it to run off and figure out which of your 2000 photos they are in. This is Google we're talking about so expect it to recognise landmarks and figure out that you were in Rome or New York and tag appropriately.

Of course, photos are 'easy' in comparison to other formats and objects. Videos, letters, invoices, tax records, receipts - we need systems that can catalogue and weed through our data to find, store and protect the stuff that really matters.

The bottom line though, is that there's going to be money in them thar hills o' data.

I've got to start writing shorter comments :).