Expect to Archive Everything

Today, everyone worries about what to archive. For example, should you keep drafts of documents? Personal communications?

I think we'll end up archiving everything, except egregious garbage like spam:

  • It's too hard to get users to conform to policy.
  • Automated methods of capturing a human-understandable policy, for example "tax records," are too hard to implement through automatic filters. The filters are too inaccurate.
  • It's impractical to get users to classify everything, and automatic classification is too crude.
  • You never know what you might want later. Stuff you think you won't want now may end up being very useful.
  • The cost of storage is trivial when looked at on a per-user basis.

It won't happen overnight. But I guess, by 2011 or so, an archiving-everything policy will be the norm.

Note this doesn't assume everything will be kept forever. It leaves open the question of retention policies.

... David Ferris

One Comment

  1. Mike Dodson
    Posted April 3, 2008 at 7:33 AM | Permalink

    I agree. Selective archival applies unnecessary pressure to make decisions in a (near) real-time fashion. The focus should be on making retention decisions, and how soon you can remove something from your archive, rather than whether to put it in your archive. Some messages may not seem important to archive until there is an “incident” and the seemingly uninteresting email is found to be on the trail leading up to the incident.

  2. Patrick
    Posted April 8, 2008 at 3:16 PM | Permalink

    Archive everything but don’t assume everything will be kept forever?

    Uh huh. And when will you get around to deciding that you need to purge stuff? When the disks fill up? When the cost of storage really amounts to something? When you can’t possibly maintain proper backup systems? When the cost of discovery (i.e. finding the needles in that haystack) exceeds the cost of settling every lawsuit presented to the company?

    Oh I know, let’s just pick an arbitrary number of years and toss anything older than that. Nevermind intellectual property or pensions or even history. Hmmm… maybe we do need retention periods then. Oh wait, maybe we need to classify records when they are created so that we don’t throw them out at the wrong time.

    Or should we just worry about that issue after we’ve retired?

    Once upon a time, in the old world of paper, people made decisions in real time about what to do with the records that they created and received. It was called filing. Now granted, the volumes were orders of magnitude different. People understood the cost of creating a “record” (i.e. having a secretary type a letter) and limited themselves (or were limited by the lack of unlimited resources). Today, the flood of information (let’s face it: we’re really talking about email here) is such that people completely mis-use the medium and create things that they never would have created in the past. They copy a million people where one would be enough. And all those recipients stuff that email away somewhere, “just in case”.

    The bottom line is that we have to fundamentally change behaviors in the office. In ye olden days, when a kid went to work for a company, they had to play by the rules. The fact that they never filed anything at home was not an excuse for not filing anything in the office. The same holds true for the office today. Just because the YouTube generation has grown up without any controls doesn’t mean that business needs to put itself at risk and incur unnecessary expenses to pamper their bad habits.

  3. Posted April 9, 2008 at 1:40 AM | Permalink

    Thanks for the input Patrick. I suspect we won’t purge stuff, or we’ll leave it to be done automatically according to some fairly crude set of filters. And that we’ll fall back upon better and better search technology to dig out material from the archive. But as you nicely point out, I may be proved mistaken.–David

  4. Posted April 9, 2008 at 9:39 AM | Permalink


    Fascinating response to Patrick’s comment. I couldn’t disagree more on your approach. Your approach will leave your organization or any organization for which you consult vulnerable to greatly expanded discovery orders than would be the case, had the electronic records been classified, and disposed of when no longer required for regulatory or business need. You may not have sufficient experience in the field of litigation yet, but I’m sure that you will. When that happens, remember – if you have it (or if your consulting client has it), it is fair game during discovery. If it happens to be a “smoking gun email” that was really supposedly using humor and a joke…. you or your client firm will have to produce it, and may well damage the that firm’s case and of their attorneys. By the way, if that firm does have any Records Management program at all, your approach will ensure that the Records Management program and any retention schedules and disposition efforts have been dismissed as not being credible, because they have not been enforced. Should that happen for one of your clients, I would anticipate the possibility that you will be included in that litigation, and that you will have some significant legal exposure as well.

    Best wishes in your approach. For those who would prefer to plan now and to avoid future disaster, I recommend embracing Patrick Cunningham’s recommended approach. It seems more difficult now – a bit like cleaning one’s room seems to be a difficult, time consuming and unrewarding endeavor, but in the long run in today’s litigious society, it’s the best for the long run. The “gosh, this real work is just too difficult” is not likely to meet with great sympathy in the courts, nor is it apt to serve anyone’s long term needs very well.

  5. Peterk
    Posted April 9, 2008 at 5:31 PM | Permalink

    First of all lets be more accurate. You’re not archiving anything. You are STORING it. Archiving refers more particularly to protect items of long-term historical value to an organization. The rule of thumb is that no more than 5% of organization’s information is considered to be of archival value.

    so please don’t use the term archive use store or storage because that is what you are really doing. As others have pointed out organizations need to get organized. And the best way is to establish a records management program that will work hand in hand with IT and Legal to determine what information needs to be kept, for how long and in what form. IT helps to determine the technology to support the decisions made by this governance group.

    One thing would be to check out the resources available from ARMA International https://www.arma.org

  6. Posted April 11, 2008 at 4:42 AM | Permalink

    I appreciate the argument that by keeping everything, one increases one’s litigation risk. This is a widely accepted argument today, and a good one. Nevertheless, I’m not sold on it:
    * First, deleting material doesn’t mean it’s no longer there. Electronic traces may well remain, quite out of your control. Other parties may have a copy, and may then use it against you. In this case, you have also lost much of the surrounding contextual information, and defence is thus harder.
    * True, you can be hoisted by your own petard. But most of us aren’t all bad. We do some bad things sometimes. And we act in good faith much of the time. Having archives means that one can better defend oneself against lawsuits, and can also better assess the strengths or otherwise of one’s case and settle earlier.

    Litigation exposure is only one of the issues to consider when archiving material. Over the long term, I don’t think it will determine archive policy. And over the long term, with respect to litigation threats, I think having a long memory will be seen as something that can cause one pain, as well as make life much better than it would be otherwise. It’ll be seen as a mixed blessing, instead of just a nasty exposure.

    BTW, I really appreciate the comments and lively discussion–many thanks–David

  7. Tony Whitby
    Posted April 15, 2008 at 7:56 AM | Permalink

    We are really struggling with these issues and any advice would be very welcome:
    Do we keep everything for ever? Everything for a period of time (10 years) & delete? or implement a retention policy and categorise all mail by type & assign a retention period for each type. Then how is that enforced, by training or by analysis of the content of mail.

    All very difficult. Added to that we are an international co & there seems to be a marked difference of opinion between the USA who would like to retain as little as possible and the UK who would like to keep much more. Since the policy must be universal as otherwise documents destroyed in the US may be retained in the UK and in a discovery situation that could nullify what the US want.

    I’m only a poor techy trying to square these circles but it makes finding the right technical solution very hard when the requirements have not been fixed.

    I think Patrick’s comments above neatly sum up the problems.

One Trackback

  1. […] place into words.  A top document-oriented archiving analyst (and my good friend), David Ferris, quite agrees. As David puts it: I think we’ll end up archiving everything, except egregious garbage […]

Post a comment

You must be logged in to post a comment. To comment, first join our community.