Interpreting People Connections in Electronic Archives; and NodeXL

Human beings have been communicating digitally, in a way that leaves electronic traces, since the advent of email. The traces of such communications can yield a lot of valuable information, in ways that are only just beginning to be understood.

Today, we usually sift through electronic archives by doing word searches. But more is possible. In particular, networks can be built that yield much of value. Very early examples of commercial products that identify such networks are from Clearwell Systems and Attenex/FTI Consulting. These products are used for e-discovery, and investigate who's been talking to whom about such-and-such subject.

Non-email methods of communicating digitally are flourishing, such as via computer-based voice, social networks, instant messaging, Twitter, SMS, and SharePoint. In each case, there's great potential for technology that lets us see who communicates with whom, about what, and how they do so.

Academic Research

Exciting academic work has been done in this field over the last 25 years, mainly by mathematicians, computer scientists, and sociologists. A rich and specialized graph theory has developed, known as social network analysis. It's accompanied by a useful (and unfamiliar) set of concepts. The work is far more sophisticated than is suggested by commercial offerings such as those of Clearwell and Attenex/FTI. For a summary of what's happening, see here.

Over the next 25 years, I think this field will yield rich commercial benefits. And conversely, engagement with the commercial world will stimulate and lend focus to academic research. I suspect there will be many high-value practical applications, and that they're hard to see right now. Practical applications that can be identified now include:

  • Legal discovery
  • Monitoring for criminal activities, or activities that are not in compliance with regulations
  • Identifying subject matter experts
  • Exploring market needs hypotheses
  • CRM
  • Identifying key influencers
  • Generating a corporate map based on organizational connections, location, or job function


NodeXL is an interesting example of human-networking interpretation technology. It's open source technology, very close to academia, but can be used commercially. It's built in Excel, and can ingest content from various sources such as Twitter, Flickr, and YouTube, as well as Outlook. A nice video tutorial is here.

For the moment, most organizations will want help using NodeXL. Marc Smith, one of the developers, offers corresponding professional services. See Connected Action Consulting.

... David Ferris, with thanks to Jeff Ubois and his Personal Archiving Conference, and Marc Smith

