Thursday, September 27, 2007

Storage is cheap, why select?

Tomorrow I will be participating in the ASIST student chapter's panel discussion in Wilson Library. The topic is, "Storage is cheap. Why select?" Below are the general comments I plan to make in my 5-minute time slot.

Saying that because storage is cheap we should therefore save everything suggests that at its heart, the issue of selection is a technical question driven primarily by resource constraints. It suggests that we select because we know that we can’t save everything and we therefore need to privilege some records above others in order to create coherent collections. It also suggests that when we have reached a point where it is technically and economically possible to “save everything,” that it is good to do this because it allows us to avoid arbitrary selection decisions, and thereby create a world where information is free and people can pick and choose knowledge objects as they will. This is an attractive belief, but at its heart it’s false. First, the issue of selection is as much an ethical issue as it is a technical and economic issue; and second, selection is inevitable. So the important question related to selection is not how we can avoid it -- we can’t. The important question is two-fold: who will do the selection and how transparent are the selection rules to the people that use the information objects.

In his now-classic “The Documentation Strategy and Archival Appraisal Principles: A Different Perspective,” Richard Cox explicitly highlights twelve primary principles of an archival appraisal theory. Principle 1 is this: “All recorded information has some continuing value to the records creators and to society.” He then notes that this is an assumption held widely by archivists, probably because of the frequency with which they come from the humanities, in particular history. In a perfect world, we wouldn’t have to get rid of any information because its very existence implies that it holds some value. In our less than perfect world we have time and resource constraints and must engage in selection and appraisal to determine what will provide the most value.

Archivists, because of their training and expertise in making selection decisions, naturally believe that they are best qualified to make such decisions. Nonetheless, archivists in the digital age are facing selection decisions that they did not face prior to the 20th Century. The other day on O’Reilly Radar Executive Director for the Digital Library Federation Peter Brantley discussed a workshop he participated in at UC Berkeley in which a policy was being sought to determine how to preserve and publicly host incidental war footage from Iraq and other sites of armed conflicts. The Internet Archive took place in these conversations, and Brantley noted:

“of around 250 videos being posted daily at the Internet Archive, approximately 30-50 could potentially be called into a process of review. These include images of hate speech or obvious propaganda, guns, victims, or long distance violence (snipers, car bombs, etc.) Some of the videos are excruciatingly violent (Trust me: extremely graphic and intimate portrayals of war and harm). In some of these videos people are identifiable through the explicit use of names, passport photos, or through questioning that reveals personally-identifiable information.”
A number of questions arise with a situation like this, such as “can someone get killed using information in this video?,” or alternatively, “Could someone get killed because they are seen in this video?”, “Am I helping terrorists recruit or communicate?,” “Am I helping the public understand?”, and “What is the archive’s or curator’s personal responsibility?” for the consequences of this video being publicly viewable?

It seems that with potential information objects such as this, three primary possibilities with regards to the “save everything” approach exist; the first is to save everything and allow it all to be publicly searchable and viewable. I have problems with that – in a society, for example, where a woman’s life could potentially be ruined if not snuffed out because it becomes public knowledge that she was raped, I would hesitate to allow that kind of information to be either publicly searchable or viewable. The second possibility is that everything is saved, but access is restricted. It seems to me that this approach really doesn’t do anything more than push the selection decision to a different rung on the ladder. In other words, a repository could keep the information, but it would still require active data management to ensure that highly sensitive material doesn’t reach the wrong hands. This now implies that the economic constraint of selection still exists, it has simply been pushed to a different level. It also muddies the waters with potential censorship issues and concerns about what will happen if some other organization ends up owning or controlling a given repository? Can we still trust that this unknown “third party” will “do the right thing”? The third possibility is that we wait for some governmental restrictions to be put in place and then we can just pass the moral buck on to the government – a really reassuring thought.

At their heart, selection decisions are decisions that reflect the ethical and culturally ingrained assumptions and values of the people in control of the knowledge objects. To pretend that just allowing a free-for-all will allow us to avoid these types of decisions is somewhat akin to believing that if we just line up all those in power and shoot them, that we will have inevitably changed the moral structures of our society. If not during this revolution, then maybe during the one next decade.

No comments: