Readers of the lost archive

IN Washington, statisticians are having difficulty working with the raw digital data for the 1961 census of the United States…

IN Washington, statisticians are having difficulty working with the raw digital data for the 1961 census of the United States. In Ireland, a research project at Trinity College, Dublin, to update a major sociology study from the 1960s is being frustrated because the original computerised baseline data cannot be re computed.

In both cases the information was preserved on old fashioned punched cards, the forerunners of today's magnetic disks. The last machine capable of interpreting them may have been scrapped years ago.

These are but two examples of a major crisis already facing modern sociologists, economists and medical researchers whose work can span a lifetime - both their own and of the objects of their study. As each new generation of data processing technology leapfrogs its predecessor what happens to the priceless hard won data left behind on obsolete cards, floppies, tapes and hard drives?

An even greater problem faces historians; what will 21st century historians make of this year's civil service e mail? Will they be able to read it - and will it survive at all?

READ MORE

"There is a real danger there will be a gap covering a generation between the era of all paper records and the resolution of this problem," says Ken Hannigan, senior archivist at the National Archives. "I'm gloomy about this generation of records."

I don't think the historical record of this generation will be as rich as the past," warns Louis Cullen, Professor of Modern History at TCD.

The National Archives has statutory responsibility for the preservation of Irish state records. It was in the headlines last week after adoption files were discovered of some 2,000 Irish children sent to the United States between 1948-62, often under false names.

Under the National Archives Act, 30 year old Government records are handed over to the National Archives and opened to historians. The irony is that by the time today's civil service e-mail is ready for the archives (in 2026 AD), the machines capable of reading it might be long obsolete. Even if sufficient old PCs survive to read ancient floppies, will the right programs survive into the next century? And will age have rendered the disks useless, like so much old video film?

Paradoxically, archivists are facing this crisis in an era when there was never such a proliferation of information or the means to store, interrogate and distribute it. Anybody who has explored the Internet, or major online services such as CompuServe, can testify to this explosion of instantly accessible knowledge. The National Archives have already reached a point where 10 times more people log onto the organisation's Web home page than visit its Bishop Street building in person. But unless a means can be found to preserve such electronic information, it will become as ephemeral as a phone call.

At Trinity College, Dublin, an update of a major sociological study of the Skibereen area is stalled because Professor John Jackson cannot find a card reader. The equipment is required to interpret the punched cards which store the original research he conducted in 1963 to track the impact of emigration and other factors on the local community. "We find we have a common problem with the archivists," he says.

One proposed solution would be to reduce the data to so called "flat" formats such as ASCII files but, as in the case of Professor Jackson's research and the US Census figures, that means they cannot be conveniently reworked to produce new interpretations.

Now efforts are under way to try and find a solution. One bright hope is Mark Conrad, an electronic records specialist from the National Archives in Washington. He is teaching in UCD's archive department on a Fulbright Scholarship, and he addressed a recent seminar on the preservation of electronic archives organised by Ken Hannigan and other worried archivists.

"Papyrus records have lasted thousands of years, a modern floppy text file can be obsolete in to years - they are much less stable than other media" he says. "My favourite example of how serious this problem is are the old eight track stereo tapes of the early 1970s. Who has both tapes and players any more?"

Buy despite hid extensive US experience, he admits to having no ready answers. In Washington, electronic archives are now stored on wide magnetic tape, checked every year and rerecorded every 10 years. This process, known as migration, at least ensures the material survives in some format. But most archivists agree it is only an interim solution.

The National Library has already received some computer disks as part of the legacies of writers who willed their papers to the Kildare Street archive.

"Those disks won't be readable in a few years time - all we can do is keep copying them onto the next generation of storage media," says Brian McKenna, Keeper of Systems at the Library.

Historians delving into government archives are familiar with original letters, postcards, carbons copies of replies and briefing position papers, all collected neatly into an appropriate cardboard file. A pencilled scribble on the margin can lead to the major reinterpretation of history. Computerisation and e mail have changed all that.

"There is no central filing system any more, everybody keeps their own files on floppies, or on the hard disks of their PCs," Ken Hannigan says. The lack of central file registries will also pose difficulties for people exercising their rights under the forthcoming Freedom of Information Act, he believes.

Professor Jackson thinks trying to preserve the huge volume of official e mail for the historical record is "crazy", while Professor Cullen thinks it will only survive if it is printed. In Washington, the National Archives recently lost a court battle which now compels it to preserve e mail of the Bush and Reagan eras.

"Who read what and when is recorded in e mail systems, but capturing that and moving it through time is an issue. There is no electronic equivalent of marginalia," Mark Conrad says.

Hannigan is less pessimistic about the computer archives of the future: "Computers and computer programs are becoming more standard and uniform, and newer technologies may mean preservation become less of a problem. As time goes by we should become more competent in the management of electronic records."

One possible solution, he says, would be that records would no longer be handed over to the National Archives under the 30 year rule but maintained by the appropriate government agency and accessed online, using the National Archives as a gateway.

"We will become a virtual archive, if you like," he says. "We would regulate access for people who log on though us." The alternative, he suggests, is daunting - with the National Archives becoming a museum clogged with thousands of different computers and programs.

In the meantime, the soul searching continues. "We are a worried community," Hannigan admits. "We will have failed as archivists if we're unable to preserve the records of our own generation."

The ultimate irony is that the proceedings of last December's conference will also be published on the Internet.