redbird: closeup of me drinking tea, in a friend's kitchen (Default)
([personal profile] redbird Jun. 11th, 2001 02:37 pm)
It sounded plausible, last week: we have a bibliographic database that we're putting on the Web, and my bosses want me to look over the author names for coding problems, both actual errors (some doozies there, each of which has to be dealt with individually) and codes that aren't recognized if I run IE with "language" set to English only.

So I started scanning at A, with a separate (Netscape) window open to a table that offers numeric codes, like Č, to replace such things as š and ł.

Three hours, not counting my lunch break, later, I was up to "Ademovič" and starting to realize the enormity of the problem.

A database of more than 300,000 items, most of them with multiple authors, each of which has to be at least glanced at. It isn't much comfort that, for example, there are four entries under "Ademovič", not when they sometimes have different first names, which also need to be examined.

That Bernie sat me down and explained this, and I asked questions like "what language should I set it to accept?" rather than saying "We couldn't do this in three weeks even if I did nothing else" shows that neither of us really grasped the problem as he was defining it.

I've left him voice mail--which he will retrieve on Wednesday--explaining the difficulty, and am wondering if there's any good way to automate this. I can't just scan the source for tags, because in most cases the tags aren't actually wrong, it's just that there are lots of different character sets out there, many of them overlapping, and with close to 20 years of data on material from most of the world, we're using characters in too many of them to just specify one and be done with it.

(Further technical details suppressed, in the Lewis Carroll sense of the term; I'll be under my desk if you need me.)
Tags:

From: [identity profile] eleanor.livejournal.com

desk sets


Well, if you're under your desk, you won't get caught in the anticipated rain, and we can have food sent in to you.
.

About Me

redbird: closeup of me drinking tea, in a friend's kitchen (Default)
Redbird

Most-used tags

Powered by Dreamwidth Studios

Style credit

Expand cut tags

No cut tags