Finding All Those Pages

We’ve used indexes (also called indices, nowadays) since we’ve had writing. As soon as we made separate rooms for different grains, or put fences between fields, we’ve kept lists of which was which. Chapters in books, page numbering, wooden pigeon-hole sorting cabinets, street addresses: they’re all types of indexes. Even the numbers on sports uniforms are used to link players’ names with their positions on the field, their lockers in the clubhouse, their paychecks and so on. Indexes.

When you start scanning pages and saving them in a computer, the software “names” each page or batches of pages, called documents, by date and time and job number or with a number that “increments” by one number-value higher than the previous page scanned that day or the previous page stored in that folder / directory. By itself that string of information is not particularly helpful for retrieving that image of the words on the page or pages you need to learn from. So, we create two simple computer indexes: document naming conventions, and logical folder / directory names.

If we are looking for Smith, Inc. invoices we would go to the Smith, Inc. folder in our computer filing system. If there are hundreds of pages filed in the Smith, Inc. folder you will be looking for those you named “Invoice” together with a date and invoice number you may have added. Those are the pieces of information you would know when you went looking, aren’t they?

Think of your paper-filled file cabinets that you hope to eliminate with scanned documents (“electronic files”). If you were going to find a Smith, Inc. invoice from February of 2012, you would know which cabinet to look in, which drawer to roll out and then, by those little plastic tabs, which Pendaflex to look inside of. That’s because everything was labeled or named as you filed it for just this purpose. Those bits of knowledge and labeling are all “indices.”

But, part of the value of electronic files is saving time. When you go looking in your computer to get the February, 2012 invoices, only, to display on screen, you want it to happen quickly. You don’t want to, figuratively, walk to the file cabinet, check the labeling of the cabinet and drawers, pull one open, paw through the hanging folders and finally open the manila folder that has invoices in it. No, what you really want – and can obtain from electronic filing – is for Smith, Inc’s “Feb, 2012, Inv” PAGES to appear on screen, or in a simple list on screen from which you can choose, by entering the quoted string, above or clicking on a couple of identifiers in a drop-down list.

Or, even better, maybe you could look at only the ONE invoice you care about at the moment, by typing in the invoice number, which would be unique to that page among the thousands of scanned pages. That will be cool.

And, it’s possible without your naming every single scanned page separately. It’s all thanks to “OCR.”

Read more about OCR.