Optical Character Recognition (OCR)
The process of Optical Character Recognition is where a programme examines the content of an image and converts the image of text into readable text. The main factors determining the sucess of conversion are the quality of the image and the fonts & font sizes used in the image.
A low resolution image will often be pixelated in the areas of the text thereby hampering recognition. Where unusual fonts have been used which are difficult to read by the naked eye, (Old English & Fancy Script or handwriting fonts for example) will also be difficult for OCR software to recognise.
Where these are found, the software will simply ignore them, resulting in these areas remaining as an image rather than readable text. Where these areas are important, they would either have to be scanned at a higher resolution or have to be entered from the keyboard with the resulting increases in time and costs.
OCR does require a choice to be made on a compromise between speed and quality. An optimal scanning resolution for general archiving would be 200dpi. Where a resolution of 100dpi is used for scanning, the scanning speed increases and the resultant file size is smaller, but the amount of text which is converted is reduced. Where a resolution of 300dpi is used, the scanning time increases and resultant file sizes are larger but the range of recognised text is increased.
The use of OCR software is not essential for all archiving. For example, if you store your existing archives in folders with, say, a job number, and most jobs consist of about 10 pages or less, OCR may not be a great benefit. With your physical system, you simply find the folder for the job number to get the information you require. This can easily be achieved by storing each job number as a separate PDF file which can then be found in the same way, but much quicker.
OCR will be of greater benefit where you have archives consisting of many pages where you can then search for a specific group of words. An example would be where you may have 100's of delivery notes in a single file and you want to find one for a specific product.

