Document image scanning in organization content management
Document Imaging is part of an information technology system and involves scanning of paper documents and converting them to digital images that are then stored on CD, DVD or other magnetic storage devices. Document imaging is a type of content management in an organization based on document image scanning.
Since the 1990s, the term, ‘document imaging’ has been used to denote software-based computer systems that capture, store and reprint images. A document image scanning can be in the form of microfilm, facsimile devices, photocopiers, multifunction printers, document scanners, and computer output microfilm (COM). Document imaging brings, in essence, two different categories, such as paper and online together in such a way that it not only facilitates preserving documents but also improves overall office efficiency. The objective of the document imaging technology is to move towards a ‘paperless office’. Even in offices that use electronic documents, some documents are still in paper format.
Document imaging software converts all these into digital documents and after that the original documents can be shredded and recycled, thanks to document image scanning. The origin of the concept of document imaging can be traced to the late 1980s, when a new document management technology, known as Electronic Document Management Technology was developed. The underlying principle behind this new technology is the need to manage and safeguard ever increasing volume of electronic documents, such as spread sheets, emails and PDF documents generated in a business enterprise.
The document imaging software first generates an image of the scanned document in a desired format, such as, Tagged Image File Format (TIFF), JPEG (Joint Photographic Experts Group), PNG (Portable Network Group) or BMP (Bitmap image file). The text in the image will be in human –readable form and therefore, cannot be recognized by the computer. It will simply recognize it as a picture and not as words. Optical Character Recognition (OCR) software recognizes the text characters in images and converts the images into a machine-readable text document. After completion of the process of OCR, information about the document, such as the date of creation, author and one or more indicators for identification, known as metadata, can be incorporated.
Before scanning, documents and a barcode page containing user-specific metadata can be put together and a barcode indexer will immediately index them using the metadata in the barcode. Document imaging and document image scanning are a vital constituent of electronic document management systems. Large organizations with constant flow of paper documents at several different locations can transfer all these documents into electronic form by scanning and simultaneously indexing the paper documents. Document imaging eliminates the need for storing paper documents, streamlines workflow, increases efficiency in businesses and contributes to significant cost reductions. The electronic documents can be backed up systematically and the backup can be stored in separate locations, such as on the web. This helps in recovering the documents in case of disasters, like fire or flood.


