Question: We'd like to add OCR (and full-text) to our OnBase solution. Can existing documents be OCR'd? What is the best way to OCR a large number of already scanned image documents (millions of documents) to enable them for full-text search?
Answer: For a large number of documents, the best way is set the status of a batch back to "Awaiting Full Page OCR". This is typically accomplished with the assistance of Technical Support. The query looks something like this:
Scan: Update hsi.archivedqueue set status = 14 where batchnum = xxx
DIP: Update hsi.parsedqueue set status = 14 where batchnum = xxx
Once this has been completed, all documents that are part of those batches can be OCR'd (either manually or on a scheduled basis). The batches would be committed again at that point and the documents would be added to Autonomy IDOL.
For a small number of documents, a user can right-click the documents to Perform Document OCR or process the documents through Workflow to Queue Document for OCR.