Hi,
We have built a small application using Alfresco to abstract legal documents. To capture data after abstraction( on a page that contains text boxes, text area etc) we have written our own content model and services for the business logic. The model and the services are incorporated in to Alfresco core. No work-flow is created but logic is built around changing properties of documents. All properties are defined in the content model.
Now we want to integrate this with a OCR tool. We want Alfresco to pick up the OCR document and batch them based on certain input criteria (similar to a query), and also auto populate some of the contents from the OCR document(unstructured) in to the pages created using content model.
I want to understand if this is possible (batching, auto-population) in Alfresco, and if someone has achieved this, please share your experience on the accuracy of data that has been auto-populated and the how successful this implementation has been especially when reading documents (OCRed pdf, tiff) that are unstructured.
Thanks
Chaitanya