Hyland Connect

even4 · ‎05-01-2009

I have installed Alfresco 3.1 and have it running smoothly on Debian Lenny using Apache/Tomcat.

I'm now looking at OCR and have installed Ocropus and Tesseract. Both of these are running perfectly. I have tried to implement an ocr transformation xml file without any luck.

Has anyone completed a successful integration of Ocropus/Tesseract with Alfresco? Can you list your xml and any other specific modifications you needed to make?

I understand Tesseract can't convert pdf, but for now tif to text is ok. I'm hoping tesseract comes along in leaps and bounds now that it is a google funded project, as there seems to be a big discrepancy between the quality of the Windows and Linux OCR options.

Any help is appreciated, and I will share whatever knowledge I can work out on getting OCR working well with Alfresco linux.

even4 · ‎05-05-2009

So noone at all has tried to implement OCR on Linux with Alfresco?

wabson · ‎05-08-2009

Hi,

Please could you post a copy of all the configuration that you have implemented so far in Alfresco?

You might also be interested in the information on the wiki on Tiger OCR, which was contributed by one of our partners in France. http://wiki.alfresco.com/wiki/Tiger_OCR_integration

Thanks,
Will.

alexander · ‎05-10-2009

Have a look here, may be it will help

http://forums.alfresco.com/en/viewtopic.php?f=14&t=4917&start=0&st=0&sk=t&sd=a&hilit=ocropus

wmay · ‎08-01-2012

Hi,

We have implemented an OCR server integrated with Alfresco, which can be used as transformer or via Javascript and Java. It runs on a separate OCR server and supports Abbyy and Google OCR. for more informaiton see here - https://forums.alfresco.com/en/viewtopic.php?f=33&t=44739