cancel
Showing results for 
Search instead for 
Did you mean: 

Linux OCR

even4
Champ in-the-making
Champ in-the-making
I have installed Alfresco 3.1 and have it running smoothly on Debian Lenny using Apache/Tomcat.

I'm now looking at OCR and have installed Ocropus and Tesseract. Both of these are running perfectly. I have tried to implement an ocr transformation xml file without any luck.

Has anyone completed a successful integration of Ocropus/Tesseract with Alfresco? Can you list your xml and any other specific modifications you needed to make?

I understand Tesseract can't convert pdf, but for now tif to text is ok. I'm hoping tesseract comes along in leaps and bounds now that it is a google funded project, as there seems to be a big discrepancy between the quality of the Windows and Linux OCR options.

Any help is appreciated, and I will share whatever knowledge I can work out on getting OCR working well with Alfresco linux.
4 REPLIES 4

even4
Champ in-the-making
Champ in-the-making
So noone at all has tried to implement OCR on Linux with Alfresco?

wabson
Star Contributor
Star Contributor
Hi,

Please could you post a copy of all the configuration that you have implemented so far in Alfresco?

You might also be interested in the information on the wiki on Tiger OCR, which was contributed by one of our partners in France. http://wiki.alfresco.com/wiki/Tiger_OCR_integration

Thanks,
Will.

alexander
Champ in-the-making
Champ in-the-making

wmay
Champ in-the-making
Champ in-the-making
Hi,

We have implemented an OCR server integrated with Alfresco, which can be used as transformer or via Javascript and Java. It runs on  a separate OCR server and supports Abbyy and Google OCR. for more informaiton see here - https://forums.alfresco.com/en/viewtopic.php?f=33&t=44739