Hi! I'm evaluating Alfresco 5, and i don't found any tutorial about tesseract integration on 5.0x versions. How did you do the integration? Thanks in advance,
The seedim.com.au tutorial should tell you how it works. Basically, if you configure a transformation for each image mimetype (ie png, tiffs etc) to text (I assume using the tesseract transform you have already configured) then when an image is uploaded solr will try to call the img-to-text transform you have configured to get the wordlist. The wordlist is then automatically added to the solr index and points to the image content. Searching will therefore find the image based on the text in the image.