Thanks for replying.
I already have luke even before posting. I'm not very familiar with it. Can you tell me how Luke can help me to debug? I can see the filename of document indexed in alf_data directory but not its content.
I verify that this transformer is executed without error :
2013-03-04 10:32:48,448 DEBUG [util.exec.RuntimeExec] [http-8080-2] Execution result:
os: Linux
command: pdftotext -enc UTF-8 /opt/apache-tomcat-6.0.36/temp/Alfresco/RuntimeExecu tableContentTransformerWorker_source_922569492484223827.pdf
succeeded: true
exit code: 0
out:
err:
And i open the result of transformation in tomcat/temp/Alfresco, there is txt file named Failover transformer intermediate tikaauto content transformer. I open the file and its empty.
I try to search for a word in the content. Search doesnt give me any result in Alfresco Explorer Search. Note that this problem only happen to PDF that has been OCRed by ABBYY fineReader 11. If i use searchable PDF file (not OCRed), the content is indexed correctly and search give me results.
Cheers,