I would like to store scanned images in Alfresco, including some metadata (either recognized by OCR or manual indexing). We usually do this by creating a PDF, containing the metadata in custom fields (e.g. "Customer" => "ACME Inc.").
Problem: Alfresco doesn't seem to index the PDF custom fields… I thought Lucene was guilty, but it seems (?) that it is not Lucene which handles the PDF, but an other converter such as XPDF.
Hence my questions:
1- How could Alfresco also index the metadata? By using an other converter?
2- Do anybody see an other workaround?
I know that I could store the data and metadata in two separate files (TIIF + XML), but it would really much better to have them in a single file.
Pascal