cancel
Showing results for 
Search instead for 
Did you mean: 

Indexing of PDF files and EMail messages

marian
Champ in-the-making
Champ in-the-making
Hi,

I have installed alfresco community 2.1 and have been playing with it for
a few days.

Apparently in my installations the bodies of PDF files and those of EMail-Messages
(as exported from MS Outlook in the form of .msg-files) does not work.

I can upload a msg file and the author is filled from the sender of the
message and description is filled from the subject of the message.
The message can then be found using searches for words from these
fields, but not from the body.

The mail shows up in the 'nitf' special-search. So apparently the
transformation failed. There is no indication of such a failure in the log (on
the console as I started alfresco with the batch).

Is indexing of the mail body possible at all? Do I need to configure
something to make it work? What can I configure to debug the failure
reported?

The same is true for PDF files: I can upload PDFs and title and author are
prepopulated. The document is not returned for searches on content. It is
also not returned from either of nitf, nicm or nint searches.

My document is very simple and small, contains only simple text and has
been created from MS word through a ghostcript-based PDF-Printer. The
word-DOC itself is correctly indexed.

Any ideas on how to debug this issue or pointers to further reading on the
system are very much appreciated.

Ciao, MM
12 REPLIES 12

fthamura
Champ in-the-making
Champ in-the-making
i just want to know, is the indexing PDF use PDFBox?

F

kevinr
Star Contributor
Star Contributor
Yes.

fthamura
Champ in-the-making
Champ in-the-making
where is the setting that we can test the PDF is indexed.

so i can know, all the content inside PDF (that is not passworded), can be searched

i try several time, cannot get it.

i think we must setup it manually

can help?