Hi,
I have a question related to the way the full text search works.
As far as I read, when a file is uploaded in the repository, in order to be indexed, Lucene needs to read the document as plain/text. Regardless of the mimetype of the document, a content transformation will be applied.
If so, how can I see what Lucene indexes? How can I see the output of the content transformation? (I have used Luke, but I cannot see the output of my transformed content).
Is this transformation affecting the document content which is stored on the filesystem?
Thank you in advance for your answer!