cancel
Showing results for 
Search instead for 
Did you mean: 

Searching the content of xml files

user01
Champ in-the-making
Champ in-the-making
Hi,

I am reading for some time documentation an forums, but still I do not have a clear idea about it.

I would like to search the content of some xml files (I already read similar questions about it). How can I achieve this?

I would like to avoid SOLR due to lack of time to setup the corresponding environment.

As far as I know, I have to index some fields and to set them as metadata. To achieve this I must implement a custom metadata extractor.

Is this a good idea. Do I have to take into account Alfresco FTS functionality.

Thank you!
2 REPLIES 2

mrogers
Star Contributor
Star Contributor
It depends, as always, upon your requirement.   Alfresco is not an XML database, although at one point we were considering the merits of adding an XML database.

If you have a simple requirement then you can extract the content of a few XML fields as alfresco properties.   There are various XML metadata extractors, or you can write your own.   Or you can use a text search of an XML document.   May not be the best but could work for some requirements.   Or you could somehow add an aspect with your metadata.     

If you need to model some of XML's complex structure then you can,  but I suggest you think carefully about how much is needed.

    

user01
Champ in-the-making
Champ in-the-making
Thank you so much for your answer!

I have seen the extract metadata solution previously, but this is not feasible for me. I want to avoid storing more metadata than necessary.

The full text search works on plain documents. To achieve this it means that I have to apply a content transformer to transform the XML into plain text that can be searched.
If I index the content of that XML file, how can I see what have been indexed? I have tried to see the content using Luke, but I cannot.
There are more elegant ways to achieve this?


I have read this post among some other ones: http://forums.alfresco.com/forum/developer-discussions/technical-architecture-discussion/building-xm... .