Searching the content of xml files
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-01-2013 12:10 PM
Hi,
I am reading for some time documentation an forums, but still I do not have a clear idea about it.
I would like to search the content of some xml files (I already read similar questions about it). How can I achieve this?
I would like to avoid SOLR due to lack of time to setup the corresponding environment.
As far as I know, I have to index some fields and to set them as metadata. To achieve this I must implement a custom metadata extractor.
Is this a good idea. Do I have to take into account Alfresco FTS functionality.
Thank you!
I am reading for some time documentation an forums, but still I do not have a clear idea about it.
I would like to search the content of some xml files (I already read similar questions about it). How can I achieve this?
I would like to avoid SOLR due to lack of time to setup the corresponding environment.
As far as I know, I have to index some fields and to set them as metadata. To achieve this I must implement a custom metadata extractor.
Is this a good idea. Do I have to take into account Alfresco FTS functionality.
Thank you!
Labels:
- Labels:
-
Archive
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-03-2013 05:11 AM
It depends, as always, upon your requirement. Alfresco is not an XML database, although at one point we were considering the merits of adding an XML database.
If you have a simple requirement then you can extract the content of a few XML fields as alfresco properties. There are various XML metadata extractors, or you can write your own. Or you can use a text search of an XML document. May not be the best but could work for some requirements. Or you could somehow add an aspect with your metadata.
If you need to model some of XML's complex structure then you can, but I suggest you think carefully about how much is needed.
If you have a simple requirement then you can extract the content of a few XML fields as alfresco properties. There are various XML metadata extractors, or you can write your own. Or you can use a text search of an XML document. May not be the best but could work for some requirements. Or you could somehow add an aspect with your metadata.
If you need to model some of XML's complex structure then you can, but I suggest you think carefully about how much is needed.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-03-2013 01:02 PM
Thank you so much for your answer!
I have seen the extract metadata solution previously, but this is not feasible for me. I want to avoid storing more metadata than necessary.
The full text search works on plain documents. To achieve this it means that I have to apply a content transformer to transform the XML into plain text that can be searched.
If I index the content of that XML file, how can I see what have been indexed? I have tried to see the content using Luke, but I cannot.
There are more elegant ways to achieve this?
I have read this post among some other ones: http://forums.alfresco.com/forum/developer-discussions/technical-architecture-discussion/building-xm... .
I have seen the extract metadata solution previously, but this is not feasible for me. I want to avoid storing more metadata than necessary.
The full text search works on plain documents. To achieve this it means that I have to apply a content transformer to transform the XML into plain text that can be searched.
If I index the content of that XML file, how can I see what have been indexed? I have tried to see the content using Luke, but I cannot.
There are more elegant ways to achieve this?
I have read this post among some other ones: http://forums.alfresco.com/forum/developer-discussions/technical-architecture-discussion/building-xm... .
