<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Searching the content of xml files in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294045#M247175</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;It depends, as always, upon your requirement.&amp;nbsp;&amp;nbsp; Alfresco is not an XML database, although at one point we were considering the merits of adding an XML database.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;If you have a simple requirement then you can extract the content of a few XML fields as alfresco properties.&amp;nbsp;&amp;nbsp; There are various XML metadata extractors, or you can write your own.&amp;nbsp;&amp;nbsp; Or you can use a text search of an XML document.&amp;nbsp;&amp;nbsp; May not be the best but could work for some requirements.&amp;nbsp;&amp;nbsp; Or you could somehow add an aspect with your metadata.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;If you need to model some of XML's complex structure then you can,&amp;nbsp; but I suggest you think carefully about how much is needed. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 03 Jul 2013 09:11:00 GMT</pubDate>
    <dc:creator>mrogers</dc:creator>
    <dc:date>2013-07-03T09:11:00Z</dc:date>
    <item>
      <title>Searching the content of xml files</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294044#M247174</link>
      <description>Hi, I am reading for some time documentation an forums, but still I do not have a clear idea about it. I would like to search the content of some xml files (I already read similar questions about it). How can I achieve this?I would like to avoid SOLR due to lack of time to setup the corresponding en</description>
      <pubDate>Mon, 01 Jul 2013 16:10:05 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294044#M247174</guid>
      <dc:creator>user01</dc:creator>
      <dc:date>2013-07-01T16:10:05Z</dc:date>
    </item>
    <item>
      <title>Re: Searching the content of xml files</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294045#M247175</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;It depends, as always, upon your requirement.&amp;nbsp;&amp;nbsp; Alfresco is not an XML database, although at one point we were considering the merits of adding an XML database.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;If you have a simple requirement then you can extract the content of a few XML fields as alfresco properties.&amp;nbsp;&amp;nbsp; There are various XML metadata extractors, or you can write your own.&amp;nbsp;&amp;nbsp; Or you can use a text search of an XML document.&amp;nbsp;&amp;nbsp; May not be the best but could work for some requirements.&amp;nbsp;&amp;nbsp; Or you could somehow add an aspect with your metadata.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;If you need to model some of XML's complex structure then you can,&amp;nbsp; but I suggest you think carefully about how much is needed. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jul 2013 09:11:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294045#M247175</guid>
      <dc:creator>mrogers</dc:creator>
      <dc:date>2013-07-03T09:11:00Z</dc:date>
    </item>
    <item>
      <title>Re: Searching the content of xml files</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294046#M247176</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thank you so much for your answer! &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I have seen the extract metadata solution previously, but this is not feasible for me. I want to avoid storing more metadata than necessary.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The full text search works on plain documents. To achieve this it means that I have to apply a content transformer to transform the XML into plain text that can be searched. &lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;If I index the content of that XML file, how can I see what have been indexed? I have tried to see the content using Luke, but I cannot. &lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;There are more elegant ways to achieve this?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I have read this post among some other ones: &lt;/SPAN&gt;&lt;A href="http://forums.alfresco.com/forum/developer-discussions/technical-architecture-discussion/building-xml-repository-10142005-1944" rel="nofollow noopener noreferrer"&gt;http://forums.alfresco.com/forum/developer-discussions/technical-architecture-discussion/building-xml-repository-10142005-1944&lt;/A&gt;&lt;SPAN&gt; .&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Jul 2013 17:02:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/searching-the-content-of-xml-files/m-p/294046#M247176</guid>
      <dc:creator>user01</dc:creator>
      <dc:date>2013-07-03T17:02:00Z</dc:date>
    </item>
  </channel>
</rss>

