<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: The meaning of XML extractor selector in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236967#M190097</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thanks, Derek. Your explanation on the intent of XML selector is very good. Does the selector process also validate (i.e. if DOCTYPE is present) the document?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I also understand why there is a two step mapping from extracted values to content property (extracted value –&amp;gt; local variable –&amp;gt; content property) . It provides an opportunity to transform extracted value before assigning it to a content property. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 02 Feb 2010 16:19:25 GMT</pubDate>
    <dc:creator>kilo</dc:creator>
    <dc:date>2010-02-02T16:19:25Z</dc:date>
    <item>
      <title>The meaning of XML extractor selector</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236965#M190095</link>
      <description>Hello Gurus,I'm trying to understand Alfresco's built-in XML meta data-extraction, which I understand requires 3 configurations:1. Configure the selector class ( where I set the "worker" property)2. Map a local variable to a content type property, where the extracted value will go to3. Map the local</description>
      <pubDate>Fri, 29 Jan 2010 23:26:56 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236965#M190095</guid>
      <dc:creator>kilo</dc:creator>
      <dc:date>2010-01-29T23:26:56Z</dc:date>
    </item>
    <item>
      <title>Re: The meaning of XML extractor selector</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236966#M190096</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;The mimetype "XML" is really an infinitely variable document format; we can't rely on it to be anything except well-formed.&amp;nbsp; The simplest way for the extractor to know what 'type' of XML it is dealing with is to "peek" into the document.&amp;nbsp; The selector runs XPath statements until it gets a hit; it then passes the document to the corresponding XPathMetadataExctractor, which runs multiple XPath statements to extract values from the documents; the extracted values are then passed through the normal mapping phase which pushes the values into a form that will be sent for persistence.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;The &lt;/SPAN&gt;&lt;STRONG&gt;XmlMetadataExtracterTest&lt;/STRONG&gt;&lt;SPAN&gt; extracts values from different types of xml: an Alfresco content model and an Eclipse project definition.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Regards&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Derek&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;PS. Recent context 'subsystem' work added some extra complexity to the code.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 01 Feb 2010 16:26:42 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236966#M190096</guid>
      <dc:creator>derek</dc:creator>
      <dc:date>2010-02-01T16:26:42Z</dc:date>
    </item>
    <item>
      <title>Re: The meaning of XML extractor selector</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236967#M190097</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thanks, Derek. Your explanation on the intent of XML selector is very good. Does the selector process also validate (i.e. if DOCTYPE is present) the document?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I also understand why there is a two step mapping from extracted values to content property (extracted value –&amp;gt; local variable –&amp;gt; content property) . It provides an opportunity to transform extracted value before assigning it to a content property. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 02 Feb 2010 16:19:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236967#M190097</guid>
      <dc:creator>kilo</dc:creator>
      <dc:date>2010-02-02T16:19:25Z</dc:date>
    </item>
    <item>
      <title>Re: The meaning of XML extractor selector</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236968#M190098</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;How strict the document builder is dependent on the parser that Java chooses at runtime: &lt;/SPAN&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;SPAN&gt;We have xercesImpl-2.8.0.jar on our classpath by default.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Regards&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Feb 2010 11:41:59 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/the-meaning-of-xml-extractor-selector/m-p/236968#M190098</guid>
      <dc:creator>derek</dc:creator>
      <dc:date>2010-02-03T11:41:59Z</dc:date>
    </item>
  </channel>
</rss>

