<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic xml meta data extractor doesn't work with DOCTYPE in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236880#M190010</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hello,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I noticed that my xml meta data extractor doesn't work when the content (xml, of course) has &amp;lt;!DOCTYPE &amp;gt; declaration. I'm using the built-in XML extractor, just like the sample. I get log message:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;&lt;BR /&gt;No working metadata extractor could be found: &lt;BR /&gt;&amp;nbsp;&amp;nbsp; Document: ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]&lt;BR /&gt;17:54:10,870 INFO&amp;nbsp; [STDOUT] 17:54:10,870 User:admin DEBUG [metadata.xml.XPathMetadataExtracter] &lt;BR /&gt;XML metadata extractor redirected: &lt;BR /&gt;&amp;nbsp;&amp;nbsp; Reader:&amp;nbsp;&amp;nbsp;&amp;nbsp; ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]&lt;BR /&gt;&amp;nbsp;&amp;nbsp; Extracter: null&lt;BR /&gt;&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;which seems strange since everything works as expected without &amp;lt;!DOCTYPE &amp;gt; in the xml document. Is this because the built-in extractor is trying to validate the xml document?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Is there any option to disable &amp;lt;!DOCTYPE &amp;gt; interpretation in built-in extractor?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thank you. I will appreciate your suggestions.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 29 Jan 2010 23:06:56 GMT</pubDate>
    <dc:creator>kilo</dc:creator>
    <dc:date>2010-01-29T23:06:56Z</dc:date>
    <item>
      <title>xml meta data extractor doesn't work with DOCTYPE</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236880#M190010</link>
      <description>Hello,I noticed that my xml meta data extractor doesn't work when the content (xml, of course) has &amp;lt;!DOCTYPE &amp;gt; declaration. I'm using the built-in XML extractor, just like the sample. I get log message:No working metadata extractor could be found: &amp;nbsp;&amp;nbsp; Document: ContentAccessor[ contentUrl=store</description>
      <pubDate>Fri, 29 Jan 2010 23:06:56 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236880#M190010</guid>
      <dc:creator>kilo</dc:creator>
      <dc:date>2010-01-29T23:06:56Z</dc:date>
    </item>
    <item>
      <title>Re: xml meta data extractor doesn't work with DOCTYPE</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236881#M190011</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Kilo, I'm trying to extract&amp;nbsp; metadata (through wcm-xml-metadata-extracter-context.xml) from an xml file that looks like this&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;&amp;lt;documento&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;destinatario&amp;gt;Pippo&amp;lt;/destinatario&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;tipo_documento&amp;gt;FATTURA&amp;lt;/tipo_documento&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;codice_articolo&amp;gt;Art1234&amp;lt;/codice_articolo&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;numero_fattura&amp;gt;31&amp;lt;/numero_fattura&amp;gt;&lt;BR /&gt;&amp;lt;/documento&amp;gt;&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;SPAN&gt;but my extracter doesn't work!&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;please can you post the code of your extractor&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Feb 2010 14:00:56 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236881#M190011</guid>
      <dc:creator>_valerio_</dc:creator>
      <dc:date>2010-02-24T14:00:56Z</dc:date>
    </item>
    <item>
      <title>Re: xml meta data extractor doesn't work with DOCTYPE</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236882#M190012</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I'm having a similar issue with regards to extracting Meta data from DITA XML topic and Map files. The information is extracted fine when the Doctype declaration is not present. Have you had any luck with this?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 09 Jul 2010 14:15:15 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/xml-meta-data-extractor-doesn-t-work-with-doctype/m-p/236882#M190012</guid>
      <dc:creator>syspro</dc:creator>
      <dc:date>2010-07-09T14:15:15Z</dc:date>
    </item>
  </channel>
</rss>

