Hyland Connect

kilo · ‎01-29-2010

Hello,

I noticed that my xml meta data extractor doesn't work when the content (xml, of course) has <!DOCTYPE > declaration. I'm using the built-in XML extractor, just like the sample. I get log message:


No working metadata extractor could be found: 
   Document: ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]
17:54:10,870 INFO  [STDOUT] 17:54:10,870 User:admin DEBUG [metadata.xml.XPathMetadataExtracter] 
XML metadata extractor redirected: 
   Reader:    ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]
   Extracter: null
‍‍‍‍‍‍‍‍

which seems strange since everything works as expected without <!DOCTYPE > in the xml document. Is this because the built-in extractor is trying to validate the xml document?

Is there any option to disable <!DOCTYPE > interpretation in built-in extractor?

Thank you. I will appreciate your suggestions.

_valerio_ · ‎02-24-2010

Hi Kilo, I'm trying to extract metadata (through wcm-xml-metadata-extracter-context.xml) from an xml file that looks like this

<documento>
  <destinatario>Pippo</destinatario>
  <tipo_documento>FATTURA</tipo_documento>
  <codice_articolo>Art1234</codice_articolo>
  <numero_fattura>31</numero_fattura>
</documento>‍‍‍‍‍‍

but my extracter doesn't work!
please can you post the code of your extractor

syspro · ‎07-09-2010

Hi,

I'm having a similar issue with regards to extracting Meta data from DITA XML topic and Map files. The information is extracted fine when the Doctype declaration is not present. Have you had any luck with this?

Hyland Connect

xml meta data extractor doesn't work with DOCTYPE