<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: General metadata extraction from MS Office (primarily MS Word) in Nuxeo Forum</title>
    <link>https://connect.hyland.com/t5/nuxeo-forum/general-metadata-extraction-from-ms-office-primarily-ms-word/m-p/318973#M5974</link>
    <description>&lt;P&gt;Studio only won't be enough but could help define some new metadata fields in the Nuxeo document types to host your custom metadata and the matching layout to display (or manually edit them using from the web browser).&lt;/P&gt;
&lt;P&gt;If you are a Java developer, you could write an &lt;A href="http://doc.nuxeo.com/display/NXDOC/Content+Automation"&gt;Automation Operation&lt;/A&gt; in Java using the &lt;A href="http://blogs.nuxeo.com/marketing/2011/12/nuxeo-studio-and-nuxeo-ide-new-ways-to-configure-content-centric-applications.html"&gt;Nuxeo IDE&lt;/A&gt; that embeds the &lt;A href="https://tika.apache.org/"&gt;Apache Tika&lt;/A&gt; library for the extraction it-self and then plug it to a user action or a event listener to trigger the extraction whenever a document is modified.&lt;/P&gt;</description>
    <pubDate>Thu, 29 Dec 2011 18:13:19 GMT</pubDate>
    <dc:creator>Olivier_Grisel</dc:creator>
    <dc:date>2011-12-29T18:13:19Z</dc:date>
    <item>
      <title>General metadata extraction from MS Office (primarily MS Word)</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/general-metadata-extraction-from-ms-office-primarily-ms-word/m-p/318972#M5973</link>
      <description>&lt;P&gt;What are the steps a developer needs to go through to display custom metadata information from a MS Word document (the standard Nuxeo extraction is rather poor) ?&lt;/P&gt;
&lt;P&gt;Let me give an example. Suppose most documents in our company have a custom MS Word property called "Document Description" and when I navigate to this document in my workspace, I would like to see the Document Description field in the "Metadata" part in the "Summary" tab page of the document.&lt;/P&gt;
&lt;P&gt;I assume there are multiple steps to be taken here to achieve this behaviour ...&lt;/P&gt;
&lt;P&gt;Would Studio help with this (automatic metadata extraction) or not ?&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2011 14:49:33 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/general-metadata-extraction-from-ms-office-primarily-ms-word/m-p/318972#M5973</guid>
      <dc:creator>Benedikt_Naesse</dc:creator>
      <dc:date>2011-12-29T14:49:33Z</dc:date>
    </item>
    <item>
      <title>Re: General metadata extraction from MS Office (primarily MS Word)</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/general-metadata-extraction-from-ms-office-primarily-ms-word/m-p/318973#M5974</link>
      <description>&lt;P&gt;Studio only won't be enough but could help define some new metadata fields in the Nuxeo document types to host your custom metadata and the matching layout to display (or manually edit them using from the web browser).&lt;/P&gt;
&lt;P&gt;If you are a Java developer, you could write an &lt;A href="http://doc.nuxeo.com/display/NXDOC/Content+Automation"&gt;Automation Operation&lt;/A&gt; in Java using the &lt;A href="http://blogs.nuxeo.com/marketing/2011/12/nuxeo-studio-and-nuxeo-ide-new-ways-to-configure-content-centric-applications.html"&gt;Nuxeo IDE&lt;/A&gt; that embeds the &lt;A href="https://tika.apache.org/"&gt;Apache Tika&lt;/A&gt; library for the extraction it-self and then plug it to a user action or a event listener to trigger the extraction whenever a document is modified.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2011 18:13:19 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/general-metadata-extraction-from-ms-office-primarily-ms-word/m-p/318973#M5974</guid>
      <dc:creator>Olivier_Grisel</dc:creator>
      <dc:date>2011-12-29T18:13:19Z</dc:date>
    </item>
  </channel>
</rss>

