<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Outlook msg extraction fail on Tika date format in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269520#M222650</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I'm trying to get Outlook msg metadata extraction to work. It fails with&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;Caused by: org.alfresco.service.cmr.repository.datatype.TypeConversionException: Unable to convert string to date: Thu, 19 Feb 2009 11:17:09 +0100 (CET)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.makeDate(AbstractMappingMetadataExtracter.java:899)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;at org.alfresco.repo.content.metadata.TikaPoweredMetadataExtracter.makeDate(TikaPoweredMetadataExtracter.java:166)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.convertSystemPropertyValues(AbstractMappingMetadataExtracter.java:798)&lt;BR /&gt;&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;SPAN&gt;The mail extractor is able to extract all metadata, it is just that the date isn't recognized.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;This is a date format that from the error isn't supported. In TikaPoweredMetadataExtracter.java class there already is a bunch of additional date formats to be supported, but none seem to match the date format I've encountered.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I've tried to set -Duser.country=US -Duser.language=en in JAVA_OPTS, but that didn't change anything.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;So is it outlook that has set the date format on the msg file? The msg file in question is from an Outlook client in an all Swedish environment.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;If so, then no config change in Alfresco will be able to support this. Could we change the TikaPoweredMetadataExtracter class to be configurable, so that when you happen to be live in some obscure part of the world like sweden can extend with extra date formats?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 15 Dec 2010 13:50:16 GMT</pubDate>
    <dc:creator>loftux</dc:creator>
    <dc:date>2010-12-15T13:50:16Z</dc:date>
    <item>
      <title>Outlook msg extraction fail on Tika date format</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269520#M222650</link>
      <description>I'm trying to get Outlook msg metadata extraction to work. It fails withCaused by: org.alfresco.service.cmr.repository.datatype.TypeConversionException: Unable to convert string to date: Thu, 19 Feb 2009 11:17:09 +0100 (CET)&amp;nbsp;&amp;nbsp;&amp;nbsp;at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.ma</description>
      <pubDate>Wed, 15 Dec 2010 13:50:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269520#M222650</guid>
      <dc:creator>loftux</dc:creator>
      <dc:date>2010-12-15T13:50:16Z</dc:date>
    </item>
    <item>
      <title>Re: Outlook msg extraction fail on Tika date format</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269521#M222651</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;It already was configurable, redefining bean extracter.Mail to look as extracter.RFC822&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;bean id="extracter.Mail" class="org.alfresco.repo.content.metadata.MailMetadataExtracter" parent="baseMetadataExtracter" &amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;property name="supportedDateFormats"&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;list&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;value&amp;gt;EEE, d MMM yyyy HH:mm:ss Z&amp;lt;/value&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;value&amp;gt;EEE, d MMM yy HH:mm:ss Z&amp;lt;/value&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/list&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/property&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/bean&amp;gt;&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;SPAN&gt;and it just worked. This issue,&amp;nbsp; &lt;/SPAN&gt;&lt;A href="http://issues.alfresco.com/jira/browse/ALF-2716" rel="nofollow noopener noreferrer"&gt;http://issues.alfresco.com/jira/browse/ALF-2716&lt;/A&gt;&lt;SPAN&gt; logged and resolved helped me here.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 15 Dec 2010 16:31:19 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269521#M222651</guid>
      <dc:creator>loftux</dc:creator>
      <dc:date>2010-12-15T16:31:19Z</dc:date>
    </item>
    <item>
      <title>Re: Outlook msg extraction fail on Tika date format</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269522#M222652</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;When you said Outlook msg.&amp;nbsp;&amp;nbsp;&amp;nbsp; Are you referring to the message/rfc822 mimetype (which normally has the extension .eml) or Outlook msg which has the mimetype of "application/vnd.ms-outlook" and normally has the extension msg.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 18 Dec 2010 21:25:11 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269522#M222652</guid>
      <dc:creator>mrogers</dc:creator>
      <dc:date>2010-12-18T21:25:11Z</dc:date>
    </item>
    <item>
      <title>Re: Outlook msg extraction fail on Tika date format</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269523#M222653</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I meant the application/vnd.ms-outlook (.msg) files.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 19 Dec 2010 06:12:28 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/outlook-msg-extraction-fail-on-tika-date-format/m-p/269523#M222653</guid>
      <dc:creator>loftux</dc:creator>
      <dc:date>2010-12-19T06:12:28Z</dc:date>
    </item>
  </channel>
</rss>

