I've come across a situation where data loss occurs when adding multipart/mixed email messages to Alfresco IMAP using Outlook 2010. What happens is that text/plain part in the message gets truncated. Here an example <blockquote> … Removed some rows for readability …. MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="—-=_NextPart_000_002C_01CE50BB.47FC5A30" X-Mailer: Example IT Department … Removed some rows for readability ….
Mitt namn är Firstname Lastname, jag kommer att vara er kontaktperson för flyttningen till Sverige. Jag har sökt dig per telefon idag för att diskutera datum. Vi har möjlighet enligt följande:
Packning / Lastning: 2013-01-01 och /ell
——=_NextPart_000_002C_01CE50BB.47FC5A30 Content-Type: application/pdf; name="Movingchecklist.pdf" Content-Transfer-Encoding: base64 Content-ID: <9153B8BF1957DB4AB952F7D81A0C28B9@example.int> Content-Disposition: attachment; filename="Movingchecklist.pdf" … Remove base64 for readability …. </blockquote> The text/plain part sometimes, as in this example, but not always, gets truncated to around 245 characters. The actual message was around 1000 characters.
What I know so far - it seems to happen only when there is an attachment of type pdf in the mail - Outlook 2010 randomly decides to use multipart/mixed and its own microsoft format rtf (and thus winmail.dat as attachments instead). When this happens, text does not get truncated. - It can happen regardless of method to transfer email from Outlook to Alfresco, ctrl+c/ctrl+v, drag and drop, copy via menu. - Other mail clients tested does not have this problem - Attachments are NOT extracted in Alfresco, so they are saved as part of the message. - Attachments never gets corrupted, it is just the text/plain part - mail server is Exchange 2013
<strong>Question</strong> Does Alfresco (or the greenmail component used for IMAP) actually take the message apart, and the re-assemble it? I need to know this to find out if Outlook or Alfresco is to blame. If Alfresco does nothing with the content, then it might just be an Outlook bug.
Since Alfresco is used for email archival, this has been going on for some time before users noticed. Usually the just drop in the mail without checking them. Very serious data loss situation here.
the Alfresco IMAP component and Greenmail integration for the IMAP protocol don't take mime messages apart and - as far as I can see in the code - always write them in their full content as read from the Java socket inputstream.
Thanks Alex for looking into this. My conclusion is also that the message is not taken apart unless you extract attachments (then it has to). Next step is to find a Microsoft expert to look at what Outlook is doing.