cancel
Showing results for 
Search instead for 
Did you mean: 

CJK characters garbled in MSG files (MailContentTransformer)

irvingpop
Champ in-the-making
Champ in-the-making
Server:  Alfresco 3.4.d on Ubuntu 10.04,  MySQL (UTF-8) DB

I'm getting reports from users that text extraction is broken for double-byte characters (Chinese in my case) in Outlook MSG files.

The rest of the email text is extracted correctly, but the Chinese characters look like this:  @z/NHa

I enabled previews for MSG files using these instructions:  http://issues.alfresco.com/jira/browse/ALF-6200

Also, you cannot search on Chinese terms in that are contained in the emails.

Anyone else tried this?
4 REPLIES 4

irvingpop
Champ in-the-making
Champ in-the-making
Can anybody please try this test MSG file in their Alfresco installation?   It's written in a mix of Latin and Chinese (Traditional) characters: 
http://www.cloudest.com/alfresco/chinese-traditional.msg

mrogers
Star Contributor
Star Contributor
Can you please raise the issue in Jira.   And make sure you attach your sample message to the JIRA ticket so we can check your exact test case.

irvingpop
Champ in-the-making
Champ in-the-making
Thanks, I have created the issue in Jira:  http://issues.alfresco.com/jira/browse/ALF-7959

mrogers
Star Contributor
Star Contributor
Smiley Happy