cancel
Showing results for 
Search instead for 
Did you mean: 

Pass Chinese character through CMLAppAspect class

liao1108
Champ in-the-making
Champ in-the-making
I am going to upload a excel file onto Alfresco 2.1 through remote web service, file name and description are the parameter of CMLAddAspect class.   But unfortunately, some character (such as 0x6) happen to conflict with XML that make SAXParser stops. Anyone konws how to avoid or encrypt these charcter and save into Alfresco server savely ? Thanks a lot.

Leo
8 REPLIES 8

rwetherall
Confirmed Champ
Confirmed Champ
Hi,

I'm afraid this isn't something I've encountered.

Perhaps you could package your example into a Unit test that we can execute here.  That will help us determine whether this can be resolved with some configuration or whether we have a bug.

Cheers,
Roy

gblomqui
Champ in-the-making
Champ in-the-making
We reproduced the same problem with Alfresco 2.0 running on JBoss.

To reproduce, change the
org.alfresco.sample.webservice.SamplesBase
class in the Alfresco Web Services SDK Samples to include
Utils.createNamedValue(Constants.PROP_TITLE, "パートナープロファイルã

gblomqui
Champ in-the-making
Champ in-the-making
A bit of clarification,

The text I used as the title combines Katakana, Hirigana, Kanji, the numeral "3", and spaces.  We tried weeding out various parts of the statement by alphabet type, and realized this:

    1. The Katakana characters, the numeral "3" and the spaces all transmit over the Web Service interface without issues
    2. The Hirigana characters transmit, but they get mangled
    3. The Kanji characters generate the exception from my previous post

gblomqui
Champ in-the-making
Champ in-the-making
Using TCP Monitor we've been able to find that the SOAP Envelope is getting decoded on the client before it gets sent across the wire.  At least, TCP Monitor is showing the SOAP Envelope with the decoded (and munged) characters.

When debugging and stepping through the client code, I can see that the multibyte characters are encoded as XML entities (for instance "パ" for "パ" and "情" for "情").

However, since TCP Monitor is showing the decoded (and munged) characters, I suspect that somewhere along the line, Axis is realizing that the character set encoding of the SOAP Envelope is UTF-8 and decodes the Envelope before transmission.  But, we're still tracking that down.

gblomqui
Champ in-the-making
Champ in-the-making
We believe the culprit is xmlsec 1.4.0 after reading:

http://issues.apache.org/bugzilla/show_bug.cgi?id=41462

The Canonicalization engine in xmlsec was revamped between versions 1.4.0 and 1.4.1. 

I just downloaded the Alfresco 2.1 SDK and saw that it still includes xmlsec 1.4.0.  I strongly suggest moving to xmlsec 1.4.1 to fix this issue.

rwetherall
Confirmed Champ
Confirmed Champ
Hi,

Thanks for investigating this thus far.

If this updated jar fixes this issue please let us know and we'll look at upgrading the jar before the next release.

Cheers,
Roy

gblomqui
Champ in-the-making
Champ in-the-making
No problem.  Glad to help.

We replaced the xmlsec-1.4.0 library with xmlsec-1.4.1 and it solved our problem.  We're now able to submit "パートナープロファイルã

fiki
Champ on-the-rise
Champ on-the-rise
I had a similar problem… In my custom application I tried to perform a search against the Alfresco repository using web services. When the search string included slovenian characters (Ä