cancel
Showing results for 
Search instead for 
Did you mean: 

Axis and SAXParseException

dvelasov
Champ in-the-making
Champ in-the-making
Hie all.
I have a resource library based on CMS Alfresco( + Liferay).
Almost all books are written in Russian.
Well. I have to provide search service in this library.
After some tricks with core-services-context.xml and dictionaryModel.xml
I managed problem with Russian lucene requests in the Node Browser.
But I encountered another problem. I developed my own HResourceLibraryPortlet for common users. It contains a single text field for lucene search in particular.
The point is that after sending a search request with Russian letters
I got the following exception:


AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1f) was found in the element content of the document.
faultActor:
faultNode:
faultDetail:
   {http://xml.apache.org/axis/}stackTrace:org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1f) was found in the element content of the document.
   at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
   at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
   at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
   at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
   at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
   at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
   at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
   at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
   at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
   at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
   at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
   at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
   at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
   at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
   at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
   at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
   at org.apache.axis.server.AxisServer.initSOAPConstants(AxisServer.java:345)
   at org.apache.axis.server.AxisServer.invoke(AxisServer.java:279)
   at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
   at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
   at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
   at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
   at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
   at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
   at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
   at java.lang.Thread.run(Thread.java:619)

It would be great if someone could give me an advice.
(I have no problems with English letters in my portlet)
5 REPLIES 5

dvelasov
Champ in-the-making
Champ in-the-making
Is this the problem of Alfresco/Axis code?
:?

dvelasov
Champ in-the-making
Champ in-the-making
I tried to perform search with a hard coded String object using UTF-8.
It leads to the same exception.

I modified Axis code in order to print the whole SOAP message as string and get info about the encoding used.
Encoding is "utf-8". But cyrillic letters were modified to a nonsense sequence of the non XML-compliant characters.
Of course any attempt to parse these ones throws the above exception.

Native Node Browser uses org.alfresco.service.cmr…..
Hmm…. May be I should use this API and include some servlet into the alfrerso.war in order to save my time.

But certainly it will be great if I find the place in the code that is responsible for this behaviour.

rwetherall
Confirmed Champ
Confirmed Champ
Hi,

This ceratinly sounds like a problem with character encoding.  I don't know if the issue lies with Axis or Alfresco, but it seems like you have a reproducable case.

The best course of action is to create a JIRA issue (http://issues.alfresco.com) describing the problem.  If you can provide a unit test/example code that we can use to reproduce the issue it will help.  It will be perticularily helpful to specify the environment you are working in and the character sets you are using.

Cheers,
Roy

dvelasov
Champ in-the-making
Champ in-the-making
Here is the expected JIRA issue:
http://issues.alfresco.com/browse/AR-2156

Thanks for your attention.
D. Vlasov

dvelasov
Champ in-the-making
Champ in-the-making
The problem turned out to be simple.
I think you may remove JIRA issue.

I forgot that among other libraries I used a set of jar files (alfresco-web-service-client.jar and some others) for the previous version of the Alfresco CMS on the client-side.
Huh….
(^_^Smiley Wink