cancel
Showing results for 
Search instead for 
Did you mean: 

Strange Lucene search exception

jcustovic
Champ in-the-making
Champ in-the-making
When using lucene search to get nodes I get strange Exception.
I tried 2 methods for searching but both produce this exception.

First code:

String searchString = "+EXACTTYPE:\"{mtbsVC.model}Deal\" +@mtbsVC\\:DealStatus:\"Active\"";
Query query = new Query(Constants.QUERY_LANG_LUCENE, searchString);
Node[] nodes = getRepositoryService().get(new Predicate(null, store, query));

Second code:

String searchString = "+EXACTTYPE:\"{mtbsVC.model}Deal\" +@mtbsVC\\:DealStatus:\"Active\"";
Query query = new Query(Constants.QUERY_LANG_LUCENE, searchString);
QueryResult result = getRepositoryService().query(store, query, true);

Both codes produce Exception:
Exception in thread "main" AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
faultActor:
faultNode:
faultDetail:
   {http://xml.apache.org/axis/}stackTraceSmiley Surprisedrg.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
   at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
   at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:174)
   at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
   at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1414)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.endEntity(XMLDocumentFragmentScannerImpl.java:905)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.endEntity(XMLDocumentScannerImpl.java:605)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1393)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1763)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:1064)
   at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:974)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:460)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:277)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2755)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
   at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
   at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
   at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
   at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
   at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
   at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
   at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
   at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
   at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
   at org.apache.axis.client.Call.invoke(Call.java:2767)
   at org.apache.axis.client.Call.invoke(Call.java:2443)
   at org.apache.axis.client.Call.invoke(Call.java:2366)
   at org.apache.axis.client.Call.invoke(Call.java:1812)
   at org.alfresco.webservice.repository.RepositoryServiceSoapBindingStub.get(RepositoryServiceSoapBindingStub.java:1078)
   at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:716)
   at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:711)
   at hr.chus.test.SearchDeals.main(SearchDeals.java:33)

   {http://xml.apache.org/axis/}hostname:chus-desktop

org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
   at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
   at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:701)
   at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
   at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
   at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
   at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
   at org.apache.axis.client.Call.invoke(Call.java:2767)
   at org.apache.axis.client.Call.invoke(Call.java:2443)
   at org.apache.axis.client.Call.invoke(Call.java:2366)
   at org.apache.axis.client.Call.invoke(Call.java:1812)
   at org.alfresco.webservice.repository.RepositoryServiceSoapBindingStub.get(RepositoryServiceSoapBindingStub.java:1078)
   at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:716)
   at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:711)
   at hr.chus.test.SearchDeals.main(SearchDeals.java:33)
Caused by: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
   at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
   at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:174)
   at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
   at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1414)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.endEntity(XMLDocumentFragmentScannerImpl.java:905)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.endEntity(XMLDocumentScannerImpl.java:605)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1393)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1763)
   at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:1064)
   at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:974)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:460)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:277)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2755)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
   at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
   at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
   at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
   at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
   at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
   … 12 more

The query    +EXACTTYPE:"{mtbsVC.model}Deal" +@mtbsVC\Smiley Very HappyealStatus:"Active" is working from Alfresco Node Browser but not via web service api.
Search like +EXACTTYPE:"{http://www.alfresco.org/model/content/1.0}folder" works via web service api.

Most interesting thing is that we have several alfresco systems up and only one of them is producing this kind of exception for the same search.

Could some data be corrupted?
3 REPLIES 3

openpj
Elite Collaborator
Elite Collaborator
Most interesting thing is that we have several alfresco systems up and only one of them is producing this kind of exception for the same search.
It could be a problem of Lucene indexes, have you tried to regenerate indexes?

jcustovic
Champ in-the-making
Champ in-the-making
I was planning to do this, but since it is a production environment I will tell our sys admins to re-build indexes this weekend and I'll come back with the comment on Monday.

jcustovic
Champ in-the-making
Champ in-the-making
We replicated the system and I did some research. In found out that one node is causing problems and had nothing to do with rebuilding the indexes. Error in catalina.out:
java.io.IOException: java.io.IOException: java.lang.IllegalArgumentException: The char '0xb' after 'Company building the ultimate place for music in the Czech Republic & Slovakia and ' is not a valid XML character.
   at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
   at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:317)
   at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:269)
   at org.apache.axis.SOAPPart.saveChanges(SOAPPart.java:530)
   at org.apache.axis.attachments.AttachmentsImpl.getAttachmentCount(AttachmentsImpl.java:554)
   at org.apache.axis.Message.writeTo(Message.java:535)
   at org.apache.axis.transport.http.AxisServlet.sendResponse(AxisServlet.java:902)
   at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:777)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
   at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
   at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
   at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
   at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
   at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
   at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
   at java.lang.Thread.run(Thread.java:619)

And the original text that caused the problem is:
Company building the ultimate place for music in the Czech Republic & Slovakia and  beyond by offering high quality music streaming service and related offerings. Content is delivered in cooperation with major music labels. The model based on successful examples  (e.g. Spotify).

You can try c/p this text to notepad or some text editor and you will notice strange character in front of "beyond" and "(e.g. Spotify)".

This is causing the problem and that node which had this text couldn't be accessed using web service (not by getNode, lucene search or any way) because server couldn't produce a valid XML. After manually removing those character from that text everything worked fine.

I'm not sure if this is alfresco problem and can it be improved?