Strange Lucene search exception

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2011 10:23 AM
When using lucene search to get nodes I get strange Exception.
I tried 2 methods for searching but both produce this exception.
First code:
Second code:
Both codes produce Exception:
The query +EXACTTYPE:"{mtbsVC.model}Deal" +@mtbsVC\
ealStatus:"Active" is working from Alfresco Node Browser but not via web service api.
Search like +EXACTTYPE:"{http://www.alfresco.org/model/content/1.0}folder" works via web service api.
Most interesting thing is that we have several alfresco systems up and only one of them is producing this kind of exception for the same search.
Could some data be corrupted?
I tried 2 methods for searching but both produce this exception.
First code:
…String searchString = "+EXACTTYPE:\"{mtbsVC.model}Deal\" +@mtbsVC\\:DealStatus:\"Active\"";Query query = new Query(Constants.QUERY_LANG_LUCENE, searchString);Node[] nodes = getRepositoryService().get(new Predicate(null, store, query));…
Second code:
…String searchString = "+EXACTTYPE:\"{mtbsVC.model}Deal\" +@mtbsVC\\:DealStatus:\"Active\"";Query query = new Query(Constants.QUERY_LANG_LUCENE, searchString);QueryResult result = getRepositoryService().query(store, query, true);…
Both codes produce Exception:
Exception in thread "main" AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTracerg.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:174)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1414)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.endEntity(XMLDocumentFragmentScannerImpl.java:905)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.endEntity(XMLDocumentScannerImpl.java:605)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1393)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1763)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:1064)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:974)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:460)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:277)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2755)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
at org.apache.axis.client.Call.invoke(Call.java:2767)
at org.apache.axis.client.Call.invoke(Call.java:2443)
at org.apache.axis.client.Call.invoke(Call.java:2366)
at org.apache.axis.client.Call.invoke(Call.java:1812)
at org.alfresco.webservice.repository.RepositoryServiceSoapBindingStub.get(RepositoryServiceSoapBindingStub.java:1078)
at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:716)
at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:711)
at hr.chus.test.SearchDeals.main(SearchDeals.java:33)
{http://xml.apache.org/axis/}hostname:chus-desktop
org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:701)
at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
at org.apache.axis.client.Call.invoke(Call.java:2767)
at org.apache.axis.client.Call.invoke(Call.java:2443)
at org.apache.axis.client.Call.invoke(Call.java:2366)
at org.apache.axis.client.Call.invoke(Call.java:1812)
at org.alfresco.webservice.repository.RepositoryServiceSoapBindingStub.get(RepositoryServiceSoapBindingStub.java:1078)
at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:716)
at hr.mtbs.alfresco.webservices.AlfrescoSoapWebApiInterfaceImpl.getNodes(AlfrescoSoapWebApiInterfaceImpl.java:711)
at hr.chus.test.SearchDeals.main(SearchDeals.java:33)
Caused by: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:174)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1414)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.endEntity(XMLDocumentFragmentScannerImpl.java:905)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.endEntity(XMLDocumentScannerImpl.java:605)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1393)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1763)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:1064)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:974)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:460)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:277)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2755)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
… 12 more
The query +EXACTTYPE:"{mtbsVC.model}Deal" +@mtbsVC\

Search like +EXACTTYPE:"{http://www.alfresco.org/model/content/1.0}folder" works via web service api.
Most interesting thing is that we have several alfresco systems up and only one of them is producing this kind of exception for the same search.
Could some data be corrupted?
Labels:
- Labels:
-
Archive
3 REPLIES 3
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2011 10:49 AM
Most interesting thing is that we have several alfresco systems up and only one of them is producing this kind of exception for the same search.It could be a problem of Lucene indexes, have you tried to regenerate indexes?

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2011 03:36 AM
I was planning to do this, but since it is a production environment I will tell our sys admins to re-build indexes this weekend and I'll come back with the comment on Monday.

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2011 06:18 AM
We replicated the system and I did some research. In found out that one node is causing problems and had nothing to do with rebuilding the indexes. Error in catalina.out:
And the original text that caused the problem is:
You can try c/p this text to notepad or some text editor and you will notice strange character in front of "beyond" and "(e.g. Spotify)".
This is causing the problem and that node which had this text couldn't be accessed using web service (not by getNode, lucene search or any way) because server couldn't produce a valid XML. After manually removing those character from that text everything worked fine.
I'm not sure if this is alfresco problem and can it be improved?
java.io.IOException: java.io.IOException: java.lang.IllegalArgumentException: The char '0xb' after 'Company building the ultimate place for music in the Czech Republic & Slovakia and ' is not a valid XML character. at org.apache.axis.AxisFault.makeFault(AxisFault.java:101) at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:317) at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:269) at org.apache.axis.SOAPPart.saveChanges(SOAPPart.java:530) at org.apache.axis.attachments.AttachmentsImpl.getAttachmentCount(AttachmentsImpl.java:554) at org.apache.axis.Message.writeTo(Message.java:535) at org.apache.axis.transport.http.AxisServlet.sendResponse(AxisServlet.java:902) at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:777) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Thread.java:619)
And the original text that caused the problem is:
Company building the ultimate place for music in the Czech Republic & Slovakia and beyond by offering high quality music streaming service and related offerings. Content is delivered in cooperation with major music labels. The model based on successful examples (e.g. Spotify).
You can try c/p this text to notepad or some text editor and you will notice strange character in front of "beyond" and "(e.g. Spotify)".
This is causing the problem and that node which had this text couldn't be accessed using web service (not by getNode, lucene search or any way) because server couldn't produce a valid XML. After manually removing those character from that text everything worked fine.
I'm not sure if this is alfresco problem and can it be improved?
