Solr index failing (corrupt data?)
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2012 08:04 AM
Solr Indexing on my live environment has stopped working for a couple of weeks now. Any content I have I added since the 8th of June is not tracked by Solr anymore and I cannot figure out why.
This is an exception that repeats itself in the log every few minutes:
Furthermore this is the output of the solr summary from solr admin panel :
I've tried to replicate the problem in my staging environment by copying accross content and database, then rebuilding the solr index from scratch and see how far it would get, but I could as for some reason indexing get stuck on an earlier date (a date at which huge imports of data occured)
and the following exception starts to get thrown as soon as indexing reaches this point:
Would really love some points if anyone has encoutnered any similar problem.
Thanks
This is an exception that repeats itself in the log every few minutes:
]] DB row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd, typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnId=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siffels, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00]] org.springframework.extensions.webscripts.WebScriptException: 0526549084 Wrapped Exception (with status template): No live node exists: ID: 30163 Cache row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd, typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnId=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siffels, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00]] DB row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd, typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnId=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siffels, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00]] at org.springframework.extensions.webscripts.AbstractWebScript.createStatusException(AbstractWebScript.java:970) at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:171) at org.alfresco.repo.web.scripts.RepositoryContainer$2.execute(RepositoryContainer.java:393) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:388) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:462) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:500) at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:275) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:372) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:209) at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:118) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:58) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662)Caused by: org.springframework.dao.ConcurrencyFailureException: No live node exists: ID: 30163 Cache row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd, typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnId=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siffels, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00]] DB row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd, typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnId=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siffels, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00]] at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getNodeNotNull(AbstractNodeDAOImpl.java:981) at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getParentAssocsCached(AbstractNodeDAOImpl.java:3586) at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.prependPaths(AbstractNodeDAOImpl.java:3459) at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.prependPaths(AbstractNodeDAOImpl.java:3575) at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getPaths(AbstractNodeDAOImpl.java:3389) at org.alfresco.repo.solr.SOLRTrackingComponentImpl.getNodesMetadata(SOLRTrackingComponentImpl.java:585) at sun.reflect.GeneratedMethodAccessor520.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.alfresco.repo.management.subsystems.SubsystemProxyFactory$1.invoke(SubsystemProxyFactory.java:65) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202) at $Proxy142.getNodesMetadata(Unknown Source) at org.alfresco.repo.web.scripts.solr.NodesMetaDataGet.executeImpl(NodesMetaDataGet.java:181) at org.springframework.extensions.webscripts.DeclarativeWebScript.executeImpl(DeclarativeWebScript.java:235) at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:64) … 25 moreJun 26, 2012 12:40:08 PM org.alfresco.solr.tracker.CoreTracker trackRepositorySEVERE: Tracking failedorg.alfresco.error.AlfrescoRuntimeException: 052613727 GetNodeMetaData return status is 500 at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:721) at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:1879) at org.alfresco.solr.tracker.CoreTracker.trackRepository(CoreTracker.java:1410) at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:491) at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45) at org.quartz.core.JobRunShell.run(JobRunShell.java:216) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)
Furthermore this is the output of the solr summary from solr admin panel :
<response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> </lst> <lst name="Summary"> <lst name="alfresco"> <long name="Last Index Commit Time">1339169184488</long> <date name="Last Index Commit Date">2012-06-08T15:26:24.488Z</date> <str name="Lag">1540249 s</str> <str name="Duration">P0Y17DT19H50M49.248S</str> <bool name="Active">true</bool> <long name="Timestamp for last TX on server">1340709433736</long> <date name="Date for last TX on server">2012-06-26T11:17:13.736Z</date> </lst> <lst name="archive"> <long name="Last Index Commit Time">1340709433736</long> <date name="Last Index Commit Date">2012-06-26T11:17:13.736Z</date> <str name="Lag">0 s</str><str name="Duration">-P0Y</str> <bool name="Active">false</bool> <long name="Timestamp for last TX on server">1340709433736</long> <date name="Date for last TX on server">2012-06-26T11:17:13.736Z</date> </lst> </lst></response>
I've tried to replicate the problem in my staging environment by copying accross content and database, then rebuilding the solr index from scratch and see how far it would get, but I could as for some reason indexing get stuck on an earlier date (a date at which huge imports of data occured)
and the following exception starts to get thrown as soon as indexing reaches this point:
2012-06-26 13:02:32,441 ERROR [extensions.webscripts.AbstractRuntime] [http-8443-3] Exception from executeScript - redirecting to status template error: 05262105 Wrapped Exception (with status template): null org.springframework.extensions.webscripts.WebScriptException: 05262105 Wrapped Exception (with status template): null at org.springframework.extensions.webscripts.AbstractWebScript.createStatusException(AbstractWebScript.java:970) at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:171) at org.alfresco.repo.web.scripts.RepositoryContainer$2.execute(RepositoryContainer.java:393) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:388) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:462) at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:500) at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:275) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:372) at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:209) at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:118) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:5😎 at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:58😎 at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662)Caused by: java.lang.NullPointerExceptionJun 26, 2012 1:02:32 PM org.alfresco.solr.tracker.CoreTracker trackRepositorySEVERE: Tracking failedorg.alfresco.error.AlfrescoRuntimeException: 05262098 GetNodeMetaData return status is 500 at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:721) at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:1879) at org.alfresco.solr.tracker.CoreTracker.trackRepository(CoreTracker.java:1410) at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:491) at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45) at org.quartz.core.JobRunShell.run(JobRunShell.java:216) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)
Would really love some points if anyone has encoutnered any similar problem.
Thanks

Labels:
- Labels:
-
Archive
7 REPLIES 7

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2012 12:45 PM
Hi
SOLR can skip this - the normal tracking is unforgiving
See the FIX action on the wiki page - it will inedx up to date(ish) and skip any problem transactions.
You can retry by re-runnning FIX.
Is the cache issue resolved by restarting alfresco?
Are you running in a cluster? Have you got the cache configured correctly?
It may be worth trying this first.
Andy
SOLR can skip this - the normal tracking is unforgiving
See the FIX action on the wiki page - it will inedx up to date(ish) and skip any problem transactions.
You can retry by re-runnning FIX.
Is the cache issue resolved by restarting alfresco?
Are you running in a cluster? Have you got the cache configured correctly?
It may be worth trying this first.
Andy
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2012 08:45 AM
Hi Andy, thanks for the reply.
The FIX action I have tried and does absolutely nothing. I have also tried the PURGE actions and they dont seem to work either….
I am not running in a cluster and restarttng alfresco doesnt change anything.
It seems that solr tracker has fallen into some sort of a loop. It tries to read a node and retrieve information about it. It fails to do that, then after a few seconds it retries for the same node, etc.
I have identified a call made from SOLR to Alfresco to retrieve metadata information. I simulated this call for all nodes in the database and I identified that 4 nodes fail and return 500 status code. The first of them is the one that causes SOLR index to loop.
The webscript that is called from within SOLR is /api/solr/metadata
I am trying to identify what these nodes may have that is faulty and fix it. Any ideas ?
thanks again
The FIX action I have tried and does absolutely nothing. I have also tried the PURGE actions and they dont seem to work either….
I am not running in a cluster and restarttng alfresco doesnt change anything.
It seems that solr tracker has fallen into some sort of a loop. It tries to read a node and retrieve information about it. It fails to do that, then after a few seconds it retries for the same node, etc.
I have identified a call made from SOLR to Alfresco to retrieve metadata information. I simulated this call for all nodes in the database and I identified that 4 nodes fail and return 500 status code. The first of them is the one that causes SOLR index to loop.
The webscript that is called from within SOLR is /api/solr/metadata
I am trying to identify what these nodes may have that is faulty and fix it. Any ideas ?
thanks again


Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2012 10:06 AM
Hi
Your error suggests that a deleted node is ending up in the Node cache and then it is not live.
Have you upgraded? What version of Alfresco did you start with?
There is a job to remove deleted nodes.
Set
index.tracking.minRecordPurgeAgeDays=0
to remove all deleted nodes
I have not checked how often this is scheduled to run.
(The default is to keep deleted nodes for 30 days to support index tracking)
Andy
Your error suggests that a deleted node is ending up in the Node cache and then it is not live.
Have you upgraded? What version of Alfresco did you start with?
There is a job to remove deleted nodes.
Set
index.tracking.minRecordPurgeAgeDays=0
to remove all deleted nodes
I have not checked how often this is scheduled to run.
(The default is to keep deleted nodes for 30 days to support index tracking)
Andy
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2012 11:01 AM
I started with 4.0.b and a month ago upgraded to 4.0.d
3 of those nodes were thumbnails that were refering to node 30163 as their parent. I marked them as deleted in the database and problem gone.
My questions:
- Any idea how this could have happaned in the first place ?
- The job that you told me about, is this supposed to do the same thing that I manally did in the database ? And how can I actually set it to run on demand ?
Thanks again
3 of those nodes were thumbnails that were refering to node 30163 as their parent. I marked them as deleted in the database and problem gone.
My questions:
- Any idea how this could have happaned in the first place ?
- The job that you told me about, is this supposed to do the same thing that I manally did in the database ? And how can I actually set it to run on demand ?
Thanks again


Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2012 02:44 PM
Hi
The job removes deleted nodes.
The issue you had was live thumbnails pointing to a deleted doc and the tracker failing to get its path as a result.
at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getParentAssocsCached(AbstractNodeDAOImpl.java:3586)
This is a known (and fixed) issue.
The casue is fixed.
SOLR will also skip any issue like this and log an exception in the index.
You will need an Enterprise build for the latest fixes, pick it up when they are merged into the nightly build - along with lots of other stuff,
or wait for the next official community release.
Andy
The job removes deleted nodes.
The issue you had was live thumbnails pointing to a deleted doc and the tracker failing to get its path as a result.
at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getParentAssocsCached(AbstractNodeDAOImpl.java:3586)
This is a known (and fixed) issue.
The casue is fixed.
SOLR will also skip any issue like this and log an exception in the index.
You will need an Enterprise build for the latest fixes, pick it up when they are merged into the nightly build - along with lots of other stuff,
or wait for the next official community release.
Andy
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-03-2012 05:24 AM
thanks a lot for your replies Andy

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-17-2014 03:37 AM
Hi,
I was trying to re-index my alfresco solr, but the index doesn't build up and everytime it shows only one file inside the index folder "lucene-e79c4944314431e4e034db76c030ec86-lucene-2vfwnv-test.lock"
Please help!
its very urgent!
I was trying to re-index my alfresco solr, but the index doesn't build up and everytime it shows only one file inside the index folder "lucene-e79c4944314431e4e034db76c030ec86-lucene-2vfwnv-test.lock"
Please help!
its very urgent!
