cancel
Showing results for 
Search instead for 
Did you mean: 

Solr index failing (corrupt data?)

kriton
Champ in-the-making
Champ in-the-making
Solr Indexing on my live environment has stopped working for a couple of weeks now. Any content I have I added since the 8th of June is not tracked by Solr anymore and I cannot figure out why.

This is an exception that repeats itself in the log every few minutes:

]]
   DB row:    NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd,
typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnI
d=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siff
els, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00
]]
org.springframework.extensions.webscripts.WebScriptException: 0526549084 Wrapped Exception (with status template): No l
ive node exists:
   ID:        30163
   Cache row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd,
typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnI
d=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siff
els, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00
]]
   DB row:    NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd,
typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnI
d=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siff
els, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00
]]
        at org.springframework.extensions.webscripts.AbstractWebScript.createStatusException(AbstractWebScript.java:970)
        at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:171)
        at org.alfresco.repo.web.scripts.RepositoryContainer$2.execute(RepositoryContainer.java:393)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:388)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:462)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:500)
        at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:275)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:372)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:209)
        at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:118)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:58)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.springframework.dao.ConcurrencyFailureException: No live node exists:
   ID:        30163
   Cache row: NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd,
typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnI
d=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siff
els, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00
]]
   DB row:    NodeEntity[ ID=30163, version=5, store=workspace://SpacesStore, uuid=c541fe9f-de0b-45e8-944d-0fdfe3e5b0fd,
typeQNameId=138, localeId=2, aclId=null, deleted=true, transaction=TransactionEntity[ ID=16850, server=null, changeTxnI
d=00c02d54-8cfe-4669-ab77-4eef6bd6aa20, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=mats.siff
els, auditCreated=2012-06-08T16:19:04.958+01:00, auditModifier=mats.siffels, auditModified=2012-06-08T16:28:58.274+01:00
]]
        at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getNodeNotNull(AbstractNodeDAOImpl.java:981)
        at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getParentAssocsCached(AbstractNodeDAOImpl.java:3586)
        at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.prependPaths(AbstractNodeDAOImpl.java:3459)
        at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.prependPaths(AbstractNodeDAOImpl.java:3575)
        at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getPaths(AbstractNodeDAOImpl.java:3389)
        at org.alfresco.repo.solr.SOLRTrackingComponentImpl.getNodesMetadata(SOLRTrackingComponentImpl.java:585)
        at sun.reflect.GeneratedMethodAccessor520.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.alfresco.repo.management.subsystems.SubsystemProxyFactory$1.invoke(SubsystemProxyFactory.java:65)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
        at $Proxy142.getNodesMetadata(Unknown Source)
        at org.alfresco.repo.web.scripts.solr.NodesMetaDataGet.executeImpl(NodesMetaDataGet.java:181)
        at org.springframework.extensions.webscripts.DeclarativeWebScript.executeImpl(DeclarativeWebScript.java:235)
        at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:64)
        … 25 more
Jun 26, 2012 12:40:08 PM org.alfresco.solr.tracker.CoreTracker trackRepository
SEVERE: Tracking failed
org.alfresco.error.AlfrescoRuntimeException: 052613727 GetNodeMetaData return status is 500
        at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:721)
        at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:1879)
        at org.alfresco.solr.tracker.CoreTracker.trackRepository(CoreTracker.java:1410)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:491)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)


Furthermore this is the output of the solr summary from solr admin panel :



<response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">0</int>
   </lst>
   <lst name="Summary">
      <lst name="alfresco">
         <long name="Last Index Commit Time">1339169184488</long>
         <date name="Last Index Commit Date">2012-06-08T15:26:24.488Z</date>
         <str name="Lag">1540249 s</str>
         <str name="Duration">P0Y17DT19H50M49.248S</str>
         <bool name="Active">true</bool>
         <long name="Timestamp for last TX on server">1340709433736</long>
         <date name="Date for last TX on server">2012-06-26T11:17:13.736Z</date>
      </lst>
      <lst name="archive">
         <long name="Last Index Commit Time">1340709433736</long>
         <date name="Last Index Commit Date">2012-06-26T11:17:13.736Z</date>
         <str name="Lag">0 s</str><str name="Duration">-P0Y</str>
         <bool name="Active">false</bool>
         <long name="Timestamp for last TX on server">1340709433736</long>
         <date name="Date for last TX on server">2012-06-26T11:17:13.736Z</date>
      </lst>
   </lst>
</response>


I've tried to replicate the problem in my staging environment by copying accross content and database, then rebuilding the solr index from scratch and see how far it would get, but I could as for some reason indexing get stuck on an earlier date (a date at which huge imports of data occured)

and the following exception starts to get thrown as soon as indexing reaches this point:


2012-06-26 13:02:32,441  ERROR [extensions.webscripts.AbstractRuntime] [http-8443-3] Exception from execu
teScript - redirecting to status template error: 05262105 Wrapped Exception (with status template): null
org.springframework.extensions.webscripts.WebScriptException: 05262105 Wrapped Exception (with status te
mplate): null
        at org.springframework.extensions.webscripts.AbstractWebScript.createStatusException(AbstractWebS
cript.java:970)
        at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.ja
va:171)
        at org.alfresco.repo.web.scripts.RepositoryContainer$2.execute(RepositoryContainer.java:393)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHel
per.java:388)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.jav
a:462)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.j
ava:500)
        at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:275)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:3
72)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:2
09)
        at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.ja
va:118)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:2
90)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:5
😎
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:2
35)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:58
😎
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
Jun 26, 2012 1:02:32 PM org.alfresco.solr.tracker.CoreTracker trackRepository
SEVERE: Tracking failed
org.alfresco.error.AlfrescoRuntimeException: 05262098 GetNodeMetaData return status is 500
        at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:721)
        at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:1879)
        at org.alfresco.solr.tracker.CoreTracker.trackRepository(CoreTracker.java:1410)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:491)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)


Would really love some points if anyone has encoutnered any similar problem.

Thanks Smiley Happy
7 REPLIES 7

andy
Champ on-the-rise
Champ on-the-rise
Hi

SOLR can skip this - the normal tracking is unforgiving

See the FIX action on the wiki page - it will inedx up to date(ish) and skip any problem transactions.
You can retry by re-runnning FIX.

Is the cache issue resolved by restarting alfresco?
Are you running in a cluster? Have you got the cache configured correctly?
It may be worth trying this first.


Andy

kriton
Champ in-the-making
Champ in-the-making
Hi Andy, thanks for the reply.

The FIX action I have tried and does absolutely nothing. I have also tried the PURGE actions and they dont seem to work either….

I am not running in a cluster and restarttng alfresco doesnt change anything.

It seems that solr tracker has fallen into some sort of a loop. It tries to read a node and retrieve information about it. It fails to do that, then after a few seconds it retries for the same node, etc.

I have identified a call made from SOLR to Alfresco to retrieve metadata information. I simulated this call for all nodes in the database and I identified that 4 nodes fail and return 500 status code. The first of them is the one that causes SOLR index to loop.

The webscript that is called from within SOLR is /api/solr/metadata

I am trying to identify what these nodes may have that is faulty and fix it. Any ideas ?

thanks again Smiley Happy

andy
Champ on-the-rise
Champ on-the-rise
Hi

Your error suggests that a deleted node is ending up in the Node cache and then it is not live.
Have you upgraded? What version of Alfresco did you start with?


There  is a job to remove deleted nodes.

Set
index.tracking.minRecordPurgeAgeDays=0
to remove all deleted nodes

I have not checked how often this is scheduled to run.

(The default is to keep deleted nodes for 30 days to support index tracking)

Andy

kriton
Champ in-the-making
Champ in-the-making
I started with 4.0.b and a month ago upgraded to 4.0.d

3 of those nodes were thumbnails that were refering to node 30163 as their parent. I marked them as deleted in the database and problem gone.

My questions:
- Any idea how this could have happaned in the first place ?

- The job that you told me about, is this supposed to do the same thing that I manally did in the database ? And how can I actually set it to run on demand ?

Thanks again Smiley Happy

andy
Champ on-the-rise
Champ on-the-rise
Hi

The job removes deleted nodes.

The issue you had was live thumbnails pointing to a deleted doc and the tracker failing to get its path as a result.

    at org.alfresco.repo.domain.node.AbstractNodeDAOImpl.getParentAssocsCached(AbstractNodeDAOImpl.java:3586)
This is a known (and fixed) issue.

The casue is fixed.
SOLR will also skip any issue like this and log an exception in the index.

You will need an Enterprise build for the latest fixes, pick it up when they are merged into the nightly build - along with lots of other stuff,
or wait for the next official community release.

Andy

kriton
Champ in-the-making
Champ in-the-making
thanks a lot for your replies Andy Smiley Happy

bhramari
Champ in-the-making
Champ in-the-making
Hi,

I was trying to re-index my alfresco solr, but the index doesn't build up and everytime it shows only one file inside the index folder "lucene-e79c4944314431e4e034db76c030ec86-lucene-2vfwnv-test.lock"

Please help!
its very urgent!