cancel
Showing results for 
Search instead for 
Did you mean: 

Solr - Missing transactions indexing slow

loftux
Star Contributor
Star Contributor
I'm seeing this in https://servername:8443/solr/admin/cores?action=REPORT&wt=xml

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">454</int>
</lst>
<lst name="report">
<lst name="alfresco">
<long name="DB transaction count">3840</long>
<long name="DB acl transaction count">56</long>
<long name="Count of duplicated transactions in the index">0</long>
<long name="Count of duplicated acl transactions in the index">0</long>
<long name="Count of transactions in the index but not the DB">0</long>
<long name="Count of acl transactions in the index but not the DB">0</long>
<long name="Count of missing transactions from the Index">2472</long>
<long name="First transaction missing from the Index">15688</long>
<long name="Count of missing acl transactions from the Index">0</long>
<long name="Index transaction count">1368</long>
<long name="Index acl transaction count">56</long>
<long name="Index unique transaction count">1368</long>
<long name="Index unique acl transaction count">56</long>
<long name="Index leaf count">2031</long>
<long name="Count of duplicate leaves in the index">0</long>
<long name="Last index commit time">1318320543887</long>
<str name="Last Index commit date">2011-10-11T10:09:03</str>
<long name="Last TX id before holes">15687</long>
</lst>
<lst name="archive">
<long name="DB transaction count">3840</long>
<long name="DB acl transaction count">56</long>
<long name="Count of duplicated transactions in the index">0</long>
<long name="Count of duplicated acl transactions in the index">0</long>
<long name="Count of transactions in the index but not the DB">0</long>
<long name="Count of acl transactions in the index but not the DB">0</long>
<long name="Count of missing transactions from the Index">0</long>
<long name="Count of missing acl transactions from the Index">0</long>
<long name="Index transaction count">3840</long>
<long name="Index acl transaction count">56</long>
<long name="Index unique transaction count">3840</long>
<long name="Index unique acl transaction count">56</long>
<long name="Index leaf count">1766</long>
<long name="Count of duplicate leaves in the index">0</long>
<long name="Last index commit time">1321515768375</long>
<str name="Last Index commit date">2011-11-17T08:42:48</str>
<long name="Last TX id before holes">21961</long>
</lst>
</lst>
</response>
The issue here is <long name="Count of missing transactions from the Index">2472</long>
it has just changed the number slightly, and it has been so for almost 24 hours.

What is going on (or actually not going on)?
I constantly see around 50% cpu usage. Turned of solr and on lucene, and it indexes all in very short time, but solr is unusable.
What kind of debugging can I turn on to research this issue further?
The machine is rhel 6.1, mysql 5.1, tomcat 6.0.32 on the same machine.
4 REPLIES 4

loftux
Star Contributor
Star Contributor
Andy just updated the wiki page http://wiki.alfresco.com/wiki/Alfresco_And_SOLR with some Logging howto.

Following that, I have
Nov 17, 2011 5:39:30 PM org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: …. from Transaction [id=21961, commitTimeMs=1321515768375, updates=1, deletes=1]
Nov 17, 2011 5:39:30 PM org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: …. to Transaction [id=21961, commitTimeMs=1321515768375, updates=1, deletes=1]
and
mysql> select max(id) from alf_transaction;
+———+
| max(id) |
+———+
|   21961 |
+———+
That is according to documentation, index is up to date, since transaction id is the latest. Still, why am I seeing missing transactions from index in the report page?

loftux
Star Contributor
Star Contributor
I found the cause of the issue looking in tomcat/logs/catalina.out, that is where solr logs its errors.
The error i found was "Node without parents does not have root aspect", this post helped me resolve the issue, it involves doing direct changes to the db. There were 112 nodes in the db that had no root aspect, all created on the day for the upgrade. Since I didn't help out with the upgrade, I'm not sure if they are the result of failing patches during the upgrade.

Once past that error, it does take a long time to complete initial full indexing with solr compared to lucene. That I think may be due to https://forums.alfresco.com/en/viewtopic.php?f=14&t=41573 "alfresco 4.0.b overuse database", and that this user filed as issue https://issues.alfresco.com/jira/browse/ALF-11546.

fstnboy
Champ on-the-rise
Champ on-the-rise
Hi Loftux,

Quick question. Which version of Alfresco are you using? Community or Enterprise? I'm struggling to get Solr working with Alfresco (I'm using Enterprise 4 Beta…)

Adei

loftux
Star Contributor
Star Contributor
I'm using the Community, built from head r31944