cancel
Showing results for 
Search instead for 
Did you mean: 

Indexing with SolR very long

mlagneaux
Champ on-the-rise
Champ on-the-rise
Hello,

After Alfresco answer in JIRA ticket https://issues.alfresco.com/jira/browse/ALF-15546 , I've installed SolR. For the moment, Alfresco is still using Lucene and SolR is indexing my store in background (this store is about 600 GB).

I've activated SolR logs and get messages like these:
 17-Nov-2011 11:22:45 org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: …. from Transaction [id = 374, = 1321528572425 commitTimeMs, updates = 1 deletes = 0]
17-Nov-2011 11:22:45 org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: …. to Transaction [id = 374, = 1321528572425 commitTimeMs, updates = 1 deletes = 0]

Indexing started Thursday night and it is still running.

To monitor indexing progress, I first get the total number of records in the alf_transaction table => 1,351,872 .
Then I use the "commitTimeMs" provided in the logs.
I got the commitTimeMS of a transaction processed Friday night. And I looked how many transactions have a lower timestamp in alf_transaction => I got 621,952 transactions.
I did the same with a commitTimeMS in a log generated today => 627,189 transactions.

So, with these numbers, my conclusion is that indexing is really slow.

Is the way I evaluate progess correct? Otherwise, how should I evaluate indexing progress ?

On the other hand, has anyone ever seen this problem?
Making a Tomcat restart can fix it?

In my log, I have also a lot of messages like this:
 October 4, 2012 3:10:29 p.m. org.alfresco.solr.tracker.CoreTracker indexNode
INFO: .. updating
October 4, 2012 3:10:30 p.m. org.alfresco.solr.tracker.CoreTracker UpdateIndex
INFO: … update is already running for alfresco
October 4, 2012 3:10:30 p.m. org.alfresco.solr.tracker.CoreTracker UpdateIndex
INFO: … update for archive is already running
October 4, 2012 3:10:30 p.m. org.alfresco.solr.tracker.CoreTracker indexNode
INFO: .. updating

What does they mean? Can they explain my problem?

Thank you in advance for your help. Do not hesitate if you need more information.
1 REPLY 1

andy
Champ on-the-rise
Champ on-the-rise
Hi

Where is the load on your system?
What DB are you using?
Can you post or link to a larger chunk of the log?

Most of the time goes into transformation so I would look here. If it is not here then you are unusual.
Have you enabled multi-threaded SOLR tracking so you can transform stuff in parallel? (This should be enabled by default)

You do not need to worry about the debug messages - they are just there to schedule indexing if it is not running - if it is running that is fine - it will carry on running until it is done.

Andy