cancel
Showing results for 
Search instead for 
Did you mean: 

Lucene indexer. Need help

volodymyr
Champ in-the-making
Champ in-the-making

Hi. I have a problem with search using lucene, I have searched forum and google already but almost everything I found leads to reindexing whole repo or updating documents that were not indexed.

My case:

I use Alfresco Community v3.4.0 (d 3370) schema 4 113

I create 1500 custom model nodes using NodeService they all have the same parent and I can find them using the same NodeService by parent nodeRef. Also I can find each node separately by nodeRef. Even few days after creating this nodes I can't find part of them using lucene. Search query searches by some custom field like id which has no more then 20 symbols. Example +TYPE:"omoSmiley Surprisedffice_document" AND +(@omo\:document_number:"UUID123")

So I suggest that in some reason they were not indexed and will not be indexed. Index recovery node is set to auto. Even after app restart nodes were not indexed.

My questions:

It would be nice to know reason why those nodes were not indexed. How can I find nodes that were not indexed? Can I see which nodes are waiting for indexing and if amount of them decreases?

Any suggestions...

Thank You for help.

2 REPLIES 2

openpj
Elite Collaborator
Elite Collaborator

Probably when you drop a lot of content in Alfresco using the Lucene subsystem, Alfresco try to store each content In-Transaction this behavior will cause a lot of fragmentation of the index and sometimes can bring issues during the indexing process.

Have you tried to execute a full reindex?

In this way Alfresco rebuild the index in a batch way without fragmentation. This is a typical procedure to correctly maintain your index.

Try to put the following property for reindexing all your contents from scratch:

index.recovery.mode=FULL

Then restart Alfresco and look at the alfresco.log to see the progress of the reindexing process.

When Alfresco starts correctly, if you don't want to execute a Full reindex the next time that you restart Alfresco, set the property to the previous value:

index.recovery.mode =AUTO

Hope this helps.

volodymyr
Champ in-the-making
Champ in-the-making

Thank you for suggestion, but full reindex is not resolution for us. At least because we have very big database and reindex will take about 3 days.

Also I want to guarantee that node can be found at least after 24 hours after creation.

At this point I have found some workaround, may be someone will find it useful... I made some scheduler that searches last transactions and checks if they were indexed, if not reindex, hope it will work.

I use

nodeDAO.getTxnsByCommitTimeDescending(fromTimeInclusive, toTimeExclusive, count, null, false); to find transactions

and

abstractReindexComponent.isTxnPresentInIndex(Transaction txn); to search if it was indexed

Always opened for better resolution.