Hyland Connect

josh_barrett · ‎01-08-2018

Could this be SOLR trying to index image files?

I have an Alfresco 5.1.3 environment with SOLR4 being used as the index server. Seeing allot of these errors in the logs:

org.alfresco.error.AlfrescoRuntimeException: 00080207 GetTextContentResponse return status is 500
at org.alfresco.solr.client.SOLRAPIClient.getTextContent(SOLRAPIClient.java:1101)
at org.alfresco.solr.SolrInformationServer.addContentPropertyToDocUsingAlfrescoRepository(SolrInformationServer.java:2712)
at org.alfresco.solr.SolrInformationServer.addContentToDoc(SolrInformationServer.java:2699)
at org.alfresco.solr.SolrInformationServer.updateContentToIndexAndCache(SolrInformationServer.java:2633)
at org.alfresco.solr.tracker.ContentTracker$ContentIndexWorkerRunnable.doWork(ContentTracker.java:140)
at org.alfresco.solr.tracker.AbstractWorkerRunnable.run(AbstractWorkerRunnable.java:47)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Not sure this is actually an issue but want to make sure we don't have a problem.

andy1 · ‎01-11-2018

Hi

A 500 error is an issue. Something has gone wrong on the alfresco side handling the request to get the text content.

I suspect that the error you are seeing is not related to indexing - unless it is the script that is being indexed.. The repository side stack trace should have solr in there somewhere...

It could be a node that has been created, meta data indexed in SOLR and then the node deleted before the content is indexed. SOLR can try to find content from nodes that have just gone. They will get tidied up when the delete is tracked.

The error you are seeing suggests you may have another issue.

If content fails to index it gets recorded in the index as an error. It does not retry to index the text as most content failures are down to transformation etc and not generally recoverable.

See Reindex documents by query | Alfresco Documentation

Andy

View answer in original post

afaust · ‎01-09-2018

This is not normal and indicates that specific documents could not be full text indexed. You should check the Repository logs for any conversion errors from any mimetype to text/plain.

josh_barrett · ‎01-09-2018

I am getting this on one environment only. I see messages like the following in the Alfresco log that Solr is tracking against.

2018-01-09 10:01:21,177 ERROR [extensions.webscripts.AbstractRuntime] [http-apr-8080-exec-539] Exception from executeScript: store://2017/7/1/2/43/608741d6-74a7-4783-96ab-aa554710010a.bin no longer exists
org.springframework.dao.ConcurrencyFailureException: store://2017/7/1/2/43/608741d6-74a7-4783-96ab-aa554710010a.bin no longer exists

cat /opt/tomcat/logs/catalina.out| grep '.bin no longer exists' | wc -l
800

Not sure if 800 is a magic number but after that last log entry on Alfresco, Solr seems to have stopped indexing.

/solr4/#/alfresco shows Last Modified:about an hour ago

This is a test environment. Is there a way I can safely remove those entries from the DB where the content no longer exists.

afaust · ‎01-09-2018

If you can identify which nodes reference this piece of content (and similar that may be failing), and those nodes are not of any special sort (regular cm:content, not part of the Data Dictionary or surf-config folder structures), then yes, it should be safe to remove these nodes from the system. I would advise to go through the Script API or UI, instead of deleting them manually from the DB.

andy1 · ‎01-11-2018

Hi