cancel
Showing results for 
Search instead for 
Did you mean: 

Potential tracking inefficiency by the Alfresco-Solr4

j_fintora
Champ in-the-making
Champ in-the-making

Hi,

we use alfresco-solr4-5.2.e with Alfresco 5.2.e and we found potential inefficiency.

Solr tracks changes in an associated Alfresco system by periodically requesting info about committed transactions. The tracking is triggered every 15 seconds (property alfresco.cron=0/15 * * * * ? *) and every time it requests transactions back to the time of the last committed transaction (actually hour before that -> alfresco.hole.retention=3600000).

One tracking sends many requests. One for each hour between now and time of the last committed transaction.

And we found out that all hour intervals since the last committed transaction are queried over and over again every 15 seconds. For example, when I upload a file to the Alfresco and then wait few hours, the number of tracking requests will grow by one for each hour since the upload. And the same requests will be fired every 15 seconds until I upload another file.

And we wonder, why it need to query the same interval more than once? Even when the interval is in the past. Is it inefficiency, or is there some reason behind that?

The problem is partially mentioned here, but there is nothing about the repeated querying of the same time interval.

Some insight would be appreciated. Thank you.

1 ACCEPTED ANSWER

afaust
Legendary Innovator
Legendary Innovator

There is no state management of "when" SOLR has last queried for changes. SOLR only checks based on the last transaction it has found in the index and uses that transaction's commit time as the basis for the interval. So in those cases where nothing has been done in the system, that information is simply lacking.

Is it inefficient? Yes. Has something changed or is something going to be changed? No, and it's not very likely. Alfresco has never been designed or optimised to be an idle system without any user load for long durations of time, and you would require an idle system for this to even manifest itself. On the other hand - apart from spamming the access logs - these additional requests should be negligible in effective cost to the system. The DB query simply yields no result and the request is done in a hand full of milliseconds.

Feel free to file an issue in the Alfresco JIRA to log this as a bug. Any discussion here in this platform does not automatically lead to such topics being tracked as something to be fixed...

View answer in original post

11 REPLIES 11

afaust
Legendary Innovator
Legendary Innovator

There is no state management of "when" SOLR has last queried for changes. SOLR only checks based on the last transaction it has found in the index and uses that transaction's commit time as the basis for the interval. So in those cases where nothing has been done in the system, that information is simply lacking.

Is it inefficient? Yes. Has something changed or is something going to be changed? No, and it's not very likely. Alfresco has never been designed or optimised to be an idle system without any user load for long durations of time, and you would require an idle system for this to even manifest itself. On the other hand - apart from spamming the access logs - these additional requests should be negligible in effective cost to the system. The DB query simply yields no result and the request is done in a hand full of milliseconds.

Feel free to file an issue in the Alfresco JIRA to log this as a bug. Any discussion here in this platform does not automatically lead to such topics being tracked as something to be fixed...

j_fintora
Champ in-the-making
Champ in-the-making

Thank you for your answer.

Well, thank you for your explanation, although I don't still fully get your point (see below). So before I file a bug, I would like to ask you (or anyone else) here, maybe I could overlook something... This topic is clearly going around for many years (since 2012, at minimum, see SOLR causes high CPU usage on idle repo. ), but no one is actually doing anything about it. I personally don't find answers like "disable your access log" or "just upload to your Alfresco something at least once in a day or two" as real solutions.

So, my question is whether the Solr implementation can be really considered as a sane one, provided that there are the following observations:

  1. If one doesn't touch Alfresco, the size of a daily access log grows by approx. 140M every day.
  2. There is an evident shortcoming seen in the querying mechanism, which causes the "over-querying", as described above. Maybe it should be reformulated like this: When Alfresco and Solr both know that the last transaction happened at 4pm yesterday, why on earth is Solr querying Alfresco for transactions also after that moment until now? What would be the benefit of such a behavior?

afaust
Legendary Innovator
Legendary Innovator

"No one is actually doing anything about it" - For a long time the contribution process was so cumbersome / ineffective that only Alfresco engineers could have been doing anything about it, and for them it did not end up being a top priority. In most production environments this has not bee a relevant issue / topiic, so customers apparently did not report this sufficiently often enough for it to become a priority. 140 M of highly compressable log file can be dealt with easily with logrotate. And if you really wanted you could separate SOLR tracking requests from others before rolling over and compressing logs.

Maybe ‌ or ‌ could comment on this (Andy also participated in that old forum thread you linked back on the old forum platform).

Hi Axel, as I wrote above, I don't find answers like "disable your access log" [yes, I know logrotate and other possibilities; but the problem also concerns server load...] or "just upload to your Alfresco something at least once in a day or two" [there is often something like a testing environment, you know...] as real solutions. This is something I wanted to avoid discussing in here, as it was already told in the past many times.

So if anyone has anything to the found problem itself, I would be very glad. Like confirming it as a real bug or even suggesting how to fix it - at the first sight it seems like easy to do (= just don't ignore the time of the last transaction in the algorithm). I also wonder if someone noticed the same problem in other systems which use Solr, like in Liferay, for example...

afaust
Legendary Innovator
Legendary Innovator

I never said they are "solutions" - just that they are why this was never a priority to complain about to Alfresco. I don't want to avoid any discussion here - after all, by tagging Harray and Andy here, I want to help you have that conversation.

andy1
Star Collaborator
Star Collaborator

Hi

Clearly we could do better in this case and it would look like a simple fix. We are, on the other hand, very careful to make sure the read only view we hold in SOLR is fully consistent with the repository. So any "simple" fix needs significant testing.

The impact of this bug has not been high enough to fix it (MNT-18100) nor a significant issue in production environments or for many customers. There is also a simple work around.

Would you complain that a system under continuous load generated the same logs? You would need a plan to deal with these logs anyway. This is the position where most people find themselves.

You can use properties to suspend indexing on the SOLR side or make it less frequent.

If you wish to contribute a fix that would be great.

Regards

Andy

p_bodnar
Confirmed Champ
Confirmed Champ

Hi,

thank you a lot for providing a link to a related issue (MNT-18100) - but would it be possible to make it visible also to public, provided there are some useful information to this topic?

To the rest of your response, I don't have anything to add (I would just repeat what's been told), apart from that this issue seems to be a technical debt like any other, decreasing the overall quality of the software, confusing not only admins, but surely also developers, and maybe quite unnecessarily.

p_bodnar
Confirmed Champ
Confirmed Champ

Bug status update: I did some tests with the latest Alfresco 6.0.7 GA. When running together with the latest 5.2 branch version, i. e. alfresco-solr4-5.2.g, the "crazy overquerying" still appears. But when running together with alfresco-search-services-1.1.1, i. e. Solr6, it seems to go away: Solr6 constantly makes just a few requests. So it looks like common sense won in the end Smiley Happy It's a pity that there is no information whatsoever communicated from the Alfresco team on this topic (keeping MNT-18100 private...), or is it?

I guess there are no plans to merge the fix into Alfresco Solr4. So my solution for now is just to start using Solr6 and say goodbye to the unlucky Alfresco Solr4 bundle and not to investigate a possible backport of the fix.

Some other thoughts, anybody?