Hyland Connect

aweber1nj · ‎07-17-2012

This is related to my ongoing trials with storing a counter and retrieving it reliably in a behaviour class…

It would appear that the attributeService is attempting to cache values it retrieves, but doing so "incorrectly"? Note the following log snippet (with threads identified, all starting with "http-8080") :

17 Jul 2012 11:21:13,483 DEBUG http-8080-7 [copyEFRAttrs] Entering getDocSequence for: 56401908-e10e-460d-ac04-588eb9fc9b69
17 Jul 2012 11:21:13,498 INFO  http-8080-23 [copyEFRAttrs] substObjName: ENTERING…
17 Jul 2012 11:21:13,498 DEBUG http-8080-23 [copyEFRAttrs] Entering getDocSequence for: 56401908-e10e-460d-ac04-588eb9fc9b69
17 Jul 2012 11:21:13,498 DEBUG http-8080-5 [copyEFRAttrs] Saved attribute successfully for parent 56401908-e10e-460d-ac04-588eb9fc9b69 = 10
17 Jul 2012 11:21:13,498 DEBUG http-8080-5 [copyEFRAttrs] Leaving getDocSequence
17 Jul 2012 11:21:13,498 DEBUG http-8080-5 [copyEFRAttrs] Sequence set to: 10
17 Jul 2012 11:21:13,498 DEBUG http-8080-7 [copyEFRAttrs] Got value from attribute: 9‍‍‍‍‍‍‍

If you look at the code in my previous thread (about behaviour threading…will try to link it) it will help explain the logged statements.
Note, specifically:

I don't know what else to do. It appears that attributeService is returning stale values, and since that is an Alfresco service, I don't have much control over how it operates.

Is there anything I can do to force attributeService to re-fetch from the DB or double-check for stale values?

Thanks again,
AJ

afaust · ‎07-17-2012

Hello,

you have several things to consider in this case: database transaction isolation, cache transaction isolation and locking. If you have several threads running concurrently that read and update the same attribute, you are bound to run into consistency problems as changes / transactions are isolated from each other until they are completed. This is the same for content nodes - if you perform the same kind of interaction on nodes concurrently, you may also end up with values that appear to be inconsistent.

If you need to implement an attribute and a counter to generate a globally unique value for each invocation, you need to take care of concurrent executions yourself. Alfresco could do it for you, but any approach the devs would have to use would impact the overall system in ALL use cases, while the number of relevant use cases is really quite low (I have only dealt with this once in 3 years of fulltime Alfresco project participation).
You need to make sure, that each time you access and increment the relevant attribute, you do so in a transaction that has access to the most up-to-date data available, and that only one such transaction runs at any given point in time. This can be done via a combination of nested transactions and usage of the JobLockService to ensure only one process executes in that nested transaction. It is one of the more complex customizations - especially if you want to avoid gaps (e.g. in case of transaction rollbacks).

I seem to remember reading a blog post of a German colleague a while ago, but are not able to find it now to give you another pointer…

Regards
Axel

mrogers · ‎07-18-2012

No - you just need to fail and retry properly.

afaust · ‎07-18-2012

Hmm, maybe this changed since the last time I implemented something like this, but then (3.2 / 3.4) there was no way to detect stale data and AFAIK no fool-proof optimistic locking by Alfresco - you had no basis for fail & retry and Alfresco wouldn't fail & retry for you. The only option I found to work (reliably and efficiently in a highly concurrent scenario) used nested transactions & locking.
I'll have to revisit this one of these days on 4.0… I would imagine that the cache handling changes I've seen in NodeDAO also apply to the attributes and would help simplify matters in the direction of mrogers post.

aweber1nj · ‎07-18-2012

I agree with Axel as he points-out that when a getAttribute() call returns a value, there's no real way to verify that it is the correct value – nor should this be the onous of the developer. If the method is there, and documented to return an attribute's value (or null if the attribute does not exist), why would the developer have to write a bunch of checks (whether they be within transactions or not) to double-check that Alfresco is returning what it promised to return? If the call is always going back to the DB, then it is almost-certainly always to be correct. If Alfresco is implementing a cache in front of the DB and this is causing stale data to be returned, then this is not a developer-problem, I'm wondering if it's a bug.

I purposely started a separate thread, because although the code in my forum posts/questions is the same, I'm trying to illustrate two different problems. The fact that the same integer was being returned twice in a row, to two different threads immediately caused my particular use-case to throw an error related to the issue – because I used the stale number to rename a node and this collided with the previous thread's rename of another node, causing a duplicate name exception. I don't see how wrapping a retry around a getAttribute/setAttribute pair of statements would help any.

As for transactions, I understand Axel's workaround he has apparently performed in the past, but this too seems unnecessary. Maybe clarification is needed for the following statement on the Java_Foundation_API wiki page:

By default, each invocation of a Service method is wrapped in its own transaction.

With the accompanying example, I took this to mean that a "Service method" was a call such as attributeService.setAttribute(). If that is so, then the attribute should be committed immediately, and any future call to .getAttribute() for that particular attribute should return the updated value, regardless of thread or client.

afaust · ‎07-18-2012

Hello,

If the call is always going back to the DB, then it is almost-certainly always to be correct.

This is not correct. Going to the DB does not guarantee you the up-to-date value. This is where transaction isolation of the database comes into play, e.g. the READ-COMMITTED isolation level will only show you what has already been committed. If you have two active transactions at the same time, they will both read the same value. If they change the value, then the it is a matter of lock ordering, which transaction writes first. In the absence of an application side verification (i.e. via a "version" column), the second write overrides changes of the first.

If Alfresco is implementing a cache in front of the DB and this is causing stale data to be returned, then this is not a developer-problem, I'm wondering if it's a bug.

This is only partially correct. If Alfresco returns a value from cache although a different value has already been committed to the database, this would be a bug. But for concurrent executions you have to consider uncommitted transactions. It would be a violation of transaction isolation to return any updated data from in-progress transactions, since any changes are still subject to rollback if the transaction fails.

By default, each invocation of a Service method is wrapped in its own transaction.

This phrase may be misleading. Each invocation of a Foundation Service operation from code that is ITSELF NOT wrapped in a transaction, will cause the called method to be executed in its own transaction. If the code calling a Foundation Service operation is already wrapped in a transaction, that transaction is reused. Anything else would introduce major problems, again mostly relating to transaction isolation. Changes of any operation should only be committed (and visible) when the entire operation succeeds, or not at all. If each invocation were wrapped in its own transaction, every minor method call would immediately update the database although there is still a chance a subsequent operation fails and all changes would have to be rolled-back.

So, if you have a long running action - which is already executed in a transaction by the ActionService - and update an attribute, that attribute change will only be committed if the entire action succeeds (depending on the length, this could mean a delay from a few milliseconds to minutes). As long as it is not committed, all parallel threads / actions / clients still see the old value, which is absolutely valid and correct.

The more I know of your use case the more it sounds exactly (to the detail) like the one I was faced with at the time of my solution / workaround.

Regards
Axel

aweber1nj · ‎07-18-2012

OK, I am very familiar with DB ACID concepts…that part I have been doing a long time (I am relatively new to Alfresco).

My wording should have been such that committed, DB values would always be consistent – due precisely to ACID requirements.

Since I am not starting any explicit transactions in my behaviour class, I took that quote from the wiki "at face value" – and I am not sure how a Foundation Service could do anything else, as it doesn't really know the state of the caller. So since I am not starting and wrapping my method(s) in an explicit UserTransaction (or using the RetryingTransactionHelper), I have to assume that as soon as my setAttribute returns, that is the "correct value" and should be returned by any future getAttribute, wherever and whenever thereafter.

I have the block that does the get/update (set) synchronized (with a Lock), so I know that other threads are not able to call getAttribute until the setAttribute has completed.

I'm not totally surprised you have run-into a similar problem in the past, and truly appreciate your feedback.

afaust · ‎07-18-2012

Hello,

the Foundation Service itself does not handle the transaction aspect - this is done by a component that transparently sits between the caller and the service. That component has access to a transactional context, can check the current transaction state (of the caller) and react appropriately.

Behaviours as one of the lowest level code blocks there are in Alfresco share the transactional scope of the Foundation Service that invokes them and the same goes for any code they call themselves. If that weren't the case, you would not be able to rely on the atomicity of a Foundation Serivce call and its (implied) changes.

Unfortunately, Java-based locks (synchronized / java.util.concurrent) do not help in this matter as they are unaware of any transactional scope and only help in ordering low-level Java code execution.

Regards
Axel

aweber1nj · ‎07-18-2012

That is excellent info I was unaware of. Thank you.

(I understand that concurrency and transactions are two different things. But that's good to note for anyone who might follow the thread some day.)

I know I tried to get an explicit UserTransaction (code snippet on that same wiki page) and tested, and it didn't help. But it could've been that I was still testing the Aspect idea and not the attribute.

If I inserted an explicit transaction in a small subset of my behaviour, how would the remainder of the code (before and after the explicit transaction) still be conducted??? Would Alfresco have something of a "parent transaction" around the whole behaviour invocation? That's interesting…

afaust · ‎07-18-2012

A nested / inserted transaction would have no impact on the remainder of your code. BUT, you can pass data from a nested transaction and use that later on in the remainder of the behaviour. That is why the retrying transaction callback requires you to specify a type via Generics for the result value.

In simplified code, this would look something like this:


// your custom behaviour code

// nested new write transaction
String uuid = transactionService.getRetryingTransactionHandler().doInTransaction(new RetryingTransactionCallback<String>(){
   public String execute() {
      // do something, e.g. get/set attribute
      String uuid = attributeService.get….;
      return uuid;
   }
}, false, true);

// continue your custom code, using uuid as an up-to-date value
‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Bear in mind that using the nested transaction violates the atomicity of the overall operation. Based on the use case, this is a conscious choice in this instance. In case the surrounding transaction is rolled back, you are left with holes / voids if you are generating uuids in a sequence - the nested transaction is not rolled back and other transactions may have already updated other data based on the intermediate value.

This is about half of my previous solution / workaround. As stated before, I'd have to check the changes in 4.0 if things can't be handled differently now as indicated by mrogers objection.

Regards
Axel

Hyland Connect

attributeService caching incorrectly?