cancel
Showing results for 
Search instead for 
Did you mean: 

Keycloak and duplicate users

akring
Confirmed Champ
Confirmed Champ

Hi all,

I have a Docker setup consisting of an Alfresco Community 6.2 backend, an ADF frontend and Keycloak. Alfresco and the ADF frontend is set up to use Keycloak for authentication and everything is working fine, i.e. the following happens:

  1. The user enters the URL of the frontend application in the browser
  2. The user gets redirected to Keycloak and logs in
  3. Keycloak redirects the user back to the frontend application
  4. The user can browse the content served by the Alfresco backend

So everything works fine, but when a (Keycloak) user logs in, Alfresco creates two user nodes with the same username in Alfresco, i.e. inspection of the sys: people container in the Nodebrowser yields

cm:bruceworkspace://SpacesStore/0bc47079-93ab-4220-a186-859c503c03d9truesys:children4Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser
cm:bruceworkspace://SpacesStore/40cfeb6c-4df8-4ef8-80e6-171b0fc40b13truesys:children5Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser

and the Alfresco log shows this error:

alfresco_1  | 2021-04-20 12:16:16,055  INFO  [security.person.PersonServiceImpl] [http-nio-8080-exec-4] Splitting 1 duplicate person objects.
alfresco_1  | 2021-04-20 12:16:16,062  ERROR [util.transaction.TransactionSupportUtil] [http-nio-8080-exec-4] After completion (committed) exception
alfresco_1  | org.alfresco.repo.security.permissions.AccessDeniedException: 03200099 Only users with ROLE_ADMINISTRATOR are allowed to manage users.
alfresco_1  | 	at org.alfresco.repo.security.authority.AuthorityTypeBehaviour.onUpdateProperties(AuthorityTypeBehaviour.java:89)
alfresco_1  | 	at jdk.internal.reflect.GeneratedMethodAccessor637.invoke(Unknown Source)
alfresco_1  | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
alfresco_1  | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
alfresco_1  | 	at org.alfresco.repo.policy.JavaBehaviour$JavaMethodInvocationHandler.invoke(JavaBehaviour.java:181)
alfresco_1  | 	at com.sun.proxy.$Proxy73.onUpdateProperties(Unknown Source)

I have tried to delete the two users (according to this thread https://hub.alfresco.com/t5/alfresco-content-services-forum/alfresco-acs-6-1-2ga-sso-authentificatio... ), but that does not solve the problem.

Any ideas why this is happening and how to fix it?

Regards,

Andreas

2 ACCEPTED ANSWERS

afaust
Legendary Innovator
Legendary Innovator

I see no substantial changes / fixes in the Repository-tier Identity Service auth subsystem for ACS 7.0. I would chalk the difference up to lucky change in timings, or maybe ADF behaviours differently for some reason.

Did you do a network trace via browser or other tools, or check the access logs to see which concurrent requests happend on the affected ACS version?

Edit: Ah, I see you already answered the second part while I was writing my reply....
Since the difference is 3 seconds (not 0.03) I definitely not suspect the last two calls to be at fault. Since the first two do not use the SSO enabled web script endpoint, that only leaves the ticket call. Can you redo your manual cURL test with that endpoint and check that it is not somehow responsible on its own?

View answer in original post

akring
Confirmed Champ
Confirmed Champ

Just to follow up on the discussion above. It turns out that your ( @afaust  ) assumption is correct, i.e. that the problem is actually due to concurrent calls from the frontend. I found out that even though the Tomcat localhost access log states that the call to the ticket endpoint and the call to the discovery endpoint are 3 seconds apart

172.21.0.7 - - [21/Apr/2021:12:56:08 +0000] "GET /alfresco/api/-default-/public/authentication/versions/1/tickets/-me- HTTP/1.0" 200 66
172.21.0.7 - - [21/Apr/2021:12:56:11 +0000] "GET /alfresco/api/discovery HTTP/1.0" 200 2212

the browser actually sends the requests amost simultaneously (confirmed by inspection the network tab in the browser inspector), i.e. 0.02 seconds apart (I double-checked - it is actually 0.02 seconds this time 🙂 ).

To verify that the concurrent calls to the two endpoints causes the issue described in the first post in this thread, I made a small Python script to simulate the browsers behavior (included below if anyone should be interested). Running this script results in the exact same problem as described above.

So one could argue that this a a core Alfresco issue, i.e. that the Alfresco backend should be able to handle concurrent requests without creating two users with the same username and without throwing errors in the log. Alternatively, one could also argue that the problem is with the ADF library, that makes the two simultaneous calls to the backend.

Thanks for your input @afaust . I will mark your first reply above as the solution.

Python script for anyone interested:

from datetime import datetime
from concurrent import futures
import json
import requests

REALM = 'alfresco'
TOKEN_URL = f'https://some.domain/auth/realms/{REALM}' \
            f'/protocol/openid-connect/token'
ALF_BASEURL = 'https://some.domain/alfresco/api'

# Get token payload = { 'client_id': 'alfresco', 'username': 'bruce', 'password': 'bruce', 'grant_type': 'password' } r = requests.post(TOKEN_URL, data=payload, verify=False) print(r.status_code, r.url) token = r.json()['access_token'] headers = { 'Authorization': f'bearer {token}' } def call_alf(url): print(datetime.now(), 'Calling ' + url) r = requests.get(url, headers=headers, verify=False) print(datetime.now(), 'Response from ' + url) return r.status_code, r.url
# Make concurrent requests
with futures.ThreadPoolExecutor(2) as executor: resp_gen = executor.map(call_alf, [ ALF_BASEURL + '/-default-/public/authentication/versions/1/tickets/-me-', ALF_BASEURL + '/discovery' ]) print(list(resp_gen))

View answer in original post

11 REPLIES 11

EddieMay
World-Class Innovator
World-Class Innovator

Hi @akring 

Sorry for my ignorance, but can you outline what you mean & the process when you say, 'but when a (Keycloak) user logs in,'?

Thanks,

Digital Community Manager, Alfresco Software.
Problem solved? Click Accept as Solution!

Hi @EddieMay 

Yes - I should have been more precise.

By a Keycloak user I mean a user created in the Alfresco realm inside Keycloak, i.e. the user does not exist in the Alfresco DB beforehand. So the steps are:

1) Make sure the user (bruce in this case) does not exits in Alfresco, e.g. by doing a

curl -u admin:admin http://localhost:8080/alfresco/s/api/people/bruce

which should return a 404 (assuming the Alfresco Docker container exposes port 8080). Assuming that some proxy Docker container is handling SSL termination (to avoid CORS issues with Keycloak it is easier to use HTTPS for everything) and reverse proxying, so that

  1. The frontend can be accessed at https://some.domain/frontend
  2. Keycloak can be accessed at https://some.domain/auth
  3. Alfresco (the repo) can be accessed at https://some.domain/alfresco

2) Create the user "bruce" in the Alfresco realm inside Keycloak

3) Open a browser and navigate to e.g. https://some.domain/frontend which will redirect to https://some.domain/auth/... where the user can log in.

4) Log in with "bruce" and the corresponding password.

5) The user is now redirected back to https://some.domain/frontend by Keycloak

6) The frontend passes the Keycloak token to the Alfresco backend, and Alfresco creates the user "bruce" two times for some reason?

7) (The user is authenticated and can access content served be the backend)

So the strange thing is that when Alfresco creates the "bruce" user from the Keycloak token, it happens twice.

I hope the above clearifies the problem.

akring
Confirmed Champ
Confirmed Champ

... and the previously posted error is shown in the log.

afaust
Legendary Innovator
Legendary Innovator

Can you perform a manual / cURL request using Keycloak OpenID token to Alfresco's ReST API without ADF. Just so that you can check / eliminate the possibility that this might be an issue due to concurrent requests being sent from ADF to the backend. The sys:children association allows duplicates to exist, and if the timing is "just right", I can see how two concurrent requests to Alfresco with the same token could theoretically create two users, after first having both checked the DB + caches for an existing user, and having found none because neither was yet done / had committed its creation.

akring
Confirmed Champ
Confirmed Champ

Yes - that works fine. Obtaining a token from Keycloak like this

$ curl -s -k -X POST -d 'grant_type=password&client_id=alfresco&username=bruce&password=bruce' https://some.domain/auth/realms/alfresco/protocol/openid-connect/token

{
  "access_token": "eyJhbGci...mng",
  "expires_in": 300,
  "refresh_expires_in": 1800,
  "refresh_token": "eyJhbGciOiJIUzI1...pNwgCZ4jT_ag1f_OK8",
  "token_type": "bearer",
  "not-before-policy": 0,
  "session_state": "fc55eca9-6ce6-4c2c-97de-aebac67e5158",
  "scope": "profile email"
}

and then sending the token to an endpoint that requires authentication gives:

$ curl -k -i -H "Authorization: bearer eyJ..." https://some.domain/alfresco/s/api/people/admin
HTTP/1.1 200 
Server: nginx/1.17.10
...
{
	"url": "\/alfresco\/s\/api\/people\/admin",
	"userName": "admin",
	"enabled": true,
	"firstName": "Administrator",
        ...
}

so that works fine. Inspecting the users in the Nodebrowser now yields

cm:adminworkspace://SpacesStore/1d935a26-177c-4cc1-aa20-dd3dc3b62b58truesys:children0Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser
cm:guestworkspace://SpacesStore/aac47022-5f62-4cad-bd36-6bfb8baea5d2truesys:children1Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser
cm:abeecherworkspace://SpacesStore/dc103838-645f-43c1-8a2a-bc187e13c343truesys:children2Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser
cm:mjacksonworkspace://SpacesStore/b6d80d49-21cc-4f04-9c92-e7063037543ftruesys:children3Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser
cm:bruceworkspace://SpacesStore/4b3974cd-bba6-46db-b521-a9875a98aab7truesys:children4Slet | Gennemtving sletning | Tag ejerskab | Gendan tilladelser

i.e. the user "bruce" is only to be found one time in the list. So your assumption could be right, i.e. that the problem is due to concurrent requests from the ADF frontend.

Logging in via the ADF frontend following the above curl requests does not result in an error in the log, and the number of "bruce" users in the Nodebrowser remains one.

Thanks for your input @afaust and @EddieMay 

akring
Confirmed Champ
Confirmed Champ

... just a note - a quick test showed that the problem originally described in the first post of this thread does not seem to occur if the Alfresco backend is replaced with version 7.0.0

akring
Confirmed Champ
Confirmed Champ

Hi @afaust 

Just to follow up regarding the possible concurrent requests from the ADF frontend. The Tomcat access log shows this during the login flow:

172.21.0.7 - - [21/Apr/2021:12:56:01 +0000] "GET /alfresco/s/something-custom/activation HTTP/1.0" 200 20
172.21.0.7 - - [21/Apr/2021:12:56:08 +0000] "GET /alfresco/s/something-custom/activation HTTP/1.0" 200 20
172.21.0.7 - - [21/Apr/2021:12:56:08 +0000] "GET /alfresco/api/-default-/public/authentication/versions/1/tickets/-me- HTTP/1.0" 200 66
172.21.0.7 - - [21/Apr/2021:12:56:11 +0000] "GET /alfresco/api/discovery HTTP/1.0" 200 2212

The /alfresco/s/something-custom/activation endpoint do not require authentication, but the last two requests in the list do. As can be seen these requests arrive within 0.03 sec, but I do not know if that is close enough to trigger the error?

afaust
Legendary Innovator
Legendary Innovator

I see no substantial changes / fixes in the Repository-tier Identity Service auth subsystem for ACS 7.0. I would chalk the difference up to lucky change in timings, or maybe ADF behaviours differently for some reason.

Did you do a network trace via browser or other tools, or check the access logs to see which concurrent requests happend on the affected ACS version?

Edit: Ah, I see you already answered the second part while I was writing my reply....
Since the difference is 3 seconds (not 0.03) I definitely not suspect the last two calls to be at fault. Since the first two do not use the SSO enabled web script endpoint, that only leaves the ticket call. Can you redo your manual cURL test with that endpoint and check that it is not somehow responsible on its own?

akring
Confirmed Champ
Confirmed Champ

Right... my bad. 3 seconds and not 0.03 seconds 🙂

Deleting all "bruce" users and making the curl call gives:

$ curl -k -i -H "Authorization: bearer eyJh..." https://some.domain/alfresco/api/-default-/public/authentication/versions/1/tickets/-me-
HTTP/1.1 200 
Server: nginx/1.17.10
...
{"entry":{"id":"TICKET_0fedc70fbdc9d00ff669d7cc2732afb03a531f1c"}}

i.e. it succeeds, but only a single "bruce" user is created. So it seems to be something triggered by the browser that triggers the error.