cancel
Showing results for 
Search instead for 
Did you mean: 

Nuxeo cluster shared nuxeo.tmp.dir causing problems due to nuxeo-launcher jar naming contention

rg1_
Star Contributor
Star Contributor

According to this answer, it is a best practice for nodes in a Nuxeo cluster to share their nuxeo.tmp.dir. When doing so, must each node in the cluster have its own tmpdir on the binary store filesystem? I am encountering nuxeo-launcher jar file naming collisions causing NFS stale file handle errors when multiple servers in a cluster share their tmpdir and I simultaneous invoke nuxeoctl operations (using Ansible) on all nodes in the cluster.

8 REPLIES 8

Florent_Guillau
World-Class Innovator
World-Class Innovator

In cluster mode it's not recommended at all to share nuxeo.tmp.dir, there are many libraries we don't control which could have a problem with it. This means in turn that you can't leverage the NXP-9361 no-copy optimizations...

On the other hand if the only problems you have are due to nuxeo-launcher jar file naming then we could fix this on our end and allow tmp sharing. Please open a JIRA ticket.

Edit: the simplest and surest way is probably to have a shared filesystem but make each node point its nuxeo.tmp.dir to a different subdirectory in it.

Please clarify.

If you want the benefits of NXP-9361 in cluster mode then the name collision you see has to be fixed. Given the code in nuxeoctl, the launcher in the tmp dir (which is there to allow the launcher to update itself) should be named nuxeo-launcher-$RANDOM.jar where $RANDOM is randomly generated by bash and should be collision-free (although mktemp would be better). Is that not the case for you? Please open a ticket if you have enough info for us to track this down.

And yes once that bug is fixed using a shared nuxeo.tmp.dir should be ok.

As mentioned above, I'm using Ansible with an ssh connection to remotely manage the multiple Nuxeo nodes in my cluster. We regularly see nuxeo-launcher jar collisions when we remotely execute nuxeoctl commands simultaneously across all nodes in the cluster. For now, we have updated each nodes' nuxeo.conf to set nuxeo.tmp.dir to a unique, node-specific directory within the shared binary file system to work around this issue.

Yes having nuxeo.tmp.dir point to different parts of a shared filesystem depending on the node is a good way to solve the issue.

Thanks. Given our discussion, you might consider updating your original answer since I found it a bit confusing (I would like to be able to mark it as the accepted answer). Also, if pointing nuxeo.tmp.dir to different parts of a shared filesystem depending on the node is a best practice, would it make sense to update the cluster documentation accordingly?

Answer updated and doc (http