cancel
Showing results for 
Search instead for 
Did you mean: 

How are duplicate files stored when using Amazon S3?

wjv16_
Champ in-the-making
Champ in-the-making

I have been told that Nuxeo does NOT store multiple copies of the same document, and just uses links. I understand that Nuxeo VCS has a duplication checker.

We are using Nuxeo as the DM, with a PostGres DB running on Amazon cloud and using Amazon S3 for storage.

Under this configuration, does Nuxeo still just store just one copy of the document, or does it store multiple copies of a document?

What about in a multi-tenant environment using the same DM & Amazon-instance. If two users upload the same document, does each user get a complete copy, or is there just one copy shared by multiple users?

We have been told various versions of how this works, and would like to find out the real answer!

Thanks!

Bill

1 REPLY 1

Florent_Guillau
World-Class Innovator
World-Class Innovator

Yes, Nuxeo uses deduplication for any content storage backend. It's true for the standard filesystem-based storage, the Amazon S3 storage, or the RDBMS-based storage.

You can even plug your own storage backend if needed, and the (simple) BinaryManager APIs it needs to implement will make it automatically deduplicate content.

Deduplication is global to a given repository; as the standard multi-tenant configuration uses a single repository, if several users upload the same document, space for only a single one will be used in S3.

Getting started

Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.