Hyland Connect

bboeri · ‎12-29-2011

Some Federal agencies require MD4, MD5 or other checksum calculations to assure that a file received is the same as the file stored in the repository. Moreover, if there is a duplicate hash in the system, they require that the user be notified. Is any such mechanism available in Alfresco?

jpotts · ‎12-30-2011

Hi Bob,

This is not available out-of-the-box, but it is a relatively straightforward customization. In fact, here is a blog post that shows one way to do it.

I did something similar for a former client. In addition to computing the checksum, I added a custom component to the Share form that would do a call to a web script to display some text and a link if a duplicate exists in the repo. The client also wanted to know about duplicates as early as the file upload so the user could choose not to create a duplicate, but that requirement never got high enough on the requirements list to implement.

Hope that helps,

Jeff

Jeff Potts
https://www.metaversant.com | https://ecmarchitect.com

cpaul · ‎12-30-2011

I'm the author of that blog post Jeff linked, and it looks like the aspect approach could work for what you want to do.

If you implement the Hashable aspect as demonstrated in the post and configure your content model correctly, you can apply the aspect to any content as it is uploaded. This would automatically generate the hash upon upload. At that point, a behavior can be set up to kick off when the file hits the repo, which could include running a webscript to check for duplicates based on the hash. At this point, if duplicate content is detected, you could provide the user the option to modify or delete the content, as Jeff mentioned.

The downside to this approach is that the duplicate content is already uploaded to the repository. A different approach might be required if you need to detect duplicates before upload takes place.

everbehere · ‎02-25-2014

I am relatively new to Alfresco. I need to have something similar. I need to know what will be the approach if I don't require a duplicate document to be uploaded. There is something which is already present where the documents are rejected on basis of the file name but is there any possibility to reject the documents based on any other field.
Thanks in advance!

Hyland Connect

MD4 or other hashing to guarantee no document changes