12-24-2020 02:17 AM
Is there any option to identify file corruption ?
12-25-2020 10:47 AM
For both points there is no functionality out of the box, but using Alfresco behaviours and background processes (i.e. cron jobs based on Quartz library), something like this could be implemented. For the file upload, you would require that the client somehow submit a verifiable checksum as a metadata property, and then one could verify that in a custom OnContentUpdatePolicy policy, throwing an exception if the verification fails, which would roll back the current transaction and result in the uploaded content being deleted (unless a WORM storage is used). That same checksum could then be stored as part of the document metadata (requires a custom content model), and used in a regularly running Quartz job to re-verify the file contents.
Typically though, I would expect the kind of long-term file corruption detection / check be done in the storage system itself and be kept out of Alfresco. Chances are that if content is that important to consider file corruption, a professional storage solution would be employed which already includes checksum and even correction capabilities.
That then only leaves the upload scenario. And for this one may not want to have the validation happen inside of Alfresco, but rather at the client side. I.e. the client uploads the content, keeps the pre-calculated checksum for itself, and asks Alfesco to provide a checksum of the file after upload for verification. As there is no standard for this yet (e.g. https://datatracker.ietf.org/doc/draft-ietf-httpbis-digest-headers/ is currently only a draft), there is obviously no support yet in Alfresco, but that does not mean one could not implement a custom upload web script / API that supports Digest + Wants-Digest HTTP headers.
Explore our Alfresco products with the links below. Use labels to filter content by product module.