04-06-2017 02:31 PM
Can anyone point me in the direction of any docs describing how the fingerprint / minihash works with Alfresco 5.2, either through Share or REST API? I am assuming we can use this for finding similar docs ('find more like this'), and that we would have to provide the hash of the content to be compared against, plus some threshold value for what 'close' means, but it's not clear to me how this is set up or used.
04-07-2017 01:17 AM
Using the search REST API Alfresco Content Services REST API Explorer you can search for/include the "FINGERPRINT" field.
You can add a similarity percentage value to the desired fingerprint with a "_" .
So: get a documents fingerprint value through the search query, add for instance _50 to the value and search for this... (see alfresco tech talk live 103 near the end s demo)
To determine the similarity between already found documents, you could calculate a string-distance, like the Levenshtein distance, between the Fingerprint values. Low distance means more similarity...
05-12-2017 11:30 AM
For more information on the topic we have added https://community.alfresco.com/people/andy1/blog/2017/05/12/document-fingerprints, which covers in depth the topic.
Hope you find this helpful.
Explore our Alfresco products with the links below. Use labels to filter content by product module.