I'm the author of that blog post Jeff linked, and it looks like the aspect approach could work for what you want to do.
If you implement the Hashable aspect as demonstrated in the post and configure your content model correctly, you can apply the aspect to any content as it is uploaded. This would automatically generate the hash upon upload. At that point, a behavior can be set up to kick off when the file hits the repo, which could include running a webscript to check for duplicates based on the hash. At this point, if duplicate content is detected, you could provide the user the option to modify or delete the content, as Jeff mentioned.
The downside to this approach is that the duplicate content is already uploaded to the repository. A different approach might be required if you need to detect duplicates before upload takes place.