This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Duplicate File Prevention

Former Member
Former Member
I am finding a lot of threads and post about people wanting to prevent duplicate files from being imported into M-Files, but many of them are from over a year ago and everyone has a different semi-solution.

What is the best modern solution to prevent employees from submitting duplicate files?

At this point, even preventing them from uploading files with the same file name would be good enough to prevent some of the mistakes.

Thank You,
Royce
Parents
  • Former Member
    Former Member

    Okay, thank you for the information bright-ideas.dk and Joonas Linkola.

    To further my question based on knowing the Unique Object Enforcement (UOE) module is the best way to approach the problem, can I have clarification on a specific scenario?

    We are transferring over 150,000 or 374GB of files from an old document manage software. We are not sure which ones are duplicates in that package, and after the transfer we want to prevent duplicates from being submitted as well. You say that the UOE module prevents duplicate object metadata. I am uncertain if it is programmed to look at all metadata or we can set it to look at specific properties.

    Do we choose which properties are looked at using this module?
    If we want it to look at only the name of the file, would that work?
    If we want it to look at only the exact size of a file, would that work?
    What prompts when we have a duplicate based on the properties we set to look at each file? (Overwrite options or an error?)

    I guess we want it to be accurate, but not ruin our import either by removing non-duplicates. Example: If we have an invoice package of 4 files that all have very similar properties since they are related to each other, we do not want to see one or two of them getting blocked out since it would be hell trying to fix that problem.

    Any guidance that you guys can provide would be appreciated!

    Sincerely,
    Royce


    This sounds like it could be handled before everything is imported. You could write a script to calculate the hash of each file, and remove the duplicates.
Reply
  • Former Member
    Former Member

    Okay, thank you for the information bright-ideas.dk and Joonas Linkola.

    To further my question based on knowing the Unique Object Enforcement (UOE) module is the best way to approach the problem, can I have clarification on a specific scenario?

    We are transferring over 150,000 or 374GB of files from an old document manage software. We are not sure which ones are duplicates in that package, and after the transfer we want to prevent duplicates from being submitted as well. You say that the UOE module prevents duplicate object metadata. I am uncertain if it is programmed to look at all metadata or we can set it to look at specific properties.

    Do we choose which properties are looked at using this module?
    If we want it to look at only the name of the file, would that work?
    If we want it to look at only the exact size of a file, would that work?
    What prompts when we have a duplicate based on the properties we set to look at each file? (Overwrite options or an error?)

    I guess we want it to be accurate, but not ruin our import either by removing non-duplicates. Example: If we have an invoice package of 4 files that all have very similar properties since they are related to each other, we do not want to see one or two of them getting blocked out since it would be hell trying to fix that problem.

    Any guidance that you guys can provide would be appreciated!

    Sincerely,
    Royce


    This sounds like it could be handled before everything is imported. You could write a script to calculate the hash of each file, and remove the duplicates.
Children
No Data