This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mass Delete Files

I'm trying to delete about 20,000 files, the M-Files viewer only allows me to see 500 files at a time, so I can only delete 500 files at a time.

Is there a more proper way to delete these files? They are not going anywhere, we just need them to be destroyed.

Thanks

  • Two options for you:

    1) If these 20,000 files all fall into the same status in a workflow, build a state transition to have the M-Files server carry this out for you. But only pursue this option if these conditions are met: all intended files to be deleted have the same metadata and they are in the same workflow, you can set the trigger to go to the next state and once reached, that state will delete them. 

    2) In M-Files admin, turn up your list view amount. I don't recommend turning it up to 20,000. We have ours set to 1,000, but it's better than 500. You know the limits of your PC's performance. And if you decide to increase the list view number even more, then be aware to revert it back so users don't experience any performance issues. 

  • I found a tool for this purpose written in C# but you would have to compile it yourself..

    https://github.com/8/MFilesDeleter

  • I had started on a commandline tool for this a while ago, I ended up converting it to a VAF that can be triggered manually or on a schedule. The primary reason I converted it to VAF was for the "Search Condition Editor" that is standard functionality in the VAF API and Admin Tool.  In our use cases, we wanted to destroy objects based on very specific metadata - which is harder to do in command line.  Let me know if you want some code snippets from what we built.

  • We have had some similar cases (in our case we needed to update certain properties). We developed small background task which would periodically loop over, search for documents and update metadata. Similar would work with destroying. It is VAF-based and could be also stopped when no more documents existing. It is pretty straight forward. If you go that path, let me know and I can support you.

  • Of course! Here's the main portion of it.  I took out a bunch log logging and my-company-specific lines, so there's a chance I missed a semi-colon or something, so don't be surprised if there's a couple immediate errors raised.  The code is wrapped in a Task (or queue, I forget what it's called).  

    using DCPMidstream.Common.Configuration;
    using DCPMidstream.Common.Logging;
    using DCPMidstream.Common.TaskProcessing;
    using MFiles.VAF.AppTasks;
    using MFiles.VAF.Common;
    using MFiles.VAF.Extensions.Dashboards;
    using MFilesAPI;
    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    
    namespace DCPMidstream.DestroyObjects
    {
        public class Module : ModuleWithTaskProcessors
        {
            [TaskProcessor(RootConfig.TaskQueueIdConcurrent, DestroyObjectsConfig.TaskType, TransactionMode = TransactionMode.Unsafe)]
            [ShowOnDashboard(ModuleName, ShowRunCommand = true)]
            public void Process(ITaskProcessingJob<TaskDirective> job)
            {
    
                Vault vault = job.Vault;
    
                // Shorthand config and setup logging
                DestroyObjectsConfig config = RootConfig.DestroyObjects;
    
                // Process enabled each rule
                for (int i = 0; i < config.Rules.Count; i++)
                {
                    DestroyRule rule = config.Rules[i];
                    if (!rule.Enabled) continue;
    
                    MFSearchBuilder mFSearchBuilder = new MFSearchBuilder(vault, rule.Filter.ToApiObject(vault));
    
                    IEnumerable<int> objIDsPrevious = new List<int>();
    
                    int iteration = 0;
                    do
                    {
                        int pageSize = RootConfig.DestroyObjects.PageSize;
    
                        ObjectSearchResults results = mFSearchBuilder.Find(
                          maxResults: pageSize,
                          searchTimeoutInSeconds: 0);
    
                        if (results.Count == 0)
                        {
                            break;
                        }
    
                        ObjIDs objIDs = results.GetAsObjectVersions().GetAsObjVers().GetAllDistinctObjIDs();
    
                        // Check if result count is smaller than page size (getting towards the end)
                        if (results.Count < pageSize)
                        {
                            // If list of Obj IDs is the same as the last result set, break - these are undeletables
                            if (results.GetAsObjectVersions().GetAsObjVers().GetAllDistinctObjIDs().Cast<ObjID>().Select(x => x.ID).Except(objIDsPrevious).Count() == 0)
                            {
                                break;
                            }
                            // If the list is not the same as the previous one, this is the first set with less results than page size - log IDs to be checked next page
                            else
                            {
                                objIDsPrevious = objIDs.Cast<ObjID>().Select(x => x.ID);
                            }
                        }
    
                        // Kill switch - if the application or rule is disabled at any time, it will break the loop and stop processing this rule
                        if (!RootConfig.DestroyObjects.Enabled || !RootConfig.DestroyObjects.Rules[i].Enabled)
                        {
                            break;
                        }
    
                        try
                        {
                            vault.ObjectOperations.DestroyObjects(objIDs);
                        }
                        catch (Exception e)
                        {
                            // Add logging
                        }
                    } while (true);
                }
            }
        }
    }

    Here's the Config that the code uses

    using MFiles.VAF.Configuration;
    using MFiles.VAF.Configuration.JsonAdaptor;
    using MFiles.VAF.Extensions;
    using System.Collections.Generic;
    using System.Runtime.Serialization;
    
    namespace DCPMidstream.Common.Configuration
    {
        [DataContract]
        public class DestroyObjectsConfig
        {
            public const string TaskType = "DestroyObjects";
    
            [DataMember]
            [RecurringOperationConfiguration(RootConfig.TaskQueueIdConcurrent, TaskType,
                IsRequired = true,
                HelpText = "Determines how often the destroy process runs.")]
            public Frequency Schedule { get; set; }
    
            [DataMember]
            [JsonConfIntegerEditor(
                Label = "Page Size",
                HelpText = "Determines how many object to search for at once.  Searching will continue in batches until 0 results are returned.\n\nM-Files API currently has a limitation where you cannot destroy more than 500 objects at one time.",
                Min = 1,
                Max = 500,
                DefaultValue = 100)]
            public int PageSize { get; set; } = 100;
    
            [DataMember]
            [JsonConfEditor(
                IsRequired = true,
                HelpText = "When the indicated object is created new objects will be created.",
                ChildName = "Rule")]
            public List<DestroyRule> Rules { get; set; }
        }
    
        [DataContract]
        public class DestroyRule
        {
            [DataMember]
            [JsonConfEditor(
                IsRequired = true,
                HelpText = "A user-friendly description to help people find it later.")]
            public string Name { get; set; }
    
            [DataMember]
            [JsonConfEditor(
                DefaultValue = false)]
            public bool Enabled { get; set; }
    
            [DataMember]
            [JsonConfEditor(
                IsRequired = true,
                HelpText = "This rule will apply to any objects that satisfy these search conditions.")]
            public SearchConditionsJA Filter { get; set; }
        }
    }

    The best part of the code (IMHO) is the property in the config called Filter of type SearchConditionsJA.  That's where all the magic happens.  It's presents a brilliant UI in admin tool for easy search configuration and then then if you look in the main code, there's one line of code:

    MFSearchBuilder mFSearchBuilder = new MFSearchBuilder(vault, rule.Filter.ToApiObject(vault));

    and it returns everything that matches what you configured.

    The other nice part is the property called Schedule of type Frequency.  It let's the admin decide how often the Task should run.  Or, since I specified "[ShowOnDashboard(ModuleName, ShowRunCommand = true)]" on the method that runs as a task (first block of code), the process can also be triggered manually at any time from the admin tool dashboard.

    You may not be able to run this in your vault as is, but hopefully it gives you some ideas on how you can do something like it in your vault.  Good luck! Let me know if I can help.

  • Another idea is using the importer tool. Conduct an export of your objects, and then have the importer tool process them to a desired state which then deletes them. 

  • I use the One Time Export feature in M-Files Admin (specify a search filter for those 20.000 files) and then tick the box against Destroy exported objects after export.

  • Thank you all for your options, I will give them a try, probably will go with the export under the admin tool first.

  • I tried this and it seemed to work, since its an export, do the files go anywhere? if so, where do they go?