The Parallels Between Managing Little Bits of Data and Large Stores of Enterprise Documents

I was reading an excerpt from a whitepaper called Total Data Management, published by a customer data platform company called Treasure Data. It was an interesting, albeit very techie, read explaining — at a high level — the changing face of data management and the need for continuous data integration.

In the whitepaper, the author describes how customer data from different silos should be continuously integrated into the data lake to drive better decisions — marketing decisions, customer experience decisions, the whole works. He went on further to explain the value of continuous data integration.

I couldn’t help but find the glaring parallels between his data integration model and intelligent information management (IIM) systems like M-Files. In fact, in the introduction to the whitepaper asks three questions that are the same questions we find ourselves asking our prospects and customers:

  • Are you able to access customer data from any device and put it into action quickly?
  • How many internal and external systems and sources of customer data do you need to consolidate for a single view of your customer?
  • Is your customer data accessible to the right people at the right time?

The angle that Treasure Data is coming from is a bit different than IIM though. They explain the concepts in terms of feeding a centralized data lake with loads of customer data — like purchases and webpages navigated all the way to IoT data from smart devices — all for the purpose of lightning-fast customer analysis which allows for equally fast decision-making. However, the concepts remain the same when speaking of enterprise information management — documents, files, project data, etc.

Here are a few of the most glaring conceptual commonalities between the two models.

Centralizing Data Scattered in Silos that Don’t Talk to Each Other

In the whitepaper, the author says:

“[A] more efficient means of storing and processing data is only one part of the ability to generate insight from it. Enterprises also need to get the data from applications into the warehouse or lake in the first place. While the sources of data have traditionally been limited to a select number of enterprise applications, predominantly run on-premises, as an increasing volume of enterprise data is generated by SaaS applications, new approaches are required to gather that data for analysis.”

Again, while the concept they explain is from the angle of little bits of data feeding a larger storehouse, the same applies to information management. Intelligent information management systems like M-Files offer users the ability to draw documents and files from external applications (CRM, ERP or SharePoint, to name a few) and on-premise stores (network folders), apply metadata for context and then present them in a single user interface.

No more looking in that folder and then in Salesforce, not finding what you need, and then having to write an email to see if a colleague can track down the document you need.

No more wondering if the version of that contract is the most current version. You can track version history and that contract now has a “single point of truth” within M-Files.

With the Intelligent Metadata Layer, M-Files uses a repository-neutral approach to intelligent information management that unifies information across different sources based on context, not on the system or folder in which the information is stored.

Data Lineage — Who Did What and When

Central to the author’s concept of data analysis is the ability to track the source and lineage of the data:

“Additionally, data lineage enables analysts and data stewards to understand where the data came from and what transformations may have been made to it already and is also vital to compliance projects. As such, the combination of data cataloging capabilities and data governance are fundamental enablers of both self-service data preparation and functioning data lakes.”

The same applies to documents and files as it does to those little bits of data. Document lifecycle is central to governance, risk management and compliance (GRC). Given the high priority that organizations in heavily-regulated industries must place on quality and compliance management and the amount of documentation they must produce to support and fulfill GRC initiatives, it is no longer practical to manage content separately from quality and compliance.

As a result, the ability of IIM to track changes to documents, training materials and standard operating procedures must play a central integrated role in effective quality and compliance management.

Who was the last person to make changes to that quality control document? What did they change? Do we need to roll it back to a previous version? All these things are easy to ascertain with IIM systems like M-Files.

Making Sure the Right People Have Access to the Data they Need

From the whitepaper:

“[T]here is yet another piece to the puzzle. The ability of business analysts and data scientists to discover data for preparation and analysis is dependent on the enterprise having an inventory of the data sets that reside within the lake, as well as control over data access, security, privacy and governance requirements. The right users must have access to the right data in the right format.”

A key feature of M-Files is the flexibility and power of user access control management. With M-Files, you can assign access permissions for individual documents and objects, and even for separate versions of the same document or object – including assigning roles that give different levels of access to different users or user groups.

M-Files guarantees that information is available to the people who need it, and inaccessible and invisible to those who don’t need it or aren’t authorized to access it.

Managing large volumes of information — whether it’s lots of little bits of customer data or lots of company files and documents — follows a similar framework in many ways. But, ultimately, they both drive towards the same goals: productivity, efficiency and better decision-making.