During my first implementations of decision-making support applications in the late 1990s, the continuous task of integrating data was already a known challenge. Not only did the projects take time to execute, but they required the engagement of distinct human resources from the organization, both technical and business.
Indeed, this was the way, since the data were born from many transactional systems that were built to solve operational problems in the organization. It was not created with the goal in mind to provide data in an easy form for analytics and to join itself to other sources of information. Integrating data was mandatory.
Since then, data management has intensified and matured, bringing new practices and organizing those requirements with a data governance perspective; ensuring that such practices are not just followed, but also documented and fed back to the same practices.
The world has changed and new ways to implement analytics have arisen to be more agile. Lower costs for massive data processing, the availability of new software paradigms made available through open source, and a new perspective on a software-as-a-service abstracted by the cloud allows us to do more and easily implement some applications previously restricted to specific uses cases, such as artificial intelligence, creating a new normal. This brave new world couldn't be greater.
But consequently, modern software architectures standards have also created a false sense that data by itself can bring the miracle of new business outcomes by just copying and storing it in large hardware clusters, or feeding it in real time, or moving it to the cloud, or giving it to new professionals using new programming languages and possessing advanced statistical knowledge. These new kids on the block were not considering data management practices as enterprise data integration.
The truth is we want and need all these new processes and capabilities from this new technological paradigm. But the data will continue to grow from traditional transactional applications and is exploding from new sources and in different formats that don’t necessarily speak to each other. And that still will be the challenge that permeates data management in order to get maximum business outcomes with minimal invested resources.
That’s why integrating data is still required in data management
. It would be unacceptable to think that several people in an organization will need to spend between 70%- 80% of their time and effort obtaining, cleansing, joining and transforming data to finally get benefits from that data in the information age.
And Data & Analytics teams will not integrate data deliberately; it requires taking some action:
- Identify and align clear business initiatives;
- Identify initiatives that reuse integrated data, or that share the same data domains;
- Plan and scale your projects so that data is integrated at the right time to the analytical applications that require it;
- Ensure that the data is in the necessary condition to support data initiatives;
- Define an ecosystem architecture that considers mature technologies with modern capabilities to provide hybrid cloud deployment and open source integration;
- Take advantage of data industrialization tools and practices that facilitate and accelerate data flow and consumption.
And do not fear losing flexibility because ecosystem architectures can provide experimentation zones, self-service capabilities and elasticity presented by new consumption models in the cloud and hybrid cloud that ensure users will enjoy all freedom they need. And better yet, it will provide data in an organized and responsible way so that data security, confidentiality, quality and reliability enable better use of resources, creating faster time to value for the organization.