How to organize a data team to get the most value out of data
Mar 29, 2024
•
Wannes Rosiers
To state the obvious: a data team is there to bring value to the company. But is it this obvious? Haven’t companies too often created a ...
To state the obvious: a data team is there to bring value to the company. But is it this obvious? Haven’t companies too often created a data team without a plan?
Monkey business solution: we often miss out on the obvious — Copywrite Daniel J. Simons
Too often data initiatives have started since competitors were doing this, or since doing data sounds sexy. And when the data team is indeed in pursuit of value, how do you measure this? Even more, if you succeed to measure value, how do you organize yourself to optimize the outcome? All valid questions, for which we want to articulate a response today.
Understanding the Data Team’s Raison d’être
Let’s start with the reason of existence of your data team. Going just a few years back in time, often we’ve heard companies saying: “we need to do Big Data”. But when asking what doing Big Data meant, no one could reply. It’s all about V’s they said: it’s the Variety of data, the Velocity and the Volume of data. But actually, it was all about the hype and what we then considered Big Data is now rather small. That wasn’t the first time data was hyped and it won’t be the last time neither.
For a long time, the actual reason of existence of a data team, has been to create operational reports, building up to KPI-trees which allow to steer your entire business. This still remains valid and valuable use of data, but it’s only the tip of the iceberg. There is more data than ever before, and algorithms and models are popping up everywhere, which leads to an increasing amount of possible use-cases every day.
The exponential growth of possible data use-cases — Picture by Ian Usher
More and more companies identify these possibilities and succeed in implementing them. Data team tasks shift from doing purely analytical reporting, to embedding solutions in operational workflows and steering automated decision making. It’s a shift in time from using data after the facts to prior to events. This requires a whole other approach of organizing your team. Suddenly things like on-call support 24/7 become relevant.
The needs of a modern data team
Next to the challenging requirements of availability — data steering operational process should be there — the omni-presence of data also impacts the companies needs with regards to building a data organization. Simply said: as of a certain size, one can not expect that a single person knows all data available in the company, let alone understands all business goals.
In short, to obtain value from data, you need to:
Understand the data: knowing the exact meaning of every data field and the business process or input channel that has created it.
Recognize the business goals: know what you want to accomplish, the why of your data work next to the what.
Have sufficient technical data knowledge: master both data engineering or data science technology, as well as methodology like data machine learning models or data modelling.
Keep everything running, all the time: again master technology and methodology, with a stress on keeping infrastructure and tools up and running and preventing models to drift.
Why central data teams fail
A central, single-disciplinary team won’t succeed. You can not expect someone to be the technical expert, both for infrastructure as engineering, as well as the business expert, leaving data model expertise still out of scope.
If it’s too much, you’ll fail — Photo by Valery Fedotov on Unsplash
This is not a data specific challenge and the good thing is: you can learn from others. Within software engineering, you see the rise of cross-functional teams or at least the rise of agile frameworks to increase the connection between technical and business experts.
Elaborating on the similarity between data needs and the rise of cross-functional teams in software engineering, you rapidly end up with a bunch of different roles needed. Some of these roles of course might be taken up by the same person, depending on your context. So what are these roles:
Product owner: someone that protects the why
Functional analyst or business process expert: someone that can translate the why into the what
Data expert: someone that understands the input data
Technical data expert: someone that can transform the input data into valuable output. Note that depending on the use-case this can be a data engineer, a data visualization expert or both.
A devops expert: someone that can provide and maintain the tools and build the CI/CD pipeline. Again, this might even be distinct people.
Central data teams are grouping these technical data experts, possibly in combination with the devops experts. This means you group the people building the data platform with those using it, rather than the data workers with those that can guide the value creation from data.
This is exactly why central data teams too often continued to tackle tech challenges rather than business challenges.
And what to do about it
As a data team, we can steal the solution with pride from software engineering teams. It’s all about bringing the why and what of data together. In cross-functional teams or using agile frameworks: up until a certain size or number of activities (of a company), bringing the business skills closer to a central data team will make sense, as of a certain size or number of activities, federating the data skills will fit better.
Product thinking and product design, recently emerging solutions for data — Photo by Edho Pratama on Unsplash
The recent introduction of product-thinking to data is both a result of this need to bring why and what together, as well as trigger to create cross-functional teams. This product-thinking has also led to the introduction of platform-thinking, which essentially means that you consider your data platform as a product to be used by your data workers, build and maintained by a separate cross-functional team.
Again up until a certain size (now of the data team), combining the responsibility to manage a data platform and building use-cases is logical — just as in start-ups, where everyone steps in to perform which ever task is needed — yet as of a certain size, focus becomes increasingly relevant and you will need to separate the team in platform team and data value units federated across your organization.
Focus through product and platform separation — Image by author
If you want to know more we invite you to join us at an event where we will deep dive into the Fast lane to data value: embracing platform and product thinking. On Thursday April 18th at 15:00–18:00 in Leuven, we will host this event together with our product brand Conveyor and one of our dear clients Luminus.
Together we explore the evolution of data governance from central initiatives over the federated data mesh approach up until the rise of data contracts. Discover how these initiatives, all rooted in the concept of ‘product thinking’ and acknowledging the distributed nature of data, aim to effectively govern data across your entire organization. Register here
Latest
From Good AI to Good Data Engineering. Or how Responsible AI interplays with High Data Quality
Responsible AI depends on high-quality data engineering to ensure ethical, fair, and transparent AI systems.
A glimpse into the life of a data leader
Data leaders face pressure to balance AI hype with data landscape organization. Here’s how they stay focused, pragmatic, and strategic.
Data Stability with Python: How to Catch Even the Smallest Changes
As a data engineer, it is nearly always the safest option to run data pipelines every X minutes. This allows you to sleep well at night…