Can data analytics provide insights that could help government organisations reduce the impact of an outsourcer failing? That was the question that my colleague, David Springate (a senior data scientist here at Teradata) and I set out to answer, after the collapse of well-known outsourcer, Carillion. To do this, we applied analytic techniques to a sample of open government spend and contract data.
We decided this was a topic worth investigating because the use of outsourcers by government organisations is likely to be a reality of service delivery for some time to come. This is despite the fact that Carillion’s collapse has led to some to question the viability of the operating models of ‘multiservice’ outsourcers – large outsourcers who provide a wide-ranging array of services and at scale.
The truth is organisations (including government) have valid reasons for using outsourcers. Sometimes it’s because they lack the expertise to easily deliver a particular service in-house, or they can’t do it as well as a specialist outsourcer. In other cases, it’s because outsourcers are able to deliver these services more efficiently and cheaply.
But outsourcing also introduces additional complexity. For example, tracking and managing performance of external contractors requires a robust supply chain management processes. Organisations also need to protect themselves against the risk of an outsourcer’s business failing. Carillion’s collapse and warnings from some other large outsourcers brought this latter point into sharp relief. It also raised questions about government’s ability to manage the risk of such failures.
This is exactly the sort of challenge that data analytics should be able to help with. Further, government’s commitment to transparency about its spending and contract data should mean there’s plenty of data on which to perform analytics. So that’s what we decided to do.
Over three blog posts, I describe what we did, how we did it, the challenges we encountered and the conclusions we reached. Spoiler alert: we think insights derived from data analytics could help government organisations manage and minimise the impact of outsourcer failure.
Understanding the business problem: articulating the business question that data analytics could help to answer
It’s always useful to invest a good chunk of time thinking about what the real business needs are and assessing which questions it makes sense to try to answer. Our colleague, Stewart Robbins has written a blog post about the sort of business problems organisations should prioritise versus the ones they could (and often do), when it comes to the use of data analytics. He points out that both business problems have merit, but the former should take precedence, especially in the face of resource constraints.
Anyway, we settled on the question: what metrics could help government organisations easily assess their exposure in the event that one or more of their outsourcers fail? Separately, we also did some thinking about how such metrics could help them to systematically reduce and manage such risk.
Getting started: Finding the necessary data
Once we were clear on the questions, we started looking around for data we needed to answer them. The dataset needed to be complete for the period we were interested in and of good enough quality. Successive governments’ commitment (see here and here) to openness around spend and contract data was a real boon - the data was out there, we just had to find it.
Unfortunately, the process of finding this data (aka data discovery) is where we hit our first road bump. Government’s spend, tender and contract data is published across multiple sources, sometimes with conflicting or not-easy-to-reconcile data. For example, contract data is published on Tender Electronic Daily, Crown Commercial Service’s Contracts Finder tool and its Contract Finder Archive, and spend is also published via a number of platforms including government organisations’ own websites as well as the centrally maintained GOV.UK and data.gov.uk websites. As a result finding all the data we were interested in wasn’t straightforward.
Data quality was another road bump
Much of government’s published data contains unexplained gaps and there were inconsistencies in the way different organisations structured their published data. Also, the fact that this data isn’t linked to any other government datasets makes the work of validation (and contextualisation for that matter) a lot harder. These challenges aren’t unique to government datasets but given the amount of effort that’s gone into the transparency initiative and government’s digital transformation agenda, failing to link datasets in this way seems like a missed opportunity.
None of these challenges will be unfamiliar to data scientists. A 2016 Harvard Business Review article cited a study which estimated that about 60% of most data scientists’ time is spent on data preparation (cleansing and organising.) We seemed on track to prove the accuracy of this estimate when, thankfully, we heard about the Spend Network. This organisation has focused on the arduous task of gathering and standardising government’s (open) contract and tender data so that it can be aggregated and analysed. Spend Network also links this data to government organisations’ spend data and suppliers’ company records before classifying it using an open standard – United Nations Standard Products and Services Code (UNSPCC). This service type classification provides additional information about the sort of service provided by an outsourcer. It proved especially useful at the analysis stage.
Given Spend Network’s subject matter expertise and comprehensive dataset (it has contract and spend data of 336 government organisations, including some historical ones). We decided to save ourselves a significant amount of work and simply reuse their well curated data.
For the purposes of this project Spend Network gave us the spend and contract data for 11 government organisations over the last 7 years. This allowed us to focus on working out the best way to extract insights. In other words, we focused on the analytics; this involved:
- Deciding what patterns were important given the business question we were trying to answer and the available data
- Assessing how well the analytics supported meaningful inferences to be made about government organisations’ spending. We moved beyond the theory and had a go at doing this for 3 different government organisations
This is the focus of the next blog post in the series.