Home United States USA — IT Five building blocks of a data-driven culture

Five building blocks of a data-driven culture

June 24, 2017

150

To be data-driven requires an overarching data culture that couples a number of elements, including high-quality data, broad access and data literacy and..
How can organizations leverage data as a strategic asset? Data comes at a high price. Businesses must pay for data collection and cleansing, hosting and maintenance, salaries of data engineers, data scientists and analysts, risk of breach and so on.
The line items add up. However, if done well, a thriving data-driven organization can reap huge rewards. Controlling for other factors, Erik Brynjolfsson et al. from MIT’s Sloan School of Management found that data-driven organizations have a 5-6 percent higher output and productivity than their less data-driven counterparts. They also had higher asset utilization, return on equity and market value. Other research shows that analytics pays back $13.01 for every dollar spent. Being data-driven pays!
To be data-driven requires an overarching data culture that couples a number of elements, including high-quality data, broad access and data literacy and appropriate data-driven decision-making processes. In this article, we discuss some of the key building blocks.
A single source of truth is a central, controlled and “blessed” source of data from which the whole company can draw. It is the master data. When you don’ t have such data and staff can pull down seemingly the same metrics from different systems, inevitably those systems will produce different numbers. Then the arguments ensue. You get into a he-said-she-said scenario, each player drawing and defending their position with their version of the “truth.” Or (and more pernicious) , some teams may unknowingly use stale, low-quality or otherwise incorrect data or metrics and make bad decisions, when they could have used a better source.
When you have a single source of truth, you provide superior value to the end user: the analysts and other decision makers. They’ ll spend less time hunting for data across the organization and more time using it. Additionally, the data sources are more likely to be organized, documented and joined. Thus, by providing a richer context about the entities of interest, the users are better positioned to leverage the data and find actionable insights.
From the data administrator’s side, a single source of truth is preferable, as well. It is easier to document, prevent name collisions across tables, run data quality checks and ensure that the underlying IDs are consistent across the tables. It also is easier to provide flattened, easier-to-work-with views of the key relations and entities that, under the hood, may have come from different sources.
For instance, at WeWork, a global provider of co-working spaces, we provide our analytics users with a core table called the “activity stream, ” a single narrow table that provides web page views, office reservations, tour bookings, payments, Zendesk tickets, key card swipes and more. The table is easy for users to work with, such as slicing and dicing different segments of our members or locations, even though the underlying data comes from many heterogeneous systems. Moreover, having this centralized, relatively holistic view of the business means that we also can build more automated tools on top of those data to look for patterns in large numbers of different segments.
In large organizations, there are often historical reasons why data are siloed. For example, large organizations are more likely to acquire data systems through company acquisitions, thereby resulting in additional independent systems. Thus, a single source of truth can represent a large and complex investment. However, in the interim, the central data team or office can still make a big difference by providing official guideposts: listing what’s available, where it is and where there are multiple sources, the best place to get it. Everyone needs to know: “if you need customer orders, use system X or database table Y” and nowhere else.
Knowing where to get the data, and providing quality data, is only one ingredient. Users need to know what the data fields and metrics mean. You need a data dictionary. This is an aspect that trips up many organizations. When you don’ t have a clear list of metrics and their definitions, people make assumptions — ones that may differ from colleagues. Then the arguments ensue.
A business needs to generate a glossary with clear, unambiguous and agreed-upon definitions. This requires discussion with all the key stakeholders and business domain experts. First, you need buy-in to those official definitions; you don’ t want teams going rogue with their secret version of a metric. Second, it is often not the core definition where people’s understandings differ but how to handle the edge cases. Thus, while everyone might have a common understanding of what an “orders placed” metric means, they may differ in how they want or expect to handle cancellations, split orders or fraud.
Those scenarios need to be laid out, discussed and resolved. A goal here is to collapse multiple similar metrics into a single common metric, or flesh out situations where you genuinely need to split one metric into two or more separate metrics to capture different perspectives.
For instance, at WeWork, prospective members check out our facilities by signing up for a tour. Importantly, some people may tour different locations, or come back for a second tour to show other members of the organization before signing off on their new space. While our various dashboards had a metric called “tours, ” they didn’ t align across teams. The process of creating a data dictionary fleshed out two different metrics:
Specificity in well-chosen names, and unambiguous definitions with examples, are key here. It is better to err toward longer but descriptive names, such as “non_cancelled_orders, ” or “Tours Created To Tours Completed Conversion %” than shorter names that users think they understand.
Having clean, high-quality data, from a central source, and with clear metadata, is ineffective if staff can’ t access it. Data-driven organizations tend to be very inclusive and provide access wherever the data can help. This doesn’ t mean handing over the keys to all the data to all the staff — the CIO would never sign off on that! Instead, it means assessing the needs of individuals, not just the analysts and key decision makers, but across the whole organization, out to the front-line of operations.
For instance, at Warby Parker, a retailer of prescription glasses and sunglasses, associates on the retail shop floor have access to a dashboard that provides details on their performance, as well as that of the store as a whole. At Sprig, a food-delivery company from San Francisco, even the chef has access to an analytics platform that they use to analyze the meals that have been ordered and understand which ingredients and flavors are popular or have not fared well, and so tailor the menu.
A large Fortune 100 financial conglomerate that hires data scientist from The Data Incubator’s fellowship is able to maintain a competitive edge in hiring compared to “sexy” Silicon Valley companies like Google, Facebook and Uber, partially through granting broad access to data for their data science team. And the access doesn’ t just stop at data scientists — one of the products our alumni have worked on is building summary dashboards that automatically gives customer service reps a visualization of the interaction history of the customer on the phone.
It is those front-line staff — the customer service agent dealing with an angry customer, or a warehouse worker facing a pallet of damaged product — who can leverage data immediately to determine best next steps. If suitably empowered, they are often also in the best position to resolve a situation, determine changes to workflow or handle a customer complaint.
Data-driven organizations need to foster a culture whereby individuals know what data are available — a good data dictionary and generally seeing data being used in day-to-day decision making helps — and, further, that they feel comfortable requesting access, if they have genuine use case. Red tape should be cut so that while there is still an appropriate approval process and oversight, and systems in place such that access can easily be revoked if necessary, the staff get access without too many hoops to jump through and without too many delays.
Finally, with broader access, and more users of analytical tools, the organization will need to commit to providing training and support. At WeWork, while our data team are available through Slack, email and service desk tickets, we also provide weekly office hours to help users with our business intelligence tools, SQL queries and any other aspects about the data.
In a data-driven organization with broad data access, staff will frequently encounter reports, dashboards and analyses, and they may have a chance to analyze data themselves. To do so effectively, they must be sufficiently data literate.
Data literacy is often a multi-pronged effort.