Friday, April 24, 2020

Data (Analytics) on COVID-19: Lessons for People Analytics

Data visualization, dashboards, and statistical modeling have been thrust into the spotlight because of the COVID-19 pandemic. I am not a biostatistician or an epidemiologist (not even an armchair one!) so I am not in a position to evaluate or criticize these visualizations and models. But I’m currently teaching a course on data and metrics for human resources, so there is an educational opportunity to consider lessons the spotlighted data (analytics) on COVID-19 might have for people analytics.

Let’s start with dashboards, which are common in people analytics. Here is a COVID-19 dashboard from Johns Hopkins:

It’s impressive in the amount of data it brings together and in the ability for the user to change views. You can certainly easily grasp the major metrics, numerically as well as graphically—which is the purpose of a dashboard, whether pertaining to COVID-19 or HR metrics such as employee headcounts. But as with all dashboards, there are at least two major questions. 
  1. Are these the right metrics for what you are trying to understand? It’s easy, for example, to find Twitter threads debating whether total deaths or deaths adjusted for country population is the better measure. But like many debates over metrics, rather than seeing this as a competition over which metric is better, it would be more productive to see various measures as complements that measure different aspects (e.g., total cases reflects the pace at which an outbreak is growing; per capita cases indicates strain on a health care system). 
  2. Are the data accurate or comparable, especially when collected from diverse sources? Do individuals within an organization have a self-interest to report data in certain ways? Or are there different capabilities that produce different measures. As Ryan Lamare reminds me, dashboards and data visualizations work best when there is a common baseline. Otherwise, users need to think carefully about what they're actually seeing and how they're interpreting it. In the COVID-19 case, for example, how should we interpret national comparisons of total tests when testing capacity differs? A similar example in HR might be a comparison of training numbers across units with different training capacities. 
  3. Beyond seeing the scope of a current situation, what actions can you take from metrics that are largely descriptive? This dashboard, for example, shows which areas have the most cases by COVID-19 but how do we act upon that information? An HR dashboard might reveal areas of an organization with low employee engagement, but is unlikely to help reveal why.  

Next, here is a visualization from John Burn-Murdoch of the Financial Times that has also frequently been spotlighted:

This is a great visualization for seeing trends within countries, and across them, too, if you carefully remember what's being compared. In a people analytics context, this could be seen almost as a scorecard to see how your organization stacks up against others, or how areas within your organization compare to each other. But there are at least three things to be cautious about. First, there are the same concerns as with a dashboard—are these the right measures, the right comparisons, is there a common baseline, etc. (in fact, the source of the data for this visualization is the John Hopkins dashboard data, so the same concerns apply). Second, the nature of visualization tempts you to forecast into the future. But what’s the basis for that forecast?

For example, let’s go back to the March 15 version of the same visualization:
Based on this visualization, we might have projected that the U.S. would look more like South Korea, and that Spain was on the worst trajectory of all. Unfortunately, Spain has indeed been hard hit, but it’s been exceeded by the United States in terms of total cases. Moreover, I think our minds are tempted to draw single lines that project out from each trend line. Even if these lines grasp the complicated curvature reflected in the trends to-date (so you do a complicated rather than simplistic projection), there is still a major problem. Namely, this ignores forecast error—instead, we should also be trying to ascertain how much variability and uncertainty there is in any forecast, including HR-related projections. More broadly, in making any statistical inference, we should understand whether the sampling error is large or small, and thus the magnitude of the margin of error and the soundness of concluding that there is a meaningful relationship or result.

A third caution for people analytics that we can take away from this visualization is a reminder that this metrics-focused approach doesn’t inquire as to what factors influence the trends portrayed. Note that is doesn’t claim to, so this isn’t a criticism per se. Rather, it’s a reminder that if you want to act upon information by, for example, implementing new HR initiatives, you should always be asking what’s influencing the metrics. What levers can you nudge that will change the metrics in the desired ways? Even if you can’t estimate an actual regression, it can be helpful to approach problems with that mindset—what variables would you like to include in a regression to explain the metric? In the absence of a regression, is there other evidence to support the importance of these factors? What’s missing from your (mental) model? 

Thinking about factors that influence a trend or a metric represents a shift from a metrics approach to more of a predictive analytics approach. In the COVID-19 pandemic, this is reflected in the importance of statistical models for policy-making—for example, using predictions from models for implementing stay-at-home orders. Let's consider two broad approaches.

One approach to modeling the spread of COVID-19 essentially tries to figure out the shape of the curves in the above visualizations by fitting statistical parameters to the curves that are the most complete (e.g., China, Italy). If you then assume that the lagging counties (or other geographical units) are on an earlier part of that same curve, then you can predict where those countries are headed. This is the approach of the Institute for Health Metrics and Evaluation (IHME):

Importantly, note the shaded area which reflects a 95% confidence interval. And note that it’s quite large for the immediate future. This is a good reminder for people analytics that estimates are just estimates. There is always uncertainty, and it’s important to understand the magnitude of that uncertainty before making decisions.

But note that this curve-fitting approach is akin to a data mining exercise. There is no epidemiological model that underlies these forecasts. In HR, this would be like observing the retirement ages of previous workers, and predicting a particular worker’s retirement probability based solely on their age. There’s no accounting for that person’s particular characteristics or changes in the environment particular to that person.

As an alternative modeling strategy, a long-standing epidemiological approach is the susceptible (S)-exposed (E)-infected (I)-resistant (R) model (SEIR, for short) (or alternatively, a SIR model with three classes: susceptible, infected, and recovered individuals). A SEIR model starts with the number of susceptible, exposed, infected, and resistant individuals, and then sets up a formulaic relationship across the categories based on estimates of incubation periods, frequency of contact across individuals, the probability of being infected after exposure, and the like. The spread of COVID-19, hospitalization usage, and other outcomes can then be simulated by projecting out what happens as exposure and infection increases. And by changing key parameters, you can also forecast alternative scenarios, such as the impact of various social distancing measures. This type of model is being used to guide public policy in Minnesota. 

An analogous people analytics example would be a workforce planning model where you start with the current number of employees and make assumptions about retention rates, mobility, hiring rates, and future needs. This creates forecasts into the future, and by changing different assumptions, you can model alternative scenarios, forecast shortfalls, and infer needed responses.

Note that there is expert judgement or past empirical trends built into this model—it’s not just curve fitting. And a realistic recognition of the range of uncertainty around the underlying assumptions yields confidence intervals that help inform how strongly you should interpret the results. These confidence intervals, or estimates of uncertainty, can be seen here (in red) for the Minnesota modeling of COVID-19, and at the same time, note the modeling of different scenarios (rows) and the estimated impact on different metrics (columns):

But important questions can always be asked, such as where to the assumptions and parameters come from (especially when trying to model a new issue), how much do they vary by different groups (e.g., age groups in the COVID-19 case; occupations in a workforce planning model), how fully-specified are the relationships, and are there important things that are missing? It’s also important to consider the decision-making criteria. In social science research and people analytics, we might be looking for results that characterize a typical (i.e., average) situation; in a public health crisis, it’s likely more important to identify how to avoid worst case scenarios.

Unfortunately, the IHME's curve-fitting model and Minnesota's SEIR model give very different predictions of where we're headed. Both approaches contain significant unknowns, such as how well (or not) states or countries fit the earlier experiences of China (which had much stricter social distancing) and Italy because there are so many variables that presumably affect how the outbreak spreads, or in the SEIR approach, whether key parameters are accurate because COVID-19 is a new virus. This highlights the importance of understanding the nature and limitations of any kind of statistical model, and paying attention to the sensitivity of the results. The starkly-different projections of these particular models are also a reminder that actions based on statistical models will only be as good as the explanatory power or fit of those models. Ideally, imprecision in the degree of fit will translate into margins of errors and confidence intervals, but if a model is being applied to a new situation, then purely statistical margins of error maybe too conservative. The onus is always on the decision-maker to use their subject-matter expertise when interpreting and applying statistical results. But what to do when you have to make a decision? Explicitly recognize the decision rules and include the costs of making different types of inferential errors in any decision calculus.   

Putting all of this together, then, a good people analytics person is always skeptical—or at least probing…where did the data and assumptions come from, how do we know they are accurate, how sensitive are the results to particular assumptions, how much uncertainty is there, what’s the decision-making criteria, what’s missing? And notice that this is as much about subject-matter expertise—whether that's infectious diseases or human resources—as it is about statistical sophistication. It's not just data mining.

It might also be useful to note that neither of these modeling strategies (curve-fitting or simulation based on parameterized flow models) match the dominant predictive approach in HR, especially in HR research (I don’t say this as a critique, just as another point of comparison). From a social sciences perspective, it’s much more common to predict outcomes in a regression framework where an outcome variable is modeled as a statistical function of a set of explanatory variables. For example, if employees’ level of engagement with their supervisor (inversely) predicts an intention to quit, then if an organization can increase engagement, we’d expect that quit probabilities would decrease, albeit imperfectly and with variation. This is a reminder that analytically, some issues are best modeled as societal phenomena, some modeled at an organizational level, and some at an individual-level. They each involve unique measures, and their own analytical challenges. A good people analytics person matches the methods and data to the problem—while still being probing as defined in the previous paragraph.    

Lastly, COVID-19 dashboards and modeling raise challenging ethical questions. What data are being collected and how are they being used? Are metrics and results being presented in sensationalized or inaccurate ways? What’s the role of modeling in determining public policy decisions? There are no easy answers to these and other ethical challenges, but they are a good reminder that people analytics also involves important ethical challenges. How is employee data being used? What kind of consent should be required? How transparent is the decision-making? Are implicit biases embedded in modeling decisions furthering rather than redressing historical inequalities? Throughout the people analytics process, it’s essential to remember that most data, and certainly most decisions, pertain to real people, not data points in a database or costs on an income statement. The science of people analytics is important, but so is the humanity. And in terms of presenting data in skewed ways, this has long been recognized as a danger with statistics, and perhaps the best defense is to be a wise consumer of statistics who doesn't naively take everything at face value (see "probing" above). 

In closing, it’s nice to see data visualization, dashboards, and statistical modeling getting such public attention, but it’s obviously unfortunate that this is because of a global pandemic that has harmed so many people and communities. While not losing sight of what’s most important, there are also lessons here for people analytics.

No comments:

Post a Comment