Identifying complementarities across tasks using two-part contracts. An application to family doctor

With Marcos Vera-Hernández. This is derived from a chapter of my PhD dissertation, Essays on the Economics of Health.

We propose an empirical test for determining if rewarded tasks are cost complements or substitutes in a pay for performance scheme with kinks on linear task-specific reward's functions. The test is based on the insensitivity of effort exerted into a particular task to variations on the price of competing tasks for agent who are bunched near the kinks. As a case study, we consider the case of the Quality and Outcomes Framework (QOF). This system accounts for nearly a quarter of family doctors income and is the largest pay-for-performance (P4P) program for primary care services in the world. We found that changes introduced in the system in 2011 were on tasks that were complements of many of the unmodified tasks. As a result, there is no evidence of effort-diversion due to the changes.

The following is an example of our procedure. The graph below shows the first step of the test for the indicator ASTHMA06 (proportion of patients with Asthma who have had an review in the last 15 months). The graph presents the density of the indicator as a function of achievement 10 pp. around the upper limit of the indicator (70%), where the P4P tariff drops to zero. It also presents two different approximations: a local linear polynomial on the left with a 1 SD window, and a restricted spline on the right without considering observations between the upper limit and 5 pp. above such figure. In the graph of the left, a continuity of the log-density test is implemented (McCrary). In the right, the value of bunching is estimated by assessing the value of the density in the excluded interval that cannot be fitted using a parametric function (Saez; Chetty et al.). As a result, both tests tell us that there is evidence of bunching, which in our model means that those observations just above the upper limit should be insensitive to changes on other tasks' marginal reward. In the second step, we did not find evidence for rejecting the hypothesis of differential responses to the 2011/12 changes between practices above and below the 70% threshold. Hence, we cannot conclude that this indicator is either a complement or a substitute of those indicators that were modified.

