Limn Deep Diagnostics

The women waited on the prickly grass, their babies hanging from nearby trees in brightly coloured string bags, too-quiet children on their laps. One by one they ascended the veranda steps to the blue Formica table, where the nurse asked them the questions they had heard many times before. “Skin hat?,” “kai kai?,” “Pek pek wara?” “Kus?” Do they have a fever? Have they eaten? Do they have loose stools? Do they have a cough? The women sat rigid on the hard bench and whispered barely audible replies. A thermometer was placed delicately under an armpit. The nurse listened to a child’s breathing with a stethoscope. A clinic book, detailing a child’s previous visits to the clinic was cursorily examined.

What were the options here? Pneumonia, malaria, diarrhea, hopefully not tuberculosis. The nurse was so familiar with the symptoms and the treatment possibilities that she rarely opened the small standard treatment book that sat on the neatly organized table next to her. Most of the children were given antimalarials (chloroquine with Fansidar), antibiotics (amoxicillin) and panadol. The mothers of the very sick ones (bikpela sik) were asked to come back if their child did not improve. They walked away in the blinding sun, carrying their children in their arms and their babies, parceled in their woven string bags, on their heads.

In 2004, when I visited Begasin Health Centre in Usino Bundi district, Papua New Guinea, diagnosis at a rural health clinic meant aligning a patient’s symptoms with available treatments. Some rudimentary diagnostic tools were available: a stethoscope, a thermometer, a sphygmomanometer. But most community health workers and nurses depended on a combination of clinical judgment and syndromic algorithms from standard treatment books to undertake what medical practitioners call “empirical diagnosis.” When the prescribed treatment did not work and patients returned to the health center sicker than when they had left, the health workers would scour the standard treatment book for other possibilities: tuberculosis, meningitis, dengue. There was no laboratory here, no way to test for these diseases, and very sick patients were referred to the general hospital in the coastal capital, several hours walk and a long bus journey away.

There was a microscope at Begasin health centre—possibly a remainder from earlier attempts to extend microscopy services into rural areas, or perhaps a one-off donation from a development agency or NGO—but no one knew how long it had been there or how to use it, and no one had the key to the wooden cabinet in which it was kept. Inside the clinic, a surplus box of microscope slides propped the window open, providing welcome ventilation to the humid, tin-roofed room.

The routine medical protocols I observed on the verandah of Begasin Health Centre in 2004 were a far remove from laboratory-based gold standards for medical diagnosis, yet they did comply with the standards for rural primary health care in low and middle-income countries. At the time, the WHO recommended that anyone presenting with fever in a malaria-endemic area with no access to microscopy services should be treated presumptively with anti-malarials. Empirical diagnosis based on clinical judgement was considered the only way for curative medicine to proceed in places where a lack of technical and transportation infrastructure and expertise precluded the extension of laboratory services.

Yet even as I observed the routine dispensing of antibiotics and antimalarials at Begasn Health Centre, elsewhere the norms for basic care in resource-limited settings were changing. Growing antimicrobial resistance to first-line drugs, such as those for malaria and tuberculosis, and the heightened cost of new drugs were drawing attention to the human and economic cost of empirical diagnosis and the overtreatment it generates. Nonetheless, the technology and expertise necessary for more accurate laboratory diagnosis simply wasn’t present in primary health care settings in many low- and middle-income countries, where the transportation, electrification, communication, and sanitation infrastructure that laboratories depend on did not reach.

A novel solution to this dilemma emerged in the late 1990s, with the development and release to market of a handful of malaria rapid diagnostic tests (MRDTs). These lateral-flow immunochromatographic tests used isolated antibodies to bind with malaria parasite antigens present in a blood sample. A positive test resulted in the appearance of a thin line in the test window where the antigen-antibody interaction occurred. MRDTs were not as accurate as laboratory based microscopy and, as the number of tests available on the market proliferated, concerns about disparity in the quality of devices and a lack of regulation in many low- and middle-income countries also grew. Nonetheless, these small devices had a significant advantage over laboratory-based assays: they were mobile.

Malaria rapid diagnostic test kits were transportable to places with limited road access. They compressed the time between test and result and therefore reduced the risk of losing patients to follow-up. They were affordable (with prices at around $1-$2 per testing kit) and easy to use, meaning they did not require a laboratory technician to read them. MRDTs extended the reach of laboratory medicine in two directions. First, they revealed the presence of pathogens hidden deep in the recesses of the diseased body. Second, they were designed to penetrate the farthest edges of the health system. Global health had entered the age of deep diagnostics.

Public Needs, Private Goods

The excitement that surrounded point-of-care diagnostic devices following the arrival of the MRDT turned on their potential to make the physical extension of laboratory infrastructure unnecessary. But the shift from laboratory to test also brought a wholly different—and equally problematic—infrastructure into view: the market.

The development of MRDTs by biotechnology brought the absence of comparable point-of-care testing devices for other treatable infectious diseases in low-income countries into sharp relief and spurred demands for their development. In 2006, for example, Médecins Sans Frontières (MSF) marked World TB day by calling for the “urgent need for ‘a simple test which yields results almost instantly and can be used by any laboratory technician, nurse or health workers even when far away from a laboratory.’” Campaign groups and public health experts made similar calls for diagnostics for neglected tropical diseases, such as trypanosomiasis and visceral leishmaniosis. Diagnostic devices are commodities, and their nonexistence was explained through the frame of market failure. The WHO focused on disincentives for industry to invest in the technology, including prohibitive R&D costs, a lack of regulation, uncertainty about market size, and concern about the ability of governments to pay for tests (AMS 2009: 9; WHO 2006). They discussed the need “to stimulate and facilitate the diagnostics industry to adapt available technologies to develop new diagnostics” (WHO 1998:2), and to call for partnership and engagement between the public sector and industry. In 1997, in an innovative move, the WHO organized a joint convention with industry to identify feasible TB tests for development. The premise of the convention was that public health experts could identify the tests that were needed, while industry representatives could help identify those that were most feasible (WHO 1997).

Emphasis on partnership gained momentum in the early 2000s, when the Bill and Melinda Gates Foundation entered the fray, adding diagnostics to its focus on drugs and vaccines within its mission to find technical solutions to global health challenges. The Gates Foundation already had invested in the establishment of novel public-private partnership arrangements for the development of life-saving drugs (DNDi) and vaccines (Gavi). In 2003, they donated $30 million to establish FIND, a nonprofit organization based in Geneva and often referred to as a “product development partnership,” with a remit of helping promising diagnostic developers to overcome development, regulatory, and market challenges. They also gave significant sums to PATH, a Seattle based nonprofit that develops new diagnostic tests, undertakes market research, and builds partnerships with industrial manufacturers.

By the middle of the decade, the global health community widely accepted that “strategic efforts to build laboratory capacity must be pursued urgently by partnerships between public (national and international), private and commercial sectors to address this health care crisis” (Petti et al. 2006: 380). With the articulation of a need for diagnosis segueing into the need for point-of-care diagnostics, work to improve the diagnosis of treatable diseases in resource-limited settings became concomitant with the work of “stimulating” and “shaping” markets for global health. These efforts to incentivize diagnostic development led to the creation of a whole array of market-making techniques, methods and devices, designed to align the necessary with the feasible, which are ancillary to the diagnostic device itself.

Market Devices

So the world needs diagnostics—but which diagnostics? Not only are there multiple candidate diseases for which diagnostics might be developed, there are also multiple possible ways to test for any single disease, from rapid antigen-based assays to molecular-level PCR. Depending on where a test is embedded in a patient care pathway, its infrastructural requirements, what kind of sample is obtained and how (finger-prick, intra-venous blood, saliva, vaginal swab, sputum, urine), and what the test seeks to detect (antigens, antibodies, biomarkers, pathogens) all determine what kind of information a test generates, how accurate that information is, and what can be done with it.

For example, a simple, affordable and easy-to-use test for tuberculosis with high sensitivity (ability to capture positive cases) and low specificity (ability to exclude negative cases) could be used at a peripheral health care setting to triage patients but not to make treatment decisions. Positive cases would need to be sent for confirmatory testing to ensure people are not treated with highly toxic drugs unnecessarily. A point-of-care non-sputum-based biomarker test with high sensitivity and specificity may enable positive diagnosis, but will not necessarily enlighten health workers about drug resistance or susceptibility.

For every disease, a multitude of tests with different performance characteristics are possible. How should diagnostic developers decide in which tests to invest their time and resources? Market logic demands that, if investors are going to invest in diagnostics, and developers are going to embark on lengthy R&D programs, they need to know there will be demand for the end product. Identifying which tests are “needed”—and therefore which tests future customers (bilateral agencies, philanthropic foundations such as the Clinton Foundation, and international organizations such as the Global Fund) are most likely to buy—has therefore become a crucial step in fostering markets for diagnostic devices.

A range of market-making techniques, methods, and devices has been developed or borrowed to help define diagnostic needs and align them with industry-led solutions. Here are three of them:

1. Forecasting

In 2004, in collaboration with the RAND corporation, the Gates Foundation established the Global Diagnostics Forum, an interdisciplinary research group with the goal of identifying which diagnostic tests are likely to have the most health impact and to stimulate interest in such tests among the global health community. As Deborah C.Hay Burgess explained in the forum’s subsequent special supplement of Nature, “An initial step in developing a rational strategy for creating diagnostic technologies for global health is to determine the need for, and the health impact of, potential new tests” (Hay Burgess et al. 2006: 2).

The forum used mathematical modeling techniques to predict the impact (measured in lives saved and disability-adjusted life years [DALYS]) for hypothetical tests in six disease areas (acute lower-respiratory infections, HIV/AIDS, diarrheal diseases, malaria, tuberculosis, and sexually transmitted infections). The GDF models quantified the difference between the status quo—in which empirical diagnosis is the norm in peripheral areas—and a future populated with rapid point-of-care tests.

The chief finding was that higher-accuracy tests, requiring more advanced infrastructure, would have a lower overall impact on disease burden than less-accurate tests that could be used in more peripheral facilities and therefore reach a greater number of people. For instance, a syphilis test requiring minimal laboratory infrastructure was calculated to prevent more than 138,000 congenital syphilis cases and more than 148,000 stillbirths annually. A test that could be performed with no laboratory infrastructure could prevent more than 201,000 congenital syphilis cases and 215,000 stillbirths annually (Urdea et al. 2006: 75; Keeler et al. 2006). Deeper penetration of the health system trumped the scientific penetration of biological matter. The impact of point-of-care diagnostic tests could be greater than that of gold-standard laboratory testing, so long as they were ambitiously distributed.

The scientific calculations that the GDF put forward made a forceful case for global health funders to invest in the development and procurement of rapid, portable, point-of-care diagnostic devices. Yet for all their apparent numerical objectivity, the GDF forecasts also depended on the construction of a compelling story about what global health “impact” looks like.

First, the GDF focused on the potential for point-of-care diagnostics to bring about some improvement, however minimal, for populations with inadequate access to diagnostic technologies: “We consider a new test to represent an improvement if it saves more adjusted lives than would be saved in the status quo” (Girosi et al. 2006: 6). This humanitarian calculus side-stepped tricky ethical questions about global health inequity; including whether it is acceptable for patients at peripheral facilities in low- and middle-income countries (LMICs) to receive a less-accurate diagnostic test than patients with access to laboratory services in wealthier countries or regions (see also Moran, this issue).

Second, the GDF forecasts implicitly abandoned older developmental visions of large-scale infrastructure development, accepting that the electrification and transportation infrastructures necessary for laboratories were unlikely to be extended uniformly across LMICs. In the GDF forecasts, the health centers where point-of-care tests were used would all remain disconnected from centralized electrification, transportation, sanitation, and communication infrastructures into the future. This was acknowledged in an aside made in one of the publications resulting from the project:

Although it is outside the scope of this paper, another method for improving health outcomes that could be approached in parallel to improving diagnostic tests would be enhancing the infrastructure and staffing available at these health-care settings. This approach would, in turn, allow the facilities to adopt better tests that might be available today or in the future. For instance, improving infrastructure and staffing could allow nucleic-acid-based tests for STIs to be adopted in more health-care settings” (Girosi et al., 2006: 8).

The GDF forecasts included calculations about the likely availability and success of treatment at different levels of health facility in different countries, but tenuous links between diagnostic test and treatment were, for the most part, glossed over. For example, the forecasts made no mention of the complexities of rolling out smooth medical supply systems, health-worker training, and treatment protocols in health settings lacking basic infrastructure. As critical global health scholars have shown, whether a test is used, how it is interpreted, and how it is acted on each depend on local institutional histories, relationships and expectations (e.g. Beisel et al. 2016; Chandler et al. 2011). The conflation of test availability with treatment created the impression that diagnostic devices have a direct impact on disease itself, occluding the many contingent steps in the diagnostic process, and focusing attention on the device itself as a worthy investment for global health funders.

Last, the GDF forecasts generated a vision of universal access to point-of-care testing that was, in some respects, no less grand than older developmental schemes. This was a vision in which there are tests for everything and tests everywhere. These tests would not be as accurate as laboratory tests that require carefully calibrated machines, refrigerated reagents, and highly trained technicians, but through sheer ubiquity they would save more lives than the best laboratory tests. This was a vision for a health infrastructure that is modest in quality but ambitious in reach.

Ultimately, the “success” of the GDF forecasts depended less on their scientific accuracy in predicting the future, than on their capacity to convince funders and developers that diagnostics have humanitarian, public health and economic value. The objective was to “articulate the acute need for diagnostic tools” and “encourage technology developers in the public and private sectors to do more to accelerate the development and delivery of new diagnostic solutions” (Hay Burgess et al. 2006: 2).

2. Consensus making

The GDF harnessed mathematical modeling techniques to evidence the need for specific diagnostics and incentivize funders and industry. However, time and again, the accuracy of mathematical forecasting has been shown to vary wildly. In 1967, the RAND Corporation published an influential paper outlining a new forecasting method, based on the generation of consensus among a community of experts. Ultimately, the author stated, mathematical models are only as good as the experts who provide the input values, so why not make this dependence on experts explicit and refine the process? The solution outlined in that paper, called the Delphi method, was first developed to forecast the impact of technological change on warfare. It was underpinned by the idea that groups are better at predicting the future than individuals, and anonymity will encourage flexibility and safeguard against status-based influence. A questionnaire was sent out to selected experts in the field. Their answers were anonymously summarized by a facilitator, who laid out common and conflicting viewpoints and reasoning and asked participants to revise their answers to the questionnaires in these responses. Over several rounds, the group was expected to move towards a consensus about what is most likely to happen.

Since the mid-20th century, the Delphi method has metamorphosed into a facilitation tool for the management of multi-stakeholder projects and is especially popular in global health. In the context of global diagnostics, it is not used to reconcile the predictions of different stakeholders, but to establish which futures—in the form of specific tests—are most desirable. In 2014, for example, the Global TB Programme of the World Health Organisation employed the Delphi method to identify priority diagnostic tests for tuberculosis (WHO 2014). The “experts” consulted in the Delphi Process included 24 participants from technical agencies and researchers (all but one based in the northern hemisphere); seven participants from funding organisations; five participants from supranational TB reference laboratories; five implementers and clinicians (all but one from institutions in the Northern hemisphere); and six representatives from countries with a high burden of TB. The process resulted in agreement on three diagnostic priorities: (i) a point-of care, biomarker-based, non-sputum-based test to detect TB; (ii) a point-of-care test that could be used for triage; (iii) a point-of-care sputum-based test that could be used as a replacement for smear microscopy. These were taken forward to a subsequent meeting with industry, where product profiles for the tests were agreed on.

The use of the Delphi method in this context raises questions about who is included and excluded from processes of defining global health needs. As one WHO representative put it to me, “The process works if you have the right experts.” But who are the “right” experts? Some lines of exclusion were explicit: for example, WHO rules designed to safeguard against the influence of commercial interests dictated that industry representatives were excluded from the process. Others were more opaque: the group was dominated by academics and public-health professionals from funders and global health organizations based in Europe and North America. In an indication of the extent to which the process of identifying needs was driven by market logic, these experts were also key individuals likely to influence their organization’s future procurement policies. Overall, out of 46 invited participants in the Delphi method, twelve were based at institutions in low- and middle-income countries with a high burden of tuberculosis.

3. Profiling

Needs must be met with solutions, and while it is sometimes deemed appropriate to exclude industry representatives from the definition of global health needs, their participation in the finding of solutions is presumed to be crucial if those solutions are going to be feasible. In 2014, following the use of the Delphi method to ascertain priority needs, the WHO hosted a meeting in Geneva where industry representatives were invited to help develop performance specifications (sensitivity, specificity, shelf life, infrastructure requirements, cost) for the priority tests. The final specifications were subsequently published in the form of four target product profiles (TPPs).

The TPP was a device originally designed by the FDA in the late 1990s to improve communication with the pharmaceutical industry during the drug-development process. Over the past decade, the TPP has found a new home among global health initiatives as a technique for reconciling needs with solutions, demand with supply. A TPP that has had input from funders, regulators, users, and industry not only describes a goal, in the form of a diagnostic test, but is intended to make its achievement more likely. TPPs, as one WHO representative explained to me, “are aspirational.” They are guidance for the manufacturer on what kind of tests agencies are willing to procure. At the same time, the role of industry in defining those characteristics means they are more likely to be met.

At the 2014 meeting, the writing of TPPs involved negotiations between advocacy groups and industry over the correct pricing of the tests, and between users and industry about the kind of temperature stability that would be required. The TPP convention of recording “minimum” and “optimum” specifications for each of these items meant that some degree of difference between stakeholders could be tolerated in the final profile. The TPP brought the desirable within touching distance of the possible.

Conclusion

The arrival of the malaria rapid diagnostic test fundamentally changed ideas about what kind of medicine was feasible and desirable at the periphery. It became possible to imagine that microbes, parasites, or viruses, which are imperceptible to the human eye, could be identified in bodily fluids by a health worker with basic training in a health facility with no running water, electricity, or laboratory equipment. In changing what was technically possible, the rapid diagnostic test kit also transformed expectations for everyday medicine in resource-limited settings. The human cost of misdiagnosis and unnecessary treatment, which previously had been accepted as the necessary cost of universal access to health care in places without a laboratory, now became an aberration demanding action.

The arrival of mobile point-of-care diagnostics also presaged a shift in the problematization of diagnosis in global health, from how to extend laboratory infrastructure to how to stimulate markets for mobile devices. Once the need for diagnosis in peripheral primary-care settings became commensurate with the need for diagnostic devices, the substantial problem-solving apparatus of global health institutions in Europe and the United States was focused on overcoming the challenge of market failure.

Ironically, given that the “need” for diagnostics in global health was framed as the motivation behind these activities, they also were aimed at specifying those needs. Once identified, needs also had to be articulated with feasible solutions—that is, small, portable, marketable diagnostic devices. The alignment of public needs with private solutions required a fine-tuned array of techniques, methods, and devices that would align the desirable with the feasible at the same time as they kept public and private interests distinct. Whether this has had any impact on the care provided at Begasin Health Centre in Papua New Guinea is another story.