Introduction
The quote from John Naisbitt, “we are drowning in information but starved for knowledge” (
Naisbitt 1982, p. 24), is more applicable today than ever before. Thanks to smartphones and similar devices, we have instant access to enormous amounts of data and information. At the same time, we seem to lack the capacity to transform available information into knowledge that would allow us to make important decisions in our daily lives on topics such as health care or economic investments (
Ungar 2008). Evidence suggests that the general knowledge of individuals has not increased in the way that overall information and data have increased—a phenomenon termed the knowledge–ignorance paradox (
Putnam 2000;
Ungar 2008;
Schulz et al. 2010;
Schulz 2012;
Millgram 2015).
Proctor (2016) has called the current era the “age of ignorance”; conspiracy theories and rumors thrive in the World Wide Web’s echo chambers (
Butter 2018), and today’s societies are increasingly seen as “post-truth societies” in which truth has partly lost its value and importance to people (
Higgins 2016;
Viner 2016).
There are different perspectives and definitions about “knowledge” and related terms such as “reality” (
Boghossian 2007;
Rowley 2007;
Moon and Blackman 2014;
Nagel 2014). Assuming that an objective reality exists, our usage of the term knowledge follows the knowledge pyramid where data are on the bottom, information is in the middle, and knowledge and understanding are on top (cf.
Ackoff 1989;
Rowley 2007). Different versions of this pyramid exist, for example “understanding” is sometimes left out or included in knowledge. We use knowledge in the broad sense here, including understanding. We avoid a narrow definition of knowledge, as indeed the concept of “knowledge in the dark” outlined below applies to various knowledge definitions. However, we focus on knowledge of individual people rather than collective knowledge. Of course, individual and collective knowledge are interrelated, and key points outlined below also apply to collective knowledge, yet a detailed comparison of individual versus collective knowledge is beyond the scope of the current article.
As illustrated in the knowledge pyramid, knowledge requires the reflection and interpretation of data and information, i.e., it is evidence based. This is not restricted to scientific evidence, but includes data and information generated in other professions or domains, as well as experience of indigenous people or other local residents (cf.
Wynne 1992;
Funtowicz and Ravetz 1993;
Kleinman and Suryanarayanan 2012;
Yeh 2016). When reflecting on and interpreting data and information about a given topic, people can become knowledgeable about this topic. Such knowledge enables them to, for instance, better predict the consequences of important decisions related to this topic—and act accordingly, for example during elections. This is not the case for data and information per se. The latter are only truly useful if people can transform them into knowledge. Here, we focus on desirable knowledge, as humans do not want to know everything (e.g.,
Gigerenzer and Garcia-Retamero 2017).
The observation that we are living in a time where data and information, and thus potential knowledge, keep accumulating, but where the real knowledge of people does not keep up, is frustrating. Science’s primary goal is to advance knowledge, thus we are currently falling short of our mission. At the same time, we are facing a sizable risk that science is losing trust, and thus its role as a counselor for evidence-based decision-making (
Pielke 2007) in societies across the globe (
Kitcher 2011). Indeed, we observe an increasing gap between evidence and people’s judgment (
Funk and Rainie 2015) partly for economic and ideological reasons.
We, the authors of this article, are natural scientists who have discussed this topic in depth with colleagues from various disciplines and put it into a broader context. Based on these discussions and reflections, we have developed the concept of knowledge in the dark that we consider useful for stimulating discussions about the pivotal role science plays in our societies, and it may help improve our ability to make effective decisions that are important for us as individuals and societies. Here, we outline this concept and then apply it to the academic realm. Various aspects of this broader topic have already been extensively dealt with; see for example the existing body of literature on ignorance studies (also known as agnotology;
Gross 2007;
Proctor and Schiebinger 2008;
Kleinman and Suryanarayanan 2012;
Gross and McGoey 2015); the relation between knowledge and uncertainty (post-normal science, e.g.,
Funtowicz and Ravetz 1993; Mode 2 science, e.g.,
Nowotny et al. 2003); and public understanding of science, public communication of science and technology, and related fields (e.g.,
Bucchi and Trench 2008;
Nisbet and Scheufele 2009;
Groffman et al. 2010;
McNeil 2013 and references therein). These studies are highly relevant, but in the interest of brevity we do not provide a comprehensive review of them here. Instead, we highlight complementary ideas that have emerged during our discussions over the past years that should be particularly interesting and accessible to natural scientists who form the primary target readership of this article.
In the next section, we provide a conceptual overview of knowledge in the dark with a focus on both laypeople and experts, where we will also clarify in which way this concept builds upon and extends existing terms, concepts, and frameworks. This section will be followed by reasons for knowledge in the dark in academia, while the final section will suggest ways forward to cope with this phenomenon.
Knowledge in the dark
Let us go back to the above-described conundrum that we are living in a time when data and information, and thus potential knowledge, keep accumulating, while the real knowledge of people does not keep up. What we call knowledge in the dark—or short: dark knowledge—is the gap between real and potential knowledge (
Fig. 1). This gap can be seen as a lost opportunity and seems to have widened through time. It is a major challenge of our current era and particularly pronounced for inter- and transdisciplinary topics, as knowledge is often trapped in disciplinary silos and professions (
Campbell 1969;
Ungar 2008;
Millgram 2015). At the same time, pivotal environmental, social, and economic challenges urgently need inter- and transdisciplinary solutions.
Our use of the term dark knowledge was inspired by “dark matter” in physics and “dark diversity” in biodiversity research. The former is probably well-known to most readers, and the latter describes the gap between potential and actual biodiversity in a given region (
Pärtel et al. 2011).
The terms knowledge in the dark or dark knowledge have not yet been applied in the emerging social science field agnotology (
Proctor and Schiebinger 2008); only ignorance (the lack of knowledge, cf.
Table 1) has been widely used, however, with different meanings (see
Gross 2007 for standard terms used in this field). It seems useful to discriminate the different dimensions of ignorance. Dark knowledge includes those dimensions of ignorance that can in principle be reduced. It does not include ignorance that cannot be reduced: we humans cannot know everything. In
Pinker’s (1997, p. 561) words: “We are organisms, not angels, and our minds are organs, not pipelines to truth. Our minds evolved by natural selection to solve problems that were life-and-death matters to our ancestors, not … to answer any question we are capable of asking.” Similarly, we humans do not want to know everything (e.g.,
Gigerenzer and Garcia-Retamero 2017); hence, the concept of dark knowledge focuses on desirable knowledge.
Thus, dark knowledge is a particular part of ignorance for which a specific term (and definition) has been lacking thus far. Indeed, it is of high practical relevance, as it focuses on those dimensions of ignorance that humans both can (in principle) and want to reduce (
Table 1). In this way, the concept might be of interest for researchers in the field of agnotology. It should also be useful for the fields of public understanding of science, public communication of science and technology, and related areas. Relevant works here have shown that engaging with the public, which includes an open dialogue between scientists and other stakeholders, is much more effective than a one-way communication effort from scientists to the public (e.g., engagement vs. deficit model;
Nisbet and Scheufele 2009;
Groffman et al. 2010;
McNeil 2013;
Smith et al. 2013). Dark knowledge can only be tackled thanks to such insights. Important measures in addition to engaging with the public are outlined in the section on ways forward. Some mechanisms leading to dark knowledge are related to uncertainty, which is key for post-normal science and Mode 2 science (
Funtowicz and Ravetz 1993;
Nowotny et al. 2003).
The concept of dark knowledge also highly benefits from other points put forward by social scientists, for example the importance of considering research biases (see below for details) or that science has no monopoly on evidence, as data and information stemming from outside of science can be crucial as well (
Wynne 1992;
Funtowicz and Ravetz 1993;
Kleinman and Suryanarayanan 2012;
Yeh 2016); further examples are provided below.
As scientists, we need to be aware of roadblocks for our endeavor to advance knowledge and focus on those we really can and want to remove. The dark knowledge concept may be helpful in this regard, particularly when we consider key mechanisms underlying dark knowledge—these are the roadblocks we should focus on.
We highlight four of these mechanisms here (
Fig. 1, right). They are aligned with consecutive steps making up the process of knowledge production: how data and information are (
i) produced or not, (
ii) made available or not, (
iii) are comprehensible or not, (
iv) and are remembered or forgotten. The mechanisms differ in their effects on different focal groups, from (a) researchers and other experts in the institution where specific data and information have been generated, to researchers and other experts outside of this institution, but in the (b) same or a (c) similar discipline or profession, and to (d) nonexperts (
Fig. 2). In explaining the mechanisms, we draw from findings across various disciplines, e.g., social sciences (including agnotology) or economics.
Biased, erroneous, or fabricated data and information
First, dark knowledge can be caused by biased, erroneous, or fabricated data and information. For instance, the type of data and information produced can be influenced by financial or sociopolitical interests (
Kitcher 2011). When “high-stakes” metrics are applied, which assess the performance of people and at the same time strongly influence their future career, there are incentives to “cream” or fabricate the data used for calculating these metrics; creaming is a strategy to maximize a metric by “excluding cases where success is more difficult to achieve” (
Muller 2018, p. 24). For example, schools in Florida and Texas have been shown to reclassify weak students as disabled, thus excluding them from calculating average student achievement levels (this is a high-stakes metric for teachers and school principals;
Muller 2018, p. 93).
The production of biased, erroneous, or fabricated data and information can be combined with systematic disinformation leading to doubt and uncertainty. This was, for example, done by the tobacco industry, which successfully distorted the public understanding of tobacco health effects (
Oreskes and Conway 2010). Similar strategies have been applied in the context of climate change (
Oreskes and Conway 2010), by the sugar industry (
Kearns et al. 2015), and by pharmaceutical companies that hide information about their products from the public (
Kreiß 2015;
Crouch 2016). False information can now be actively spread with so-called bots, i.e., software applications running automated tasks (
Howard and Kollanyi 2016;
Kollanyi et al. 2016).
Producing biased, erroneous, or fabricated data and information leads, of course, to an increase in the amount of data and information (
Fig. 2). Under ideal circumstances, such an increase would augment the amount of knowledge (see idealized line “potential knowledge” in
Fig. 1, and idealized scenario in
Fig. 2). In the case of biased, erroneous, or fabricated data and information, however, such data and information reduce instead of increase knowledge (
Fig. 2). Only researchers or other experts from the institution that generated the data and information might be aware of critical errors; other people are not usually aware of them, thus their understanding of the topic will be severely hampered (
Fig. 2).
Inaccessible data and information
The second reason for dark knowledge is inaccessibility of data and information. For example, findings of secret services, the military, and industry are frequently inaccessible to the public and thus do not increase public knowledge (
Resnik 2006;
Proctor and Schiebinger 2008;
Bozeman and Youtie 2017). Looking at Organisation for Economic Co-operation and Development (OECD) countries (for which more comprehensive and comparable data are available than for other countries), expenditures into research and development by the industry and military combined are about three times higher than governmental expenditures for civil research (
OECD 2017). Industry investments are particularly high, and these have been increasing through time, whereas governmental expenditures are—in relative gross domestic product terms (GDP)—lower today than they were in the 1980s (
Fig. 3). This trend can be called privatization of knowledge. In 2015, Volkswagen had the highest research and development budget of all companies worldwide, which was higher than the United Kingdom’s governmental expenditures for civil research (
Fig. S1). Samsung also trumped the United Kingdom’s budget, and Intel and Microsoft trumped Italy’s budget.
Of course, not all research results from industry, the military or secret services remain hidden from the public. This is, for example, illustrated by the American Department of Defense Congressionally Directed Medical Research Programs (
http://cdmrp.army.mil), which originated in 1992 with a focus on breast cancer research and now includes other medical research areas that are not primarily of military interest, but benefit the general public (
Young-McCaughan et al. 2002). Nonetheless, a large fraction of the research results from industry, the military, or secret services remains hidden. Companies intend to become economic leaders in their specific domain, and military supports geopolitical power and protects national interests (see also
Resnik 2006). Thus, the results that are made public are often biased or selected, for instance to boost sales (e.g., for pharmaceutical products), to avoid legal restrictions (e.g., for tobacco or sugar) or to shape geopolitical decisions (
Hartnett and Stengrim 2004;
Oreskes and Conway 2010;
Kearns et al. 2015;
Kreiß 2015;
Crouch 2016).
Incomprehensible data and information
The third reason for dark knowledge is that much information is incomprehensible. Even if information is accessible in principle, it can frequently only be understood by researchers and experts from the same discipline or profession, whereas most people find it incomprehensible, for instance, because they do not understand the logic underlying the data or information, or the technical language in which these are outlined (
Fig. 2;
Millgram 2015;
Plavén-Sigray et al. 2017).
Loss of knowledge
Fourth and finally, previous knowledge can be lost. This is, for example, the case when professions or scientific disciplines shrink (e.g., if university positions for this discipline are cut) or completely disappear. Although the literature and other information produced by such disciplines still exist, there is (almost) no one left to make this information fully comprehensible and usable. This mechanism underlying dark knowledge is thus similar to the third; however, there are (almost) no experts anymore who could tap into the literature and information and teach non-experts. Consequently, some of the knowledge that had been produced by these dying disciplines and professions is forever lost. If languages disappear, any related information is similarly lost; and data and information stored in disappearing technologies will also be lost if not transferred to modern technologies. For example, information stored on floppy disks is nowadays increasingly hard to access.
While we outlined general reasons for dark knowledge in this section, we will specify them for academia in the next section and then suggest ways to tackle them. The insights we offer may be transferable to other professions and knowledge domains. Since dark knowledge is a broader societal phenomenon and challenge, we encourage others to join us in advancing the concept of dark knowledge in the future and to apply it in various disciplines and professions.
Ways forward in academia
Dark knowledge is a challenge for democratic societies, as these need citizens who can make informed decisions. If people are ill-informed or no longer care about the truth, democracy is at risk and science will basically become irrelevant (
Kitcher 2011). To avoid such a pessimistic scenario, what are possible ways forward? We outline five approaches below (summarized in
Fig. 4).
Open science
Key components of open science are open access to scientific publications, open data, open source and open methodology (
Kraker et al. 2011). One of its initiatives aims at FAIR—findable, accessible, interoperable, and reusable—data (
Wilkinson et al. 2016). Thus, open science directly tackles one of the key reasons underlying dark knowledge, the inaccessibility of data and information. Related to the more specific challenges in academia outlined in the previous section, open science has great potential in improving research reproducibility (e.g., through open methodology) and reducing biases in which data and information can be found, accessed, and reused for research synthesis (e.g., through the FAIR data principles). Open science is clearly an important step forward and helps to build trust into research.
However, there are also important challenges. First, the public availability of data such as health records, behavioral data, or genomic sequencing information poses a threat to citizens on the part of private companies and (future) governments alike. There has been much research on the re-identification of anonymized data, and many examples of past misuse of such data sets exist (
Ohm 2009;
O’Doherty et al. 2016). In ecology and conservation biology, information about the location of individuals belonging to endangered or newly described rare species can be used by poachers to find them (
Lindenmayer and Scheele 2017). Another potential negative effect is that too many nature lovers will try to find particular animals or plants, with possible negative consequences for the whole ecosystem: too many people may destroy the habitat, and harm the species inhabiting it (
Lindenmayer and Scheele 2017).
Second, a thorough discussion of how to deal with private companies using data sets of public research institutions is needed. Open public databases are paid by taxpayers and may be an important source of wealth for private companies, which themselves do not typically share their data with the public; when they do, these data are often biased (see above). In other words, open public databases essentially subsidize certain private companies (cf.
Mirowski 2018). It is clear that an open science approach alone will not solve the challenges underlying dark knowledge; thus additional approaches are needed (see below).
Diverse evaluation systems
There is an increasing need to revise the performance metrics of researchers and institutions. As briefly outlined above, the application of few quantitative metrics, focusing on money, publications, and citations, constrains academic freedom and favors mainstream rather than outside-of-the-box research, thus promoting research biases (e.g., the Matthew effect) and incentives for authors to predominantly publish what is currently fashionable in science, whereas other research results might remain unpublished (i.e., author publication biases). Furthermore, it may impede inter- or transdisciplinary research, thus contributing to the challenge of the Scientific tower of Babel (cf.
Campbell 1969), and even threaten entire disciplines, in which financial interests, overall number of publications, and citations are low.
There is a clear need to diversify evaluation strategies. Researchers should not always be assessed using the same set of metrics, but different metrics should be applied depending on which type of researcher and which skill is needed at an institution (
Weingart 2005;
Arlinghaus 2014;
Hicks et al. 2015;
Jeschke et al. 2016). Otherwise, players (i.e., researchers and heads of institutions) focus on “gaming” metrics rather than on their research. Indeed, maximizing metrics has become an end in itself for many researchers, which is not surprising when these metrics are continuously applied for their evaluation (
Lawrence 2007;
Hicks et al. 2015). For example, many researchers today primarily think about how they can acquire grant money and how they can get into a high-impact journal. If different metrics are applied by different evaluation committees, researchers may be less worried about maximizing certain metrics, as they do not know which metrics will be used in their case. They can then instead focus on actually creating knowledge.
An international court of arbitration for research
Another promising way forward would be to use existing codes of ethics and responsible conduct in science and research (e.g.,
www.esa.org/esa/about/governance/esa-code-of-ethics;
www.icmje.org/recommendations) and turn parts of them into binding rules (cf.
Kaushal and Jeschke 2013;
Alberts et al. 2015). Any violations of these rules could be dealt with by an international Court of Arbitration for Research (CARe). A similar system exists for sports, where disputes (e.g., doping) can be settled at the international Court of Arbitration for Sport (CAS), which has three courts (in Lausanne, New York, and Sydney). Perhaps it would be worth trying to have at least one for research as well, either in the form of a court or a similar type of entity, such as an international agency of research integrity.
Such an international entity could serve three functions. First, it could assist in setting standards and stimulate a cross-disciplinary discussion of what constitutes scientific misconduct and what does not (cf.
Neuroskeptic 2012). Second, for those few countries that have a similar national-level entity (e.g., Austria or Sweden,
www.oeawi.at,
www.epn.se/en/start/expert-group-for-misconduct-in-research-at-the-central-ethical-review-boardstar), an international entity could handle revisions of cases that are not resolved nationally. Third, it could ensure independent investigations of judgements about possible cases of misconduct. Such independence is not guaranteed if cases are investigated by the research institutions where they occurred or by journals where a study was published. Also, misconduct by scientists often spans across institutions, countries, and journals. After a group of researchers investigated scientific misconduct on the part of the Japanese bone researcher Yoshihoro Sato over a period of several years, focusing on 33 of his more than 200 papers, they concluded that “investigations of this scale should not be handled by journals or institutions” (
Kupferschmidt 2018, p. 639).
Of the challenges outlined above, such a court would mainly tackle (i) loss of academic freedom and (ii) lack of reproducibility, financial interests. Standards and rules can be discussed and implemented to clarify what constitutes misconduct delimiting academic freedom, and potential cases can be handled at the court. Similarly, cases of potential misconduct can be handled that changed the outcome of studies, for example data manipulation, thus making them irreproducible. As outlined above, such misconduct is sometimes driven by financial interests. Of course, the effectiveness of such a court in preventing future cases of misconduct will depend on many factors—a key aspect will be its real power to penalize misconduct.
Advances in research synthesis
The primary goal of research synthesis is to gather, process, and present complex data and information, so that they become more accessible. Indeed, we argue that advances in research synthesis are critical for tackling dark knowledge. For example, systematic reviews and meta-analyses such as those performed by Cochrane (
www.cochrane.org) have proven important in synthesizing data and information. However, we need to take further steps (
Nakagawa et al. 2019). A promising path forward is an atlas or map of knowledge that will allow people to see where certain research is situated and which lines of research and concepts are (dis-)similar to each other (
Bollen et al. 2009;
Börner 2010,
2015;
Kitcher 2011;
Jeschke 2014). Such a map of knowledge will allow nonspecialists to better understand a given discipline and more quickly acquire its knowledge, thus tackling the challenge of the Scientific tower of Babel. Advanced synthesis tools also reveal how data and information delivered by various research fields are important for tackling ecological, social, and economic challenges; they clearly show the need to keep such research fields alive that might be threatened in their existence.
Furthermore, knowledge maps and other synthesis tools can only be successfully developed if scientists of several disciplines and artists work together. For instance, information technologists and statisticians should not only work with experts on the focal research questions, but also with artists or designers who will make sure that the final product (e.g., an online portal) is aesthetically sound and user-friendly. Fortunately, such joint work on advanced research synthesis is increasing, for instance work on visual analytics (
Keim et al. 2010), sonification which turns data into sound (
Hermann et al. 2011), or the above-mentioned advances in creating knowledge maps (
https://hi-knowledge.org). Advanced tools for research synthesis can also help uncover and correct for topological, geographic, or author biases, e.g., by considering potential interests of the funders of a study.
Training the next generation of researchers
Additional training is required in critically evaluating information and reducing questionable research practices. Specifically, courses could include analyzing different information sources and teaching methods for distinguishing science from pseudoscience (
Boudry and Braeckman 2012). They should address questions such as: What constitutes or should constitute our evidence base? What is the role of evidence-based knowledge in society and political decision-making? For example, the course “Calling Bulls**t: Data Reasoning in a Digital World” by Bergstrom and West at the University of Washington, which started in 2017, is a valuable way forward. Its aim is to teach students “how to think critically about the data and models that constitute evidence in the social and natural sciences” (
http://callingbulls**t.org).
Training of future researchers should also build awareness that scientists are not immune to biases that influence their work. A profound understanding of what differentiates responsible research from questionable research practices is necessary (
Neuroskeptic 2012;
Sijtsma 2016). Questionable rearch practices do not necessarily imply intentional fraud but can include “
p-hacking”, as when one repeats an experiment until the desired statistical significance is reached or one ignores outliers in statistical analyses (
Neuroskeptic 2012;
Head et al. 2015). Such practices of “data cooking” are unfortunately widespread (
Fanelli 2009). Importantly, such targeted training needs to benefit future researchers across the globe.
Acknowledgements
We appreciate stimulating discussions with other members of the Dark Knowledge Group in Berlin, particularly input and comments by Elisabeth Marquard, Gabriele Bammer, Martin Enders, Hans-Peter Grossart, Lara Hofner, Lydia Koglin, Simone Langhans, Johannes Müller, Florian Ruland, Ulrike Scharfenberger, and Max Wolf. We additionally appreciate contributions at the session “Open Science, Dark Knowledge: Science in an Age of Ignorance” of the Alpbach Technology Symposium, Austria, in August 2017 (organized by KT and JMJ). We also very much thank Karin Bugow, Fernando Galindo-Rueda, Nicole Klenk, Christoph Kueffer, Paolo Mazzetti, Elijah Millgram, and anonymous reviewers for helpful input. Financial support was received from the Cross-Cutting Research Domain Aquatic Biodiversity of the Leibniz-Institute of Freshwater Ecology and Inland Fisheries (IGB), the Deutsche Forschungsgemeinschaft (DFG; JE 288/9-1, JE 288/9-2), and the Austrian Federal Ministry of Education, Science and Research (BMBWF).