Introduction
The establishment of the internet and the advent of electronic publishing have introduced a dramatic shift in scholarly communication and publication. Although the switch from print-only publication to online publication has been accompanied by increased research impact (
Lawrence 2001), barriers to open scholarly communication still exist. The removal of such barriers, however, could further increase research impact. Consequently, “open access” publishing has been suggested as a potential solution to the aforementioned problem; although, this is a plausible solution, its efficacy is insufficiently documented, particularly within disciplines and sub-disciplines. As such, while many benefits of open access have been highlighted, many researchers remain uncertain as to how openly sharing their work may affect them, and the widespread adoption of open access remains elusive (
McKiernan et al. 2016).
In a broad sense, “open access” refers to the free access of online scholastic articles (to download, read, and use) to any user of the internet. To make scholarly works open access, two primary models have been implemented: (1) the “gold” model—whereby the author(s) of an article pay to publish it, rather than the reader paying to download and read the article; and (2) the “green” model—whereby the article is stored in an institutional or electronic repository either as a draft (pre-print version) or as a fully published article (
Harnad et al. 2004;
Craig et al. 2007). In addition, under the green open access model, authors are typically granted a permissive licence to freely share an article with colleagues, post on personal web pages, and (or) upload onto professional (e.g., ResearchGate, Academia.edu, and LinkedIn) and nonprofessional (e.g., Facebook and Twitter) social media. However, given that publishing open access can be financially costly (gold open access publishing typically costs > $1000 USD, and fees are usually associated with acquiring a permissive licence to share post-print versions of an article), it is important to understand the pros and cons of publishing open access.
Open access can greatly benefit both authors and readers of scholarly works. For example, evidence that open access can increase citation rate has been accumulating in recent years (
Hitchcock 2013). In a meta-analysis of published literature,
McKiernan et al. (2016) reported that open access articles had higher relative citation rates than non-open access articles in 19 disciplines, while also having numerous other benefits to researchers. Chinese scientific journals offering an open access option were reported to have higher citation indicators (e.g., impact factor) than equivalent non-open access journals (
Cheng and Ren 2008), while impact factors for other journals have also been found to benefit from the establishment of open access publishing (
Lin 2007,
2009). At the individual researcher level, physics papers posted to preprint open access archives (e.g., arXiv) have been reported to experience increased citations (
Metcalfe 2005,
2006). Using citations from ISI Web of Science,
Antelman (2004) also found that open access articles received more citations than non-open access articles across four academic disciplines (mathematics, electrical and electronic engineering, political science, and philosophy), although substantial variation was present in citation counts. In a longitudinal study of articles published in the journal
Proceedings of the National Academy of Sciences (PNAS) controlling for a number of potentially confounding variables,
Eysenbach (2006) reported that open access articles were twice as likely to be cited within 4–10 months following publication, and almost three times as likely to be cited within 10–16 months after being published. Furthermore,
Hajjem et al. (2005) reported that open access articles consistently received more citations than non-open access articles across 10 disciplines and 12 years, and
Gargouri et al. (2010) suggested that highly cited articles benefited from open access even more than less cited articles. In a literature review,
Craig et al. (2007) also suggested that a clear link between open access publishing and citation counts was evident. In contrast, however,
Davis et al. (2008) found no impact of open access on citations across 11 American Physiological Society journals, but reported that open access articles received more full text and PDF downloads than non-open access articles. Other studies also reported similar results and suggested that open access does not influence citations (e.g.,
Kurtz et al. 2005;
Lansingh and Carter 2009). While studies predominantly suggest that open access can increase citations, such studies are typically conducted across many journals and disciplines, introducing a number of potential issues and limiting their utility for speaking to discipline-specific trends. Furthermore, studies have largely neglected the effects of open access within hybrid journals (i.e., journals that publish both open access and non-open access articles) during the same time frame, and most generally ignore the potential influence of self-citations (i.e., authors citing their own work). Thus, studies filling these knowledge gaps can lead to a more comprehensive understanding of the impact of open access and can aid in the effort to establish widespread open access practices.
Although open access can increase citations, many other factors can also influence citations and the impact of a paper. For example, the overall journal impact factor can influence citations, and it has been suggested that the impact of open access on citations should be assessed in a single journal or in highly similar journals to reduce the influence of journal impact factor (
Harnad and Brody 2004). The relative age of articles (i.e., time elapsed since publication) can also influence citation counts, as articles are known to accumulate the most citations within the first 3–5 years post-publication (e.g.,
Okerson and O’Donnell 1995;
Davis 2013), with some exceptions such as highly cited articles that may continue to gather citations over longer periods of time. Likewise, articles with an “online first” publication option (i.e., an early version of an article published online before the final version) have been found to have higher citation counts than those without “online first” versions (
Moed 2007). Various other article attributes can also influence citation counts, such as the number of authors, article length, title length, article type (i.e., broad review papers or featured articles are likely to be cited more than nonfeatured research articles), the number of references, and field of study (e.g.,
Gargouri et al. 2010;
Letchford et al. 2015). Alongside article attributes, authorial attributes such as author nationality, author prestige, lifetime publication count, and grant funding can also theoretically influence citations, although their effects on citations are reported to be insignificant (
Boyack 2004;
Eysenbach 2006). It has also been highlighted that author self-selection to publish open access (i.e., authors choosing to publish only their best articles open access, thus inflating citations of open access articles) may impact citation counts for open access articles as well (
Hajjem et al. 2005;
Craig et al. 2007;
Gargouri et al. 2010). Finally, self-citations can also influence citations, theoretically, as authors may be more likely to cite their own open access published work. While such factors can contribute to substantial variability in citations, it has been suggested that many have little to no influence on citation counts when comparing open access and non-open access articles (e.g.,
Hajjem et al. 2005;
Gargouri et al. 2010). Nonetheless, it is important to consider other variables when assessing differences in citations between open access and non-open access articles. Ultimately, studies exploring the effect of open access publishing in hybrid journals within specific disciplines and including potential interactions between open access publishing and other factors that can influence citations are necessary.
Although studies have attempted to elucidate the impact of open access publishing on citations, various issues with these studies exist. Studies assessing open access effects on citations across many journals and disciplines are unable to capture differences between disciplines, and such studies often neglect other factors that could influence citations such as time, journal impact factor, the number of authors, article type (e.g., original research versus review), and self-citations. Furthermore, studies have yet to investigate the effect of open access and other variables on self-citations. Given that many marine biology/ecology journals now offer an open access option when publishing, I used three primary marine ecology journals with relatively equal impact factors (ICES Journal of Marine Science (ICES JMS), Marine Ecology Progress Series (MEPS), and Marine Biology (MB)) as a “microcosm” to test for within-discipline effects of open access publishing on citation counts in hybrid marine ecology journals, controlling for self-citations (defined here as any citation in which any author from the citing article appeared) and article type. In addition, I tested whether or not additional variables could influence citations, including time since publication, the year that the article was published, and the number of authors, and assessed the effects of these variables on self-citations as well.
Methods
Data collection
Open access and non-open access articles published from 2009 to 2012 were collected from three primary hybrid marine ecology journals with similar impact factors (to control for the potential effect of journal impact factor)—ICES JMS (IF = 2.63), MEPS (IF = 2.40), and MB (IF = 2.38). Articles were collected from the ICES JMS website on 17 March 2016, from the MEPS website between 28 August and 1 September 2015, and from the MB website between 20 and 25 November 2015; all three websites report citation information obtained via CrossRef. MEPS is a primary journal in the field of marine ecology/biology that does not provide an “online first” option (i.e., an early version of a full-text article published online before the final version) which is published by a small publisher (InterResearch) with an expensive annual subscription cost of $7403 USD, potentially resulting in limited personal and institutional access. MEPS open access fees vary depending on page numbers, with articles 1–8 pages in length costing $1140 USD, articles 9–14 pages costing $1465 USD, and articles >14 pages subjected to a fee of $1682 USD. In contrast, ICES JMS and MB provide an “online first” option and are published by relatively large publishing companies (ICES JMS: Oxford Journals and MB: Springer). ICES JMS and MB have annual subscription costs of $446.50 USD (nonmembers; member price: $142.50 USD) and $199 USD, respectively. Unlike MEPS, ICES JMS and MB charge a flat open access fee of $2800 USD and $3000 USD, which are each nearly double the price of the most expensive open access fee for MEPS. While all three journals offer both “green” and “gold” open access options, I only considered gold open access articles for the purposes of this study (i.e., only articles that were open access on the journal websites and thus paid for by authors; I did not distinguish articles that were published under the green open access option as these articles are not easily identifiable).
To control for article type, only primary research articles were selected to avoid higher citation counts associated with particular articles such as reviews, feature articles, editorials, short communications, etc. Articles appearing in “theme issues” for which all articles were available open access were also omitted. In MEPS, for each year (2009–2012) and starting with the first issue of that year, I selected the first 50 open access articles published and the non-open access article immediately following to ensure that the order in which articles appeared on the MEPS website did not influence citation counts. I chose only the first 50 articles to keep a balanced number of samples between open access and non-open access groups and to avoid drastically unbalanced samples between journals. In instances where a non-open access article did not immediately follow an open access article (e.g., when >1 open access article appeared in sequence), the non-open access article within the closest proximity (either before or after the open access article) was selected. The same criteria were used for article selection in ICES JMS and MB; however, I used all open access articles published for each year since the annual number of open access articles in those journals was <50. Additionally, for each journal, I counted the annual total number of open access articles and calculated the annual percentage of these articles for each year assessed in this study (2009–2012) to determine if open access publishing was more frequent in certain journals.
For each individual article, I recorded the total number of citations and self-citations to control for inflated citation counts resulting from self-citations (citation metrics omitting self-citations are hereafter referred to as “peer-citations”). Only citations that fell within the year of publication and 3 subsequent years after publication were used, to minimize the influence of time on citation metrics. Citation counts were obtained from CrossRef options on each journal website. In addition to citation counts, additional variables were also collected, including publication age (number of days since an article was published from date of publication through that same year and 3 subsequent years after), and the number of authors on an article.
Data analysis
All statistical analyses were conducted using R version 3.2.1 (
R Core Team 2013) with a significance threshold of
α < 0.05; R-script can be found in the
Supplementary Material. To test for independent and interactive effects of open access, journal, year, number of authors, and time since publication on citations, I used a nested ANOVA with year nested within journal (see
Table 1 for factor descriptions). Pairwise
t-tests were used to elucidate pairwise differences for significant factors (from the nested ANOVA) with >2 levels. Assumptions of normality and homoscedasticity were checked using Q–Q Plots and Levene’s test, respectively (see
Supplementary Material for results). Data initially violated the assumption of normality, which was rectified using a log+1 data transformation.
Results
Open access publishing appeared to be more common in MEPS than in ICES JMS and MB from 2009 to 2012. Both the annual total number of open access articles and the percentage of open access articles per year were consistently higher in MEPS than ICES JMS and MB (
Fig. 1). On average, MEPS published a total of 80.3 ± 8.4 SEM open access articles per year (2009–2012), representing 17.8% ± 3.2% SEM of all articles published per year. In contrast, MB published an average of 15.5 ± 3.9 SEM open access articles per year (2009–2012), representing 6.8% ± 1.8% SEM of all articles published per year, and ICES JMS only published an average of 7.0 ± 3.4 SEM open access articles per year (2009–2012) representing a mere 3.2% ± 1.5% SEM of all articles published per year. While the percentage of open access articles published in MEPS tended to increase from 2009 to 2012, the percentage in MB and ICES JMS tended to decrease, although annual deviations in the percentage of open access articles were small (
Fig. 1).
Nested ANOVA revealed significant independent effects of open access, journal, and time (days) since publication on citations (
Table 2). On average, open access articles received more peer-citations than non-open access articles (
Figs. 2a;
S1). For ICES JMS, MEPS, and MB, respectively, open access articles had, on average, 56.7%, 37.5%, and 24.4% more citations than non-open access articles (
Fig. 2b). Across the journals, MEPS and ICES JMS appeared to garner more citations than MB, although pairwise
t-tests suggested that only MEPS had significantly more citations than MB (
Fig. 3). Citation counts increased with publication age (i.e., number of days since the article was published;
Fig. 4). There was an independent effect of author number on self-citations, with self-citations increasing with author number (
Fig. 5). There was also a complex “open access × journal × time since publication” interaction (
Table 3), although no apparent trends were discernable.
Discussion
The results of this study highlight that, during the same time period, articles published as open access received more citations in hybrid marine ecology journals than non-open access articles. Previous studies have reported similar results in top science journals. When various factors aside from open access were controlled for,
Eysenbach (2006) reported that open access articles in PNAS were approximately 2× more likely to be cited in the 4–10 months after publication, and nearly 3× more likely to be cited within the 10–16 months after being published. Similarly,
Norris et al. (2008) reported that open-access articles published in ecology journals received more citations than articles behind a paywall, although the difference in citations between open access and non-open access articles in ecology was lower than in other subjects (applied mathematics, sociology, and economics). Similarly, while these results suggest that open access articles receive more citations in primary hybrid marine ecology journals, the degree of citation increase appears lower than in other disciplines, although the average citation difference between open access and non-open access articles can be quite substantial (nearly 60% more citations, on average, for open access articles published in ICES JMS, although this journal had the smallest sample size of articles).
Although it seems that open access articles receive more citations than non-open access articles in primary hybrid marine ecology journals, various other factors can influence the number of citations an article receives. Indeed, substantial variability in citation counts was evident in this study (
Fig. S1), with some open access articles receiving as few as zero citations, and some non-open access articles receiving very high citation counts, making it difficult to predict exactly which papers will receive more citations. Such variability has also been highlighted in previous studies (e.g.,
Antelman 2004). There are numerous factors that could potentially impact citation counts and thus contribute to the variability observed in this and other studies. While the overall journal impact factor (
Harnad and Brody 2004), article type (
PLoS Medicine Editors 2006), and elevated self-citation rates can drive citation counts, these potentially confounding factors were controlled for in this study methodologically by choosing journals with highly similar impact factors, only using original research articles, and by excluding self-citations. Time since publication can also influence citation counts (
Moed 2007), which is unsurprisingly confirmed by the results of this study (although there was no interactive effect between time since publication and open access). It is also plausible that articles with many authors may be more impactful, as studies with many authors are typically large in scope and can be highly influential within their field (
Wuchty et al. 2007). However, author number had no effect on peer-citations in this study. Ultimately, while the number of citations was different across journals and time since publication, none of these factors interacted with open access, suggesting that open access articles are more highly cited than non-open access articles in marine ecology journals regardless of journal or the amount of time since the paper was published.
In addition to the aforementioned factors tested or controlled for in this study, authors may also choose to only publish their best articles (i.e., likely to receive many citations) open access. This self-selection bias could ultimately be the driver behind the trend of increased citations for open access articles. Given that the only true and objective way of determining the degree of self-selection is to assign treatment and control groups and measure citations experimentally (e.g.,
Davis et al. 2008;
Davis 2011), future studies should implement more experimental approaches to determine the degree at which author self-selection may have contributed to the trends observed here (although such studies need to be very careful and aware of ethical guidelines).
Various other factors may have impacted the citation trends observed in this study, including individual author attributes (citation count, nationality, etc.), funding agency, article length, the number of references, topical papers (i.e., topical subjects may get cited more than less topical ones) (
Eysenbach 2006), and title length (
Letchford et al. 2015). For example, authors that already have high citation counts may be more well known in their field and broader scientific community, and thus may receive more citations simply because they are well known and respected. Such sources of variation are thought to be negligible, however, and likely have little effect on the results of this study (
Hajjem et al. 2005;
Gargouri et al. 2010). For example,
Eysenbach (2006) found a stronger difference between open access and non-open access citations than those observed here, but reported that author prestige (as measured by funding) had no effect on citations. One factor that likely contributes to the high degree of variation in citation metrics, however, is green open access publishing and access to articles through additional outlets. In this study, I only took into account articles published under the gold open access option. However, all three journals offer green options for open access publishing. Moreover, many researchers publish their papers openly online without the consent of the publisher, whether it be the pre-publication manuscript or the final published article (e.g., ResearchGate, personal websites, e-mail exchanges, etc.), thus providing green open access for their work without doing so through the publisher. Even though green open access may have contributed to the substantial variation in citation metrics observed in this study, controlling for such variability would most likely enhance the measured citation advantage for open access articles. Likewise, some journals make all articles freely available after a certain period of time; MEPS makes all articles freely accessible after 5 years and ICES JMS after 1 year. Given that I counted citations over a 3–4 year period after publication and ICES JMS is freely available after 1 year, these results may actually be an underestimation of the true impact of open access publishing. In contrast, the results may actually support the idea that authors select their best work for open access publishing. A comprehensive understanding of the impacts of open access on citations would benefit from future studies determining whether or not a citation advantage exists in paying for open access vs. putting an article somewhere online.
This study only assessed three primary marine ecology journals. As such, applying these results to other marine ecology journals, most notably those with highly restricted access, should be approached with caution. Even more, these results are likely not transferrable to other disciplines, particularly given the immense variation in the citation impact of open access publishing observed across disciplines (e.g.,
Norris et al. 2008;
Gargouri et al. 2010). Thus, further research expanding this idea to a broader range of journals can aid in the generalization of these results. However, the consistent trends observed across three primary and high-impact marine ecology journals in this study suggest that open access articles in comparable hybrid journals in this field are likely receive more citations than non-open access articles.
In addition to increased citations, open access practices can benefit researchers in a variety of other ways. Open access practices are associated with increased media attention and researcher exposure, increased collaborative opportunities, and more job and funding opportunities (
McKiernan et al. 2016). For example, open access articles in
Nature Communications received twice as many unique tweets and Mendeley reads than non-open access articles (
Adie 2014), and 2–4 times as many page views (
Wang et al. 2015). Thus, the benefits of open access publishing are not only limited to citations, but can also have broader positive impacts on researchers as well.
This study ultimately adds to the growing body of literature suggesting that open access publishing can increase research impact by elevating citations for individual researchers. As such, the benefits of open access are likely far-reaching and can positively impact researchers wishing to share their work, as well as researchers wishing to access scholarly works. The results of this study thus add further support for the widespread adoption of open access practices in scientific and scholarly publishing.