Skip to main content

Open Peer Review Will Be A Thing

There is a huge resource of human energy to be tapped into that will strengthen the scholarly ecosystem. (3,455 words / April 26, 2024)

Published onApr 26, 2024
Open Peer Review Will Be A Thing

Open peer-review is growing. Slowly, to be sure, but in the face of declines in more traditional forms of peer-review, any extra voluntary growth is notable. Which type of review will prevail? Both; the two supposedly binary forms are not at war. We need “both”. We need a system that allows the inclusion of far more ways of being reviewed, and of conducting review, before, during, and after publication, however we may define that activity.

To paraphrase Newton: if we hope to understand better, it will be through reading the comments of peers. More publicly-viewable forms of comment will be an important step for fortifying trust and integrity of the scholarly record. Making multiple opportunities to review a work by different sets of readers, who can then share those insights with other readers, is about creating a lifecycle with robust safeguard opportunities—not about creating new burdens.

As people, we love to review, comment, and moderate. Open peer-review, like other aspects of web-transitioned scholarly communication, has been an outlier. Lots of experimentation may be necessary for open review and comment to become routine. One of the biggest steps we should consider taking today is breaking out of the thinking that open peer-review should simply recreate closed peer-review, but in the open. Traditional peer-review already needs to be reformed. Why recreate it 1 for 1?

Some publishers are experimenting with structured review: this needs to be the norm. Make clear exactly which aspects of a paper that your journal has coordinated review of. This doesn’t mean you have to reveal reviewer names or what their comments were, just that X, Y, and Z are aspects that have been assessed at least once. The rate at which fault is found with those aspects of your published papers will provide specific evidence about your journal’s reputation.

Structured peer-review can also speed traditional peer review. If you are clear to both reviewers and readers that X, Y, and Z are the only aspects your journal assesses, that makes for a much clearer assignment to your reviewer recruits and your readers will be better equipped to read your papers, knowing what aspects that may require more scrutiny. And should a reader find an issue with an aspect outside the journal’s purview, we need more efficient and open systems to report that.

Breaking peer-review down to reviewable facets is an important step for ‘traditional’ peer-review, but will also open enormous amounts of opportunity for open peer-review.

There are readers who specialize in assessing particular facets of research papers. The best way to tap into their potential—and to improve scholarly communication at this moment—will be allowing specialists to openly comment on those single aspects of papers, rather than asking them to assess the entire work.

And rather than rely on the opinion of 2-3 reviewers that an editor was able to wrangle in the handful of months before publication, allow any number of reviewers to self-select into making a public assessment on any paper, at any time, and in whatever level of granularity appropriate for the situation.

While breaking down peer-review to faceted assessment will strengthen ‘traditional’ peer-review, there’s no reason that ‘open’ peer-review can’t undertake this concept today.

Research Integrity & Responsibility

In a Scholarly Kitchen post, Angela Cochrane cited some recent high profile instances of published research misconduct. Although the “vast, vast majority of papers submitted to the vast majority of journals are written by ethical and responsible researchers,” integrity checks for research need to be put where they belong, Cochrane asserted, not with the journals, but with researcher’s employing institutions.

This is not necessarily wrong, but a wider truth may be that we all have a responsibility to read, vet, and moderate the literature. The ultimate responsibility should not rest solely on any single group, whether it be a handful of peer reviewers or staff in university research offices, but rather on all of us who take an interest in the epistemological hygiene of the scholarly record.

Let me reframe that slightly: the ability should be afforded to any of us who are able and willing to contribute to strengthening truth claims and methodology.

It is recognized that scholarship now takes place in a digital era. Given this, the scholarly world needs to better harness the tendency of the web to allow wider participation. As one example of this wider participation, consider the how Twitter/X not only has its own in-house moderation efforts, but provides support for community-based efforts which exist alongside the baked-in ability of individuals to reply to posts.

Bird Watching

The Twitter Community Notes program (launched in 2021 as Birdwatch) is a “community-driven content moderation program, intended to provide helpful and informative context, based on a crowd-sourced system.” This program rests on top of X/Twitter’s own (imperfect) in-house moderation efforts, as well as user’s ability to reply to most tweets.

Renault, Amariles, and Troussel differentiated the Community Notes program from other warning media labels because they are generated by crowds rather than independent fact-checkers, offering “a contextual piece of information”, rather than a simple warning. In their study of the program’s database, these researchers found that adding context to tweets (with false or misleading claims) reduced retweets by nearly half and increased the likelihood that the tweet author would delete their tweet by 80%.

The effect on the spread of information through this style of moderation is remarkable in itself but it is also worth pointing out that, even as Twitter/X undergoes a shift in its user base, there are users who remain dedicated to help moderate content without apparent individual reward.

At the time of study, there were 285,000 notes, making for a modest annual average of 95,000 notes. Given that hundreds of million of tweets made daily, the output of the Notes program may not seem so impressive. This relatively small number may be fine. Most tweets (we hope) would not require a community note.

Follow Community Notes updates at
Follow open source code updates at
Follow better and worse examples of community notes at

Slow Feedback

When research flaws or misconduct are uncovered in formally-published articles, readers conversant in this genre may expect to see a publisher-issued retraction, correction, or notice of concern. Other readers may learn of troubled articles through a tweet or blogpost, only to click through to discover that there is no such notice on the version of record. A third category of readers may a problematic published article and never detect that there is an issue.

It’s not the simple case that one system of review or moderation is better than the other (although there are cases that could certainly serve as supporting evidence). Spotting issues and reporting them in a literature as large as our current one simply takes more than one set of eyes and at more than one stage.

Analogously, the Federal government has inspectors in place to make sure that meat and produce handling clears minimum standards. Restaurants and grocery stores are also held to similar account. But even with these two quality checks in place, consumers should still trust their gut and avoid consumption when signs of spoilage are present.

If merited, consumers can report to an appropriate agency or take to social media. FDA alerts will help consumers who pay attention to recall notices. A social media post will help those friended with the right person at the right time. Otherwise, some consumers, regardless of these measure taken still might end up eating a bag of bad salad (munch around) and learn the hard way (find out).

A troubled scholarly article is similar to a contaminated head of lettuce: despite having inspection schedules and reporting mechanisms, the object and its warnings are too often separate from each other.

With food, you can put a serial number, barcode, or QR code somewhere on it, but this supposes that consumers have a smart device and will use it. Unlike icky lettuce, most scholarly articles are born-digital and—like tweets—are capable of displaying real-time contextual notes produced by a community.

The content of research is digital and text-based. So are real-time, community-moderated notes of caution. There is no technical hindrance to displaying the two alongside each other.

Lack of Open Peer Review is Weird

Imagine the services and products you’ve searched for online, how many of them didn’t have some sort of rating beside them? Think of the last piece of popular media you consumed. If you searched, how many reviews would you find? Consider the entirety of Wikipedia: the number of pages, editors per page, and steps of moderation per edit. At a base level, people love to rate, review, comment, and moderate. That’s been very clear since web 2.0 gave us the tools.

While there are a lot of fake reviews, poorly conceived criticisms, and star ratings added in a rash because a website forced a user to add one, there is also a lot of well meant, well written, and well-informed opinion to be found as well. On a longer timeline, these are still new forms. It takes time for norms to coalesce, but cream does rise.

Considering the web’s tendency for moderation and review, it’s honestly a little weird we don’t already have established practices in place to more systematically enable, capture, and display open comment on research objects. Academic and scholarly content has been the outlier in many internet content dynamics (both for better and for worse) and this is just another of them.

Yes, a lot of groups have attempted already to make this a thing. One lesson I took from Taylor Lorenz’s Extremely Online is how fragile place, time, and user bases are when launching platforms. Well-designed social sites have failed at launched, while others have become successful even in spite of what founders had in mind. The point is, any unsuccessful stabs at open peer-review you may have observed heretofore do not necessarily imply this is doomed to failure forever.

Timing Right

The stars have begun to align for open peer review but fall short of a full constellation. Overall numbers are still relatively low, but the trend of adoption is hard to deny. The fact there is sustained conversation about open peer review says something. Wide interest, or at least curiosity, is a bare minimum prerequisite, which has been achieved. And this conversation has led to an uptick in open peer-review which not so long ago was zero.

Talk has turned into dollars, with funders announcing support for preprints as well as infrastructure enabling some level of review. Whether or not what the funders like the Gates Foundation has in mind will be successful is almost beside the matter for this conversation. At the very least, their experiment will generate needed evidence about alternative forms of vetting.

Growth of preprinting is a boon to progress in open peer review. Preprints are a suitable candidate for early open peer review adoption. They can offer insight to readers used to having other signals and they offer potential for reviewers to add value to work that has yet to receive more formal review. Perhaps crowdsource review of preprints won’t entirely replace journal systems, but eventually we may recognize that these do need not be competing activities.

Two Categories of Review

Once we recognize open peer-review is not structurally or inherently at war with closed peer-review, but is rather a complementary activity, we may also understand the two as existing on a spectrum of review-shaped activities. Closed peer-review is not one thing. And so neither should open peer-review recreate the imagined/conceptualized monolithic structure of closed peer-review. Different journals have different standards of review. Different article types require different checks. Recognition of the different ways reviewing takes place could encourage further expansion of reviewing modes.

Outside of scholarly communication, reviewing does not refer to a single type of activity. Let’s begin by imagining reviewing as falling into two very broad categories (from which even smaller categories could be theorized).

  • The first category of review is an assessment of an entire work from a particular perspective.

  • The second category of review assesses only a single (or narrow group of) aspect(s) about a work.

As an example of a first category review, consider professional film critic Dana Stevens’ 1,799 word review of Oppenheimer for the Slate was 1,799 words. Here in this 67-word excerpt Stevens discusses the editing style’s effect on the story telling:

“The film’s narrative chronology is so fragmented it seems to have taken its cue from the recurring image of a black field studded with swirling points of light, a seeming reference to both the starry night skies above Los Alamos, New Mexico, and the subatomic micro-events the Manhattan Project team is trying to observe and manipulate. The story zigzags freely and sometimes confusingly among two main timelines.”

This single aspect is part of a larger review that may be thought of as a compilation of such insights that add up to a more complete assessment of the film.

In contrast a review of the Oppenheimer DVD by “Brad S” on is an example of a Second Category Review:

“Arrived in new condition and played well.”

This is the entire review. It says nothing about the content or artistic achievements of the film, but nonetheless tells potential consumers something important to consider. Importantly, this review of Brad S sits alongside over two thousand other (presumably mostly real) reviews of the DVD that users may read to help form a contextual picture of the product’s reception.

Typical journal peer-reviewer notes are often expected to include several discrete Category 2 Reviews that build up into an overall Category 1 Review. A reviewer may discuss particular aspects of a paper (literature review, methodology, findings), but then offer a more holistic overall summary (this paper represents an important new finding in the nascent literature of XYZ), followed by a note on whether they believe the paper is appropriate to the journal organizing the review.

It is deficient to think that open peer-review—with the novel twist of being made available for public consumption—must recreate the review style that takes place in traditional journals. There’s no reason that peer review must always require reviewers to assess every aspect of a paper. On the other hand, there is very good reason to be very clear about what assessments have taken place.

Recognize Second Category Reviewing

When we accept that review might cover some single aspects rather than the whole of a paper, then we begin to perceive peer-review as a dynamic raft instead of a static monolith. Whether a paper has undergone peer-review evolves from a question with a binary yes/no answer into an invitation to understand what aspects of a paper have been reviewed, exactly.

Different journals in the same field may have different standards for peer-review, in the sense that the aspects or facets of a paper that they expect to pass review may not be identical. These may not even be uniform across articles within the same journal.

CRediT (Contributor Roles Taxonomy) helped crystalize something we have always known about multiple-author papers: authors of those papers often contribute in very specific ways, rather than general ones. Similarly, peer review could match reviewers with specific skillset to comment only on relevant areas of a paper, rather than the whole paper.

Encourage Second Category Reviewing

If the peer-review pool is thought to be a resource in decline, we should ask ourselves how can we most thoughtfully deploy that resource. Would you apply Elizabeth Bik’s talents toward reviewing a small number of papers in their entirety, or continue to let her focus her attention on image duplication across a maximal number of papers?

Dr. Bik represents a new sort of specialist that works in the open and focuses mainly on one narrow aspect. There are likely more specialists out there that could be recruited. We should make room for the second category that allows individuals to quickly jump in and speak about the specific aspects they wish to comment on.

We need foxes who review multiple aspects of papers, but recruiting more hedgehogs to apply a single specialty focus can fill out our corpus of open peer review comments, assuming that they don’t all have the same specialty. This could happen in closed peer-review systems as well as open peer-review systems, but it seems like it could happen faster in an open setting where individual users could self-identify and self-match.

How to encourage this? The movement for journals to compensate peer-reviewers could just as easily apply to review work that takes place outside of journals. A lot of social sites have jumpstarted activity through compensation. In the past, I’ve kidded that funders should offer bounties to sleuths who spot research misconduct in their funded work. But we really are seeing universities fund initiatives that offer compensation to reviewers who uncover errors (and compensate authors whose reviewed work proves error-free).

As another example, funders could hire librarians (who make valuable contributions in this space) to peer review methodologies of funded preprints. This could prime the pump for further action, with some of these librarians enjoying the practice and doing it for hobby. New librarians may do it on spec to build a track record in the hopes of becoming paid reviewers. And eventually, a librarian or researcher’s level of engagement with peer review could become countable toward tenure and promotion, a way to improve the integrity of the record as suggested by a mathematical society publisher.

Uncover & Display Second Category Reviewing

First, recognize that peer-review can be broken down into a number of discrete facets.

Second, enable and encourage a new style of review that focused on reviewing or assessing single facets.

Then third—or, perhaps tied for second—recognize and uncover areas where single facet reviewing is already occurring, or could easily take place, and display it more prominently.

When we think about how hard it has become to recruit a sufficient number of peer-reviewers for the number of manuscripts we have, what does that tell us? Is it the case that people don’t have time to read and assess papers, or that people don’t have time to write up a full report on a paper and its merits for a particular journal? If people are strapped for time with their other obligations, what are those other obligations?

Research is one of those obligations. In order to write research, it is necessary to read research. When authors write a paper, they will include a literature review of papers that they (we hope) have read and assessed as solid enough to build upon. Tools like have enabled a view of not only what articles a paper has cited, but which section of those articles specifically, and whether or not the sentiment of the citation seems to be one of confirmation or contradiction.

Can we enable a more deliberate process during literature reviews for researchers to leave some mark of assessment of papers that they read, if not in full, then at least of the facets that they considered for their own research? When I read a book on my Kindle, I have the option to highlight passages and make private notes. I also have the option to make these passages and notes public. Why not build similar tools in reference managers like Zotero?


A lot of review/moderation/comment experiments will come and go. The important thing is to archive the content of these modalities so that they can be gathered together and displayed into the future. Maybe widgets like those from Altmetrics or PlumX could be expanded out to accommodate. Maybe a site like Publons could evolve into something more like a Metacritic.

Not every paper or preprint will fill out, but that’s fine. Right now, we give esteem to papers for being published in journals with low acceptance rates. Having a history of public review could serve as a new signal for papers into the future. On the flip-side, in an open system, gaining renown for taste-making or early flaw detection could become a possibility for individual reviewers or organizations.

This may all sound naive and utopian. Maybe so. What I know is that people quite enjoy reviewing, commenting, and moderating content online. I also know that both journal content and preprints need far better signaling tools than they currently have. It’s entirely possible that building out ever more platforms and initiatives may not catch on, but I think on a long enough timeline, that it will. At the very least, like I said at the start, the ability should be afforded to any of us willing to contribute to strengthening truth claims and methodology.


Like this content?

Consider supporting:

Or leaving feedback:

Jay Patel:

What are some good examples of open, pre-publication peer review? Preprint comments? Are there other places to capture the comments like conference talks (Q&A sessions)?

Jay Patel:

I believe Ronen Tamari is working on something like social media widgets on preprint servers to nudge users to peer review openly and in micro interactions.

Jay Patel:

What are the psychological barriers to using Community Notes as opposed to richer threads? Can we chain Community Notes into a special Community Thread? Maybe the Community Notes feature is too coarse for scholars right now.

Jay Patel:

Curious to know if you have examples of scholars using Community Notes for open peer review. A dataset of that would be cool to explore. In months of studying open peer review examples online, I haven’t seen this very much. Laypeople and practitioners seem to use it, though.

Jay Patel:

Though those structures are a bit loose sometimes. I’m curious which ones are the most structured. Grounding these checklists and sensemaking prompts in reporting guidelines and syntheses of quality appraisal tools seems sensible.

Jay Patel:

In the immediate absence of multiple sources of compensation for open peer review, I wonder whether having role models may be helpful and to infuse open peer review into academic courses, journal clubs, and departmental talks.

I’m conducting a content analysis right now on open peer review on social media (though I call it informal peer review).