What does it mean to publish your scientific paper in 2020?
Benefits to authors of non-anonymous preprints.
One of the most popular reactions to our previous post was: “we generally agree, so let’s allow preprints before acceptance, but in anonymous form only.” The current post addresses this idea, and to do it properly, we need to go back to the essentials and try to answer these questions: “What is a scientific publication for?,” “What are the functions of peer review?” and “How is it related to science?” Prepare yourself for a long read.
- Functions of conferences.
- Peer review quality.
- Conference prefiltering.
- Role of double-blind peer review.
- Benefits to the community.
- Benefits to authors.
Part 1: Defining science and the function of conference publications.
The first point is that “science” as a process of mining and updating knowledge by using the scientific method does not equal modern “science” as a set of management and organizational routines (grants, conferences, publications, etc.). This latter form of science, which we designate science-as-implemented, is just a current form of how science is performed in modern society. Having that in mind, let’s revisit the purpose of scientific publications and scientific conferences in computer science and engineering in particular.
Functions of conferences.
For the author, conferences provide:
- Knowledge dissemination: “I want people to know about the new knowledge I discovered.” The conference promises a certain minimum level of attention one’s work gets. In other words, if you don’t publish at top conferences, nobody would read your work.
- Feedback: I would like people to check my results through the review process and discussion.
- Formal goodies: checking boxes, required to defend Ph.D., for tenure package, performance review, to put into grant report, etc.
- Certification. Publishing at CVPR is hard, therefore valuable.
- Reputation-building: Listing certain conferences on C.V. as a way of building one’s name as a scientist.
- Networking: meeting with peers, potential employers, etc.
Out the six functions, only 2.5 (dissemination, feedback, and part of certification) are related to the science as a knowledge mining process, or the first definition of science. We will get back to it shortly, now functions for the audience:
For the audience:
- Prefiltering: time is limited, so we outsource the selection of what we are reading to the reviewers.
- Certification: time is limited, so we outsource the quality control and result check to the reviewers. We create a basic classifier: “If the paper is published by a top-conference, it is true.”
- Special case of certification: for people outside the field without the basic qualifications to select work that meets basic quality guarantees.
- Authors promise to answer our questions (symmetrical to “attention for the author from audience,” and audience gets the guarantee that questions about the work will be answered at the talk or poster session).
Partly “certification” serves science as a knowledge mining process, reducing a barrier to build on top of others’ work. The rest of the functions serve the science-as-implemented current model of professional scientific work and help the community to cope with resources (time, money, attention) scarcity.
So far, so good. All of these functions are built on top of double-blind peer review as the certification engine for most cases. However, there are two problems with these conference functions currently: peer review quality and inherent subjectivity, caused by the prefiltering function. The first problem is, at least in an ideal world, solvable, while the other is not — by design.
Peer review quality.
Peer review quality is neither 100% precision (all accepted papers are good), nor 100% recall (all good submissions are accepted). Let’s forget about recall for now and focus on a precision for a moment. There is quite a bit of evidence, that papers, passed through peer review could contain critical flaws, methodological and factual errors. For example,
- recent benchmark “Metric Learning Reality Check” by Kevin Musgrave et al. specifically lists tens of top conference-published papers, which contain unfair comparisons.
- “A Unifying Perspective on Neighbor Embeddings along the Attraction-Repulsion Spectrum” by Böhm et al. shows that widely cited and used method UMAP properties are an artifact of the implementation rather than what was published.
- Study “On the Convergence of Adam and Beyond” by Reddi et al. shows that the convergence proof of the popular Adam optimizer is wrong. This doesn’t make Adam a bad optimizer (practice shows that it is good), but rather that such a flaw went unnoticed by reviewers.
- Study “Cracking BING and Beyond” by Zhao et al. shows that the great results of the popular BING object proposal (CVPR2014 oral, >1000 citations in Google Scholar) are not because of the proposed objectness detector, but are results of a clever implementation-related hacking of the 0.5 IoU metric.
- Moreover, keep in mind that, all of the above examples are about the best of the best: top conferences. The quality of the average paper and peer review from the low-tier is much, much worse.
Conferences are filters, lotteries, or some poor combination.
Let’s recall now the Prefiltering and Formal goodies functions of the conference. We argue that the main point of the current review process is not answering the question: “Are the results in the paper correct?,” instead: “Is the paper worthy to be published at our precious conference?” And while the first question is related to the paper’s technical aspects and is mostly objective, the current one is subjective by definition. For example:
- Reviewer thinks that the dataset paper is not for the conference, while an AC thinks it is: OpenReview.
- Aaron Hertzmann on rebuttals, where instructions specify that, “the rebuttal is for addressing factual errors in the reviews and for answering specific questions posed by reviewers.” He writes,
What’s the problem with this rule? It implies that bad reviews are bad solely because of factual errors. It implies that writing a technical paper is primarily about providing objective or factual information, and that the more subjective aspects of paper-writing aren’t important. The first implication is obviously false, and the second one is more subtly wrong.[…] In a sense, most computer science papers have two components. The first are technical components: theory, algorithms, and experimental results. The second, however, is an argument about why the technical components are significant.” (emphasis ours)
- There is a paper by Kenneth Ward Church, Emerging trends: Reviewing the reviewers (again) discussing the problems of conferences in a similar way as we have. Specifically, are reviews good enough at certification? Should we outsource all of the prefiltering functions to the reviewers? For both questions, answer he leans to is “rather not.”
Role of double-blind peer review.
What is the place of double-blind peer review here? It (let’s assume it is true) increases the fairness of the answer on the “Is the paper worthy to be published at our precious conference?” question by removing bias towards big names, because otherwise a reviewer could think “If X has done this, then it would be worthy”. Thus, double-blind review should assure also fairer credit assignment, and all the rest of the attributes and benefits of science-as-implemented.
Note, the point above has nothing to do with science as knowledge mining. The found fact about the universe is fact regardless of who found it and who would be credited for it. The benefits of double-blind review lie not in the science area, but in the science-as-implemented area.
This doesn’t undermine the importance of the fairness, credit assignment or benefits of double blind reviews. Rather the opposite — we are trying to draw the reader`s attention to the bigger picture before jumping to conclusions.
Having established that, let’s overview, what are the problematic aspects of anonymous preprints for those who are supposed to benefit from double-blind peer review, i.e. early career researchers from unknown labs.
Part 2: Anonymous preprints are a special case of delaying release until acceptance.
In our previous post, we did not handle anonymous preprints deeply. They are not currently implemented in venues where we submit papers. But we do have thoughts about them.
We consider anonymous preprints to be a special case of delaying release of a paper until acceptance. Requiring anonymity has harms that impact the research community, but particularly early career researchers (ECRs) as explained in “Hands off arXiv!”. However, we will explain some of these issues in more detail.
After our previous post, besides comments about anonymous preprints, there were other suggestions about how authors could “just wait n months”, “just submit application packets this other way,” “just allow the system to correct for x, y, z”, “just…” do more work or wait to be noticed. We see these suggestions as contortions to get around barriers imposed by one’s own research community. Furthermore, we see these contortions as unnecessary, because of the greater cost to those without stability or security. Not everyone in our research community can assume security in their employment, visa status, finances, health, etc. It is our hope here to outline more clearly what some of these costs are to ECRs with respect to the delay of preprints and the special case of anonymous preprints.
We will discuss costs to ECRs from the opposite direction — by discussing all of the opportunities that come from non-anonymous preprints. Without non-anonymous preprints permitted, the costs then are missed opportunity costs. Keep in mind, that ECRs from small labs typically have fewer opportunities than ECRs from large labs and universities very well known in the field.
Benefits to the research community of non-anonymous preprints.
The most obvious benefit of non-anonymous preprints is the straightforward distribution of authors’ research to the community. This research and ideas can then be clarified and explained through interaction or blog posts out in the open. Additionally, public code and data releases could be done without the administrative overhead of keeping track of anonymity at all stages. Code contributions could then also be accepted. While the research could also be distributed in the anonymous preprint case, the need to maintain anonymity restricts some forms of feedback from the authors.
Benefits to the authors of non-anonymous preprints.
The benefits to authors of non-anonymous preprints, particularly ECRs, is that they can build professional identities around the work instead of waiting until acceptance. Activities include collaborations, building informal networks of support outside of their home institution, and organizing events. All scholarly work can be listed on their C.V. — including preprints –- to show research output.
Increased visibility through preprints allows others to become aware of these individuals. Examples include recruiters, who in our experience, contact candidates versus candidates submitting application packets. Moreover, other examples include fellow researchers who are looking for speakers at local meetups, non-academic conferences, workshops, departmental seminars, and so on.
What is signaled by top conference acceptances?
On the topic of the “wait 3-n” months argument, there has recently been some discussion about conference acceptances as a requirement for acceptance to Ph.D. programs and some jobs (see Andreas Madsen for one account). We see this as a troubling trend. Under the prior requirements that one must travel for a paper publication, we want to explore what a top conference paper acceptance signals on a candidate’s C.V. One, it signals good quality work (in the technical sense) whose arguments for value are accepted by the community. But what else does it signal?
- the ability to pay conference fees, either out-of-pocket or with grants, and/or in a country with a favorable exchange rate,
- the ability to write well, attentive advisor, or employment stability to wait a few review cycles,
- adequate computing or laboratory resources to perform the work,
- having the type of citizenship that would allow one to get a visa to the hosting country.
(this is a non-exhaustive list.)
Once you start to interrogate these signals, I hope you get the idea that requiring top conference acceptances would favor those with the luck to fulfill all of these conditions. If you want to change the demographics of your departments and companies and you use top conference acceptance as a criterion, you should have a deep think about what that really means.
While virtual conferences have alleviated some of these issues, they do not alleviate these issues from [ — 2019], and there is no guarantee that virtual presentations will be allowed if / when covid-19 ends in the future. Moreover, some conference attendance and printing fees remain expensive, e.g. £450 for virtual ECCV 2020 (point 1).
Now as one can see, double-blind review combined with banning named preprints does not come for free. While removing one bias towards big names, it introduces other biases and limitations, which can be more harmful than the originals.
Is it worthwhile to preprint? It depends on a specific situation, that is why we are arguing for allowing researchers to decide how to share their work.
Stay safe and sane.
Sincerely yours, Dmytro Mishkin and Amy Tabb.
P.S. This post in Amy Tabb blog