How young scientists can be sensitised to Open Science
Andreas Beyerlein talks about his teaching experience with Open and Reproducible Science
Three key learnings:
- Combining obligatory statistics courses with a module on Open and Reproducible Science can be a promising way of sensitising young scentists to Open Science.
- The benefit of an Open Science Competence Centre at a universty is that skills can be bundled and researchers can get advice exactly when they need it.
- Early career researchers submitting an article for the first time can profit from subject-specific checklists which tell them exactly which information should be given where.
When you exchange ideas with statisticians, psychologists, economists, and other scientists at the Open Science Centre in Munich, how important are discipline-specific differences? Are there interdisciplinary conversations about Best Practices regarding Open Science?
AB: Yes indeed. For instance, we discuss course content. Let me give an example: during my work at the Core Facility Statistical Consulting – a service unit at the Helmholtz Centre Munich – I was part of the team which established a course programme with three units: (1) introduction to statistics programme R, (2) introduction to statistics and (3) Open and Reproducible Science. These are obligatory for PhD candidates. For the third part involving Open and Reproducible Science, we had intensive exchanges with colleagues at the Open Science Centre of LMU University. Discipline-specific differences showed in the awareness of the use of replication studies which is much higher in psychology than medicine, for instance.
How did participants respond?
AB: The courses in R and statistics were always highly evaluated. We also tried to relieve them with so-called Statistical Stories. Such as: where do we meet statistics in our daily life? How can a test be falsely positive or falsely negative? Most can apply the content of the R and statistics courses directly in their research work even if they have only just embarked on their PhD thesis. When it comes to Open and Reproducible Science, we noticed that participants could recognise the relevance of the topic quicker if they were at a more advanced stage of their scholarly work. For PhD candidates who had no previous exposure to scholarly publishing, the topic of Open and Reproducible Science was rather abstract at first. It’s important, though, to sensitise early career researchers from the beginning to the importance of careful documentation and good scientific practice, and to give them practical guidelines for this. That’s the purpose of these courses.
Would you also recommend that economic research institutes offer obligatory statistics and reproducibility courses? And if so, who would ideally teach these courses? Should they be offered as interactive workshops so participants can get actively involved?
AB: I believe that economics would also benefit from them. Half a day for Open and Reproducible Science isn’t much of a time investment and you can cover a lot. I also think that the course should include discipline-specific examples, because otherwise it will come across as very theoretical. The workshop format didn’t look useful to us for the Open and Reproducible Science module. Instead we tried to get up a discussion with the participants at certain stages. We also showed Best Practice examples and useful tools for creating, saving, and disseminating reproducible source code, but we didn’t go into implementation for practical reasons and lack of time. The two statistics modules, on the other hand, include many exercises and interactive working. In addition, there’s always the Core Facility which offers individual coaching in statistics and Open Science outside the courses.
According to your experience so far, should there be a stringently coordinated Open Science course at research institutes? Or would you consider various individual offers from different units at a university more effective?
AB: I think such a course is very sensible for the Helmholtz Centre Munich with its focus on biomedical research. But for a more multidisciplinary institution such as LMU Munich this approach seems less practicable to me. The LMU Open Science Centre therefore aims to bundle the resources and skills of different disciplines. So if a LMU scientist has questions about a certain aspect of Open Science, they ideally contact the Open Science Centre. Colleagues there can help directly and know who is expert in which topic. For example, I am not familiar with the legal aspects of Open Access, for this the university library would be the competent partner. The whole field is also evolving very quickly and I think it is hardly possible to stay on top of the latest developments for all issues. So it’s good if you can talk to people who know exactly which platform or which best tool is best fitted for which purpose, applying division of labour, so to speak. The advantage of such a Competence Centre is also that the university’s awareness of the whole issue is raised and it becomes visible.
You are an editor for PLOS ONE. Do you really insist on seeing all the data in your discipline? Does PLOS ONE have special requirements compared to other journals?
AB: For PLOS ONE it is important that the study is carried out scrupulously and adequately describes especially the methodology and the results. What the study’s findings are may not be included in the evaluation of reviewers and editors, unlike most other journals I know. I think this approach is essential to avoid so-called publication bias, i.e. reporting only “positive” (i.e. statistically significant) results and “negative” (i.e. insignificant) ones are dismissed – a principle which obviously sets wrong incentives for the science system. That’s why I was happy to sign on as editor. Even before, when I was a reviewer for PLOS ONE and other journals, it wasn’t decisive for me whether the result changes the way you look at things, but whether it is well done, whether I believe that the result is coherent and adequately discussed. Are the limitations stated? Can I understand what has been done? I read the metholodogical section very closely and compare it to the results section to see if everything fits well and is understandable. But it has never happened to me that data and code have been provided beforehand.
So you must request them?
AB: Yes, exactly. Providing only the code directly practically never happens, although it would actually be possible. Usually I can see very well from the text whether data and evaluation appear to be valid. Sometimes I suggest additional sensitivity analyses, e.g. does the result change materially if you include a certain variable; are there indications of systematically missing values and how do they potentially influence the findings? Both as an editor and a reviewer I always demand to see the first review, that code is provided, and I always ask if data can be provided, and if not I demand a statement explaining why not.
How often are data made available in the end?
AB: Very rarely. Of course, providing individual data is a delicate affair in the life sciences for privacy reasons, unfortunately also in the case of anonymised data. So in most cases I accept that data can’t be provided. In the case of code, however, I often meet with resistance: In perhaps fifty percent of cases, the codes are made available to me for a second review upon request. In the other fifty percent I’m offered spurious arguments why it’s impossible – and which I don’t accept.
When you say that PLOS ONE is a good place to publish your non-significant studies: how many studies with non-significant results are there in PLOS ONE? Does their share grow over the years?
AB: In general I think that awareness of the value of non-significant findings has grown over the last years, and also of the duty to publish well-done studies regardless of their results. Given the fact that PLOS ONE publishes more than 1,000 papers per month, I can’t make comparisons over time. Generally it’s my impression that medical papers are often submitted first to journals with higher impact factors. But they often publish only papers with “positive” results so that the ones with “negative” or at least less spectacular results end up at PLOS ONE which, because of its deliberately broad subject focus, cannot compete with the impact factors of the high-ranking medical journals. Thus, PLOS ONE represents a corrective in a sense, although, in view of the publication bias already mentioned, it would of course be desirable for other journals to evaluate studies primarily according to their quality and not according to their results.
What are your tips for young economists at the start of their careers for making their editor happy? What should they keep in mind?
AB: To me it’s important that the entire paper follows a stringent line of argument from the introduction to the discussion, and my focus is on the description and presentation of methods and findings. I always recommend the use of discipline-specific checklists which tell you exactly which information to give where. And if you publish the code and the data in advance – which is probably more easily done in economics than in life sciences – you have done a lot in the direction of Open and Reproducible Science. In addition, I recommend to write the code clearly and reproducibly from the beginning. You will be grateful for this when weeks or months later someone wants additional analyses, which in my experience nearly always happens. As a statistician, I always consider it important to address multiple testing which, in addition to deciding on an appropriate correction (e.g. using Bonferroni), also includes clearly naming main and secondary analyses and placing them in the context of the hypothesis. Sometimes there are studies to which such concepts cannot be applied meaningfully. As a reviewer I attach great importance to not discussing such studies as “conclusive”, but as purely hypothesis-generating, i.e. all findings must be confirmed in other independent studies. What I also want mention is that I do not see my role as a reviewer or editor only in an evaluative way, but that I also always try to make constructive suggestions for improvement in order to add aspects to the submitted work that may not have been considered yet.
Where do you generally see the future of Open Science?
AB: I think Open Science has a positive radiance. Unfortunately, there are still many established scientists in senior positions who think little of Open Data or reproducibility or who haven’t looked at the issue closely enough and thus refer to traditional ways – according to the motto “We’ve always done it like this”. PhD candidates and Post-docs socialised in this fashion naturally find it difficult to implement Open Science. At the Helmholtz Centre Munich it was very important to us to reach young scientists with our courses so that later, when they reach senior positions themselves, they can implement Open Science. At the same time there must be suitable incentives or the intrinsic motivation will be lost. There’s a heavy competition for positions and third-party funding, and many scientists are on fixed-term contracts, so it’s understandable if researchers can’t think only altruistically. Open Science must find a substantial impact in funders’ funding guidelines and happily it already does. I am very optimistic that these two approaches – supporting from below and pulling from above – will move everything in the right direction.
The questions were asked by Dr Doreen Siegfried.
The interview was conducted on February 11, 2022.
About Dr Andreas Beyerlein
Dr Andreas Beyerlein is a statistician and epidemiologist at the State Office for Health and Food Safety as well as senior lecturer at the Technical University of Munich. He is Associate Editor of PLOS One and a member of the Open Science Centre at LMU Munich.