From the replication to the job interview
Andreas Peichl talks about his experience with Open Science
Three key learnings:
- Replications help initiate contacts. Discussion and exchange with the original authors can sometimes even lead to a job offer.
- Publishing research data and codes increases your own credibility and improves your image
– especially if your name is still unknown in the research community.
- Pre-registrations are useful for experiments – for example in the AAA Registry of the American Economic Association.
Can you give us a Best Practice example in the context of Open Science from your field?
AP: In my department at the ifo Institute we produce the ifo Business Climate Index. There you can download all the data from our ifo homepage as an Excel file. Every month we survey 10,000 companies for this. The microdata with their responses are available from our research data centre EBDC. We offer access through our research data centre and researchers worldwide use this. Here we make the data available as data producers and also make the methodology transparent. We communicate the Business Climate Index through various channels, among them a monthly press conference. Another example is a new Open Source simulation model for evaluating policy reforms. Until now, every institute had its own model. For a few years now we’ve been doing this together with IZA and ZEW based on a common code. And now there’s also a project together with the institutes and the University of Bonn where we develop our new model GETTSIM based on the old models, which is available as true Open Source on the internet, at GitHub for instance, so that everybody can use it.
Is the code already available?
AP: In a simplified form the code is available on GitHub, but the whole project is still under development. We have received project funding from the German Research Foundation to develop this model, to make it truly Open Source and to document it in such a way that everybody can understand and use it.
What is the role of replication studies for you?
AP: In my teaching I motivate students to do replications, especially master students. In Mannheim I did this with a lecture plus seminar. In the seminar I chose papers which were replicated. Right now I hold a seminar in Munich where each participant is given a publication to replicate, to enhance or to apply to another dataset. For instance, students are given a publication with American data and are told to apply it to German data and then to analyse the differences. I think this is very important for students. On the one hand they can see how science is done, on the other hand they learn how to do it themselves. And they can watch if they get the same results as the original authors. And if not, how they can explain the differences. The course is always booked out and very popular among students.
Have you experienced positive effects yourself in the context of replications?
AP: One positive effect has been that I was offered a job after finishing my PhD. This was also a bit more at the interface between scientific work and policy simulation. At the time, “citizen’s money” (a basic income) was hotly discussed in Germany. One part of my thesis looked at the effects of this and other reform proposals. Our computations and those of other institutes were very heterogeneous and produced very different results. This gave rise to lots of discussions, of course. And these discussions initiated contact with the IZA in Bonn where I signed on after I finished my thesis. I hadn’t even thought of them as a potential employer. Because of the exchange I was invited directly to a job interview. Apart from that, there’s always exchange and discussion. Replications help initiate contacts. Especially when you do methodical research it’s useful to make your software and codes available in some form so that others can use them.
When you have various publications on your desk, would the author get bonus points for more credibility if data are available?
AP: Definitely, especially if it’s a paper in a journal. If the journal has a Data Availability Policy, and the author then applied for an exemption, I would try to understand why that happened. In that sense there’s definitely a bonus point for available data. However, data availability has no influence on whether or not I read the paper. But when I think now of a meta study, then I would probably weight those papers a little lower where no data are available.
Would you advise early career researchers to publish their data if it is possible?
AP: Yes, definitely. We try to push Open Science so that data are available. What I find useful in any case is to make the data available to the journal within the framework of the publishing process. If the data are self-collected, it’s comparatively unproblematic. Problems can arise however with data from other sources. Nonetheless you need to keep scholarly competition in mind and choose a good moment for publishing your data.
Do you share your research data in the Economics & Business Data Centre of LMU and ifo?
AP: Yes, our research data are archived in EBDC if we collect them in some form and are thus able to share and to archive them. In other projects, archived data are stored by the data provider, e.g. data from the Federal Employment Agency at the IAB in Nuremberg. In other contexts it depends on the journals which operate or use different repositories of their own. In which case the final data and code will be archived there (see example). Usually we also save one version in our EBDC, too.
Who can access the EBDC? Must potential secondary users legitimise themselves?
AP: Access is primarily for scientists. You can make a request for use to our research data centre, and then there are different data protection requirements depending on the data. There’s a formal process that varies with the dataset. In some cases, data are sent to you and you can do with them whatever you like. But there are also cases where you have to go to Munich and consult the data inside the EBDC from a secure workstation without internet access. When data are archived, the access options are defined.
When have you been first sensitised to Open Science and why?
AP: I can’t really give a precise date and time. During my student years and my thesis I tried to replicate other studies. There were things I found interesting and I asked myself how the researchers had done this. I graduated in 2004, and in my final thesis I tried to replicate a study done by colleagues at the DIW. I got into contact with these colleagues and received codes so I could retrace things. Later at the IZA in Bonn I was a member of the Data Committee of the research data centre. At the research data centre we also looked how we can archive our data and how we can make our results replicable. I have only become aware over the last five or ten years that these activities fall under the rubric of Open Science. In Munich we make our data available through our research data centre EBDC. This includes submitting the data and the codes. This is important for science, so things can be replicated.
So your own stuff also lies inside the research data centre?
AP: Exactly. In Munich right now we have a Collective Research Centre where it is obligatory to archive all data etc. in our research data centre (EBDC) which then offers access options. The problem with many things that I do is that the data don’t belong to me. For example, I work a lot with tax data from the Federal Statistical Office and I’m not allowed to publish these data. That’s often a problem when working with administrative data. You enter into a contract with the Federal Statistical Office for a limited time, and when the corresponding project is concluded you can no longer access this dataset. I am a member of the Scientific Advisory Board of the statistical offices’ research data centres, where we repeatedly discuss the problems of access to certain data and try to create a sensitivity for it, which is quite present. Often it’s a matter of lack of resources that some of these data can’t be archived. We have a project right now where we work with 60 million tax returns with 2,000 variables over a time of twenty years. And precisely for such huge datasets the statistical offices or the providers of administrative data often don’t have the infrastructure needed to archive separately the dataset I’ve worked with. Data are updated repeatedly, laws are changed, and if I take today’s data they won’t necessarily be the same I used before. That’s another point that causes difficulties.
Do you use pre-registrations? And if so, where do you do it?
AP: Partially, yes. When it’s appropriate, e.g. for a survey experiment. I think pre-registration makes most sense for experiments. I usually do it at the American Economic Association, on the AAA Registry.
Can you offer tips for young researchers starting with Open Science?
AP: In my opinion it’s definitely part of good scientific practice to make your stuff available to others as far as possible, so that it is replicable. In a sense this is ultimately a job requirement for scientists and therefore you should do it per se as best as possible. There are a few problems, obstacles at some points that you need to be aware of. If you publish data etc. on your homepage you create a positive image. At least it gives me a good impression, of a person acting transparently.
How do you see the future of the Open Science movement?
AP: I believe the Open Science movement can no longer be stopped – and that’s a good thing. Whether the steps are small or big depends on your perspective. As regards publications in academics: as long as we have publishers and journals with a certain business model, it will be difficult to reach a true Open Access, Open Science domain. Simply because there will always be people who earn money from Restricted Access in its various forms. There’s something similar happening with patents, as soon as there is a commercial interest, it creates incentives for restriction. Conversely, of course, you wouldn’t have many investments and innovations without this commercial interest. What is not going to work is to say that everything must be Open Access, no matter whether it’s financed with private or with public money. If you had more access to data, or even to patents, it might give rise to more innovation and that will always remain the problem: finding the right balance. But then, that’s what we economists study.
The questions were asked by Dr Doreen Siegfried.
The interview was conducted on March 10, 2022.
About Professor Andreas Peichl
Professor Andreas Peichl is director of the ifo Centre for Macroeconomics and Surveys. He is a professor of macroeconomics and public finance at the Faculty of Economics of LMU Munich. His research activities focus on public finance, tax and transfer systems, distribution and inequality, and the labour market. Professor Andreas Peichl is a member of the Scientific Advisory Board of the Federal Ministry of Finance. He is also a member of the Open Science Centre of LMU Munich.