Big team science enables structured, collaborative research

David Albrecht on his experiences with open science

The three key learnings:

Big-team science is characterised by coordinated collaboration between many research teams, each of which makes defined contributions throughout the research process. Organisation is based on clear roles, coordinated schedules and transparent selection criteria
Formats such as many-analysts or many-designs studies show how different methodological decisions can lead to different research results. Big-team science thus creates the basis for making heterogeneity in research visible and evaluable.
For early career researchers, big team science projects offer an opportunity to contribute methodologically, learn from standardised processes and become part of larger scientific networks. Even without senior project responsibility, visible participation is possible – with potential added value for publications and professional contacts.

How did you first come into contact with open science?

DA: The topic of open science was basically already present when I started my PhD. I began my PhD in economics at Maastricht University in 2019. All doctoral students in economics were required to take certain courses that were not explicitly labelled as open science, but were clearly aimed at it in terms of content. Among other things, these courses covered research data management and practices such as pre-registration and the transparent documentation of methods and analyses. These topics were addressed early on and accompanied me throughout my work, both in internal seminars and presentations and in discussions with my supervisors. The question of how principles of open science could be incorporated into my own research was therefore present from the very beginning.

Another factor that had a significant impact on me personally was the supervision I received during my doctoral studies. I had three supervisors from different generations and subfields. Their experiences and perspectives on open science were correspondingly diverse. None of them were new to the topic, but their interpretations and priorities differed considerably. These differences led to constructive discussions, for example about which standards or best practices I should apply in my research. Looking back, it was very helpful to be confronted with different perspectives at an early stage, toform on the topic actively my own opinion and to integrate open science principles into my work from the outset. Of course, this is a learning process. What I practise today is certainly not yet perfect – much is still evolving. This applies equally to my own skills and knowledge, as well as to the latest best practices. Consciously engaging with open science is now clearly an integral part of my scientific practice.

Which open science practices are you currently using – and which of these are proving particularly helpful for your work?

DA: I would like to divide this into two perspectives: first, a look back at projects from my doctoral studies, i.e. before I joined the Lab ², and second, my current research . Looking back, I worked on several projects during my PhD in which I already applied open science principles, albeit not consistently and to the full extent. One of these projects was not pre-registered, but the underlying code and data were made openly available. This was relatively straightforward, as I come from the field of experimental economic research, where the publication of anonymised self-collected data sets is generally easy to implement.

I also pre-registered another project – my so-called job market paper – via the Open Science Framework. There, I created a complete pre-analysis plan, i.e. detailed advance documentation of the planned analyses. In this case, I made too the code and data publicly available.

How did the pre-analysis plan and pre-registration help you in your work? What would you say are the main advantages?

DA: For me, the key aspect was not so much public accessibility in the sense of openness, but rather the structuring effect of the associated deadline. Pre-registration forced me to think through the planned analysis concepts in full before data collection began. I worked with simulated data to check in advance whether my plans could be implemented methodically as planned.

This created productive pressure to structure the research design clearly at an early stage – in other words, to think further ahead than I probably would have done without this formal requirement. Otherwise, there is a tendency to start an analysis in an exploratory manner without having developed a clear concept in advance. This was very helpful for my project . It not only prepared the subsequent analysis, but also influenced the design of the experiment – at a stage when adjustments were still possible. This allowed certain weaknesses and areas for improvement to be identified at an early stage and systematically taken into account.

Did pre-registration also give you a certain degree of security in the analysis process – for example, in terms of a structured workflow?

DA: Yes, it definitely created structure. I wouldn’t call it certainty in the strict sense. However, I also found that the plan was not sufficient in all respects. In some places, I deliberately deviated from it later on – with justification and transparent presentation in the paper. For me, this is an appropriate way of dealing with pre-registration: as a framework and benchmark, not as a rigid guideline.

I would like to turn to the topic of big team science. Could you briefly outline what this involves for our readers? What forms does it take and how do they differ?

DA: Sure. The main difference between traditional projects and big team science – also known as crowd science – is the participation structure. Many researchers work together on a project, usually with clearly defined roles throughout the research process. The timing of when participants get involved is key. This results in different formats. In a many-designs study, there is a common question, but no fixed study design. The teams involved develop their own methodological approaches, for example through experiments, surveys or observational methods. Interested teams recruited , their are through open callsdesigns are collected and later systematically analysed.

In many-analysts studies, on the other hand, the question and data set are already given. Here, the study examines how different analysis methods can be – with the aim of revealing the robustness or variance of results. The third form concerns data collection itself, for example in many-labs or many-surveys studies. Here, an identical experiment or a standardised questionnaire is conducted simultaneously in many locations in order to increase external validity through re ous samples from different populations.

In all these cases, the question arises as to how to deal with the diversity of the submissions, including their quality. This can be clearly illustrated by the example of our current Many Analysts project at Lab . Even before the open call, we defined clear eligibility criteria. Participants had to demonstrate a completed PhD or relevant publication experience. In this way, we wanted to ensure a certain level of quality in the contributions. All 160 teams that met these criteria and submitted a relevant analysis proposal were considered. In addition, we set up an internal peer review process: each team evaluates ten analyses from other teams. This creates an internal quality ranking that is incorporated into the subsequent evaluation. For example, we analyse how the results change when only contributions from the upper rating range are taken into account. In this way, we combine formal criteria with a collective quality check by the participating community.

How does evaluation work in other formats, such as many-design studies?

DA: Basically similar to Many Analysts studies. In Many Designs studies, we also collect contributions from different research teams – in this case, alternative research designs for a common question. These proposals can also be evaluated in a peer review process, for example by the participating researchers themselves. There are two options here: either only research designs that meet certain quality criteria are considered, or all submitted designs are included and analysed afterwards to see whether there are differences between the results of higher and lower-rated proposals. In my view, it is important that this procedure is clearly defined in advance – ideally in a publicly available analysis plan. This avoids selective decisions being made retrospectively about which contributions are included and which are not, in order to achieve a desired result, either consciously or subconsciously. Transparency in these processes is crucial.

Am I correct in understanding that a Many Analysts study already provides results that are then evaluated as part of a review process?

DA: Not quite. I would narrow down the term “results” a bit here. What is evaluated in the review process is not the result in the sense of “hypothesis confirmed or rejected,” but rather the appropriateness of the proposed analysis. In this specific case, the question is whether a team’s methodological approach is suitable for testing the underlying hypothesis in a well-founded manner.

How likely is it that different analytical methods will lead to different results? In other words, how do you approach the question of scientific truth in this context?

DA: This probability exists – and that is precisely one of the central findings of Many Analysts studies. The few studies conducted in recent years have shown that different analytical decisions can lead to significantly different results. We are also addressing this very question in our current project at Lab . In terms of content, we are investigating whether “having daughters” an effect on has certain attitudes and behaviours – in other words, a social science question. We want to test several hypotheses here. At the same time, we are pursuing a meta-scientific perspective: we want to analyse how much the results vary between teams at the content level based on the approach used to analyse data. Since the data set, the question and the hypotheses are identical for all teams, we can isolate on the variability of the results. the influence of different methodological decisions in data analysis Measuring this variability is particularly relevant for us because it shows the influence that degrees of freedom (in our case, in data analysis) can have on scientific findings.

Many-Labs and Many-Surveys studies are therefore not the next step after a Many-Analysts study, but rather independent formats – for example, to test a design internationally, correct?

DA: In a way, yes and no. The three formats – Many Designs, Many Analysts and Many Labs – are currently mostly independent variations of Big Team Science. They can be combined thematically, but each follows its own logic in the research process. We are currently seeing that these formats are being carried out independently and that each of these forms can make an important contribution on its own, especially with regard to heterogeneity. Many Labs studies, for example, show how strongly results can differ between different populations – for example, between participants from Germany, Austria or other countries. Many Analysts studies reveal how the analytical decisions of individual teams influence the results. And many-design studies show that even the choice of study design, i.e. the way in which data is collected to answer a research question, can lead to different results.

Until now, these formats have generally been conducted separately. But that doesn’t have to remain the case. At Lab Square, we find the idea of developing studies that combine multiple levels – for example, varying the design, analysis and implementation levels – very exciting. This could be understood as a kind of “many everything” study – a comprehensive big team science project that relies on collaborative participation in all phases of the research workflow.

How does publication work in such big team science projects? Is there a central writing team, and does this result in one or more publications?

DA: That varies depending on the project. In our current case, a central publication is planned. The writing is done by us, the project coordination team. We designed and pre-registered a meta-analysis before the participating teams were involved. Once the team phase is complete, we will implement this analysis plan and use it as the basis for the paper.

However, there are other models. Felix Holzmeister took a different approach with his team in an earlier project – the Fincap project ( Finance Crowd Analysis Project ), which was later published under the title Non-Standard Errors. In this project, the participating analysts ( ) each wrote their own short paper of three to four pages. This resulted in around 160 individual contributions, supplemented by a central meta paper that systematically summarises and classifies the results. This format is also conceivable. The publication strategy depends heavily on the project’s objective, the scope and the structure of the contributions.

You are closely involved with Lab ² – what exactly is your role in the current project? Are you coordinating the whole thing?

DA: Yes, in the current Many Analysts project, I am responsible for coordinating the teams involved. I am the first point of contact for all 160 research teams – both in terms of organisation and content. This means that I provide information on the status of the project, remind people of deadlines and clearly formulate what we expect in each phase. At the same time, I support the teams in submitting their contributions on time and in the required format. My goal is to design the entire process in such a way that cooperation runs as smoothly as possible.

That almost sounds like managing a medium-sized company with 160 employees. Assuming someone wants to initiate such a project for the first time, are there established structures, templates or tools to support the management of such projects? Or is this still pioneering work?

DA: That is precisely one of our goals at Lab ²: not only to implement such projects, but also to systematically process them. We don’t have any finished templates or handouts yet, but we are working on it. Our approach is to compile the experience and structures gained during the project into a kind of best practice guide. At the moment, we would say: anyone who is interested can contact us directly. We are also happy to offer exchange formats or short research stays to share our experience to date – currently on an informal basis, but with a view to offering a more structured format in the future.

From a coordinator’s perspective, what are the three most important things to consider when setting up a big team science project?

David Albrecht: In my view, it is crucial to have someone on the team who already has experience with such projects. This person does not have to be permanently involved, but access to practical experience alone is enormously valuable and important. In addition, I would recommend talking to people who have carried out similar projects at an early stage – even if they are not directly involved. Just write to these people, talk to them at conferences or similar events. I spoke to researchers in advance, for example from the University of Innsbruck, who had already implemented crowd science projects. Even just a few conversations gave me important insights. For example, they described how they structured their project management, developed technical tools or automated processes – all with the aim of keeping the organisational effort manageable. These insights were crucial for me to be able to fulfil my role as coordinator well.

What has been the response to big team science projects? Are there formats that are more in demand than others?

DA: My experiences so far have been very positive – especially with the current Many Analysts study. Before the official call, I spoke to many colleagues to present the idea. There was a lot of scepticism, especially outside the community that regularly deals with Big Team Science. The most common question was: What do you offer the participating teams? Why should they get involved if they’ll just be listed as one of a hundred co-authors at the end? These doubts also left their mark on me, and I was a bit nervous about how the response would actually turn out.

We had decided internally that we needed at least 80 teams for the project, otherwise we would not start it – for statistical reasons alone. At the same time, we did not want to allow more than 160 teams in order to ensure organisational feasibility. In the end, we received around” 200 applications for ” – from teams that all met our eligibility criteria, for example in terms of qualifications and methodological experience. So we even had to turn some people down. This shows that there is definitely interest in formats like this – even beyond the narrower Crowd Science community.

That means you didn’t have to actively “sell” the advantages of such a project – the response was immediate. Can you nevertheless assess what motivates the participating teams? Especially given the scepticism that they may end up being just one of many co-authors.

DA: I can only guess what motivated the individual teams. I haven’t had any direct conversations about this yet, partly because the research project is still ongoing and we don’t want to influence the teams’ work unnecessarily. But if you asked me what a meaningful introduction to open science might be, especially for doctoral students or junior researchers, I would recommend projects like this. The participation is well structured in terms of content, the requirements are clearly defined, and the workload is manageable. Over the entire project period, we are talking about perhaps two weeks of intensive work per team. For many, this is probably an attractive way to get involved in a larger scientific context without having to take on a major project of their own right away. And even if you take a cautious view of the value of the publication, the methodological learning gains that can be gained from the collaboration are likely to be the decisive incentive for many.

Does professional networking also play a role? Are there opportunities within such projects to get to know colleagues from the community, the 159 co-authors, so to speak?

DA: I think this aspect is definitely relevant – even if it doesn’t come into play right from the start. At the moment, the teams don’t know anything about each other, and this is deliberately organised this way: we want to avoid the groups exchanging ideas and thereby unintentionally influencing the results. But at the end of the project, when the manuscript is finished and all those involved are named as co-authors, there will of course be an overview of who contributed. At that point, at the latest, you can become aware of colleagues you didn’t know before. For me personally, this means that if I meet someone at a conference who was also part of the project, I immediately have a ” ” point of contact. Such encounters can promote professional exchange and enable new contacts to be made – especially across national borders.

With such large projects, there is a good chance that you will meet some of the contributors later at conferences or in other contexts.

DA: Exactly. And this networking aspect doesn’t just apply to young researchers. I was pleasantly surprised by how many senior researchers participated in our project. This means that, in the end, you’re not just connected to other doctoral candidates or postdocs, but you also share a publication with renowned names in the economics community. This can be a good conversation starter at conferences – perhaps not as a direct door opener, but definitely as a point of contact. And that is an added value of such projects that should not be underestimated.

How do you assess the importance of big team science or crowd science overall? Is it more of a niche or will it become a growing part of scientific practice?

DA: I currently see a clear growth trend. Crowd science is becoming more visible and gaining in importance – even if it is certainly not a format that will fundamentally replace traditional research approaches. In an earlier job interview, I was once asked whether crowd science would dominate everything in the future. My assessment is: no. It is not a universal tool for every question. Rather, it is particularly suitable for research questions that have already been investigated many times but for which there are inconsistent results. In such cases, a crowd science project can help to reveal heterogeneity and explain it systematically – for example, through different analysis or design decisions.

The question of how the use of artificial intelligence will affect this area will certainly be exciting in the coming years. The question increasingly arises: Could AI take on individual roles in such projects – for example, in coordination, or even as “virtual analysts” who simulate different methodological perspectives? This could fundamentally change the idea of crowd science once again.

Or even in evaluation?

DA: Both the generation of different ideas, which we are now crowdsourcing, and their evaluation – in other words, the entire process – will probably change significantly as a result of the growing possibilities offered by artificial intelligence. At the moment it is difficult for me to say inthis will go,exactly which direction . But it is clear that a lot is happening here at the moment – both in science as a whole and in the field of crowd science.

If you understand crowd science or big team science as part of open science, how do you generally see the interaction between artificial intelligence and open science? Where are the possible interfaces?

DA: I would highlight two aspects in particular. First, the AI systems already available today can directly support open science – especially in terms of reproducibility. The day before yesterday, I attended an online symposium at the Meta Science Conference in London. There was a discussion about how AI can be used to systematically identify errors in published research papers. Can AI check something like computational reproducibility or perform other logical checks? I think such procedures can make a significant contribution to ensuring the quality of scientific results. In this sense, I clearly see AI as an enabler – especially in the context of reproducibility and transparency, which are central goals of open science.

Do you mean reproducibility in the sense that AI automatically checks whether published data and code lead to the same result?

DA: Exactly, that’s the idea. Today, we still rely on complex reproduction or replication studies – such as those conducted by Abel Brodeur’s Institute for Replication – which require a lot of human resources. AI could make such checks much more efficient in the future. And if that succeeds, it will also become more realistic to systematically integrate reproducibility into peer review processes.

I see.

DA: If I, as a reviewer, am also supposed to check reproducibility today, I think twice about whether I want to put in the extra effort. If AI makes this easier in the future, reproducibility could be checked much more frequently. That would help to identify errors earlier – and possibly also prevent non-reproducible results from being published in the first place.

That was one point. The second is just as important to me: open science will become even more relevant as AI is used more extensively in research. Today, we may be talking about using ChatGPT to revise individual sections of research papers. But as AI becomes more powerful, the prospect is that entire research processes, from data collection to analysis, could be AI-based.

This raises the question: How reliable are such results? This is precisely where clear open science principles are needed – to ensure that AI-generated research remains transparent, verifiable and ultimately trustworthy.

So in future, I will not only publish data and code, but also my prompts?

DA: Yes for example,, that would be an important step towards transparency. But another point is that the more research projects are no longer primarily human-based but increasingly AI-based, the more important the question of the reproducibility of results becomes. In recent months, there has been a lot of talk about so-called “hallucinations” of AI, i.e. content that is generated without being based on verifiable facts. Of course, this must not happen in scientific work. But the greater the proportion of AI in a study, the greater the risk that such errors will go unnoticed. That is why I believe it is all the more important to consistently apply principles such as reproducibility and transparency – especially in a research landscape in which AI is playing an increasingly active role.

Thank you very much!

*The interview was conducted on 19 June 2025 by Dr Doreen Siegfried.
This text was translated on 14 July 2025 using DeeplPro.

About David Albrecht, PhD:

David Albrecht is a behavioural economist and postdoctoral researcher at the Berlin Social Science Centre (WZB). His research focuses on data-driven approaches to analysing economic issues and decision-making processes. Among other things, he investigates economic preferences and group behaviour. At the WZB, Albrecht coordinates the research and laboratory activities of the project “Lab Square “ , which is led by Anna Dreber and Levent Neyse. The aim of the project is to establish a central hub for replicability, meta-science and crowd science in the economic and social sciences.

Contact: https://da-lbrecht.github.io/

LinkedIn: https://www.linkedin.com/in/david-albrecht-2b4479120/

to Open Science Magazine