Winners Dutch Data Prize 2022
The winners of the Dutch Data Prize 2022 have been announced. From 51 nominations, three datasets excelled the most in terms of discoverability, accessibility, interoperability and reusability (FAIR). The award was presented for the seventh time in the categories Life Sciences & Health, Natural & Engineering Sciences and Social Sciences & Humanities during the FAIR Data Day on 29 November at the Jaarbeurs in Utrecht.
“Data sharing helps accelerate scientific progress. Data from one research project can be used to move another research project forward. But simply opening up your data is not enough. FAIR data means thinking about the needs of those who could benefit from the data. It encourages engagement and collaboration…” So begins Caroline Visser (NWO) her speech in the role of jury chair during the presentation of the Dutch Data Prizes.
“Today we are celebrating researchers who make contributions to their own fields but more importantly who take steps to ensure that the research data they produce can be widely reused by others.” Visser speaks on behalf of the entire jury when she says they enjoyed looking at and evaluating all the entries.
DNA barcodes and new fungal species
In the Life Sciences & Health category, the award goes to DNA barcodes for fungal identification. This dataset stood out for its impact, originality and interdisciplinary relevance. The dataset contains more than 24,000 DNA sequences of 7,300 accepted filamentous fungal species. Fungi and their identification are very important for biodiversity. It is estimated that fungi represent 40% of our national biodiversity. This dataset is therefore also relevant to disciplines and topics such as medicine, food security and materials science.
Duong Vu, researcher and one of the creators of the dataset: “It is a great honor for us to have won the Dutch Data Prize 2022 in the Life Sciences and Health category. Our dataset is the result of a 10-year DNA barcoding project at the Westerdijk Fungal Biodiversity Institute, involving many scientists working on different aspects of the project including preserving the fungal strains of our collection, generating DNA barcodes for the strains, developing a system to manage the large amount of barcode data in a FAIR manner, and validating the barcode data. Winning the prize gives us lots of motivation. To some extent, I feel that the hard work of many people pays off. Hopefully, more fungal DNA barcodes will be released in the near future. I would like to take this opportunity to thank all my colleagues involved in the project and the funding organizations that support our work.”

Cutting and analysing paintings into fragments
Materials in Paintings (MIP): An interdisciplinary dataset for perception, art history, and computer vision is firmly rooted in computer science as a dataset. For computer vision and machine learning, it is a rich resource. Its reusability across different domains and its applicability to the arts and humanities make the dataset special. In the Natural & Engineering Sciences category, MIP was therefore presented with the Dutch Data Prize.
MIP is an annotated dataset of 19,000 paintings from the past 500 years. These paintings were cut into more than 200,000 fragments. The depicted materials in each of these fragments were classified using machine learning algorithms. Of course, the cutting was done on digital images and not the real ones! All images can be downloaded in open formats, both as a comprehensive dataset available via 4TU.ResearchData and via an interactive portal that allows users to browse through the individual paintings.
“I regret I couldn’t be there in person, as I’m personally working in Kyoto university, but I enjoyed the updates from my team members on the ground. I’m personally very grateful to the 4TU team at Delft for nominating our work! I’m glad to hear that the jury is as excited about our work as we are and I am honored to be the winner of the Dutch Data prize. Creating the dataset has been a lot of work, and it’s wonderful to get such positive responses.
Last, we intend to spend the prize money on hosting/sponsoring a conference to further spread fair data practices in our field”, said Mitchell van Zuylen, one of the creators of this dataset.

4,000 children followed, from gestation to adolescence
YOUth is an excellent example of how data can be shared in a FAIR way, in line with GDPR, through a transparent process. The management team of the Youth cohort study is a strong team of researchers who support open science. That’s why it became the winner of the Dutch Data Prize in the category Social Sciences & Humanities.
YOUth is a large-scale, longitudinal cohort study following nearly 4,000 children (and their parents) in the Utrecht region, from gestation to adolescence. The YOUth data are available for GDPR-compliant research use through safe, managed access. The dataset is accompanied by highly detailed information, including visualisations and videos on how to request the data. The team behind this dataset truly encourages and facilitates extensive and appropriate use of the data, making it easy for other researchers to learn about the dataset and to request the data for research purposes.
“We are super proud of winning the award. It is a fantastic recognition of all the time and energy we invest as a team in creating and making the dataset FAIR. We did this together from Utrecht University and UMCU, with all our measurement assistants, policy staff, the front office, the data managers, communication advisors, technicians, team leaders, the university library, IT services, our researchers, the management team and our participants. If we succeed with our large amounts of complex, sensitive data, it should also be possible for many other studies to make data FAIR,” said Coosje Veldkamp, YOUth project manager.

Reusable research data pays off
Every two years, the Dutch Data Prize is awarded to an individual or a team that creates well-reusable research data and makes it available in a repository. The prize is a valuable recognition of researchers’ contributions to their own field and to the principle of FAIR data. Besides the award, there is prize money to make data more FAIR and encourage data reuse. The winners can use this money to, for example, organise a symposium or make their data more accessible online.
The Dutch Data Prize has been awarded since 2010. These are the previous winners.