The iDPP@CLEF Challenge as a Way to Open Science

iDPP@CLEF [1] is an open evaluation challenge to assess the performance of Artificial Intelligence (AI) algorithms to predict the progression of Amyotrophic Lateral Sclerosis (ALS) and Multiple Sclerosis (MS). iDPP stands for Intelligent Disease Progression Prediction and it is a series of events, organised by the BRAINTEASER project, co-located with the Conference and Labs of the Evaluation Forum (CLEF) [2] since 2022 [1; 3–6]. This year edition [3] is co-located with CLEF 2024 [4], whose final event will be held in Grenoble, France, from 9 to 12 September 2024.

ALS and MS are two severe neurodegenerative diseases that affect the Central Nervous System (CNS). They are chronic diseases characterized by progressive or alternate impairment of neurological functions (motor, sensory, visual, cognitive). Patients undergo alternated periods in hospital with care at home, experiencing a constant uncertainty regarding the timing of the disease acute phases and facing a considerable psychological and economic burden that also involves their caregivers. Clinicians, on the other hand, need tools able to support them in all the phases of the patient treatment, suggest personalized therapeutic decisions, indicate urgently needed interventions.

Therefore, AI algorithms, trained on both retrospective and prospective patient data, can be of great help to both clinicians and patients in providing indications about the estimated progression of such diseases to support therapeutic decisions, to contribute to better caregiving, and to reduce psychological burden and uncertainty.

To be effective and accurate such AI algorithms need, at the same time, to be trained on real patient data and to be tested on previously unseen patient data, in order to evaluate and ensure their capacity of reliably operate in real conditions.

Data is clearly an extremely valuable asset in this context, as in many others, since it comes from real patients and requires years to be collected in sufficient amount for AI algorithms training and testing. In the case of the iDPP challenges we gathered ALS and MS patients’ data from medical institutions in Turin, Pavia, Lisbon, and Madrid and we carefully curated them in order to ensure their quality, correctness, and coherence. This means not only pre-processing, cleaning, and validating the raw patient data but also semantically modelling such data by means of an ontology – the BRAINTEASER Ontology [5] – and representing them in a knowledge base. In other terms, we applied the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) [8] to the preparation and sharing of the datasets for the iDPP challenges in order to maximize their availability, interpretability, and impact, making them a first-class citizen in the exploitation strategy of the BRAINTEASER project.

However, the FAIR principles are just one of the building blocks of the much broader vision embraced by the Open Science paradigm [2] which according to the UNESCO recommendation about it: “sets a new paradigm that integrates into the scientific enterprise practices for reproducibility, transparency, sharing and collaboration resulting from the increased opening of scientific contents, tools and processes” [7, p. 7] where “increased openness leads to increased transparency and trust in scientific information and reinforces the fundamental feature of science as a distinct form of knowledge based on evidence and tested against reality, logic and the scrutiny of scientific peers” [7,  p. 18].

Open Science is clearly crucial for such a sensitive domain as AI for predicting the progression of ALS and MS where not only transparency, trust and reproducibility play a pivotal role but also sharing and collaboration are indispensable to let computer scientists, medical doctors and patients cooperate together and to transfer knowledge.

In this respect, the iDPP@CLEF challenges are a quite effective way to embody the Open Science and FAIR visions since they create and curate datasets which are then distributed to other researchers participating in the challenges and are available also beyond the challenges themselves under open source licenses; they bring together researchers working on such AI prediction algorithms and let them directly compare their approaches on the same datasets in order to understand what works better and why; they steer the development of such AI algorithms by setting increasingly complex tasks iteration after iteration; they accelerate knowledge transfer by organizing an annual event where participants discuss their approaches, by publishing the technical description and analysis of the participant approaches in open access outlets [7], and by sharing the results of participants’ approaches under open source licenses [8]. Moreover, iDPP@CLEF 2024 relies on prospective patient data of patients currently enrolled in clinical trials promoted by the BRAINTEASER project and this represents a form of citizens science, another pillar of the open science vision. In this context, the annual event in September 2024 will offer the opportunity to involve not only the computer scientists participating in the challenge but also patient associations and medical doctors in order to provide them with feedback on how their data have been used and which results and advancement they produced in the prediction of progression of ALS and MS.

In summary, the iDPP@CLEF challenges offer to the BRAINTEASER project the possibility of giving added value to its dataset and AI prediction algorithms as well as maximising their impact and exploitation. But, especially, the iDPP@CLEF challenges fully embrace the Open Science parading and contribute to building transparency, trust, reproducibility, and collaboration around such AI prediction algorithms, involving both the research community and the society.

 

References

  1. Aidos, H., Bergamaschi, S., Cavalla, P., Chiò, A., Dagliati, A., Di Camillo, B., de Carvalho, M., Ferro, N., Fariselli, P., Garcia Dominguez, J. M., Madeira, S. C., and Tavazzi, E. (2024). iDPP@CLEF 2024: The Intelligent Disease Progression Prediction Challenge. In Nazli, G., Tonellotto, N., He, Y., Lipani, A., McDonald, G., Macdonald, C., and Ounis, I., editors, Advances in Information Retrieval. Proc. 46th European Confer- ence on IR Research (ECIR 2024) – Part II. Lecture Notes in Computer Science (LNCS) 14609, Springer, Heidelberg, Germany.
  2. Chubin, D. E. (1985). Open Science and Closed Science: Tradeoffs in a Democracy. Science, Technology, & Human Values, 10(2):73–81. https://doi.org/10.1177/016224398501000211
  3. Faggioli, G., Guazzo, A., Marchesin, S., Menotti, L., Trescato, I., Aidos, H., Bergamaschi, R., Birolo, G., Cavalla, P., Chiò, A., Dagliati, A., de Carvalho, M., Di Nunzio, G. M., Fariselli, P., Garc ia Dominguez, J. M., Gromicho, M., Longato, E., Madeira, S. C., Manera, U., Silvello, G., Tavazzi, E., Tavazzi, E., Vettoretti, M., Di Camillo, B., and Ferro, N. (2023). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2023. In Arampatzis, A., Kanoulas, E., Tsikrika, T., Vrochidis, S., Giachanou, A., Li, D., Aliannejadi, M., Vlachos, M., Fag gioli, G., and Ferro, N., editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2023), pages 343–369. Lecture Notes in Computer Science (LNCS) 14163, Springer, Heidelberg, Germany. https://doi.org/10.1007/978-3-031-42448-9_24
  4. Faggioli, G., Guazzo, A., Marchesin, S., Menotti, L., Trescato, I., Aidos, H., Bergamaschi, R., Birolo, G., Cavalla, P., Chiò, A., Dagliati, A., de Carvalho, M., Di Nunzio, G. M., Fariselli, P., Garcia Dominguez, J. M., Gromicho, M., Longato, E., Madeira, S. C., Manera, U., Silvello, G., Tavazzi, E., Tavazzi, E., Vettoretti, M., Di Camillo, B., and Ferro, N. (2023). Overview of iDPP@CLEF 2023: The Intelligent Disease Pro- gression Prediction Challenge. In Aliannejadi, M., Faggioli, G., Ferro, N., and Vlachos, M., editors, CLEF 2023 Working Notes, pages 1123– 1164. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. https://ceur-ws.org/Vol-3497/paper-095.pdf
  5. Guazzo, A., Trescato, I., Longato, E., Hazizaj, E., Dosso, D., Faggioli, G., Di Nunzio, G. M., Silvello, G., Vettoretti, M., Tavazzi, E., Roversi,C., Fariselli, P., Madeira, S. C., de Carvalho, M., Gromicho, M., Chiò, A., Manera, U., Dagliati, A., Birolo, G., Aidos, H., Di Camillo, B., and Ferro, N. (2022). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2022. In Barron-Cedenno, A., Da San Martino, G., Degli Es- posti, M., Sebastiani, F., Macdonald, C., Pasi, G., Hanbury, A., Potthast, M., Faggioli, G., and Ferro, N., editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF 2022), pages 395–422. Lecture Notes in Computer Science (LNCS) 13390, Springer, Heidelberg, Germany. https://doi.org/10.1007/978-3-031-13643-6_25
  6. Guazzo, A., Trescato, I., Longato, E., Hazizaj, E., Dosso, D., Faggioli, G., Di Nunzio, G. M., Silvello, G., Vettoretti, M., Tavazzi, E., Roversi, C., Fariselli, P., Madeira, S. C., de Carvalho, M., Gromicho, M., Chiò, A., Manera, U., Dagliati, A., Birolo, G., Aidos, H., Di Camillo, B., and Ferro, N. (2022). Overview of iDPP@CLEF 2022: The Intelligent Disease Progression Prediction Challenge. In Faggioli, G., Ferro, N., Hanbury, A., and Potthast, M., editors, CLEF 2022 Working Notes, pages 1130– 1210. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. https://ceur-ws.org/Vol-3180/paper-88.pdf
  7. UNESCO (2021). UNESCO Recommendation on Open Science. UNESCO, Paris, France, SC-PCB-SPP/2021/OS/UROS. https://doi.org/10.54677/MNMH8546
  8. Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Gonzalez-Beltran, A., Gray, A. J., Groth, P., Goble, C., Grethe, J. S., Heringa, J., ’t Hoen, P., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S. J., Martone, M. E., Mons, A., Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S.-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M. A., Thompson, M., van der Lei, J., van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., and Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Nature Scientific Data, 3(160018). https://doi.org/10.1038/sdata.2016.18
Comments are closed.
Start simulating now