Background image

Workshop for IJCAI–ECAI 2022
Saturday July 23rd, Vienna

Scarce Data in Artificial Intelligence for Healthcare (SDAIH)

Register ▶


The goal of this workshop is to exchange learnings and efforts on how to solve the issue of data scarcity for the practical deployment of AI in healthcare. We aim at bringing together, from both academia and industry, researchers and data scientists that are confronted with challenges related to limited data availability for machine learning in medicine, medical engineering, biotechnology, pharmaceuticals, and medical services.

About image

Call for papers


AI has the potential to generate a revolution in the field of healthcare by enabling accurate, fast and reliable analyses of data at an unprecedented scale both in the clinics and in industry. Leveraged properly, AI can thus allow to better meet patient needs by developing new medical devices, drugs, and personalized treatments, while simultaneously freeing up time for clinical staff to nourish the profound human connection between caregivers and patients. Moreover, AI promises to democratize the healthcare system by spreading basic services to low-income or remote areas through telemedicine.

Notwithstanding the terrific progress achieved in the last two decades, many AI projects related to medicine struggle to make their way to deployment and sustainable productivity because of the limited availability of high-quality annotated data. The scarcity of useful information is often exacerbated in medicine, medical engineering, and healthcare in general because labelling requires highly-specialized staff, patient privacy must be respected, ethnic differences and rare diseases adequately represented. Despite the incredible advances of the last few years in facilitating data collection and annotation, learning representations, and detecting different types of bias, basic observations on implications for practitioners are often lacking, new ingenious ideas are flourishing, and recommendations for healthcare are far from established.

Topics of interest

Topics of interest include, but are not limited to:
Topics image

  • Publication of datasets relevant to healthcare including text, images, audio and structured data.
  • Hardware and software tools for enabling data acquisition in low-resource or restricted environments, such as federated annotations and pseudonimization techniques.
  • Tools to produce or evaluate high-quality clinical annotations and consensus diagnoses.
  • Critical analysis of iterative procedures to clean up or refine annotations, as well as guidelines to assess the uncertainty on metric scores.
  • Anonymization methods for intra- and inter-institutional data exchange.
  • Technical solutions to work in the presence of legal concerns, for instance federated learning and i2b2.
  • Works on learning representations or transfer learning with a focus on improving model generalization across different patient cohorts, data acquisition conditions, medical expert evaluations etc.
  • Studies which compare or combine learning from nature with learning from human experts.
  • Works on unsupervised, self-supervised, semi-supervised, or few-shot learning aimed e.g. at reducing the need for annotations by specialists.
  • Methods to deal with strongly imbalanced datasets such as those including rare diseases, or very small pathological features in medical image collections.
  • Strategies to handle scarcity of subsets in large datasets, i.e. “filling the gaps”.
  • Works on using public or artificially-generated datasets to improve the performance of machine-learning models in healthcare or to mitigate (patient) privacy issues.
  • Case studies linked to the practical deployment of AI in a clinical setting or in medical devices with limited data, as well as to the construction of pipelines or databases for addressing data scarcity.
  • Insightful, original analyses of reasons for the failure of AI projects in healthcare, and work-in-progress reports of efforts related to the themes listed above.


We welcome the submission of original research reports within the topics of interest of the workshop. The maximum length of papers is fixed to 6 pages including references. We especially encourage the contribution of case studies, work in progress, position papers, and critical analyses of failed projects.

Accepted papers will be published as proceedings with SciTePress and submitted for indexation by dblp, Scopus, SemanticScholar, Google Scholar and Microsoft Academic.

SciTePress templates: LaTex, MS Word

Paper submission deadline: May 13, 2022
Paper submission deadline extended: May 20, 2022
Decision notification: June 3, 2022
Camera-ready submission: June 17, 2022

All deadlines correspond to the end of the indicated day Anywhere on Earth (AoE).

Access your contribution via CMT

Accepted papers

  • Ontology-driven self-supervision for Adverse Childhood Experiences identification using social media datasets - Jinge Wu, Rowena Smith and Honghan Wu
  • Towards reducing segmentation labeling costs for CMR imaging using explainable AI - Alessa Stria and Asan Agibetov
  • Evaluation of the Synthetic Electronic Health Records - Emily Muller, Xu Zheng and Jer Hayes
  • OData Augmentation for Reliability and Fairness in Counselling Quality Classification - Vivek Kumar, Simone Balloccu, Wu Zixiu, Ehud Reiter, Rim Helaoui, Diego Reforgiato Recupero and Daniele Riboni
  • PT-MESS: a Problem-Transformation approach for Multi-Event Survival analySis - Michela Venturini, Felipe Kenji Nakano and Celine Vens
  • How Much Data is Enough? Benchmarking Transfer Learning for Few Shot ECG Image Classification - Sathvik Bhaskarpandit
  • Towards Reducing the Need for Annotations in Digital Dermatology with Self-Supervised Learning - Fabian Gröger, Philippe Gottfrois, Ludovic Amruthalingam, Alvaro Gonzalez-Jimenez, Simone Lionetti, Alexander A. Navarini and Marc Pouly
  • Eye-Tracking Dataset to Support the Research on Autism Spectrum Disorder - Federica Cilia, Romuald Carette, Mahmoud Elbattah, Jean-Luc Guérin and Gilles Dequen
  • Segmenting Overlapping Red Blood Cells With Classical Image Processing and Deep Learning - Nils Brünggel, Pascal Vallotton and Patrick Conway
  • SANO: Score-based Anomaly Localization for Dermatology - Alvaro Gonzalez-Jimenez, Simone Lionetti, Ludovic Amruthalingam, Philippe Gottfrois, Marc Pouly and Alexander Navarini

Tentative schedule

IJCAI/ECAI and SDAIH are fully in-person events and remote participation will be only considered in exceptional cases.

  • 09:00 - 10:30 = welcome & paper presentations
  • 10:30 - 11:00 = coffee break
  • 11:00 - 12:30 = paper presentations & discussion session
  • 12:30 - 14:00 = lunch break
  • 14:00 - 15:30 = paper presentations
  • 15:30 - 16:00 = coffee break
  • 16:00 - 17:00 = discussion session & conclusion

Discussion sessions are “round tables” organized around (2 × 5) speakers, starting from observations and comments about the speaker's talks to stimulate discussions about common issues and how to solve them, directions of future works, needed recommendations and policies.

Discussion image

Organising committee

Program committee