Leaderboard is now active! Submit your results here

DISPLACE 2026 CHALLENGE

DISPLACE-M

DIarization and Speech Processing for LAnguage understanding in Conversational Environments

About the Challenge

Focusing on multilingual medical conversational speech between community health workers and patients

Download the Flyer

DISPLACE-M Challenge Overview

The DISPLACE-M challenge aims to advance diarization and speech understanding technologies in real-world healthcare conversations. The dataset includes multilingual, code-mixed interactions between Accredited Social Health Activists (ASHAs) and patients, representing authentic primary healthcare settings.

Important Dates

21 Oct 2025
Registration Opens
6 Jan 2026
Data Release (Development Set)
8 Jan 2026
Baseline System Release
8 Feb 2026
Leaderboard Active & Phase-I Data Release
20 Feb 2026
Registration Closes
26 Feb 2026
Phase-I Evaluation Closes
27 Feb 2026
System Report Submission
20 March 2026
Phase-II Evaluation Opens
20 May 2026
Phase-II Evaluation Closes

Challenge Tracks

Three core tracks for 2026

Track 1 — Speaker Diarization (SD)

Determine “who spoke when” in code-mixed multilingual medical dialogues with overlapping speech.

Track 2 — Automatic Speech Recognition (ASR)

Transcribe multilingual, code-mixed healthcare conversations across dialects and noisy environments.

Track 3 — Topic Identification

Identify conversation topics (e.g., maternal health, infection)

Track 4 — Summarization

Produce concise medical dialogue summaries.

Database

Multilingual medical dialogue corpus

Corpus Composition

~30 hours of real primary-healthcare interactions recorded between community health workers (ASHAs) and patients in local clinics, homes and health camps.

Languages & Code-Mixing

Hindi, English naturally mixed across turns, reflecting true medical multilingual usage.

Recording Setup

Single-channel 16 kHz 16-bit WAV recordings, far-field microphones, real acoustic conditions, annotated for speaker segments, language, and overlap.

Resources & Baselines

Dataset Access

Access the DISPLACE-M dataset on Hugging Face

View Dataset on Hugging Face

Baseline Systems

Explore baseline implementations on GitHub

View Baselines on GitHub

Report Template

Download the system report template

Download Report Template

Evaluation Plan and Leaderboard Guidelines

Submission Template to Leaderboard

View Guidelines Go to Leaderboard

Additional links to evaluation metrics and baseline systems will appear here once released.

Registration Process

Step 1 — Fill the Registration Form

Provide team, affiliation, and track details.

Register Here

Step 2 — Sign Terms & Conditions

Download and email the signed Terms & Conditions to displace2026@gmail.com.

Step 3 — Access Dataset & Baselines

After verification, you’ll receive credentials for the development set and baselines.

Organizing Committee

Meet the team behind DISPLACE-M

Ankita Meena
M.Tech Student, IISc Bangalore
Ashwini Nagaraj Shenoy
Junior Research Fellow, NITK Surathkal
Chitralekha Bhat
Chief Engineering Manager, TANUH
Deepu Vijayasenan
Professor, NITK Surathkal
Dhanya E
Post Doctoral Researcher, IISc Bangalore
Kalluri Shareef Babu
Assistant Professor, UPES Dehradun
Manas Nanivadekar
Research Intern, IISc Bangalore
Noumida A
Post Doctoral Researcher, IISc Bangalore
Pratik Ranjan Roy Chowdhuri
Research Scholar, NITK Surathkal
Sriram Ganapathy
Associate Professor, IISc Bangalore
Dr. Srikanth Raj Chetupalli
Assistant Professor, IIT Bombay
Pratik Ranjan Roy Chowdhuri
Junior Research Fellow, NITK Surathkal
Victor Azad
M.Tech Student, IISc Bangalore

Frequently Asked Questions

How do I get the DISPLACE-M dataset?

Register for the challenge and send the signed agreement to displace2026@gmail.com to receive access credentials.

Can I share or redistribute the data?

No, redistribution is prohibited. You may use it for research with proper citation.

Do we have any constraints on the overall compute allowed, the number of models we can use, the latency of the overall system?

No hard constraints on overall compute or number of models or latency constraints.

Are we allowed to use proprietary/open source data for training our models?

Yes, you are allowed to use proprietary/open source data for training your models, but it must be properly cited.

Is report need to be submitted?

Yes, in a prescribed format- will be provided to the registered participants.

How do I submit my findings obtained by participating in this challenge to Interspeech 2026? Answer

That's great! You can follow the Interspeech 2026 paper submission portal here. Remember to select "DISPLACE-M special session” while uploading your paper there.