This initiative is led by the Mosqlimate and Infodengue in collaboration with the Harmonize and IDExtremes projects.
Challenge
The year 2024 has seen an exceptional number of reported dengue fever cases globally. In Brazil, besides the high incidence, the disease has spread to areas in the south or at altitudes where epidemics were not previously recorded, or where the incidence rate has far exceeded that of previous years. In the beginning of 2025, the dengue season was not expected to be as intense as in 2024 for most of the country, but we continue to see strong transmission throughout the country.
The objective of this sprint is to promote, in a standardized way, the training of predictive models and to develop high-quality ensemble forecast models for dengue in Brazil.
The challenge involves three testing targets and one “true” forecast target. The period of interest spans from the epidemiological week (EW) 41 of one year to EW 40 of the following year, aligning with the typical dengue season in Brazil.
Validation test 1. Predict the weekly number of dengue cases by state (UF) in the 2022-2023 season [EW 41 2022- EW40 2023], using data covering the period from EW 01 2010 to EW 25 2022;
Validation test 2. Predict the weekly number of dengue cases by state (UF) in the 2023-2024 season [EW 41 2023- EW40 2024], using data covering the period from EW 01 2010 to EW 25 2023;
Validation test 3: Predict the weekly number of dengue cases by state (UF) in the 2024-2025 season [EW 41 2024- EW40 2025], using data covering the period from EW 01 2010 to EW 25 2024;
Forecast. Predict the weekly number of dengue cases in Brazil, and by state (UF), in the 2025-2026 season [EW 41 2025- EW40 2026], using data covering the period from EW 01 2010 to EW 25 2025;
Afterward, the best-performing models in the first 3 validation runs will be included in an ensemble forecast model for 2026, built by the organizers.
Outcome variables
Predictions should be made for all 27 Brazilian federative units. If, for some reason, a forecast cannot be generated for a specific state, the sprint organization will decide how to proceed and will communicate the appropriate guidelines to the participants.
The models should generate the following outcome:
- For the validation test: Curve of dengue cases for seasons 2022-23, 2023-2024, and 2024–2025, by Epidemiological week (EW) 41 to 40, including median estimate and 50%, 80%, 90%, and 95% predictive intervals.
- For forecast: Curve of dengue cases for season 2025-2026, by EW 41 to 40, including median estimate and 50%, 80%, 90%, and 95% predictive intervals.
Data
The training datasets will be available for download as tar.gz files on an FTP server created by the Mosqlimate platform1, along with a variable dictionary. Instructions to access this data are available here. Participants are welcome to peruse the Mosqlimate data API for specific subsets of data. Participants can use other data sources as long as they share them with other participants through the organizers. The data used must be open-access and updatable. It must also be available for all the Brazilian states.
List of available datasets these can be updated at any time:
Type | Variables | Source |
---|---|---|
Epidemiological data | Number of probable2 dengue cases per week per municipality ; | From SINAN (aggregated by Infodengue and available in the Mosqlimate API). |
Demographic data | Population per municipality (2001-2024); | Source: SVS. |
Climate data | Weekly temperature, humidity, precipitation, El Niño/Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), Indian Ocean Dipole (IOD) | From ERA5, aggregated by week and available in the Mosqlimate API. |
Climate forecast data | Monthly temperature, humidity, and precipitation | Copernicus. |
Environment data | Municipality main climate and biome |
Forecast Evaluation
For the evaluation, the logarithmic score, CRPS, and the weighted interval score will be computed. The calculation of the interval predictions will be done on a per-week basis. See references 1-5 for more information about the methodology applied in the sprint.
Expected Outputs:
A technical note will be published by Infodengue, in Portuguese, with the results of the ensemble models. All participants of the sprint will be acknowledged. All groups that contributed models that meet the minimum quality criteria will be part of the publication.
In the second semester of 2026, there will be an award for the model(s) that made the most useful forecast(s), with a date to be announced later.
The forecasting exercise will be replicated in the following year if funding is available.
Important Dates:
- Release of Call for participation — May 15th, 2025
- Forecast submission deadline (validation datasets) — June 30th, 2025
- Webinar (Results of the validation round) — July 30th, 2025
- 2nd Forecast submission deadline (2026 forecasts) - September 23rd, 2025
- Presentation of the ensemble model - October 15th, 2025
How to participate in the Sprint
To participate, you must fill out the registration form between 2025/05/15 and 2025/06/15. Before that, we recommend that you access the event website with detailed instructions and guidelines.
We will maintain two communication channels for interaction between participants and the organization team to promote quick communication about any aspect of the event.
A discussion forum for participants on Discord.
Virtual meetings will be organized during the sprint to answer questions and explain the model/forecast submission processes; the invitation for these meetings will be sent to the registered email.
Sprint Rules:
1. Public GitHub Repository: All participating models must maintain their code in a public GitHub repository created from this template permanently, even after the end of the event. The readme.md of the repository must contain an explanation of the methodology.
2. Forecast Submission Format: The forecast submission format is documented in the template repository available here and in the API documentation. The upload code will check the formatting before submission.
3. Data Sharing: Participants may use other data sources as long as they are anonymized and shared with other participants through the organizers. The data used must be open-access and updatable.
4. To be considered for the validation round and ensemble construction, each submitted model must provide predictions for all target variables.
Tutorials and support information can be found here.
References
1. Banholzer, N. et al. A comparison of short-term probabilistic forecasts for the incidence of COVID-19 using mechanistic and statistical time series models.
2. Gneiting, T. & Raftery, A. E. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Am. Stat. Assoc. 102, 359–378 (2007).
3. Bracher, J., Ray, E. L., Gneiting, T. & Reich, N. G. Evaluating epidemic forecasts in an interval format. PLOS Comput. Biol. 17, e1008618 (2021).
Financial Support:
Mosqlimate project coordinates the 2025 Sprint with financial support from Wellcome Trust grant number 226088/Z/22/Z.
Contact: mosqlimate@gmail.com
http://mosqlimate.org/ is hosting this modeling initiative. ↩︎
Probable dengue cases = reported cases - discarded cases. They include both confirmed cases and cases under investigation. ↩︎