Image by Dino Reichmuth

Multi-Destination Trips Dataset Challenge ACM WSDM 2021 WebTour

ACM SIGIR 2021 Resource Track 

ABOUT’s mission is to make it easier for everyone to experience the world. By investing in the technology that helps take the friction out of travel, seamlessly connects millions of travellers with memorable experiences, a range of transport options and incredible places to stay.

Many of the travellers go on trips which include more than one destination. For instance, a user from the US could fly to Amsterdam for 5 nights, then spend 2 nights in Brussels, 3 in Paris and 1 in Amsterdam again before heading back home. In this scenario, we suggest options for extending their trip immediately they make their booking.

The goal of this challenge is to use a dataset based on millions of real anonymized accommodation reservations to come up with a strategy for making the best recommendation for their next destination in real-time.



The training dataset consists of over a million of anonymized hotel reservations, based on real data, with the following features:
user_id - User ID
check-in - Reservation check-in date
checkout - Reservation check-out date
affiliate_id - An anonymized ID of affiliate channels where the booker came from (e.g. direct, some third party referrals, paid search engine, etc.)
device_class - desktop/mobile
booker_country - Country from which the reservation was made (anonymized)
hotel_country - Country of the hotel (anonymized)
city_id - city_id of the hotel’s city (anonymized)
utrip_id - Unique identification of user’s trip (a group of multi-destinations bookings within the same trip)

Each reservation is a part of a customer’s trip (identified by utrip_id) which includes at least 4 consecutive reservations.  There are 0 or more days between check-out and check-in dates of two consecutive reservations.

The evaluation dataset is constructed similarly, however the city_id of the final reservation of each trip is concealed and requires a prediction.


Evaluation criteria

The goal of the challenge is to predict (and recommend) the final city (city_id) of each trip (utrip_id). We will evaluate the quality of the predictions based on the top four recommended cities for each trip by using Precision@4 metric (4 representing the four suggestion slots at website). When the true city is one of the top 4 suggestions (regardless of the order), it is considered correct.


Competition terms and conditions

The dataset is a property of and may not be reused for commercial purposes.

Employees of online travel platform companies or other booking services (including Booking Holdings employees) are not eligible to compete for prizes in the challenge.

Participants are allowed to participate only once, with no concurrent submissions or code sharing between the teams.

The organizer is authorized to change the prize to award one that’s equivalent in its monetary value


Submission guidelines

The test set will be released to registered e-mails on January 5st, 2021. The teams are expected to submit their top four city predictions per each trip on the test set until January 28th 2021 in a csv file named submission.csv with the following columns:

utrip_id - 1000031_1

city_id_1 - 8652

city_id_2 - 8652

city_id_3 - 4323

city_id_4 - 4332

Where utrip_id represents each unique trip in the test and the rest of the columns represent the city_id of top 4 predicted cities.

Please refer to the WSDM WebTour 21 challenge in the following format:

Dmitri Goldenberg, Kostia Kofman, Pavel Levin, Sarai Mizrachi, Maayan Kafry, and Guy Nadav. 2021. WSDM WebTour 2021 Challenge.  In ACM WSDM Workshop on Web Tourism (WSDM Webtour’21), March 12, 2021, Jerusalem, Israel.

Please refer to the Multi-Destination Trips Dataset in the following format:

Dmitri Goldenberg and Pavel Levin. 2021. Multi-Destination Trips Dataset. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), July 11–15, 2021, Virtual Event, Canada.

For any problems or questions please contact

Download the dataset

The challenge dataset is available for download for the use of research purposes only.



Challenge Papers

A list of challenge papers that were accepted to WSDM WebTour 2021 workshop:


Challenge Schedule

Challenge starts

December 6th, 2020

The training dataset is accessible after a short pre-registration

Test set release

January 5th, 2021

The test set will be released to registered e-mails

Intermediate leaderboard submission

January 14th, 2021

Single submission is closed. Intermediate leaderboard results are published here

Release leaderboard data

January 20th, 2021

Challenge closes

January 28th, 2021

The teams are expected to submit their top four city predictions per each trip on the test set . The submission should be completed here.

Announcement on the winners

February 4th, 2021

The organizers will reveal the performance on the test set and will announce the final leaderboard

Paper submission deadline

February 18th, 2021

The top 10 teams will be invited to submit short papers

Paper notification

February 25th, 2021

Camera ready submission

March 4th, 2021

Participants submit their papers to

Workshop day 

March 12th, 2021

Virtual participation at the workshop is mandatory in order to be eligible for a prize

Deadlines refer to 23:59 (11:59pm) in the AoE (Anywhere on Earth) time zone


Useful resources

Mizrachi, Sarai, and Pavel Levin. "Combining Context Features in Sequence-Aware Recommender Systems." RecSys (Late-Breaking Results). 2019.


Bernardi, Lucas, Themistoklis Mavridis, and Pablo Estevez. "150 successful machine learning models: 6 lessons learned at booking. com." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019.

Kiseleva, Julia, Melanie JI Mueller, Lucas Bernardi, Chad Davis, Ivan Kovacek, Mats Stafseng Einarsen, Jaap Kamps, Alexander Tuzhilin, and Djoerd Hiemstra. "Where to go on your next trip? Optimizing travel destinations based on user preferences." In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1097-1100. 2015.

Levin, Pavel. “Modeling Multi-Destination Trips with Recurrent Neural Networks”  . DataConf ‘18, November 25th 2018, Jerusalem, Israel.


Goldenberg, Dmitri.  “How to stay statistically significant in Agile environment?” . Data Science Festival, March 6th 2019, Manchester, United Kingdom.


Mizrachi, Sarai. “Modeling Multi-Destination Trips with RNN”  . PyData, August 29th, 2019, Tel Aviv, Israel.

Join us! Data science blog


Image by Glenn Carstens-Peters