Multi-Destination Trips Dataset
Booking.com Challenge ACM WSDM 2021 WebTour
ACM SIGIR 2021 Resource Track
Booking.com’s mission is to make it easier for everyone to experience the world. By investing in the technology that helps take the friction out of travel, Booking.com seamlessly connects millions of travellers with memorable experiences, a range of transport options and incredible places to stay.
Many of the travellers go on trips which include more than one destination. For instance, a user from the US could fly to Amsterdam for 5 nights, then spend 2 nights in Brussels, 3 in Paris and 1 in Amsterdam again before heading back home. In this scenario, we suggest options for extending their trip immediately they make their booking.
The goal of this challenge is to use a dataset based on millions of real anonymized accommodation reservations to come up with a strategy for making the best recommendation for their next destination in real-time.
The training dataset consists of over a million of anonymized hotel reservations, based on real data, with the following features:
user_id - User ID
check-in - Reservation check-in date
checkout - Reservation check-out date
affiliate_id - An anonymized ID of affiliate channels where the booker came from (e.g. direct, some third party referrals, paid search engine, etc.)
device_class - desktop/mobile
booker_country - Country from which the reservation was made (anonymized)
hotel_country - Country of the hotel (anonymized)
city_id - city_id of the hotel’s city (anonymized)
utrip_id - Unique identification of user’s trip (a group of multi-destinations bookings within the same trip)
Each reservation is a part of a customer’s trip (identified by utrip_id) which includes at least 4 consecutive reservations. There are 0 or more days between check-out and check-in dates of two consecutive reservations.
The evaluation dataset is constructed similarly, however the city_id of the final reservation of each trip is concealed and requires a prediction.
The goal of the challenge is to predict (and recommend) the final city (city_id) of each trip (utrip_id). We will evaluate the quality of the predictions based on the top four recommended cities for each trip by using Precision@4 metric (4 representing the four suggestion slots at Booking.com website). When the true city is one of the top 4 suggestions (regardless of the order), it is considered correct.
Competition terms and conditions
The dataset is a property of Booking.com and may not be reused for commercial purposes.
Employees of online travel platform companies or other booking services (including Booking Holdings employees) are not eligible to compete for prizes in the challenge.
Participants are allowed to participate only once, with no concurrent submissions or code sharing between the teams.
The organizer is authorized to change the prize to award one that’s equivalent in its monetary value
The test set will be released to registered e-mails on January 5st, 2021. The teams are expected to submit their top four city predictions per each trip on the test set until January 28th 2021 in a csv file named submission.csv with the following columns:
utrip_id - 1000031_1
city_id_1 - 8652
city_id_2 - 8652
city_id_3 - 4323
city_id_4 - 4332
Where utrip_id represents each unique trip in the test and the rest of the columns represent the city_id of top 4 predicted cities.
Please refer to the Booking.com WSDM WebTour 21 challenge in the following format:
Dmitri Goldenberg, Kostia Kofman, Pavel Levin, Sarai Mizrachi, Maayan Kafry, and Guy Nadav. 2021. Booking.com WSDM WebTour 2021 Challenge. https://www.bookingchallenge.com. In ACM WSDM Workshop on Web Tourism (WSDM Webtour’21), March 12, 2021, Jerusalem, Israel.
Please refer to the Booking.com Multi-Destination Trips Dataset in the following format:
Dmitri Goldenberg and Pavel Levin. 2021. Booking.com Multi-Destination Trips Dataset. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), July 11–15, 2021, Virtual Event, Canada. https://doi.org/10.1145/3404835.3463240
For any problems or questions please contact firstname.lastname@example.org
A list of challenge papers that were accepted to WSDM WebTour 2021 workshop:
Dmitri Goldenberg, Kostia Kofman, Pavel Levin, Sarai Mizrachi, Maayan Kafry and Guy Nadav: Booking.com WSDM WebTour 2021 Challenge.
Benedikt Schifferer, Chris Deotte, Jean-Francois Puget, Gabriel de Souza Pereira Moreira, Gilberto Titericz, Jiwei Liu and Ronay Ak: Using Deep Learning to Win the Booking.com WSDM WebTour21 Challenge on Sequential Recommendations. (1st place)
Michał Daniluk, Barbara Rychalska, Konrad Gołuchowski and Jacek Dąbrowski: Modeling Multi-Destination Trips with Sketch-Based Model. (2nd place and best paper award)
Yuanzhe Zhou: Explore next destination prediction. (3rd place)
Martín Baigorria Alonso: Data Augmentation Using Many-To-Many RNNs for Session-Aware Recommender Systems.
Aleksandr Petrov and Yuriy Makarov: Attention-based neural re-ranking approach for next city in trip recommendations.
Shotaro Ishihara, Shuhei Goda and Yuya Matsumura: Weighted Averaging of Various LSTM Models for Next Destination Recommendation.
Yoshihiro Sakatani: Combining RNN with Transformer for Modeling Multi-Leg Trips.
Marlesson R. O. Santana and Anderson Soares: Hybrid Model with Time Modeling for Sequential Recommender Systems.
December 6th, 2020
The training dataset is accessible after a short pre-registration
Test set release
January 5th, 2021
The test set will be released to registered e-mails
Intermediate leaderboard submission
January 14th, 2021
Single submission is closed. Intermediate leaderboard results are published here.
Release leaderboard data
January 20th, 2021
January 28th, 2021
The teams are expected to submit their top four city predictions per each trip on the test set . The submission should be completed here.
Announcement on the winners
February 4th, 2021
The organizers will reveal the performance on the test set and will announce the final leaderboard
Paper submission deadline
February 18th, 2021
The top 10 teams will be invited to submit short papers
February 25th, 2021
Camera ready submission
March 4th, 2021
Participants submit their papers to email@example.com
March 12th, 2021
Virtual participation at the workshop is mandatory in order to be eligible for a prize
Deadlines refer to 23:59 (11:59pm) in the AoE (Anywhere on Earth) time zone
Mizrachi, Sarai, and Pavel Levin. "Combining Context Features in Sequence-Aware Recommender Systems." RecSys (Late-Breaking Results). 2019.
Bernardi, Lucas, Themistoklis Mavridis, and Pablo Estevez. "150 successful machine learning models: 6 lessons learned at booking. com." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019.
Kiseleva, Julia, Melanie JI Mueller, Lucas Bernardi, Chad Davis, Ivan Kovacek, Mats Stafseng Einarsen, Jaap Kamps, Alexander Tuzhilin, and Djoerd Hiemstra. "Where to go on your next trip? Optimizing travel destinations based on user preferences." In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1097-1100. 2015.
Levin, Pavel. “Modeling Multi-Destination Trips with Recurrent Neural Networks” https://youtu.be/pwfwUA4ZShI . DataConf ‘18, November 25th 2018, Jerusalem, Israel.
Goldenberg, Dmitri. “How to stay statistically significant in Agile environment?” https://youtu.be/6R2PKJ0RD_s . Data Science Festival, March 6th 2019, Manchester, United Kingdom.
Mizrachi, Sarai. “Modeling Multi-Destination Trips with RNN” https://youtu.be/NtvxDFGbdPQ . PyData, August 29th, 2019, Tel Aviv, Israel.