Mar 30, 2017

Travelling Salesman Challenge — Recap

The Travelling Salesman Challenge has ended which means it’s a great time to look back and summarise the event. You can see the pictures here and watch the video here.

Designing the challenge

There are multiple ways of setting up the technical environment for this kind of competitive programming event. The most important question is where to run the contestant’s code during the evaluation stage.

After experimenting with the Data challenge six months ago, we decided to keep the same model and provide our own infrastructure for the development phase and the evaluation phase. This is a different approach to Kaggle-like contests where contestants have to process the dataset on their own machines. Our approach ensures that each contestant’s program will be able to use only the provided resources (CPU/RAM), meaning a fair environment for everybody.

The idea of this approach originated in PAN: series of scientific events and shared tasks on digital text forensics where I participated a few years ago. We also decided to replace virtual machines with Docker containers (thanks Simone for this idea) which allowed us to easily follow the steps below.

Running the challenge

Development phase

  • Create a docker container for each team (with limited CPU/RAM)
  • Mount one shared volume of the dataset to each container
  • Map an SSH port of each container to the host’s port range
  • Allow teams to develop and test their programs (in parallel of course)

Evaluation phase

  • Block incoming traffic and stop the containers
  • Run the contestants’ programs within their containers sequentially with generously allocated resources
  • Collect and validate the results for each level/iteration

Below is the command we used to provide the contestant’s container with generous resources and run their program with a time limit. We ran the containers sequentially one by one to be able to allocate them with more resources during the evaluation.

cat ${inp_data} | docker run
--name ${team_name}
30G -i --rm
-v /opt/${team_name}:/app ${team_name} /usr/bin/timeout
--signal=SIGTERM --kill-after=1s 30s
/app/run > ${out_file}.output 2> ${out_file}.error

The command also ensured that the contestant’s process received SIGTERM after 30 seconds and then SIGKILL one second later.

Results of the challenge

This colourful table (also available in Google Drive) shows the distribution of prices found by the teams’ algorithms. Each row represents a team, while the columns are the input datasets (“levels”). The red colour means that the output of the program was not valid (usually empty). Green/blue colouring represents the price-value distribution.

You can see that bigger datasets correlated with more failures of the contestants’ programs. Some of the programs failed to run or produce valid outputs. I have went through each of the errors and there was quite a variety — core dumps, missing libraries, empty results after timeout etc. Fortunately, the majority of programs were able to process each level.

Processing small datasets was easy and the programs produced the same results. The most important factor of finalists’ rankings were their results for the large datasets.

The winner

Congratulations to Petr Lavička. We hope you enjoy your and Trip Around the World with your parents. You can find Petr’s code on Github.

The top two teams had very similar solutions, both using the Simulated Annealing Algorithm. They first got a random permutation and then swapped the cities. Whenever the price was better, they accepted the new state, but they would also consider how far they were from the beginning. This algorithm is superior to the Hill Climbing Algorithm because it uses cooling to get a global optimum, instead of a local one.

Hill Climbing Algorithm with Simulated Annealing . Source: Wikipedia

The difference between first and second place was a hot topic. Petr managed to get the lowest price because, at the 300 city level, he did 120 million swaps — doubling the speed of the team in second place. Petr’s solution included parallelisation with a different seed of randomisation on each of the four processor cores of the server.

Other teams

We have sent a questionnaire to each team to find out what programming language and algorithm they used. The basic view of the used language, development time and position in leaderboard (lower means better) is. charted in this graph

used language, development time in hours and position in leaderboard (lower means better)

As far as I can remember the best three positions got occupied by teams which used Simulated annealing. But this is the only clear pattern I can see right now. So feel free to analyse source data in spreadsheet, where we added also the position in leaderboard.

Featured articles
Generating SwiftUI snapshot tests with Swift macros
Don’t Fix Bad Data, Do This Instead