Interview with Jan Plhák, Kiwi.com Principal Engineer
Modern technologies have given us the power not only to progress in creating a new “reality” but also to innovate areas that have been established before. Different ways of traveling existed for centuries, however, technology has allowed improving search capabilities leading to a better customer experience. We met with Jan Plhák, Kiwi.com Principal Engineer, to discuss how the synergy between science and modern technology has allowed creating Nomad — a product that brought a new perspective to a standard Multi City search.
You’ve been among the key people to work on Nomad. It is a representation of a so-called Travelling Salesman problem (TSP) — important in theoretical computer science. How did you start exploring the possibility of its integration and what was your motivation behind it?
I am a mathematician originally and TSP is one of the best known NP-Hard problems so I was naturally intrigued to work on it. On top of that, Nomad is a natural extension of Multi City search, that we already had by that time. It is very probable that for some of our multi city customers the exact order of the cities would not be a critical matter. Thus by dropping the ordering constraint we could access wider optimization space within which we can push the price even lower compared to regular Multi City search which results in better value for the customer.
It is worth noting that while computing the optimal solution for TSP is difficult on its own, it is only half of the problem. The reason is that before you start computing anything, you need to gather all of the underlying data. That is a big task itself as it requires getting the optimal itineraries for all city pairs and all relevant days in a matter of milliseconds. We happened to already have a very advanced routing system so the question was “Why not use it to create Nomad?”.
It is very unexpected to hear, usually, the main interest behind Nomad creation is the complexity of algorithm application while the need for handling such massive data is usually underestimated.
Imagine that for every possible departure date, between every two via cities, you need an optimal itinerary. That is quadratic with respect to the number of via cities and linear in the number of departure dates. In addition to that, you actually need multiple optimal solutions per cell to provide variability. All of this results in a very big 4-dimensional data tensor. It is quite difficult to make an optimal Virtual Interlining even for a single day and it took us some time to do it in real-time. Luckily for us, we could leverage our own custom database — Skystore — where we store hundreds of billions of pre-computed optimal itineraries. This database itself is a small miracle. So Nomad is a synthesis of being able to solve a time-dependent Travelling Salesman problem and getting the data.
Considering the size of the data, did you have any issues with the processing time to get the response?
If we are not speaking about extreme scenarios (e.g. 9 cities combined with very wide time windows), the search query is solved very fast. In most cases, we are able to use a deterministic algorithm that always guarantees the correct solution. More complicated instances are solved by a probabilistic algorithm.
In fact, we started with a deterministic algorithm while building Nomad. The algorithm as such is based on a dynamic programming algorithm which can be found on Wikipedia. Although it’s fair to say that we had to modify it heavily as in our case the problem has time-based nature, many additional constraints (stay duration, total trip length, …), variability requirements, and similar.
Was it planned from the beginning to have 2 algorithms? Or you started with one and realized later that one will not be sufficient to solve all the requests?
Originally we built just the deterministic one. However, later on, we added the probabilistic one which is able to address more complicated queries, especially when the customer can select multiple destinations within one itinerary segment. For example, thanks to the 2nd engine, we can serve customers that want to visit multiple countries, but only one city per each country for which you need to ensure continuity of the itineraries which makes the problem much harder.
How long did it take to prepare the first production version? What improvements can we expect in the future?
I think the first prototype version was built in 3–4 weeks. But then the biggest challenge was actually getting it through our system. That meant integrating it to our public API, preparing UX and front-end in general. The UX in particular has been very challenging as we had to make it understandable for everyone.
As Kiwi.com is growing and enriching its content, we’ve been working on securing the versatility of the results as well as growing the number of the results per search.
There are a few “hidden gems” prepared that have been deprioritized due to Covid-related changes to support the customers. At the moment new Nomad features are awaiting their time.
Searching for a new adventure in Engineering? Check our open positions.