Taking a Crack at the Traveling Salesman Problem

The human mind is a path-planning wizard. Think back to pre-lockdown days when we all ran multiple errands back to back across town. There was always a mental dance in the back of your head to make sense of how you planned the day. It might go something like “first to the bank, then to drop off the dry-cleaning. Since the post office is on the way to the grocery store, I’ll pop by and send that box that’s been sitting in the trunk for a week.”

This sort of mental gymnastics doesn’t come naturally to machines — it’s actually a famous problem in computer science known as the traveling salesman problem. While it is classified in the industry as an NP-hard problem in combinatorial optimization, a more succinct and understandable definition would be: given a list of destinations, what’s the best round-trip route that visits every location?

This summer brought news that the 44-year old record for solving the problem has been broken. Let’s take a look at why this is a hard problem, and how the research team from the University of Washington took a different approach to achieve the speed up.

The Science of Getting From Here to There

The traveling salesman problem (or TSP for short) has been one of the most studied problems in computer science. It was first considered all the way back in the 1800s by mathematicians W.R Hamilton and Thomas Kirkman. Despite what you might think, it actually has applications outside the GPS navigation system on your smartphone. What can be modeled as a TSP includes everything from the layout of ASICs, the school bus route plans, the order of drilling holes for PCBs, to DNA sequencing. The “distance” between points could mean physical distance or similarity between genes.

In the 1930s, Karl Menger an Austrian-born mathematician who was a visting professor at Harvard and Rice at about the same time, considered the brute force solution that simply considers every possible route. That’s approach becomes dauntingly difficult quickly as just 15 different locations yield 1,307,674,368,000 different routes to compute (that’s 15 factorial). As you can imagine, scaling to hundreds or thousands of different destinations quickly becomes uncomputable. Right around 60 different destinations yields around the same number of possible routes as the number of estimated particles in the universe. And so a classic computing challenge was born.

There are several different versions of the TSP such as metric, euclidean, asymmetric, and more domain-specific constraints such as adding a cost to traveling different routes in addition to the distances themselves. This leads to a distinction between the generic and non-generic TSP.

Our Unique Talents

Strangely enough, humans are actually quite good at solving the euclidean version of the problem. Humans can arrive at a near-perfect solution in close to linear time even in cases with more than 120 destinations. The ease at which we can accomplish this feat has fascinated several cognitive psychologists and prompted dozens of studies and papers on the subject. In fact, this same ability has been shown in pigeons and shape-shifting bacteria.

The current hypothesis is that humans have several heuristics or shortcuts that make them very good at satisficing — a term coined by Herbert A. Simon that simply means arriving at a decision that is “good enough”. Most of what computers do is calculating a specific and verifiable answer. It is correct or incorrect with no room in between. It is much easier to arrive at the perfection solution than it is to define what is good enough given the current context.

The Approximate Solutions

In this vein, computer scientists over the past half-century have been refining and applying new techniques and approximations that computers can use to sort of guess their way to a better solution. Techniques such as dynamic programming were able to get the number of routes to calculate down to n²2ⁿ or 7,372,800 possible routes for 15 destinations, a far cry less than one trillion. Branch and bound algorithms are able to get that even lower but so far, it hasn’t been proven that there exists a perfect algorithm that evaluates less than 2ⁿ routes.

Modeling ant colonies, which use pheromone trails to build consensus for shortest path, was proposed in this paper published in 1997.

This is still largely unreasonable for larger data sets and so approximate solutions needed to be solved for rather than perfect ones. Heuristics such as Nearest Neighbor, which simply goes to the nearest destination that hasn’t been visited, randomized searching using Markov chains, and even simulating ant colonies are all solutions that are used to produce approximate solutions. However, even the best of these algorithms are only able to guarantee that their solution will be at most 50% longer than the perfect solution.

A Razor Thin Improvement is Worth More than Face Value

While we’ve seen hardware being thrown at the TSP in both custom accelerators as well as the sheer number of cores working on the problem, there haven’t been any algorithms able to break that 50% number. Until the recent preprint paper published by Anna Karlin, Nathan Klein, and Shayan Gharan about a new technique that offers a slight improvement for the metric version of the TSP. The improvement just shaves 10^-34 off that 50% number. That may not seem very much and you can argue that it is so tiny that it doesn’t matter on all but the largest data sets.

However, the key is that it was mathematically demonstrated that improvement was and is possible. The record of the past 44 years has been broken and the authors of the paper hope that it will inspire others. A similar breakthrough happened in 2011 when the graphical case of the TSP saw a small improvement. That small refinement inspired other researchers who developed redesigned algorithms. What started as a simple augmentation eventually lead to breaking the barrier of 50% all the way down to 40% for the graphical case.

The takeaway here is that problems can be solved. Even problems that seem like you’ve hit a limit that can’t be solved. It’s nice to find that 44-year-old problems still get some attention, pulling the state of the art along with them as new eyes find unexpected approaches.