Traveling Salesman Problem Path Finder in SQL

Finding optimal travel paths between cities and traveling salesman problem working in SQL

I came across a situation to find the path for a salesman based on city locations and roads connecting these cities. The base version of the scenario was the same problem as Traveling Salesman Problem. I wanted to try solving the problem with SQL.

Below is the problem statement, I got on internet for Traveling Salesman Problem, try reading through the inputs and outputs required. And I also posted solutions I made in both T-SQL and PostgreSQL variants.

First, is the City table containing two columns, Id and Name. This time, we're traveling between cities in Germany and we've been given a second table called Road that has the columns CityFrom, CityTo, and Time which contain average trip durations on a given route from one city to another. Below is create statement and data to populate rows.

create schema trip
create table trip.city(
    id int identity(1,1), -- in postgresl use serial datatype instead
    name varchar(100)
);

insert into trip.city(name) values ('Berlin'), ('Hamburg'), ('Muinch'), ('Amsterdam');
select * from trip.city;

create table trip.road(
    city_from int,
    city_to int,
    time int
);
insert into trip.road(city_from, city_to, time) values 
(1, 2, 180), (1, 3, 365), (1, 4, 390), (2, 3, 490), (2, 4, 270), (3, 2, 420), (3, 4, 370), (4, 2, 245), (4, 3, 385);
select *  from trip.road;

The trip path should cover all cities, starting from Berlin. The main task is to return travel paths starting from Berlin and covering all cities in descending order by the total time taken.

The output should contain the following columns

  1. path – the city names, separated by N' -> ',

  2. last_city_id – the ID of the last city visited,

  3. total_time – the total time spent driving,

  4. places_count – the number of places visited; it should equal 4.

The below solution is T-SQL(SQL Server) based variant:

;with travel (path, last_city_id, total_time, places_count )
as
(
    select
        cast(name as nvarchar(max)),
        id,
        0,
        1
    from trip.city c
    where c.name = 'Berlin' -- since starting from berlin, then we initially return the anchor point

    union all 

    select
        t.path + N' -> ' + c.name, -- concate the city name with existing travel path name 
        c.id, 
        t.total_time + r.time,
        t.places_count + 1
    from travel t
    join trip.road r
        on t.last_city_id= r.city_from
    join trip.city c
        on c.id = r.city_to
    where charindex(c.name, t.path) = 0 -- where part was to prevent revisiting the same city twice in a path

)

select * from travel where places_count = 4 order by total_time desc;

And below is PostgreSQL varient

WITH RECURSIVE travel (path, last_city_id, total_time, places_count) AS (
  SELECT
    CAST(name AS text),
    id,
    0,
    1
  FROM trip.city c
  WHERE c.name = 'Berlin'

  UNION ALL

  SELECT
    t.path || ' -> ' || c.name,
    c.id,
    t.total_time + r.time,
    t.places_count + 1
  FROM travel t
  JOIN trip.road r
    ON t.last_city_id = r.city_from
  JOIN trip.city c
    ON c.id = r.city_to
  WHERE position(c.name IN t.path) = 0
)

SELECT *
FROM travel
WHERE places_count = 4
ORDER BY total_time DESC;

In the codes above, we used common table expression (CTE) to recursively build up a list of travel paths between cities. The travel CTE contains four columns: path (the travel path so far), last_city_id (the ID of the last city visited), total_time (the total travel time so far), and places_count (the number of places visited so far).

The first part of the CTE returns to the starting point in Berlin. Then, the union all clause is used to recursively build up the travel paths. In the second part of the CTE, you're joining the travel CTE with the road and city tables to find the next city to visit. The charindex function (in T-SQL) or position function (in PostgreSQL) is used to make sure you don't revisit any cities on the same path. You can also use the substring function in PostgreSQL as an alternative.

Finally, the SELECT statement is used to return all travel paths that visit all four cities and are ordered in descending order by total travel time.