Case Study

Great Rail Journeys

An intro to Great Rail Journeys

Great Rail Journeys is a leader in escorted group touring, with a differentiated rail proposition at its core.  They have an extensive brand heritage having been established in 1973 and now has an established presence across the UK and US source markets.  Headquartered in York and Chicago, they provide a global list of destinations with routes across Western Europe being among the most popular offerings.

The Challenge

Great Rail Journey’s customers book up to 2 years in advance of travel. This poses a pricing challenge, particularly for rail transport, where tickets are only released for sale much closer to the time. As a result, Great Rail Journeys must price the package for the customer long before they know the cost to them.

Previously, Great Rail Journeys relied on manual estimation, which is time-consuming and prone to overlooking sudden changes in factors such as inflation, interest rates, and fuel prices. Accurate estimation is key to providing a fair and competitive price to the customer, which Great Rail Journeys are keen to do.

The Solution

To improve estimation, Great Rail Journeys partnered with Enablis on a proof of concept (POC) to see whether artificial intelligence (AI) and machine learning (ML) could be used to improve their pricing model. If AI/ML has the capability to match – or better – manual performance, that would drive time efficiency and reduce risk.

Implementation

The first step was to train a model using historic bookings data.

Over several iterations and tests, we found that a combination of CatBoost, random forest and XGBoost ML algorithms produced the highest scoring model.

We wrapped this in a Python based ML pipeline which allowed the model to be run time and again to predict prices for different dates, routes and product ranges.

The next step was to run the model against test data. To do this, we’d withheld the most recent six months of booking data from the training process. This meant we could pretend it was for future bookings and get the model to predict prices which we would then compare to what really happened.

Challenges

Data volumes

  • The Challenge: By narrowing the scope to a single route, the volume of training data the model could use was reduced. Whilst we were aware of this, we hadn’t considered that customer bookings would be rolled up into batched purchases, further reducing the volume.
  • The Solution: We extended the baseline model to include all journeys by a one operator. This increased the volume of training data whilst retaining consistency in factors that drive pricing decisions.

Data evolution

  • The Challenge: The data within a complex system is occasionally altered by historical activities and maintenance, which can disrupt the model. We identified a few anomalies where the data did not align as expected.
  • The Solution: We either excluded or cleaned the data. This required a high degree of business knowledge to choose the correct path. Each thing we discovered and corrected or omitted led to a slightly more accurate model.

Product evolution

  • The Challenge: Some differences in historic data were a result of the way the product had evolved. For example, the product range data field was only recently introduced. Historic data was not classified in the same way and so the model couldn’t use it for training purposes.
  • The Solution: As this was an important indicator, we needed to back-populate for historic records. The Great Rail Journeys data team, who we worked closely with throughout, provided crucial support in the process.

Recent data profile

  • The Challenge: Whilst we held back the most recent six months of data to be used for testing, we hadn’t considered that much of it was yet to be priced. Our model was able to come up with an estimate, but there was nothing to compare it to and check the accuracy.
  • The Solution: We tested with a subset of the recent data that had been priced. Whilst this meant that it skewed towards shorter lead-time bookings, which inevitably have more volatile pricing, we could still compare the accuracy of our model with manual predictions to gauge its performance.

The Results

Metrics

We used three metrics to assess the accuracy of a set of price estimations:

  • Mean Delta : The standard average across all estimates. Gives a view of overall P&L position if the given predictions are used.
  • Mean Absolute Error (MAE) : The average margin of error across all estimates in absolute terms, considering positive and negative errors the same.
  • Root Mean Squared Error (RMSE) : This gives exaggerated importance to large gaps between estimated and actual and so measures the tightness of fit – this assesses the ability to predict spikes in pricing which would affect individual tours’ profitability.

Man vs. Model

We compared the accuracy of the ML model against manual price estimations across tens of thousands of bookings. We evaluated performance across the above three metrics combined and broken down by product range.

Analysis

  • The model outperformed manual prediction in 15 of the 16 categories.
  • Manual estimation had a slightly better mean delta for the Discovery product range but still performed worse than the model on MAE. This means that the model was closer to the actual price on most estimates, but the wins and losses of manual estimation cancelled each other out to a greater extent.
  • The overall mean average of the model was almost zero. This means that the wins and losses cancel each other out almost exactly across the full product range. This is largely due to the mean for the Classic product range being significantly better for the model (0.51) versus the man (6.37).
  • The model’s MAE of 13.95 for the Classic product range is particularly impressive as it is notoriously hard to predict that product due to differences in the ticket class that the POC model is unaware of.

The Results

Fair Pricing

Competitive, accurate fares for customers.

Risk Mitigation

Reduced exposure to unforeseen price spikes.

Efficiency

Freed up teams to focus on higher-value activities.

The Impact

The POC demonstrated that an AI/ML approach can be used to not only match, but improve upon, manual estimation of fare prices. This will:

  • Improve the ability to offer fair and competitive customer pricing.
  • De-risk losses due to unforeseen price spikes.
  • More accurately forecast overall P&L on a tour in line with budgets.
  • Refocus manual efforts onto higher value activities e.g. new routes.
  • Safeguard the business against single-point-of-failure key skills.

What the client said

Kerry Jenkin

Great Rail Journeys, CIO

“This is a great example of how AI can be used to drive business value that directly hits the bottom line. Working in collaboration with our domain SMEs, Enablis was able to understand our business and deliver an incredibly impressive outcome very quickly.”