Rolling out the 1250 Credit Score

How we rolled out a fundamental change to millions of customers

By Jon Short

A successful rollout involves a ton of prep work including projecting usage, anticipating failure and monitoring. This article gives a very high-level look at the rollout of the recent "1250" Credit Score which was introduced by Experian UK.

As a staff engineer within the "Score & Report" domain, this year I was heavily involved with the technical rollout of the new Experian Credit Score to the entire UK customer-base (tens of millions).

I won't go into detail about the new score itself, but here's a BBC article about it if you're interested.

Setting (technical) boundaries

With large projects like this it's important to establish rollout capabilities early-on - what options do you actually have? Generally you won't be given a concrete rollout plan from the start of a project so it's vital to ensure that stakeholders are aware of what levers they can pull while defining the rollout strategy.

I'd recommend assuming that the rollout plan will change during the build, and making sure that stakeholders & decision-makers know what can be achieved technically without you being in the room.

For the new score project we defined the following levers:

  • A percentage of customers (e.g. 10% of customers)
  • Characteristic - their "customer number" (a customer facing ID)
  • Characteristic - When they created their account

Illustration of the levers with associated colours

After defining these, we knew that any potential strategies being discussed outside of the tech teams would remain technically viable.

Example - Rollout to new signups? 100% of customers which signed up after December 1st

Example - Rollout to specific customers? Customers matching ABC

Example - Rollout to customers which recently logged in? Not possible without changes!

Compatibility

Any rollout strategy which isn't "switch it 100% ON NOW!" means you'll need to run / maintain two systems in parallel for a period of time.

This often ends up being a combination of:

  • Backwards / Forwards compatibility
  • Isolation of new elements
  • Preparatory data migrations

As part of this project we modernised most of the backend services which handle customers' scores / reports, using a mixture of strategies to isolate old / new depending on what best suited the element.

Diagram showing backwards compatible elements bridging old / new systems

Event driven elements were particularly useful as we could subscribe to the same events being published throughout the system:

Diagram showing the same events being consumed by new / old systems

One of the most useful tools to have at this stage is the ability to determine the rollout state of the customer from anywhere in the system.

While I wouldn't advise designing a system that has to do these checks in lots of places - it's worth having this ability to prevent getting blocked later in the project when some part of the system needs to add branching logic.

For the new score rollout, we enhanced our in-house A/B tool to maintain an aggressive cache of the immutable characteristics (customer number, account creation date), so we had a low-cost way of getting the customer's rollout status from anywhere in the system.

Preparing to fail

How intensively you need to handle failures in a rollout varies, usually based on the customer or technical impact of an issue.

For the new score project it was vital we managed the risk of failure, so we added the following controls:

  • Pause rollout
    • At any stage we could have stopped and kept traffic to the new system as-is.
  • Rollback
    • We could have safely returned customers to the old score - we built an in-product experience to support this scenario too.
  • Temporarily limit access to the website / app
    • To protect against spikes in traffic taking down systems which were not yet tuned to high demand we could have prevented surges in traffic - similar to those queues on gig ticket websites.
  • Temporarily exclude specific customer/s from rollout
    • Useful if the makeup of a legacy customer's data caused issues within the new system, we could temporarily exclude them while we resolve the issues.

Thankfully we didn't actually need to use most of these controls, but having the peace of mind that we could manage failure was worth the extra implementation effort in this case.

Dogfooding

Using your own features & products is always useful, but it works especially well when validating personal data (e.g. a score or report). It's often easier to spot inaccuracies or inconsistencies if the data is your own.

Because of this it was important for us to get staff using and feeding back on the new score as early as possible, so we rolled out to internal staff early on in the project, and asked them to use the product as they normally would.

All staff were invited to participate, and we got loads of useful feedback which we fed back into features.

Going live

That point of going live to the first real customer is the most satisfying and nerve-wracking moment in any project.

For the new score project, during this preparation period we were building awareness amongst customers (in-product messaging, press releases, other CRM campaigns).

Martin Lewis explaining the credit score change

It's worth taking a pause before going live just to verify that everyone is aware of the technical and commercial metrics to monitor, and how they might affect the pace of rollout.

For example:

  • Service stability / expected error-rate
  • Performance tolerance
  • Customer sentiment

These need to be continually checked, with alarms and dashboards available to anyone who's interested (ideally not just the tech teams).

The new score launch roughly went:

  1. Customers which opted in to help us test (~20 customers)
  2. Small cohorts (each ~300k customers)
  3. Regular cohorts (each 1m+ customers)
  4. All new customers get the new score
  5. Large cohorts (each 4m+ customers)
  6. 100% of customers 🚀

We were quite aggressive, so a graph of the customers onboarded would look a bit like this:

A graph going upwards in a mostly linear fashion

Technical challenges

With larger projects it's key to start off by ensuring that the overall architecture is heading in the right direction. Often requirements change, so having a system which is able to accommodate this without breaking the original vision is important.

Using a phased rollout does mean that two systems need to be maintained for a longer period of time - it's worth keeping that extra test effort in mind when developing anything in parallel. As you get closer to 100% of traffic on the new system make sure you have strong regression test suites checking the scenarios that are not actively being worked on.

There's also a balance in time spent preparing to launch - for example despite load testing, some elements of our new system did initially struggle when millions of customers interested in seeing their new score started to hit it; and small parts of inefficient code had a much larger impact than we anticipated.

We tried to avoid premature optimisation, but there were quite a few improvements we could more easily identify under real-life load. Thankfully our architecture scales horizontally, so we were able to instantly relieve pressure from load during this period by scaling while we optimised the problem areas.

Summary / Final thoughts

The new score was a huge project and it took an immense amount of effort and passion from many people across Experian to deliver it as quickly as we did.

I can't take credit for the amazing work done by everyone else; if you are one of those people - thank you - I genuinely appreciate it.

Thanks for reading,

Jon 😄

chevronLeft iconReturn to home