Global automotive company use Hazy to boost data provisioning.

The credit side of automotive companies is a major profit center. Small incremental improvements in the systems and the decisioning processes that power these processes can have a big impact on increasing revenue.

Challenge

Improving the systems that drive the credit decisioning process as well as improving the algorithms that make those decisions all relies on developer and data scientists using highly sensitive data. This data is often aggregated with third party data that has contractual usage limitations as well. These factors make it hard to drive rapid improvements and increase revenue.

Sensitive internal and third party data

Data about a person's credit worthiness is highly sensitive and could cause significant damage to the business if leaked. Third party data cannot be used contractually to test and evaluate new processes.

Masked data loses fidelity

Heavily masked data is currently used by the testing team to improve software processes, but this data has no statistical fidelity and cannot be used by data scientists to improve credit decision algorithms.

Long data provisioning process

Due to the sensitivity of the data, manual data masking process must be added into the data provisioning workflow lengthening the time it takes significantly.

Solution

Hazy’s synthetic credit risk data set preserved the same risk profiles and outcomes as the real data. This allowed thorough, realistic testing of new data pipelines, workflows and processes. By improving the whole credit risk rating system to make it more efficient, allows quicker credit risk checks making the customer experience more seamless.

By maintaining the statistical fidelity in the data it also critically allows the team to now explore and build new credit decisioning models without accessing any third party data. These built and tested models can then be run against real data in production.

The whole team across the development and testing community varies in technical skill level. Some members of the team a comfortable with interacting with databases and command lines directly, other prefer to use excel and graphical interfaces to interact with the data.

To integrate synthetic data in to production, the team built a Confluence front end interface on top of the synthetic data set in order to allow every type of user to easily access, subset and export the exact data they needed. This was a core part of the project that allowed very quick data provisioning that was unlocked by having pre-certified synthetic data.

Results

Delivering a synthetic data solution for the credit risk rating team unlocked rapid testing and development. Running decisioning models against synthetic data and real data and getting the same results, gave the team confidence that they can work with synthetic data in the future.

=

Exactly the same results as real data

40 ×

Faster data provisioning

0

Requirement for long masking process

Next step

Want to do more with your data?

Hazy unlocks the real value of your data. Without compromising privacy.

Contact us