NatWest Group - Leveraging Comet for ML Platform Standardization and Collaboration
NatWest Group is a leading bank in the UK, serving over 19 million customers to achieve their goals. Over the past few years, NatWest has been transforming its business by adopting new technology and developing many new machine learning-driven products and solutions. As the business started to scale its use of ML, the data science community recognized the need to streamline their machine learning (ML) operations processes and enhance collaboration among their teams. The goal was to accelerate how NatWest could take their models from the data science world of experimentation and move them into production using software engineering best practices. To standardize their ML platform and improve visibility into model management, NatWest adopted Comet as their experiment tracking and model management tool. This case study explores their challenges with their previous tech stack, the benefits of implementing Comet, and their vision for the future.
NatWest uses Machine Learning extensively across their organization to improve their customer experience. They have several models in production that are responsible for:
- Keeping Customers Safe: Fraud control, financial crime detection, and ensuring customer safety were some of the first areas where they invested heavily in ML.
- Keeping Customers Engaged: Understanding customer behavior, customer lifetime value, and targeted marketing was crucial for customer engagement, and this was an area where investing in ML proved to be good for the business.
- Improving Customer Interactions: NatWest Group aimed to enhance the customer experience using chatbots and efficient call centers that used the latest NLP techniques to understand customer needs.
NatWest Group boasts a substantial ML team consisting of over 180 data scientists and ML engineers, and over 400 data engineers. These teams operate in a decentralized manner, spread across different locations. Currently, Comet serves as the central system of record for these teams/practitioners. Providing them with full visibility into the model development process enables seamless collaboration across various roles and responsibilities.
ML Frameworks and Model Deployment
NatWest’s use cases span the use of both classical machine learning and deep learning. Practitioners within NatWest use a wide variety of frameworks and tools, such as scikit-learn, XGBoost, Pytorch, Tensorflow, Hugging Face, and Sagemaker. ML best practices change constantly, and it is imperative that MLOps tools remain flexible enough to support the varied stacks and workflows that practitioners prefer.
For model inference, NatWest predominantly runs batch inference. However, fraud and financial crime detection models are starting to move towards an event-driven model, for real-time inference. This shift aims to enable proactive identification and prevention of fraudulent activities, ultimately enhancing customer safety.
Challenges with the Previous Tech Stack
NatWest encountered several challenges with its ML workflows across people, processes, data, and technology.
People: NatWest rapidly expanded its team of data scientists and ML practitioners, requiring them to be trained to use a consistent model development methodology and technology stack.
Process: Existing processes for the development of models and solutions were manual and unsuited for an agile way of working that aimed to accelerate time to value.
Data: Before creating an enterprise data lake, data was spread across multiple systems and databases, making access for data scientists and engineers difficult and time-consuming.
Technology: Their tech stack was heterogeneous, leading to inconsistencies and complexity. Experiment tracking was fragmented, ranging from Excel spreadsheets to individual instances of MLflow. Model monitoring was also bespoke for each use case, relying on tools like Tableau for visualizations.
The above challenges can be summarized as follows:
- Managing a large and complex data science pipeline. The team needs to track and manage the data, code, and models used in their machine learning projects.
- Ensuring reproducibility of their machine learning experiments. The team needs to be able to reproduce their results so that they can be confident in their findings.
- Communicating results of their machine learning projects to stakeholders and business leaders for review and auditing. The team needs to be able to communicate their findings clearly and concisely so that stakeholders can understand the impact of their work.
- Lack of standardization resulted in varying quality, difficulties in auditing, governance, and limited reproducibility, which impacted the overall efficiency and effectiveness of ML initiatives.
Vision for a Hybrid Platform and Tooling Philosophy
By integrating Sagemaker and third-party tools like Comet, NatWest created a comprehensive and robust ML infrastructure. This allowed them to adopt a best-of-breed approach, selecting the top-performing tools that align well with their ML processes. Their systems were also designed with interchangeability in mind, enabling easy integration of new tools and harnessing the latest advancements in the ML field. By promoting good software development practices like modular and maintainable code, version control, testing, documentation, and reporting, they ensured platform independence, efficient collaboration, and the ability to handle large-scale data. This made their ML stack scalable and adaptable to their evolving business needs and the ever-changing ML landscape.
Choosing Comet Over In-House and Open-Source Solutions
NatWest initially experimented with MLflow, but scalability and enterprise security concerns led them to seek a dedicated ML platform. Ultimately, they chose Comet due to its advanced functionality, enterprise-grade scalability, ease of use, and compatibility with their existing MLflow implementation. Comet’s familiarity with similar APIs facilitated a smooth transition, enabling the NatWest ML teams to unlock advanced features and enterprise-grade capabilities.
The Adoption of Comet
Comet provided several features that helped NatWest’s data science team, including:
- Experiment Tracking: Comet tracks the data, code, and models used in machine learning experiments. This information can be used to reproduce experiments and to identify the best-performing models.
- Model Management: Comet helped their data scientists manage their machine learning models. This includes tracking the status of models, monitoring their performance, and deploying them to production.
- Reporting: Comet provides several reports that helped the team at NatWest communicate the results of their machine learning projects to stakeholders. These reports can be customized to meet the specific needs of the audience.
- Centralized Model Registry: NatWest Group aims to expand the usage of Comet’s Model Registry, making it the central source of truth for ML assets within the organization. This strategic move enhances visibility, fosters collaboration, and enables effective team communication. The Model Registry feature becomes instrumental in providing a comprehensive overview of the ML landscape, empowering stakeholders to make informed decisions.
Extending Comet Usage to Language Model (LLM) Experimentation and Prompts
Recognizing the diverse applications of machine learning within their organization, NatWest has expanded their utilization of Comet beyond traditional machine learning models. One notable example is the integration of Comet for tracking Language Model (LLM) experimentation and prompts. As the bank continues to invest in enhancing customer interactions, they are now rapidly experimenting with how to leverage LLMs to enhance customer conversations and provide automated ways to summarize and derive insights from these interactions.
Standardizing the tracking and management of LLM experimentation using Comet contributes to overall operational efficiency, leading to faster iterations and a more agile development process, driving increased innovation. As they continue to explore the capabilities of Comet, they are poised to achieve further advancements in LLM development, customer engagement, and business outcomes.
Comet’s Relevance Today and Future Vision
NatWest sees Comet as a pivotal component of their ML platform both now and in the future. They aim to continue building upon their standardization efforts, leveraging Comet’s capabilities for seamless collaboration and enhanced visibility. The Model Registry will become the go-to resource for ML assets, ensuring a unified view of the bank’s ML initiatives. Furthermore, NatWest plans to utilize Comet’s reporting feature to improve cross-team communications and foster a data-driven culture within the organization.
By embracing Comet, NatWest Group has achieved significant advancements in standardizing their ML platform, improving visibility, and fostering collaboration across teams.