Fraud Prevention using Machine Learning – Transaction Risk Monitoring Qorus Reinvention Awards - APAC 2024-2025 - Winner
IndiaCategory
Core Offering InnovationKeyword
AI & Generative AI, Prevention, Cybersecurity & Authentication, Automated Inspection, Risk management
Innovation presentation
Problem Statement: Conventional Fraud prevention systems rely heavily on rules. Rules curated by subject matter experts tend to be reactive, time consuming and tend to miss out on capturing complex multi-dimensional relationships that can indicate prevalence of fraud risk. With close to 6 million transactions processed in a day through Credit / Debit cards, there is a clear need to be proactive and not reactive with approach.
High Level Solution: “Transaction Risk - ML Fraud Engine” is a state-of-the-art machine learning-powered fraud scoring engine. This engine detects card transaction fraud by scoring each transaction in real time. The system uses historical data to analyse cardholder data*, transaction patterns, channel attributes and generates real-time "fraud likelihood scores". These scores can then be used in conjunction with existing systems to improve fraud strike rate and reduce false positives, thereby mitigate risk and enhance customer experience. This solution is applicable to POS, ATM, and CNP (Card not present) transactions for both credit and debit cards. “Transaction Risk - ML Fraud Engine” integrates with existing fraud risk engine to enhance fraud detection.
Role of Host application: Host system supplies the cardholder data viz. demographic details of the debit and credit card customers. Additionally, if the rule qualifies a condition for which the outcome to block the card in host system, then the Transaction risk engine invokes a card block API basis the score received from the Transaction risk ML engine.
Objective / Key Performance Indicators: 1. Increase first fraud identification strike rate 2. Increase overall fraud identification strike rate 3. Reduce false positives
The architecture is designed for high performance and scalability to manage 1000 transactions per second (TPS) from both real-time (RT) and near-real-time (NRT) API requests, ensuring no performance degradation. It guarantees rapid fraud probability scores, delivering RT scores within 15 milliseconds and NRT scores within 40 milliseconds. The entire application is tightly integrated with the other critical applications in the Transaction Processing flow – Switch (Base 24), Fraud Risk Engine (PRM) and Host systems of credit and debit cards. Even with introducing a new hop (“Transaction Risk - ML Fraud Engine”), architecture has been designed to keep the total response time intact. The system is expected to outperform the PRM model, maintaining accuracy in fraud detection and false positive rates. It features dynamic scalability for handling fluctuating transaction volumes and includes mechanisms for timely data ingestion and model retraining. Additionally, the architecture supports weekly model updates without service interruption, utilizing an in-memory Redis Cluster for instant data access. System is available 24X7 except under scheduled OS Patch activity (monthly).
A key innovation in solving this problem is the balanced approach of serving inference from machine learning models in real time and managing and processing data needed for model inference and recomputing of the features. This is done by logically splitting the architecture into 2 blocks. This is an End to end on-premises implementation. Details of the same are provided below for reference.
Architecture components: The Architecture has 2 components, Block A and Block B.
Block A focuses on deploying real-time (RT) and near real-time (NRT) APIs, weekly model updates, and aggregates updates, utilizing a hardware load balancer for efficient traffic distribution and redundancy. It includes services for processing transactions across different channels (ATM, POS, CNP), supported by a Redis Cluster for in-memory data storage and quick access.
Block B is centered around batch processing for data ingestion, model re-training, building and validation, and logging management, employing Hadoop for handling large datasets. It utilizes Apache Kafka for data streams, Spark and Python for model retraining and validation, and a Hive database for logging and reporting, ensuring robust data processing and storage capabilities.
The Key Features and Innovations of the solution are mentioned below:
Machine Learning centric: Utilizes custom state-of-the-art machine learning algorithms to reliably capture fraud patterns and anomalies efficiently.
Real-time fraud scoring: Processes transactions in real-time, generating scores to identify fraud within milliseconds.
Scalable, reliable and Secure Architecture: Designed to handle large volumes of data securely, ensuring reliability and compliance with regulatory standards.
Balancing Accuracy and Latency:
In our fraud detection system, we leverage over 150 features to accurately assess transactions across various channels. Around 60% would be precomputed in previous day batches (e.g. a customer to merchant affinity score, customer Last 3, 7, 30, 90 days average of spends on the merchant, category, time zone, geography etc.) and 40% of features are computed inflight based on delta changes for that particular transaction. These features are primarily divided into three categories: transactional features, card aggregates, and merchant aggregates. Transactional features encapsulate the details of each individual transaction and occupy approximately 1 kilobyte (kb) of space. Card aggregates, which summarize historical data related to a specific card, consume around 5kb, while merchant aggregates, providing insights into merchant behaviour, occupy roughly 1kb.Cumulatively, the feature vector for a single transaction amount to approximately 7kb. To ensure optimal processing speed, we store these feature vectors in memory. For this we've implemented Redis an in-memory database, during transaction processing mitigating concerns regarding latency. To maintain accuracy, we've developed a separate API that dynamically updates card and merchant aggregates based on current transaction features and historical data stored in Redis. By calculating delta changes in aggregates after each transaction, we ensure that the system remains finely tuned and adaptive. This proactive approach enhances the accuracy of our fraud detection while maintaining efficient processing speeds. Application is designed to handle 1000 TPS and current peak volume observed is 270 TPS. Application is load tested end-to-end to handle 4X of the current TPS. To handle peak transaction volumes, we are leveraging Kubernetes in our application, API’s are deployed as Kubernetes pods. There is control logic put in place to automatically scale up and scale down the pods to handle incoming TPS and this will not diminish model performance. We have application monitoring in place and model has performed consistently well in terms of accuracy and latency meeting the SLA. Approximately, 99.8 % of transactions meet the SLA for real time response within 15 MS.
Implementation Approach: To mitigate potential risks during implementation, we adopted a phased approach to deploying ML models into production. Initially, the focus was on deploying an ML model for one specific channel, namely ATM transactions. We prioritized enabling transaction scoring in Near Real-Time (NRT) rather than immediately implementing Real-Time (RT) scoring. This approach allowed us to ensure system stability and smooth operation for one channel, minimizing the risk of timeouts or disruptions. Subsequently, we extended this deployment to include other channels, following a phased implementation strategy. Once the system demonstrated stability and satisfactory performance in the NRT environment, we proceeded to evaluate model performance for RT scoring. To effectively utilize the scoring system for identifying and declining fraudulent transactions, we initially implemented rules within the NRT framework. These rules were closely monitored for a period of one month to assess the impact of the ML model scores. Following a thorough evaluation of the model's performance in the NRT environment, we proceeded to enable RT rules utilizing the model scores. A separate rule has been created in PRM to utilize only the ML score and decline transactions in RT, a score of >80 was used to decline transactions in Jan’24 and assessing the strike rates, optimum score of >70 has been used in Mar’24. Also, the existing rules of PRM additionally use ML scores to enhance the decision and reduce the false positives.
Interested in learning more?
Qorus has a library of almost 8,000 innovation case studies across critical areas like customer experience, sustainability, marketing & distribution and more that can be used to inform your decision-making.