Whitepaper · Banking & Finance

Real-time Decisioning
in Regulated Banking

How tier-1 banks are replacing batch campaigns with sub-100ms AI decisions across digital channels — while meeting RBI, SAMA and MAS compliance requirements.

Pages18

PublishedJune 2025

IndustryBanking & Finance

Executive Summary

The Case for Real-time Decisioning

Batch-mode marketing — the practice of segmenting customers overnight and pushing campaigns at fixed intervals — is failing in modern banking. Customers now interact with their bank dozens of times per day through mobile, web, ATM and branch channels. The window to present a relevant offer, warning or service message can close in seconds.

This whitepaper examines how leading tier-1 banks in Asia Pacific, the Middle East and Europe are deploying AI-driven decisioning engines that evaluate customer context in real time — below 100 milliseconds — and select the next best action across every touchpoint simultaneously.

3.8×

Conversion uplift vs. batch campaigns

62%

Reduction in irrelevant notifications

<100ms

Decision latency at scale

"The banks that win the next decade will not be those with the most products — they will be those that know exactly which product to offer, to whom, at the precise moment the customer is ready to act."

— Head of Digital Banking, Tier-1 Asian Bank

Chapter 1

Why Batch Fails in the Moment-of-Intent Economy

Traditional CRM and campaign management tools were built for a world where customer interactions happened once or twice a week — a branch visit, a monthly statement. Today, the average retail banking customer generates over 40 digital events per day: logins, balance checks, card transactions, transfer attempts and support queries.

The Batch Paradox

When a bank segments customers at 2am and launches a home loan campaign at 9am, it is operating on 7-hour-old data. A customer who repaid a significant balance at 8:30am and is therefore primed for a refinancing offer will receive the generic segment message — or worse, be excluded because the overnight model didn't catch the signal.

Key Finding

Banks running real-time decisioning see an average 62% reduction in opted-out customers within 90 days of deployment, primarily due to relevance improvements reducing notification fatigue.

Regulatory Pressure Compounds the Problem

Regulators in India (RBI), Saudi Arabia (SAMA), Singapore (MAS) and the EU (GDPR/PSD2) are increasingly scrutinising mass outreach practices. Personalised, consent-driven, contextually relevant communication is no longer a best practice — it is an emerging regulatory expectation.

Regulator	Key Requirement	Real-time Decisioning Impact
RBI (India)	Consent-based outreach, opt-out honoured within 24 hrs	Instant suppression at decision layer
SAMA (Saudi)	Personalisation guardrails for vulnerable customers	Real-time risk segment gating
MAS (Singapore)	Fair dealing, no misleading offers	Eligibility checks at decision time
GDPR (EU)	Lawful basis per communication type	Consent checked per event, not at batch load

Chapter 2

Architecture of a Real-time Decisioning Engine

A production-grade decisioning engine for banking has four core layers: event ingestion, customer context assembly, model inference and action selection. Each layer must operate within strict latency budgets to meet the sub-100ms SLA required for inline channel injection.

Layer 1 — Event Stream Ingestion

Every customer touchpoint — mobile SDK event, card authorisation, IVR interaction, web session — is published to a high-throughput event stream (Kafka or Kinesis). The ingestion layer normalises event schemas and attaches customer identifiers within 5–10ms.

Layer 2 — Context Assembly

The context service retrieves the customer's current feature vector from a low-latency feature store (Redis, DynamoDB or Aerospike). This vector includes recency scores, product holdings, risk tier, consent flags and behavioural propensity scores — pre-computed by batch and micro-batch jobs and updated incrementally on each event.

Layer 3 — Model Inference

A served machine learning model — typically a gradient boosted tree or neural network — evaluates the assembled context against each candidate action. ONNX-format models served via TensorFlow Serving or Triton Inference Server achieve single-digit millisecond inference latencies at thousands of requests per second.

Layer 4 — Action Selection and Delivery

The top-ranked action is selected subject to business rules: frequency caps, eligibility guards, channel availability and regulatory suppressions. The decision is written to a decision log, and the selected action is dispatched to the channel layer — push, in-app, email, SMS or in-branch — within the latency budget.

Architecture Note

Appice's decisioning engine runs context assembly + inference + action selection in under 30ms at p99, leaving 70ms headroom for channel dispatch. The system handles 50,000+ decisions per second per node with horizontal auto-scaling.

Chapter 3

Implementing Compliance Guardrails at Decision Time

Regulated banking environments require that compliance constraints are applied at the moment of decision, not as a post-processing filter. Post-hoc filtering creates race conditions: a suppressed customer may receive a message that was committed to the channel queue before the suppression event was processed.

Consent and Suppression Architecture

Best-practice architectures maintain a consent service that is queried synchronously as part of the action selection layer. The consent service holds per-customer, per-channel, per-topic consent states and applies regulatory time-window rules (e.g., no marketing between 9pm–8am local time per RBI circular).

Explainability Requirements

MAS and SAMA are beginning to require that AI-driven personalisation decisions are explainable — particularly for credit and insurance product offers. SHAP (SHapley Additive exPlanations) values computed at inference time provide per-decision attribution that can be logged, audited and, if required, presented to the customer.

100%

Decisions logged with explainability metadata

<5ms

Consent service lookup latency

4 yrs

Decision audit trail retention

Conclusion

From Batch to Real-time: A Phased Approach

For most banks, the transition from batch to real-time decisioning is a 12–18 month programme, not an overnight switch. The recommended phased approach preserves existing channel investments while progressively introducing real-time capabilities.

Phase 1 (0–3 months): Deploy event stream ingestion alongside existing batch CRM. Begin building the feature store. Real-time suppression and consent management go live first — immediate compliance benefit with low risk.

Phase 2 (3–9 months): Deploy real-time decisioning for one high-value use case (e.g., in-app next best offer on login). A/B test against the existing batch campaign. Measure conversion, opt-out rate and customer satisfaction.

Phase 3 (9–18 months): Extend real-time decisioning across all digital channels. Retire batch segmentation for customer-facing outreach. Maintain batch jobs for analytical and reporting purposes.

About Appice

Appice is a real-time decisioning system built for regulated industries. Most platforms analyse. Appice coordinates decisions and execution — inside compliance. Signal in. Decision made. Action taken. Under 100ms. Allyvate AI processes millions of decisions daily across banking, telco and healthcare. Visit appice.ai or write to contact@appice.ai.

Real-time Decisioningin Regulated Banking

The Case for Real-time Decisioning

Why Batch Fails in the Moment-of-Intent Economy

The Batch Paradox

Regulatory Pressure Compounds the Problem

Architecture of a Real-time Decisioning Engine

Layer 1 — Event Stream Ingestion

Layer 2 — Context Assembly

Layer 3 — Model Inference

Layer 4 — Action Selection and Delivery

Implementing Compliance Guardrails at Decision Time

Consent and Suppression Architecture

Explainability Requirements

From Batch to Real-time: A Phased Approach

Real-time Decisioning
in Regulated Banking