Data Mesh in SAP
Interview prep: understand domains, data products, ownership, contracts/SLAs, lineage, and the shift to event‑driven integration.
Core Idea
Traditional SAP integrations were heavy: JCo, IDocs, and batch jobs everywhere. It worked, but it was slow, tightly coupled, and hard to scale. Today, businesses need real‑time responsiveness. Event‑driven architecture (EDA) pushes changes as events — systems are notified instantly. Data Mesh adds domain ownership: each domain (Sales, HR, Finance) provides data as a product, discoverable and reusable without central bottlenecks. Together, EDA + Data Mesh enable real‑time, decoupled, scalable integration across SAP and non‑SAP.
From Monolith to Data Mesh
If we go back to the old monolith world, the database was always the source of truth. Everything was strongly consistent, and read‑only data was easy to query inside that system.
The problem started when other applications wanted the same data. There were only two strategies: either replicate the data into another service, or expose it through APIs. Both worked, but both also introduced complexity and coupling.
Later, we added data warehouses and OLAP systems. Data was extracted in batches, transformed, and loaded into a warehouse for reporting. That solved analytics, but it created new issues: long delays, rigid schemas, and no single ownership. Data teams became bottlenecks.
That’s exactly where data mesh comes in. Instead of centralizing everything, you publish important datasets as data products. Each domain owns its own data product, takes care of its quality and evolution, and others can discover and subscribe to it.
And event streams are the perfect foundation for these products. Unlike monolith databases, which are the only ‘truth,’ event streams become a new source of truth across domains. They are immutable, durable, and replayable. Consumers can rebuild state or aggregate data as they need.
But the trade‑off changes: in a monolith, data was always strongly consistent. In an event‑driven mesh, data is eventually consistent. You don’t get a single instant global truth — but you get scalability, independence, and real‑time distribution. That’s the deal we make when moving from monolith to event‑driven data mesh.
Monoliths optimized for local truth; cross‑app sharing created tight coupling and slow batches. Mesh moves ownership to domains and uses events as the backbone so products can be discovered, subscribed to, and evolved independently.
📝 Quick Reminder List
- Monolith → DB = source of truth, strongly consistent, local queries fine.
- Problem → need data outside → replicate or expose APIs (tight coupling).
- Data Warehouse → batch ETL, analytics ok, but slow + no ownership.
- Data Mesh → domains publish datasets as products, self‑owned.
- Event streams → new source of truth: immutable, durable, replayable.
- Key shift → Strong consistency → Eventual consistency.
Data Mesh Principles
Data mesh changes the mindset around data. Instead of a central BI or IT team owning all pipelines, which creates bottlenecks and delays, responsibility shifts to domains. Sales owns Sales data, HR owns HR data, Finance owns Finance data — each domain makes its data available. That’s domain ownership.
But it’s not enough to just “have data.” You must treat it like a product — with a clear schema, documentation, and access methods so others can consume it easily. That’s data as a product.
To avoid chaos, we add federated governance: common rules like GDPR, naming standards, and quality checks. Within those guardrails, domains have freedom to manage their data their way.
Finally, self‑service ensures consumers don’t beg central IT. They discover existing data products, subscribe, and start using them on their own — enabled by catalogs, schema registries, and event brokers.
In short: domains own and publish data as products, governance sets shared rules, and consumers access via self‑service. Much more scalable than a central warehouse model.
📝 Quick Reminder List
- Domain ownership → each domain owns its data (Sales, HR, Finance).
- Data as a product → schema + docs + access (usable, not raw).
- Federated governance → rules for all (GDPR, quality), freedom within domains.
- Self‑service → discover & subscribe without central IT bottlenecks.
- Core answer → “Data Mesh = domains own and publish their data as products, with shared rules, discoverable in self‑service.”
SAP Data Mesh – Event Types
In SAP’s event-driven world we usually talk about three types of events: notification, data, and decision events.
A notification event is the smallest: it just signals that something changed (e.g., SalesOrder.Created
). It does not carry full data; consumers decide if it’s relevant and can call an API for details. Light and secure, but often requires extra API calls.
A data event is the opposite: it carries the full payload (items, prices, partners). Consumers can process immediately, but events are larger, heavier for brokers, and risk exposing sensitive data.
A decision event is a middle ground: it carries more than a notification but less than full data. Consumers get enough info to decide whether to call back for details, reducing unnecessary API calls.
SAP events often follow the CloudEvents format, an open standard ensuring consistent metadata and structure across SAP and non-SAP platforms.
📝 Quick Reminder List
- Notification event → minimal info, consumer decides relevance, light & secure, extra API calls likely.
- Data event → full payload, immediate processing, heavy & potential data exposure.
- Decision event → middle ground, enough info to reduce API calls.
- CloudEvents → SAP uses this open standard format for interoperability.
- Core answer → “Notification = light, Data = full, Decision = middle. CloudEvents standardize it.”
Channel vs Queue
A channel is short‑lived — like a loudspeaker. The producer broadcasts a message, and whoever listens in that moment hears it. If nobody listens, the message is gone. The broker doesn’t store it. Useful for quick, lightweight notifications, but risky if no consumers are online.
A queue is durable — like a to‑do list. The producer puts a task into the queue, and a consumer processes it. Only then is it removed. If the consumer is down, the task waits safely until picked up. Perfect for workloads like order processing or background jobs where message loss is unacceptable.
So: channels are ephemeral and best for broadcast notifications, while queues are durable and best for guaranteed work delivery.
📝 Quick Reminder List
- Channel → ephemeral, no storage, if no listener = lost. (like a loudspeaker).
- Queue → durable, stored until processed, safe for work items. (like a to‑do list).
- Core answer → “Channel = fire‑and‑forget, Queue = guaranteed work delivery.”
The Three Data Product Alignment Types
When we say “data product,” it doesn’t always look the same. It depends on how it’s aligned. There are three types:source‑aligned, aggregate‑aligned, and consumer‑aligned.
A source‑aligned data product comes straight from the operational system. Example: sales facts directly from S/4HANA (items, prices, shipping, payment info). Closest to the raw business events, useful for both ops and analytics.
An aggregate‑aligned data product is business‑oriented. Instead of raw transactions, it groups data into meaningful KPIs (e.g., hourly sales totals per store). Great for trend or performance tracking.
A consumer‑aligned data product is tailored for a specific use case. Example: mixing sales aggregates, inventory levels, and customer profiles into exactly what one team needs. Powerful but narrow — built for one consumer.
The progression: source‑aligned (raw detail), aggregate‑aligned (summarized), consumer‑aligned (tailored). Choice depends on the consumer and their use case.
📝 Quick Reminder List
- Source‑aligned → raw facts from operational system (e.g., Sales Order details).
- Aggregate‑aligned → grouped view (e.g., hourly sales per store).
- Consumer‑aligned → highly tailored mix (e.g., sales + inventory + customer profile for one use case).
- Core answer → “Source = raw, Aggregate = summarized, Consumer = customized.”
Event as Data Product
In a data mesh, event streams become a key type of data product. Instead of treating events as mere technical plumbing, domains publish important business events (like SalesOrder.Created
) as discoverable, versioned products. These streams are the single source of truth for changes, available for any consumer to subscribe and build their own state or analytics.
There are two main event types: state events (sometimes called "full" or "snapshot" events) and delta events ("change" or "patch" events). State events contain the entire record as of a point in time—easy to consume and replay, but heavier. Delta events only describe what changed (e.g., just a field update)—lighter, but riskier if you miss one or if consumers fall behind.
Schema registries (like Confluent Schema Registry or SAP Event Mesh registry) ensure all producers and consumers agree on the event format. This shared contract is essential for interoperability and versioning.
Governance is key: domains must document the topic, schema, quality SLAs, and evolution policy. The event stream is now a product with an owner, lifecycle, and discoverable contract—not just a technical afterthought.
📝 Quick Reminder List
- Event stream = single source of truth
- State event → full record, reliable
- Delta event → only changes, lighter but risky
- Schema registry → ensures shared language
- Core answer summary
Consuming and Using Event-Driven Data Products
In an event-driven setup, the key idea is the state event: each event contains the full record of an entity at a point in time. If a customer changes, the event doesn’t just say “address updated” — it sends the whole customer snapshot again.
This is Event-Carried State Transfer (ECST). If you replay all events in order, you can rebuild the entire state from scratch. It’s reliable for mesh-style sharing because consumers don’t need to call the producer to fetch missing details.
On the consumer side we use materialization: keep a local, read-optimized copy of only the fields we need. Each incoming event merges/overwrites the local record. That lets every domain shape data to its model without coupling to the producer’s DB.
📝 Quick Reminder List
- State event → full snapshot of the entity, not just the delta.
- ECST → replay all events to rebuild state.
- Materialization → consumer maintains its own read model, updated per event.
- Core answer → “Producers send full snapshots; consumers materialize their own local view.”
📝 Quick Reminder List
- Old: JCo, IDoc, batch → slow, dependent.
- New: EDA push → events flow automatically.
- Data Mesh = domain owns data → product mindset.
- Benefits: real‑time, scalable, decoupled.
- SAP tools: Event Mesh, Advanced Event Mesh.
What is Data Mesh
A socio‑technical approach where domains own data as a product. Each product has a contract (schema, quality SLO/SLAs, lineage, access policy), is discoverable, and is served via a self‑service platform. In SAP, this usually means S/4HANA domains publish events and productized views, while consumers (analytics, CRM, apps) use contracts instead of direct tables.
How it maps to SAP
Mesh Concept | SAP Flavor | Platform Capability | Notes |
---|---|---|---|
Domain ownership | Line‑of‑business domains (O2C, P2P, Finance) own product definitions | Catalog, access control, CI/CD for data | Clear owners; not central IT only |
Data product | e.g., SalesOrder product (read model + API + event) | Schema registry, SLAs, monitoring | Versioned contracts |
Event backbone | Event Mesh / AEM / Kafka topics | Pub/sub, retention, replay | Near real‑time fan‑out |
Serving interfaces | OData/REST APIs, CDS views, Calculation Views | Gateway, API mgmt, QoS | Contracts not tables |
Governance | Data catalog, lineage, policy‑as‑code | Central standards; domain execution | Balance autonomy & control |
EDA + Data Mesh (mental model)
Events flow from S/4 domains; products expose stable read models. Consumers subscribe and query without tight coupling.
When to use what (interview quick compare)
Decision | Use Mesh/EDA | Use Central DW/Lake |
---|---|---|
Many domains, rapid change | Yes — domain products + events | Risk of central backlog |
Strict global model, finance close | Mesh feeds curated views | Good for consolidated reporting |
Real‑time fan‑out | Events first‑class | Batch CDC later |
Legacy integration | Wrap as product API | ETL to DW if needed |
Event-Driven Architecture Blueprints
Blueprint 1: Master Data Distribution
Main characteristics: real-time updates for master data, push/subscription-based approach.
Implementation Flow
- 1
Publish Master Data Event
Event Source (e.g., SAP S/4HANA, ECC, SuccessFactors) emits a Business Object change event via AMQP to SAP Integration Suite, Advanced Event Mesh.
- 2
Filter & Route
Advanced Event Mesh filters events and routes them to relevant queues/subscriptions.
- 3
Distribute to Consumers
Multiple consumers (SAP S/4HANA Cloud, third‑party, on‑premise S/4HANA) receive and process the event independently.
Blueprint 2: Transforming Distributor
Main characteristics: no polling; SAP Integration Suite used for enrichment and event → API transformation.
Implementation Flow
- 1
Publish Event
Event Source emits Business Object change events via AMQP.
- 2
Filter & Route
Event Mesh filters and sends events to subscribers.
- 3
Transform the Event
SAP Integration Suite enriches/transforms the event (e.g., add fields, adjust format).
- 4
Deliver to Consumer
Event sent via API to ECC, third‑party apps, or other systems.
When to use: You need pub/sub and content mediation — enrich, remap, and call APIs without polling.
Blueprint 3: Real-Time Notifications
Main characteristics: small event with minimal data; consumers fetch details via API with enforced auth.
Implementation Flow
- 1
Publish Notification Event
Source emits minimal payload via AMQP.
- 2
Route & Deliver
Event Mesh delivers notifications to subscribers.
- 3
Retrieve Additional Data
Consumers call back‑end API for full details with access control.
Interview Explanation: “Notification events are small (just ID/type). Consumers fetch details securely via API. Efficient: events stay light, APIs handle security and full payload.”
Blueprint 4: Across Vendor Mesh
Main characteristics: diverse event sources (Kafka, Azure, etc.); cross-vendor interoperability with SAP Integration Suite.
Implementation Flow
- 1
Event Sources Publish
Events flow from Kafka, Azure, etc. into Advanced Event Mesh.
- 2
Aggregate & Route
Event Mesh consolidates events from multiple brokers.
- 3
Event Mediation
SAP Integration Suite transforms/mediates events (SMF supported).
- 4
Distribute to Consumers
Consumers (SAP/non‑SAP) receive tailored payloads.
Blueprint 5: Shock Absorber
Main characteristics: buffering, decoupling, cross-vendor support, dynamic scalability (e.g., Black Friday).
Implementation Flow
- 1
Source Publishes Events
Source emits events via AMQP/REST into Advanced Event Mesh.
- 2
Queue Buffering
Events buffered in queues for controlled delivery.
- 3
Consumer Subscribes
Consumers process events at a sustainable rate, decoupled from spikes.
Event-Driven Architecture Blueprints — Comparison
Blueprint | What it is | When to use | Key difference |
---|---|---|---|
1. Master Data Distribution | Push master data changes (BP, material, customer) to multiple consumers | When you need real-time sync of master data | Simple pub/sub; focus on distribution |
2. Transforming Distributor | Events enriched/transformed via SAP Integration Suite before delivery | When consumers need different formats or extra fields | Adds mediation/enrichment layer |
3. Real-Time Notifications | Minimal event (ID, timestamp), consumer retrieves details via API | When you want lightweight events + secure access | Event + API combo |
4. Across Vendor Mesh | Aggregates events from different brokers (Kafka, Azure, etc.) | When you have heterogeneous landscape | Cross-vendor interoperability |
5. Shock Absorber | Queues buffer spikes, decouple producer/consumer | When you expect load spikes | Buffering + decoupling |
Interview Tip
“Each blueprint solves a different integration problem. #1 distributes master data, #2 adds mediation, #3 balances lightweight events with secure APIs, #4 integrates across brokers, #5 absorbs peak load.”
Integration between SAP BTP and SAP S/4
SAP Intelligent Clinical Supply Management (ICSM) is a hybrid solution — runs partly on SAP BTP (cloud)and partly on SAP S/4HANA (on‑prem). These two sides must continuously exchange data securely and efficiently.
Key Communication Channels
1. Cloud Connector
- Acts as a secure tunnel from SAP BTP → on‑prem S/4HANA.
- Critical for cloud → S/4HANA communication without exposing backend to the internet.
Interview phrase: “Think of the Cloud Connector as a VPN tunnel between SAP BTP and on‑prem S/4HANA — it secures traffic in hybrid scenarios.”
2. SAP Event Mesh
- Used to synchronize data in real time between planning (cloud) and operations (S/4HANA).
- Events can also trigger follow‑up actions (workflows, dataset refreshes).
Interview phrase: “Event Mesh keeps planning and operations in sync, and events can trigger further automation.”
3. OData APIs
- Support direct data exchange between BTP and S/4HANA.
- Example: Cloud calls an API to fetch study information stored in S/4HANA.
- Best when a structured request/response is needed (not only push).
Interview phrase: “APIs handle structured request/response exchanges — e.g., the cloud asking S/4HANA for study master data.”
Why it matters
By combining Cloud Connector (secure channel), Event Mesh (real‑time sync & triggers), and OData APIs (structured exchange), SAP delivers a hybrid architecture that balances security, real‑time integration, and flexibility.
Event Brokers
Definition: The “post office” — brokers route, filter, and manage events between producers and consumers.
Option A: SAP Event Mesh
- Purpose: Quick entry into EDA, low complexity.
- Model: Event broker as a service on SAP BTP.
- Use Cases: Integrating/extending SAP back‑ends and apps.
- Strengths: Pay‑per‑use; native S/4 add‑on; standards support with SAP‑optimized features.
- Limits: Scales well but has service constraints.
Option B: SAP Integration Suite, Advanced Event Mesh (AEM)
- Purpose: Enterprise‑grade, large‑scale EDA.
- Deployment: Across hyperscalers or private cloud; supports distributed meshes.
- Advanced Features: Event streaming (with replay), dynamic routing, fine‑grained filtering, high performance, security, governance, monitoring.
SAP Event Mesh vs Advanced Event Mesh — Key Differences
Feature | SAP Event Mesh | Advanced Event Mesh |
---|---|---|
Infrastructure Model | Shared broker on SAP BTP | Dedicated broker(s) — scalable T‑shirt sizes |
Deployment Options | SAP BTP only | On‑prem, private/public clouds (AWS, Azure, GCP) + edge |
Message Size | Up to 1 MB | Up to 30 MB |
Storage | 10 GB | 6 TB |
Scale | Small → Medium | Small → Ultra‑Large (billions of events/day) |
Advanced Features | Basic integration & extension | Event streaming, replay, transactions, dynamic routing, distributed tracing, lifecycle mgmt |
Event Replay | ❌ | ✅ |
Transactions | ❌ | ✅ |
Filtering | Basic | Advanced, fine‑grained |
Monitoring & Analysis | Basic | Advanced event management & lifecycle tools |
Protocols | AMQP over WebSockets, REST, JMS, MQTT | AMQP, MQTT, SMF, SMF/WS, REST, JMS |
Pricing | Usage‑based (message count) | Broker‑based (T‑shirt size model) |
Target Scenarios
SAP Event Mesh
- SAP‑to‑SAP and SAP‑to‑external extensions.
- Ideal for quick starts, small to medium use cases.
Advanced Event Mesh
- “SAP to Everything” and “Everything to Everything.”
- Event‑driven backbone for enterprise‑scale, mission‑critical integrations.
People also ask
Why can’t we just use REST APIs or services?
REST APIs work well for point‑to‑point, synchronous calls. But at scale, APIs create tight coupling, polling overhead, and complex dependency chains between systems. This slows down integration and innovation.
What does Data Mesh add on top of APIs?
Data Mesh decentralizes ownership. Each business domain manages its own data products and publishes them as events. Other domains subscribe without direct dependencies. This scales much better in complex, distributed landscapes.
How is event‑driven better than APIs?
APIs = pull (you have to ask for data). Events = push (changes broadcast in real time). This eliminates polling, reduces latency, and ensures immediate reaction to business events.
Do APIs disappear in Data Mesh?
No. APIs remain essential for transactional request‑response. Mesh complements APIs by providing real‑time, loosely coupled data distribution across domains.
What’s the real business value?
Faster decisions, real‑time visibility, and reduced integration complexity. Domains are autonomous, data is reusable, and new consumers can onboard without impacting existing producers.
Can you summarize in one line?
“APIs connect systems; Data Mesh connects domains with real‑time, event‑driven data products, enabling scale, agility, and loose coupling.”
What is Event‑Driven Architecture (EDA)?
EDA is an integration model where applications publish, capture, and respond to events in real time. Producers emit events when state changes, and consumers subscribe to process them.
What is Data Mesh?
An architectural approach where each domain owns and publishes its own data products, often distributed via events in real time.
Why are EDA and Data Mesh often used together?
EDA provides the technical backbone (event brokers, real‑time streaming), while Data Mesh provides the organizational model (domain ownership, data as a product). Together, they enable scalable, real‑time data sharing across domains.
What is a business event in SAP?
A significant state change, e.g., SalesOrder.Created
or BusinessPartner.Changed
.
Notification vs Data vs Decision events?
Notification = minimal info; consumer fetches details via API. Data = full payload included. Decision = middle ground with extra context to reduce unnecessary API calls.
What format does SAP use for events?
SAP uses the CloudEvents standard (CNCF); SuccessFactors currently uses JSON/SOAP (CloudEvents planned).
What are the two main EDA patterns?
Publish/Subscribe (no replay) and Event Streaming (retained & replayable; good for recovery and analytics).
When would you use pub/sub vs streaming?
Pub/Sub = instant triggers, low‑latency. Streaming = event history, recovery, onboarding, and analytics.
SAP Event Mesh vs Advanced Event Mesh?
Event Mesh = BTP‑native, lightweight, SAP‑to‑SAP extensions. AEM = enterprise‑grade, distributed mesh, multi‑cloud/hybrid with replay, transactions, routing, tracing.
What is the Event Portal?
Part of AEM for designing, cataloging, documenting, and governing events across the enterprise.
Which SAP systems act as event sources?
ECC (custom via Event Enablement), S/4HANA & S/4HANA Cloud (~600 RAP‑based standard + custom), SuccessFactors (Intelligent Services; JSON/SOAP).
Outbound vs Inbound events?
Outbound = the system publishes events (~90%). Inbound = the system consumes events (~10%; often replaced by API).
Why not just use APIs instead of events?
APIs are synchronous/pull/tightly coupled. Events are asynchronous/push/loosely coupled and scale in real‑time. They complement each other.
Key benefits of EDA?
Loose coupling, scalability, resilience, real‑time updates, incremental growth, faster innovation.
Business value of Data Mesh with EDA?
Faster decision‑making, domain autonomy, reduced integration complexity, improved agility, and cross‑vendor interoperability.
Examples of SAP EDA use cases?
- HR Onboarding:
Employee.Created
→ subscribers process. - Shock Absorber: queues buffer Black Friday spikes.
- Across Vendor Mesh: Kafka/Azure aggregated into S/4HANA.
- Master Data Distribution: MDG pushes changes to many consumers.
How do you monitor events in AEM?
Insights dashboards (via Datadog), KPIs, alerting, and distributed tracing for troubleshooting.
What is distributed tracing in AEM?
End‑to‑end trace across producer → broker mesh → consumer using OpenTelemetry to troubleshoot latency and bottlenecks.
How is Advanced Event Mesh licensed?
Two models: Consumption‑based (hourly brokers) and Subscription‑based (reserved capacity; lower TCO for predictable workloads). Regions bundled.
Which plan should customers choose?
Standard Plan (default plan deprecated). Standard includes tracing, replay, and spool add‑ons.
In SD, when use EDA vs APIs?
APIs for synchronous checks (ATP, pricing). EDA for distributing order, delivery, or GI events to many consumers in real time.
Which SD events are most useful?
SalesOrder.Created
, SalesOrder.Changed
, Delivery.Created
, GoodsIssuePosted
, BusinessPartner.Changed
.
When should I stick to APIs instead of EDA?
- Single consumer only
- Synchronous validation required (pricing, ATP)
- Low‑volume, predictable processes
- Consumer always needs full dataset (better via API)
What is MQTT?
Lightweight publish/subscribe protocol for IoT/mobile; efficient over low‑bandwidth networks; QoS 0/1/2.
What is AMQP?
Enterprise messaging protocol for reliable, ordered, transactional communication with rich routing (queues/topics).
How do MQTT and AMQP fit into SAP EDA?
MQTT for edge → SAP BTP telemetry; AMQP as backbone in Event Mesh/AEM. They can be bridged.
What is Kafka?
Apache Kafka is a distributed streaming platform for high‑volume, low‑latency processing; great for analytics/ML pipelines and microservices.
When recommend SAP Data Mesh / AEM vs Kafka?
Use Kafka for raw high‑throughput streaming and analytics. Use AEM for SAP‑centric business events, lifecycle governance, and cross‑vendor mesh (Kafka/Azure/SAP together).
Can Kafka and SAP Data Mesh work together?
Yes. Hybrid is common: Kafka handles raw ingestion; AEM integrates streams into business processes (see Blueprint 4 — Across Vendor Mesh).