Upserting Data using Spark and Iceberg
May 25, 2023
•
Jonathan Merlevede
Use Spark and Iceberg’s MERGE INTO syntax to efficiently store daily, incremental snapshots of a mutable source table.
Latest
From Good AI to Good Data Engineering. Or how Responsible AI interplays with High Data Quality
Responsible AI depends on high-quality data engineering to ensure ethical, fair, and transparent AI systems.
A glimpse into the life of a data leader
Data leaders face pressure to balance AI hype with data landscape organization. Here’s how they stay focused, pragmatic, and strategic.
Data Stability with Python: How to Catch Even the Smallest Changes
As a data engineer, it is nearly always the safest option to run data pipelines every X minutes. This allows you to sleep well at night…