Member-only story
How to realize Data Versioning in Google BigQuery
Building Historical and Current Views in BigQuery
In this short tutorial, I want to show you how you can easily realize data versioning in new cloud-based, hybrid — a mix of NoSQL and classic database systems and column-based Data Warehouses. One famous and one of my favorite solutions is Google’s BigQuery.
When building up such a Data Warehouse you want of course a historical view of your data. A possibility would be to build up a Data Vault — read here more about it:
Newer cloud-based and SaaS-based technologies also load data from source systems via CDC or messaging services over an ETL or ELT process into the target system. But one doesn’t work with classical relational or cube-based systems anymore.
Step 1: Build up the ELT/ETL Process
In this relation, NoSQL or hybrid and denormalized solutions are in use. So you will often end up loading every updated record from the source system. It is important that you add metadata like a load…