Do you ever find yourself crafting JSON data you want to send into PubSub? Wish you had some kind of GUI, WebService, or CLI to make it easier?
How about a static webpage, hosted internally on an https endpoint, as a quick-and-easy approach?
This is also relying on GCP’s native…
DBT is a great tool for orchestrating and managing SQL running against Data Warehouses. When using BigQuery, it can be useful to profile the DBT runs and capture the slot usage and the bytes processed to measure the cost.
GKE Usage Metering is a great feature that enables GKE profiling, capturing the usage and cost of CPU, Memory, Storage, and Network Egress (optional). There are provided example queries and Data Studio dashboards.
The goal of this blog post is to show you an alternative hour-by-hour analysis and a detailed…
Another day, another challenge — a customer wanted to parse and compare fixed-width text files. These files are used frequently by mainframes and for data interchange. This is to validate some of their new BigQuery-based pipelines with legacy pipelines.
These files are -
We’re very happy to announce the availability of Liquibase Spanner extension beta version 1.0. This brings all of Liquibase’s CI/CD benefits to Spanner.
You can find the source and detailed information in GitHub here:
The following change types are supported by the extension: createTable, dropTable, addColumn, modifyDataType, addNotNullConstraint, dropColumn, createIndex…
I’m happy to announce a comprehensive sample SQL parser, translator, and formatter written in Python. It is available open source at https://github.com/google/sample-sql-translator.
It is intended to do the following -
Building on BigQuery: delta to latest, rather than taking multiple changes and finding the latest value we want to capture the changes over time.
First we will create a set of the test data called testing.sparsedata. This will…
A colleague presented me with a challenge:
I have a series of updates with only the changed fields set. NULLs indicate no change. I have the original records. How do I handle this in BigQuery?
Putting this together into a BigQuery script will be for another day, but for now…
Sometimes you need to compare data across two BigQuery tables. You may want to do this generically (match entire-row-by-row), sometimes comparing by key. This is far easier than you may think!
NOTE: You need the transactions dataset for this SQL.
Create a left_table with a million rows. …
Do you ever get duplicate rows in BigQuery?
I’m going to explore some techniques for deduplication in BigQuery both for the whole table and by partition. It assumes you have the dataset transactions.
This will create a table containing two columns (date, v). There will be 21 days of data…