I created an ETL in GCP, I process XML file from a bucket, and load them to bigquery.
Sometimes we find that some files are not processed, or they are not in the bigquery dataset.
I created a metric table that contains metadata about processed files, however, I want to automate checks (for example checking that all files in storage exist in the metric table...)
EDIT
In short what I want is to be able to compare source and target environment/ compare data before entering the ETL and the data after exiting it, to tell that I didn't forget anything, I could work out some scripts to do that but I wonder if there is something already created.