Build and Optimize Data Warehouses with BigQuery: Challenge Lab

In this article, we will go through the lab to Build and Optimize Data Warehouses with BigQuery: Challenge Lab. In the previous, lab you will get familiar with Creating a Data Warehouse Through Joins and Unions,
Creating Date-Partitioned Tables in BigQuery, Troubleshooting and Solving Data Join Pitfalls, Working with JSON, Arrays, and Structs in BigQuery, and Build and Execute MySQL, PostgreSQL, and SQLServer to Data Catalog Connectors.

The challenge contains 6 required tasks.

  1. Create a table partitioned by date.
  2. Correctly add appropriate columns to a table.
  3. Populate the population column.
  4. Populate the country_area column.
  5. Populate the mobility STRUCT.
  6. Query missing data in population & country_area columns.

Challenge scenario

1. Create a table partitioned by date

  • In the GCP Console go to Navigation Menu >BigQury.
  • Click on Create Database and Name it as ecommerce.
  • Then Click o Create Database.
  • We can check on our Project Id Database has been Created or Not.
  • Click on Compose New Query.
  • Copy and paste the following code to the Query editor and then click Run
CREATE OR REPLACE TABLE ecommerce.sample 
PARTITION BY date OPTIONS ( partition_expiration_days=90,description="COVID 19 data") AS
SELECT *
FROM bigquery-public-data.covid19_govt_response.oxford_policy_tracker WHERE alpha_3_code != 'USA' AND alpha_3_code != 'GBR'
  • The query should return the Table that has been created.

2.Add new columns to your table

  • Click on compose new query and then copy and paste the following query into the BigQuery Query editor.
ALTER TABLE ecommerce.sample
ADD COLUMN IF NOT EXISTS population INT64,
ADD COLUMN IF NOT EXISTS country_area FLOAT64,
ADD COLUMN IF NOT EXISTS mobility STRUCT<avg_retail FLOAT64,
avg_grocery FLOAT64,
avg_parks FLOAT64,
avg_transit FLOAT64,
avg_workplace FLOAT64,
avg_residential FLOAT64>
  • The query should return the new columns to your table.

3.Add country population data to the population column

  • Replace Project Id.
UPDATE `<PROJECT_ID>.ecommerce.sample` count
SET count.population = count1.pop_data_2019
FROM `bigquery-public-data.covid19_ecdc.covid_19_geographic_distribution_worldwide` count1
WHERE count.date = count1.date AND count.alpha_3_code=count1.country_territory_code

4.Add country area data to the country_area column

  • Replace Project Id.
UPDATE `<PROJECT_ID>.ecommerce.sample` count
SET count.country_area = count1.country_area
FROM `bigquery-public-data.census_bureau_international.country_names_area` count1
WHERE count.country_name = count1.country_name

5.Populate the mobility record data

  • Replace Project Id.
UPDATE `<PROJECT_ID>.ecommerce.sample` count
SET count.mobility = STRUCT<avg_retail FLOAT64, avg_grocery FLOAT64, avg_parks FLOAT64, avg_transit FLOAT64, avg_workplace FLOAT64, avg_residential FLOAT64>
(count1.avg_retail, count1.avg_grocery, count1.avg_parks, count1.avg_transit, count1.avg_workplace, count1.avg_residential)FROM ( SELECT country_region, date, AVG(retail_and_recreation_percent_change_from_baseline) as avg_retail, AVG(grocery_and_pharmacy_percent_change_from_baseline)as avg_grocery,
AVG(parks_percent_change_from_baseline) as avg_parks,
AVG(transit_stations_percent_change_from_baseline) as avg_transit, AVG( workplaces_percent_change_from_baseline ) as avg_workplace,
AVG( residential_percent_change_from_baseline) as avg_residential FROM `bigquery-public-data.covid19_google_mobility.mobility_report` GROUP BY country_region, date) AS count1
WHERE count.country_name = count1.country_region
AND count.date = count1.date

6.Query missing data in population & country_area columns

  • Replace Project Id.
SELECT DISTINCT country_name
FROM `Project Id.ecommerce.sample`
WHERE population is NULL
UNION ALL
SELECT DISTINCT country_name
FROM `Project Id.ecommerce.sample`
WHERE country_area IS NULL ORDER BY country_name ASC
  • The query should return the missing countries in the population and country_area data.

Congratulations! Done with the challenge lab.

If you Want to Connect with Me:

Linkedin: https://www.linkedin.com/in/mayank-chourasia-38421a134/

Twitter: https://twitter.com/ChourasiaMayank.

Youtube: https://youtu.be/Luc03Dgs_yU

Google Developers Google Cloud Qwiklabs ChallengePost

I am Google Cloud Certified Associate Cloud Engineer and MS Azure Certifed. I have been working on different fields including Cloud Computing , Sap ABAP.