BigQuery: totals by day, totals by date range in one table.

June 22, 2019


An article by Sergey Matrosov

Let’s say that you want to know the conversion amount in a specific date range for streaming in BigQuery: the total amount for each day and the total for all days as well. You could do this by making either two queries resulting in two tables, or one query resulting in one table. The first solution is not particularly user-friendly (you have to click from one tab to another etc.), but it is very quick to carry out. The second solution sounds pretty easy too, although it could involve a slight challenge; however, in terms of output, one table is produced.

Our conversions include ‘Registrations’, thus our streaming table looks like this:

Date Event
2019-01-01 Registration
2019-01-01 Activation
2019-01-01 Registration
... ...
2019-01-02 Registration
... ...

And so on.

Obtaining data only by day is really that easy:

SELECT
  date,
  COUNT(event) AS Conversions
FROM
  `your-project.your_dataset.table`
WHERE
  event = 'Registration'
  AND date BETWEEN ‘2019-01-01’
  AND ‘2019-01-05’
GROUP BY
  date
Date Conversions
2019-01-01 10
2019-01-02 5
2019-01-03 6
2019-01-04 1
2019-01-05 2

Then, for the grand total (10 + 5 + 6 + 1 + 2 = 24 in our case), you might think that it’s all about one line of query in SELECT, but it’s not. Using “Group By” will only count events by each day.

To achieve your goal, you need to use “CROSS JOIN” with this query:

SELECT
  COUNT(event) AS Total,
FROM
  `your-project.your_dataset.table`
WHERE
  event = 'Registration'
  AND date BETWEEN ‘2019-01-01’
  AND ‘2019-01-05’

So, the full query will be:

SELECT
  t1.date AS Date,
  t1.Conversions AS Conversions,
  t2.Total AS Grand_Total
FROM (
  SELECT
  date,
  COUNT(event) AS Conversions
FROM
  `your-project.your_dataset.table`
WHERE
  event = 'Registration'
  AND date BETWEEN ‘2019-01-01’
  AND ‘2019-01-05’) AS t1
CROSS JOIN (
SELECT
  COUNT(event) AS Total,
FROM
  `your-project.your_dataset.table`
WHERE
  event = 'Registration'
  AND date BETWEEN ‘2019-01-01’
  AND ‘2019-01-05’) AS t2
ORDER BY
  Date ASC

And the result will be as expected:

Date Conversions Grand_Total
2019-01-01 10 24
2019-01-02 5 24
2019-01-03 6 24
2019-01-04 1 24
2019-01-05 2 24

Indeed, as a result, you will get a duplicate of the total in each row. However, the result clearer and more convenient, especially if it is used only in BigQuery’s interface.