August 14, 2021
An article by Sergey Matrosov
If you have a database, including a table with raw data, where the first column is the user-id/date and the two others are the name and value of your URL parameters as follows:
and you wish to transform it into a table, such as this:
I created solutions for two databases: Clickhouse and PostgreSQL. This article is based on the former, but here you can also find a query relating to PostgreSQL.
Firstly, we need to generate a data table via a query, using the columns, “userid”, “name” and “value”, as we did when we wrote simple attribution models via BigQuery.
As a result, we obtain a table as in our first screenshot. Then we need to use the CASE...WHEN construction to mutate the “name” column into columns that have as its name, the values of this column. The value of these new columns will be the value of the column, “value”.
Here is our result:
The problem is obvious: not only does it have many void cells, but we wanted to have one row for each user. If you considered GROUP BY, you are correct, but we need to use a proper function for this. Firstly, I took an array function called groupArray:
Don’t forget to use the GROUP BY userid statement:
Here is our modified table:
We are almost there; the last function is arrayJoin, which will unnest this array into the usual rows.
And here it is:
Hope this helped you. Good luck!