July 28, 2021
An article by Sergey Matrosov
Parsing HTML with regular expressions is a really bad idea, but if you are using a tracking template for your ads, I’m sure your URL has well-defined, structured UTM parameters. Therefore, this method may actually be used in this case. Here is a quick tip on how to get the value of UTM parameters, using SQL and REGEXP:
The pattern is ‘utm_parameter=([^&]*)‘. It will take everything, starting from the equal sign ‘=’ following ‘parameter’, as far as the ampersand, ‘&’.
The main feature of this extraction is that the order of the UTM parameters is not at all important, therefore, if we rewrite the referrer as:
this_is_referrer/?&utm_medium=cpc&utm_campaign=TEST&utm_source=google
the result of our query will be the same.
This means that if someone changes the order of the UTMs, either intentionally or by accident, the query won’t need any fixes.
The full code is here.