Nate Smith avatar
Categories:

Remove query params from URL with BigQuery SQL

This is a continuation of my GA4 real-time analytics project using BigQuery, Google Analytics and Looker Studio.

Code to remove query params from URL

There is an article by Matt Mazur published in 2018 that includes the code needed to remove the query params.

His article includes the full BigQuery syntax to strip the params – so take a read. However, as I’m plugging the code into an existing query I only needed to take the Case statement, which looks something like this:

current_url,
  CASE
    WHEN STRPOS(current_url, "?") > 0 THEN SUBSTR(current_url, 0, STRPOS(current_url, "?") – 1)
    WHEN STRPOS(current_url, "#") > 0 THEN SUBSTR(current_url, 0, STRPOS(current_url, "#") – 1)
    ELSE current_url
  END AS url

Here’s an explanation of why you might need this code, and specifically how it’s going to help my dashboard project.

BigQuery tables contain URLs with parameters and fragments

One thing Google Analytics likes to do is to track URLs with their parameters and fragments included. Which can be useful for example, WordPress uses ?s= as it’s search parameter. And because GA will track the parameter it means you can see what words people are searching on your website and how many times.

However, this means GA will also track URLs with UTM data as a separate URL from the non-parameter. This isn’t good because it basically classes them as a separate page, when in reality we want the stats for both URLs to be classed as one page.

This is one reason to find a way to strip the Parameters from the URL in BigQuery and make them ‘clean’.

However, I also have a very specific reason why I want clean URLs

Using ‘clean’ URLs as a join key to merge two data sets

As part of the big analytics and dashboard project I’m working on, we’re using BigQuery to merge data from Google Analytics with “meta” data we’ve manually added to pages on our site and then outputted to a feed.

The meta data I’m talking about here is information such as content owners, page purpose, department etc. Which will allow the dashboards to segment URLs for the marketing teams to analyse.

To merge the data I’m using the URL as the join key, because it makes the most sense. However, the join key looks for a match, which means the URLs won’t merge if they have parameters or fragments added.

Last modified:

Comments

Leave a Reply

Web Mentions

To respond on your own website, enter the URL of your response which should contain a link to this post’s permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post’s URL again.

Search the website



Popular Categories