How does referrer grouping work?

Content of this page

Preface

keytiles-traffic-sources.jpgIn Keytiles TileView on the main, overview screen we are showing from which sources the visitors are coming from. We are talking about what you see on this image.

In this article we would like to describe the mechanism how Keytiles is categorizing the sources so you understand better what and how is going on.

"Referrer URL" - the base of everything

The most important term in this mechanism is the so called "referrer URL". This is the base of the algorithm.

Describing in details what it is is out of scope for this article. You can learn more about it easily by searching for this term and reading articles about it. There is a lot. For now just to have a basic understanding this means when your visitor arrives to your website e.g. by clicking a link you published in your Facebook page then the Web Browser of your visitor is passing over the URL of this Facebook page - as "referrer" - to your website.

Our Tracking Script which is dropped in to your website is recognizing this referrer URL and sending it up into our cloud servers along with the hit so Keytiles basically knows from where the visitor has arrived to your page.

The algorithm

OK so the "referrer URL" comes in to Keytiles along with the hit data - if there was a referrer.

Then Keytiles - in a nutshell - is doing the following:

  1. If the referrerUrl is empty (no referrer) then the hit is categorized as "Direct" visit.
    This means your visitor just came directly to your site without visible source. Maybe just keyed in your web address or using a bookmark or clicking a link you sent him in an e-mail.
  2. If the referrerUrl is pointing to a domain name of your own, tracked website (Container admins can manage the list of these in the Container settings area) then it is categorized as "Internal" traffic.
    This happens when a visitor of yours navigate to a page/article coming from one of your own page/article using a link or the menu you placed on your own page.
  3. If none of the above is the case then Keytiles will use a global configuration we manage centrally to classify the source. The referrerUrl will be matched against this config to get the result.
    Using this config - on a high level - the following is happening
    1. The first block of the config is testing the referrerUrl against known "Search" referrers. Like something from Google, Bing, etc. If no match for any of these then
    2. The second block of the config is testing the referrerUrl against known "Social" referrers. Like LinkedIn, Facebook, etc. If none of them is matching then
    3. The visit is categorized as coming from "Link" and Keytiles will store the host name extracted from the URL as the source of this external link.

The referrer classifying configuration

This is the configuration we talk about in #3 above. It is an ordered list of so called "matchers".

Keytiles goes over this list - top to bottom - and if any of the configured "matchers" is a match for the referrerUrl then Keytiles stops there and visit will pick up the category provided by the matching matcher. This means the order in this config matters!

Understanding this configuration requires technical knowledge about Regex patterns!

An item of the configuration looks like this:

{
	"classifierClassName": "SearchReferrerClassifier",
	"domainRegex": "^[a-z0-9.]*google\\..*",
	"pathRegex": ".*search.*"
	"name": "Google",
	...
}

The matching process takes into account the domainRegex and the pathRegex.

The referrerUrl - as every URL - looks like this:
http(s)://<domain name>/<path>

The above regular expressions are matched against the appropriate part of the referrerUrl. If it is a match then classifierClassName as a "plugin" takes over. We have several plugins implemented for this but on a high level what important for you to know is only the name of the plugin. If the name of the plugin

  • has "SearchReferrer" in it then that plugin classifies as "Search"
  • has "SocialReferrer" in it then that plugin classifies as "Social"
  • has "LinkReferrer" in it then that plugin classifies as "Link"
    but here we need to note currently the "Link" categorization is used anyways if there is no match found in this config - see #3.3 above!

The plugin is responsible to provide the name of the source. As you see above with name attribute in this case it would be "Google" within the category "Search".

Everything else you see in the config is detail which is not relevant from the classification process point of view for you so we do not go deeper.

Check and see the current configuration

The configuration Keytiles is using is public. You can view that any point in time by executing a GET request for the following URL:
https://api.keytiles.com/api/v1/management/config/referrerclassification

Please note: time by time review and align (= change) this configuration based on our own verification process or based on suspicions reported by our customers to us. So it is not constant and evolving over time.

"What should I do if something is categorized wrong?"

Given the fact we manage this configuration centrally you can not change this.

Please report your suspicion to us by dropping a quick mail to support@keytiles.com!
Thank you in advance helping us to keep this classification config up-to-date!

Notes and remarks

  • Given the fact referrer classification fully relies on referrerUrl provided by the Web Browser of your visitor if your visitor is using plugins which hide this referrer as privacy protection then Keytiles of course will not get it so the visit will be categorized as coming "Direct"