Back to blog

Identifying your users in Google Analytics while complying with section 7 of the terms of service

October 17, 2013 - Posted in analytics , Featured , web Posted by:

Tags: ,

In Google Analytics we’re not allowed to track information such as usernames that would be identifiable by a 3rd party such as Google. This is because of section 7 of the terms of service:

You will not (and will not allow any third party to) use the Service to track, collect or upload any data that personally identifies an individual (such as a name, email address or billing information), or other data which can be reasonably linked to such information by Google.

While sending Google information that is personally identifiable is simply not permitted, you can instead send an identifier which is known only to you.

This is “confirmed” by “RockinFewl” in the Google Product Forum:

Earlier this year, I had an interesting talk with a Google representative on this very matter.

He confirmed that you are actually allowed to track individuals, but that you cannot store personally identifiable information on Google’s servers or a GA cookie. More specifically: you cannot store names or ip addresses in a custom var, but you can store ids that need your backend to resolve into a person identification. He said that whatever you’re doing in your backend is beyond the responsibility of Google.

Google Analytics Forum | Best way to track individual users

It is further confirmed by Justin Cutroni, Analytics Evangelist at Google:

To add Google Analytics data to a data warehouse you need to add some type of primary key to Google Analytics. In most of the work that I’ve done this key is a visitor ID. This anonymous identifier usually comes from some other system like a CRM. [...] I know what you’re thinking, “You can’t store personally identifiable information in Google Analytics!” But this isn’t personally identifiable information.

Merging Google Analytics with your Data Warehouse

This also means that if you’re trying to identify users within the Google Analytics UI, you’ll have to do a search in a separate system to lookup who the user is. This is probably not very usable in reality, but if you use the Google Analytics API to create an integration with your backend system, you would be able to perform the lookup and display the correct user details in your reports.

Real world examples

So what’s allowed and what isn’t allowed?

In my day job, I’m normally using systems such as Atlassian Confluence or IBM Connections to create new product features or new product integrations. How can we track users and uniquely identify them in these systems?

The simplest way to track a user is to add a custom variable that records say the username, email address or another identifier.

Example: Atlassian Confluence

In Atlassian Confluence, you could easily use the following:

 _gaq.push(['_setCustomVar', 2, 'username', AJS.params.remoteUser, 1 ]); // ***** DO NOT USE THIS *****

This would be a violation of section 7 of the terms of service because it sends personally identifiable information to Google. OK, a username may not be easily linked to a user’s actual identity, but it’s likely that it would be.

Example: IBM Connections

So, we know that usernames and email addresses are not allowed, but what else can we use?

In IBM Connections, each user is internally identifiable by a universal user identity (UUID). If this is not externally available, this would be perfect to send to Google as the identifier.

We could then use:

_gaq.push(['_setCustomVar', 2, 'uuid', uuid, 1 ]);

However, most IBM Connections systems have the ability to query a user based on this UUID using the Profiles REST API e.g.

https://connections.example.ibm.com/profiles/atom/profile.do?userid={GUID}

If this REST API is protected by authentication, then we are good. It would comply with section 7 of the terms of service.

If this REST API is not protected by authentication, then this would be a violation section 7 of the terms of service because it sends information to Google…

which can be reasonably linked to such information by Google

Workaround

To get around this problem, you should create what I’m going to term a “Google Analytics identifier” (GAID) which is mapped to the username or UUID and is only used to send tracking data to Google Analytics. You’ll likely need to store this against the user object/user table in your backend system.

That way, you can use this:

 _gaq.push(['_setCustomVar', 2, 'gaid', gaid, 1 ]);

Provided this GAID is not publicly accessible, we are good. It would comply with section 7 of the terms of service.

You will be able to happily track users, but now just need to generate some reports in your backend system that decodes these GAIDs into useful data. Hold tight, that’s another story.

Further reading

David is a senior developer and solutions architect at AppFusions based in Nottingham, England.

AppFusions solves mixed-technology integration problems. We bring engineering and business workflows together, you can work better, faster and smooter.

AppFusions is headquartered in San Francisco, California and works with enterprise vendors and partners such as IBM, Jive, DropBox, Box and Atlassian.

14 Comments

Shehzad 4 months ago

Hi David,

thank you for this fantastic post to clear things up. We are using Moodle as our VLE and have already installed GA but would very much like to be able to track individual user activity in Moodle. Each student at our college has a username and locks in with that and its usually a 6 digit number. Could we use this as our custom variable? im not sure what you mean when you say if the username is available publically? Any help is very much appreciated.
Thank you
Shehzad

Reply

David 4 months ago

Hi Shehzed,

If the the username is something that a third party could understand and then use to get the user’s details, then that is no good. Perhaps Moodle has a UUID or similar that you could use.

Reply

@dvdsmpsn 2 months ago

RT @christian_r: L’article qui a fait ma soirée : on peut tracker des individus dans GA tout en respectant les TOS Google #joie http://t.c…

Reply

@jmesam 1 month ago

Identifying your users in Google Analytics complying with section 7 of the terms of service http://t.co/aISnsvogLV via @analyticsdennis

Reply

@SemperBanU 3 weeks ago

BTW – if you ever wanted to set data with Google Analytics that ties a specific user to a data point, here’s how http://t.co/CXkCl5ftEZ

Reply

Laura 3 weeks ago

Hi David,

Great article, I’ve been researching this a lot and yours is the first post that has made things a lot clearer!
I want to be able to send an email to my database and then track individual level behaviour from the email to the website (i.e. what did those who clicked through from the email get up to on the website).
Is this possible using your method if I used a GAID and linked it back to the email address (UUID)?
Is there an “off the shelf” solution or do you have to be a web developer genius and create an integration API from scratch?

Thanks,

Laura

Reply

David Simpson 2 weeks ago

Laura

You should be able to add individual user tracking on all your emails if you use something like MailChimp to send to your mailing list. This would give stats on who opened what and when.

You could add the same tracking ID as the “GUID” for Google Analytics integration on the links in your email. This should be achievable with a little JavaScript. Creating the lookup to convert from GAID back to email address is the tricky bit.

Reply

Jarrod 2 weeks ago

Hi David,

Interesting article, it has cleared up a few misconceptions that I had. I am hoping you might be able to answer this question:- what about storing other bits of information? We are looking at an application that does a postcode lookup and returns search results based on the postcode. Is it okay to store the postcode in GA (because it is all client-side with no database or server facility) or, extending that concept, a geo-location? My feeling is ‘no’, but I am not so sure now.

Thanks,
Jarrod

Reply

David Simpson 2 weeks ago

I’d say that storing postcodes or geo-location is a bit shady. This would not uniquely identify a person, but is at least reduces the person to a unique cohort from which a third party may be able to guess the user.

A safer approach would be for you to create a “GAID” from which you could lookup the postcode or geo-location.

Reply

Jarrod 2 weeks ago

Thanks for the prompt reply. I will go back to the powers-that-be with that info.

Don’t think we can use the gaid approach as there is no capacity to store the postcodes when a search is made.

Thanks again.

Reply

@pufn1ca 1 week ago

Identifying your users in Google Analytics while complying with section 7 of the terms of service http://t.co/gvLxSJcwi0

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>