Solving the Potentially Violating Personal Data Error

One of the most common errors thrown by Facebook is the ominous “potentially violating personal data” error. These errors typically appear in the Events Manager and occasionally in your email, usually with the following text:

We detected potentially violating personal data in 1 event that may not comply with our terms and policies.

Compliant data protects your users and keeps their information usable and beneficial for your business.To make sure you send us compliant data in the future, review our personal data requirements.

To protect the privacy of your users, we haven’t stored this data in our ad systems and you won’t be able to view or use it. This may impact the performance of any ads optimizing on the affected events, including those in custom audiences or custom conversions.

These errors indicate that Facebook has decided to drop some or all of your events because they contain what it considers “personal” data. However, Facebook’s definition of personal data varies by business vertical and is highly idiosyncratic. Most data that we would consider personal is accepted by Facebook just fine — including names, emails, IP addresses, and so on. However, sometimes Facebook will decide that a specific piece of data is unacceptable in this specific case. Some examples from our clients include business_email, latitude/longitude, and search keywords. While this data can appear anywhere in the event, the most common location is the query string.

While it is impossible to predict in advance which pieces of data Facebook will consider “personal”, it is possible to adapt after receiving an error to prevent Deviate Tracking from sending that data to Facebook. This will reduce the data available to Facebook, but it’s better to have less data available than none at all — and if you send even a single piece of “personal” data, that’s what you’ll end up with: nothing. Facebook drops the entire event, treating it as if it had never happened, rather than excising the data it dislikes.

The Redaction Tag

The redaction tag is a special tag that you can use to redact query parameters from all your events, even those not sent through Deviate Tracking. With this tag, you can eliminate the “personal” data before Deviate Tracking sends data to Facebook, making it appear as if that data never existed and preventing Facebook from throwing an error.

You might be asking yourself, why a separate tag instead of a setting in Deviate Tracking? Simple: because with a tag, you only have to configure it once. Imagine what a hassle it would be to configure every Deviate Tracking tag to redact five query parameters — much easier to configure it once, and have it apply to all your tags.

Technical Explanation

This section explains how the redaction tag works — if you don’t care about the details, feel free to skip to the Setting Up the Redaction Tag section below.

Each Deviate Tracking tag actually sends two events: one for the browser, and one for the server. (These events are then merged together by Facebook, so it appears as if only one event was sent.) While the server event is easy to modify, the browser event is not — Deviate Tracking still relies on the traditional Facebook Pixel to send these events, and the Facebook Pixel does not offer a way to modify the URL sent to Facebook.

However, there’s a clever workaround. Instead of instructing the Facebook Pixel which URL to send, what if we simply fooled it into thinking that it was on a different URL? In most cases, this would require us to redirect the user to the new URL. This is of course not acceptable — aside from the issues of duplicate events, such redirection would double load times for the user and potentially break any dynamically generated content on the page, since websites often use query parameters to determine what information to render on the page.

Luckily, the history.replaceState function allows us to change the URL of a page without causing a redirect. That is, it allows us to fool the Facebook Pixel into thinking that it’s on https://example.com when it’s actually on https://example.com?violating=data. The history.replaceState function is supported in all modern browsers, and even has good support in Safari and Internet Explorer.

With the history.replaceState function, redacting the parameters becomes a simple programming problem. And because the redaction is done through Javascript, it can be added to your GTM container as a standard HTML tag.

Setting Up the Redaction Tag

This tutorial requires you to make some minor code modifications, but does not require any programming knowledge. As long as you can follow the instructions carefully, you can complete this tutorial and start redacting query parameters on your site.

Step 1: Create the tag

  1. Go to your GTM container.
  2. Click the Tags tab on the left
  3. Click the blue New button near the top right of the tag list

Step 2: Configure the tag type

  1. In the tag name field at the top left, enter “Query Parameter Redactor”
  2. Click the Tag Configuration box
  3. Click the search bar and search for “Custom HTML”. Choose the Custom HTML Tag option
  4. In the resulting HTML box, paste the code below
<script>
var badQueries = {
  "test": true
}

var newQueries = window.location.search.slice(1).split("&").filter(function (query) {
  var key = query.split("=")[0];

  return badQueries[key] ! true;
});

var newUrl = window.location.origin + window.location.pathname;

if (newQueries.length > 0){
  newUrl += "?" + newQueries.join("&");
}

window.history.replaceState({} , "", newUrl);
</script>

Step 3: Define your redactions

For this step, you will need to slightly modify the code you just pasted in. Find the badQueries line in the code and modify it to have the redactions you want.

For example, if you want to redact the parameters “email”, “latitude”, and “longitude”, you would need to modify the code to look like this:

<script>
var badQueries = {
  "email": true,
  "latitude": true,
  "longitude": true
}

var newQueries = window.location.search.slice(1).split("&").filter(function (query) {
  var key = query.split("=")[0];

  return badQueries[key] ! true;
});

var newUrl = window.location.origin + window.location.pathname;

if (newQueries.length > 0){
  newUrl += "?" + newQueries.join("&");
}

window.history.replaceState({} , "", newUrl);
</script>

If you wanted to redact only “email”, you would modify the code to look like this:

<script>
var badQueries = {
  "email": true
}

var newQueries = window.location.search.slice(1).split("&").filter(function (query) {
  var key = query.split("=")[0];

  return badQueries[key] ! true;
});

var newUrl = window.location.origin + window.location.pathname;

if (newQueries.length > 0){
  newUrl += "?" + newQueries.join("&");
}

window.history.replaceState({} , "", newUrl);
</script>

Take careful note of the commas — each row except the last one should have a comma.

Step 4: Define the tag’s timing

In this step, we configure the redaction tag to always fire before any other tags.

  1. Click the Advanced Settings accordion row, underneath the HTML box
  2. Set the Tag firing priority to 1. (This must be lower than any other tag in the container)
  3. Set the Tag firing options to Once per page

Step 5: Configure triggers

In this step, we set the tag to fire on all pages.

  1. Click the Triggering box.
  2. Choose All Pages (should appear near the top)

Step 6: Save

  1. Click the blue Save button in the top right

Congratulations! That’s it! You now have a redaction tag in your container.

Troubleshooting the Redaction Tag

Because the redaction tag alters the query string, it’s possible that it may break the page if the page accesses the query string after the redaction tag fires. If you notice that part of the page stopped working, it’s worth disabling the redaction tag and seeing if the page starts working again.