How to track traffic in Google Analytics without JS

May 7, 2017   

I’ve long wanted to hook up the RSS feeds on hnrss.org to Google Analytics so I could get a better sense of what kind of traffic the site got. But because the feeds are delivered as XML, I had to figure out a way to do it without JavaScript.

After some poking around yesterday, I finally figured out a way.

Google Analytics has a Measurement Protocol API built for this sort of thing. You send a HTTP request (POST and GET both work) with the necessary parameters, and it’ll register the hit in your Analytics property.

After testing it out locally using curl and seeing it all work, it was time to integrate with the app.

After searching around for awhile, the eureka moment came when I discovered the post_action feature of NGINX. This enabled me to hit the Measurement Protocol API after serving the feed.

Here’s the necessary bits that I had to add to my NGINX config to get it all working:

# replace UA-XXXXXXXX-Y with your Analytics property ID
location @GA {
  internal;
  resolver 8.8.8.8 ipv6=off;
  proxy_pass https://www.google-analytics.com/collect?v=1&tid=UA-XXXXXXXX-Y&cid=$remote_addr&t=pageview&dp=$request_uri&uip=$remote_addr;
}

server {
  # the standard stuff (server_name, listen, etc.)
  location / {
    # no changes needed other than adding post_action
    post_action @GA;
  }
}

Also, directly passing the User-Agent (e.g., ua=$http_user_agent) gave me all sorts of problems (I think it wasn’t being properly URL encoded) so I sanitized them with a map:

# N.B. maps have to exist in the http context
map $http_user_agent $user_agent {
  ""                    empty;
  "~Android"            android;
  "~^Slackbot"          slackbot;
  "~^curl"              curl;
  "~^Feedbin"           feedbin;
  "~^Tiny Tiny"         ttrss;
  "~^NewsBlur"          newsblur;
  "~^Feedly"            feedly;
  "~^Go-http"           golang;
  "~^UniversalFeed"     feedparser;
  "~^Zapier"            zapier;
  "~^PHP"               php;
  "~^python-requests"   python-requests;
  "~^Mozilla"           mozilla;
  default               other;
}

Then I added ua=$user_agent to the URL. Not ideal, but good enough for me.