I like to have some insights on what pages get visibility on this blog. For most of its existence I was using Google Analytics, but I don’t like the prospect of participating to global tracking, and so I wanted to get rid of it. I ended up deciding to build my own web analytics solution, and use PowerBI to interpret the results.
It’s also an example of configuration for an Nginx Ingress where you can use the same sub-domain to aggregate two backend applications.
Both the blog and the analytics solution are hosted on the same subdomain (
www.feval.ca), so I use an Ingress configuration to route the traffic. Everything going to
www.feval.ca/ana/ is directed to Minilytics, the rest is directed to the blog. An Azure Storage Table is used to store the events. Then I use PowerBI to analyze the data, which offers a mobile application, allowing me to see all the dirty things you’re doing to my blog from the comfort of my pocket-computer.
Minilytics is a pure CRUD, doesn’t have any business logic. I don’t anticipate that it will have any updates, so I didn’t bother automating build or tests1.
In my previous blog I was detailing how this blog is now deployed, this solution is relatively similar. It has a deployment and a service configured in a separate namespace
minilytics. I created a new Ingress configuration that gets deployed in the
apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: kubernetes.io/ingress.class: nginx nginx.ingress.kubernetes.io/rewrite-target: / nginx.ingress.kubernetes.io/configuration-snippet: proxy_set_header x-forwarded-ip "$remote_addr"; certmanager.k8s.io/cluster-issuer: letsencrypt-prod name: minilytics-ingress namespace: minilytics spec: rules: - host: www.feval.ca # Should be variabilized and use helm instead. http: paths: - backend: serviceName: minilytics-service servicePort: 80 path: /ana tls: - hosts: - www.feval.ca secretName: tls-secret
The important thing here is the path,
There’s not much to the client. It’s loading on the
onload of the page, and executes in a timeout after a second, so as not to block the page loading. It’s stealing the page you’re looking at, your “userid” (a random number, I just want to see if you’re ever coming back), and your browser information (I’m not sure why, I don’t really care about it), and sends all of that to the backend.
The report is built up with PowerBI desktop. There’s a native connector for Azure Table Storage, so it’s pretty easy to use.
And there’s a mobile application, which allows consulting the report directly on my phone
Now for the interesting thing, how do the results compare to Google Analytics?
That’s 20% more on the same period!?
SO AM I OVERESTIMATING??
I don’t think so. Maybe there’s some logic in Google Analytics to filter out some bad requests. But for the most part, more and more browsers are killing Google Analytics - Firefox is blocking it by default, so the benefit of my own solution is better numbers.
And this blog is now free from Google’s dirty hands.