Is service fabric better than AKS “at scale”? Are they better than App Services? Under which conditions should you start considering an App Service Environment? What is scaling up and down the fastest? When are storage queues starting to give up the ghost, and when should you start considering event hubs? Is table storage more scalable than Cosmos? Better than blob? Do you ever need more than two instances of app service standard? Is dotnet core significantly faster than dotnet? What about node? Do you get more from v3 series than v2? Should you use large vms or multiple small ones? Is the performance varying with load balancers in the middle? Is it varying from a region to another? What is the cost of docker? What is the cost of container orchestration over docker? What is the most cost-effective way of deploying containers? Can you extrapolate the load that single node clusters can handle to multi-node clusters? And most importantly, how much load can the free-tier of app services handle?

The answer is probably: it depends.

There are some recommendations as per which technology to use based on use cases, but it’s usually limited to SLAs, general guidance, and unfortunately hearsay and beliefs.

My plan is to compare different technologies under different scenarios, and observe which solution gives you the most chooch for your buck1. The goal is to build an empirical reference table that can tell how 2 solutions compare, and when they start breaking.


After giving much thoughts on how to best generate load, I’ve decided on implementing my own solution. Why not using any existing load-performance testing tool out there, you ask? Because I want it very flexible. I have multiple purposes, I want to generate load on HTTP. I want to generate it on queues. I want to generate it on the server itself. And also, I want to understand what my input really was. How best to achieve that than with something I crafter myself. And where is the fun in reusing stuff for my personal pet-projects?

To begin with, I want to simulate load on a synchronous server. I will use a straight forward, 2 components architecture:

  • A client: my personal naming convention starts experiments by “exp”, and this one is about generating “load” → exp-load.

  • A server, that needs to respond to the client, and possibly simulate some load, and/or some calls to other services.

  • A result store where I can keep all the individual results. I was initially thinking of simply relying on various monitoring systems, but I prefer it to be linked to the client because that’s what really matters ; plus not all solution can be monitored by the same system, and I want repeatability.

  • A reporting system where I can crunch the results.

  • Monitoring, to explain what I see.

graph LR;

Client --> Server
Client --> Storage
Reporting --> Storage
Server --> Monitoring

A pretty revolutionary architecture if you ask me


Turns out I already built something like that. Bounce is a small CLI-configured http server. It’s relatively lightweight, which means it should give a relatively stable base to compare hosting solutions between each other, and can be configured to simulate CPU-load and proxy calls to other services.

It’s based on node. I might need to create an equivalent on dotnet to compare node, dotnet and dotnet core - but that’s for a later day.


My client needs to be able to generate significant noise for the servers to have a hassle. The solution I retained is a solution that can generate load at three levels:

  • Linearly: repeat a task over and over again. Repetition frequency needs to be configurable.
  • Parallelly: repeat that task in multiple threads to maximize the use of my agent ; said parallelization need to be configurable for more control.
  • Scale-out-ly: repeat that task on multiple agents while storing the results in a common place. This needs to be finely controllable.

The task needs to be configurable, could be an http call, adding a message to a queue, or even just generating CPU load on the agent.

It can last for a given time, or number of iterations.

The results need to be stored in a place that can be changed ultimately.

And finally, it needs to be abstracted enough that I can change it later, and so someone can possibly modify it and reuse it to test other platforms.


My plan is to keep taking infrastructure related decisions (where to store data and how to execute the server) as long as possible, so it’s flexible and I can switch. Once the thing is built, run a single instance of it against 5 of a similarly sized machines, and see how much load I can get a single server to generate, and make sure I can predict it correctly.

Then from there, start braking stuff and draw hasty conclusions.

That’s the intent.


  1. grown ups call that “ROI”.