One of my customers has troubles with Disk IO on an Azure Kubernetes Service container. They are using Apache Lucene, with catalogs stored in an Azure Storage File Share.

Azure Storage File share is not the first option that comes to mind for transactional storage, but I didn’t have data on its performance. Which brought me to produce it, and compare to other storage solutions: “local” storage on agent with different VM skus, within virtual nodes1, Azure Storage File Share standard and premium, and Azure Disks standard and premium.

Let’s get it out of the way: unless there is a very good reason, to maximize performance and lower cost, solutions being built for Kubernetes (and more particularly AKS) should target cloud-native storage solutions: blob storage, Cosmos DB, Azure SQL, etc. This is not always an option though, particularly when integrating solutions like Lucene, or lifting older applications.

The findings and conclusions in this post cover the latter situation: what performance to expect if a file API has to be used for the application, based on the solution being chosen.

Conclusions

On average, if throughput is a concern, disk-based solutions provide more consistent, and better results. If possible (mostly if the data is transient), use agent’s disks. These are the results on 1Mb files:

Average files

If applications are write and read intensive (lots of small files), do not use Azure File-Share as the storage solution. They seem to underperform on both read and write. Instead, use disks. These are the results on 10kb files:

Small files

For applications storing larger files and are mostly write, file shares might work. These are the results on 100Mb files:

Large files

Findings

On consistency:

  • The size of the files being written has an impact on consistency. Large files tend to produce higher and more consistent bandwidth. Smaller files produce slower and less predictable bandwidth, especially on read. I’m guessing this is mostly due to access time more than bandwidth itself.
  • Write bandwidth is more predictable than read. The same test can yield average read bandwidths that are up to double from one run to the next, especially on smaller files. I’m assuming this is partly due to caching.

Smaller files and larger files exhibit somewhat different findings.

Results for reading on all options on various file sizes

Results for writing on all options on various file sizes

Larger files

(>= 1Mb)

  • Write bandwidth is “relatively” similar for all options I’ve tried, the fastest option (on agent, DS2v2) is “only” 3 to 4 times faster than the slowest option (standard file share).

Three options for large files (10Mb and 100Mb files)

  • Read bandwidth is similar when reading on agent, surprisingly so for either standard or premium disks.
  • File share is 40 times slower for reading than the fastest option.
  • v2 series seem to be faster at writing, v3 series seem to be faster at reading.

V2 are faster are writing, v3 at reading large files (Avg on 10 & 100Mb files)

  • Non-S series (traditional hard drives) are faster both on read and write for large files.

Non S vs. S on large files (Avg on 10 & 100Mb files)

Smaller files

Much more varied results on smaller files (10kb and 100kb):

Three options for small files (Avg on 10 and 100Kb files)

  • File share is several orders of magnitude slower than the fastest options for reading small files (369 times slower for 100kb files, 1400 times slower for 10kb files.)
  • Disks attached as Persistent Volumes are between 10% and 20% slower than reading on agent, standard gave me results much faster than writing on agent. My conclusion here is that giving the levels of consistency I’ve observed, I would just assume no noticeable difference between disks on agent and disks on persistent volume.
  • Virtual nodes are, surprisingly, the fastest option for reading on agent. I’m guessing there’s a lot of caching involved.
  • What really surprised me is that Premium disks as Persistent Volumes were consistently slower than standard in my tests. I double checked everything, I don’t really understand this result.

Non S vs. S on large files (Avg on 10 & 100kb files)

  • VM in S is between 40% and 50% faster for reading than non-S version (D2 v3 vs D2s v3). For writing, this jumps to a 4 times difference:

V2 are faster are writing, v3 at reading large files (Avg on 10 & 100Mb files)

  • VM in v2 series is between 10% and 15% faster than v3 for both reading and writing.

Methodology

I have used two methods:

  1. using dd and writing 1GB from /dev/zero onto a file, then reading the file into /dev/null. This is not necessarily very representative of a transactional load, but that already gives some details on pure throughput. Since the results were consistent with 2), I only used the second solution, which worked better at a larger scale.

  2. using a small utility I built (sources on GitHub) that simulates:

    • writing a bunch of files a number of times
    • reading a bunch of files a number of times
    • writing, then reading a bunch of …
    • Repeat with various file size and file counts.

Screen cap of the script running

The numbers I provide shouldn’t be taken at face-value, the point is more to compare solutions and get an idea of the impact moving from one to another can have.

The options I have tried are:

  • Storing on agent
    • D2 v3
    • D2s v3
    • Ds2 v2
    • Ds3 v2
  • Using Persistent Volumes
    • Standard disk
    • Premium disk
  • Using Azure File Share
    • Standard storage account
    • Premium file storage

There are other solutions that can be leveraged, top of mind are:

  • Azure NetApp files, which I’ve heard are fast.
  • Using a VM scale-set as a file-share - which might be faster than Azure FileShare (or not - haven’t tried).
  • Using Blob Fuse, which mounts blob storage as a disk. Since this is still pretty experimental, I was hesitant to include it. 2
  • Use a more “custom” approach to sync files through blob or another solution - this is also out of scope of this test.

The last thing is that this test doesn’t go for concurrency, which might change the results completely. There are tons of different parameters that might also affect the results, such as other containers on the same agent, CPU and memory usage, network usage, etc.

Raw results

The raw data is available in this spreadsheet or in CSV:

Run name File count File Size (unit) Iterations Write (Mb/s) Read (Mb/s) Write/Read (Mb/s)
aci 100 100kb 10 2.997868695 1097.954522 2.956693616
aci 5 100Mb 1 30.03283057 1206.292407 30.09522424
aci 1000 10kb 10 0.465760469 331.2872079 0.452941806
aci 5 10Mb 2 38.06376625 1425.114508 30.13759227
aci 10 1Mb 10 27.9757646 1312.883854 28.62050538
agent-d2sv3 100 100kb 10 7.02311993 984.2899211 6.626219992
agent-d2sv3 5 100Mb 1 21.41514132 1977.489838 20.98751602
agent-d2sv3 1000 10kb 10 1.036217939 222.2896475 1.062938355
agent-d2sv3 5 10Mb 2 25.0953164 1580.325579 24.6909648
agent-d2sv3 10 1Mb 10 33.75672936 1594.507241 33.12407749
agent-d2v3 100 100kb 10 3.381719841 686.6663552 3.073505972
agent-d2v3 5 100Mb 1 33.35587701 1995.764987 34.8093001
agent-d2v3 1000 10kb 10 0.396589076 148.4721678 0.368654442
agent-d2v3 5 10Mb 2 104.2598163 1775.347524 59.58370178
agent-d2v3 10 1Mb 10 28.33545977 1491.724657 30.64092167
agent-ds2v2 100 100kb 10 8.837920913 912.8002831 8.647334389
agent-ds2v2 5 100Mb 1 39.58529683 1873.804513 38.12266783
agent-ds2v2 1000 10kb 10 1.137920705 252.4788477 1.199798161
agent-ds2v2 5 10Mb 2 49.80793314 1401.63206 49.94369847
agent-ds2v2 10 1Mb 10 42.45767617 1119.040132 43.50022092
agent-ds3v2 100 100kb 10 4.025262264 878.5205352 5.879947761
agent-ds3v2 5 100Mb 1 54.2693314 1964.450518 52.4764299
agent-ds3v2 1000 10kb 10 0.990691604 235.7335627 0.96675677
agent-ds3v2 5 10Mb 2 66.45884984 1678.446564 66.4663062
agent-ds3v2 10 1Mb 10 39.65566353 1612.689935 44.45427575
disk-premium 100 100kb 10 3.653394467 641.6782599 3.938886895
disk-premium 5 100Mb 1 23.61841942 1433.710667 23.04382916
disk-premium 1000 10kb 10 0.389595126 134.704748 0.393338471
disk-premium 5 10Mb 2 20.14580608 1256.431358 20.1977958
disk-premium 10 1Mb 10 15.41859655 1100.795435 20.32073186
disk-standard 100 100kb 10 10.25766981 816.6653617 11.27205409
disk-standard 5 100Mb 1 30.19629719 1410.130547 31.01925532
disk-standard 1000 10kb 10 1.281182606 153.678022 1.292590568
disk-standard 5 10Mb 2 26.7314816 1099.578971 31.46868078
disk-standard 10 1Mb 10 21.43916896 1324.01404 31.10205627
fileshare-premium 100 100kb 10 2.315095252 3.844586575 1.726979666
fileshare-premium 5 100Mb 1 66.10871965 95.87076559 39.93052887
fileshare-premium 1000 10kb 10 0.236020722 0.421238975 0.179359775
fileshare-premium 5 10Mb 2 58.83980727 94.17889658 36.99116799
fileshare-premium 10 1Mb 10 13.09658377 35.95597407 10.85203081
fileshare-standard 100 100kb 10 1.48032832 2.968062754 1.220750176
fileshare-standard 5 100Mb 1 80.21506879 68.30682837 28.62737415
fileshare-standard 1000 10kb 10 0.159751519 0.237704611 0.112891072
fileshare-standard 5 10Mb 2 34.15769865 42.32723072 27.37672398
fileshare-standard 10 1Mb 10 12.32456338 21.75097762 6.945325536
full-blobfuse 5 100Mb 1 20.93519553 1409.427775 20.88581294
full-blobfuse 5 10Mb 2 24.85523133 1236.433236 24.31815008
full-blobfuse 1000 10kb 10 1.174692873 139.6828853 1.135266581
full-blobfuse 10 1Mb 10 28.86070769 1228.271871 34.2174495
full-blobfuse 100 100kb 10 8.71356157 663.2914024 7.873257694

Notes

  1. Azure Container Instances, referred to as ACI in the rest of the document. 

  2. It gave me results that are for intents and purposes identical to the agent it was running on, which tells me I might have done something wrong with the configuration, or that it doesn’t take into account the replication to blob time. So I didn’t show the results here.