They could be titles of a James Bond movie but they are not. They are bugs in the Intel processors, bugs that have been there for ages. A few weeks ago the world learned that anyone with a coding background could get access to our data. All this because of a serious flaw in security in the processors to make them execute faster, convenience for the end user. As said many times before convenience kills security and this was the last example of that. I won’t go into what Spectre or Meltdown are and what the possible breaches are, I think there are more sites covering that than you have time this year. My goal with this article is to collect data from various sites and sources about real-life performance, I’ve read enough about what it could be but like to know what it is.
Spectre and Meltdown
For anyone just emerging from the woods and wondering what Spectre and Meltdown are all about I included some links to resources that explain a lot. I should include a link to the news bulletin telling about the Intel manager selling his stocks in November as well but I guess that is what you do when you understand your company will take a hit – sarcasm -.
Theoretical performance hit
Intel did release some data on what the performance hit could be, Microsoft did as well although they never released any real data but just showed some ballpark numbers. So if we first look at the data Microsoft gave they say that Windows 10 will be impacted around 1-3%. If we talk about Microsoft server and Microsoft Hyper-V they only say the impact is significant. Significant can be 10% but also 20-30%, pity they don’t share more insight as clearly, they will have. Intel shared some more data, they did a benchmark on several processor types. Their data has been published in a PDF, I took a screenshot of the first page (there are two) and published it here. The original pdf is found here – Intel benchmark PDF.
if 1 CPU cycle is 1 second
An interesting look at the issue is when we scale the time a computer takes. With the patches things will get slower but what is slower. Going from 0.3 nanoseconds to 0.9 nanoseconds is still damn fast. What is 1 computer cycle is 1 second? what if we map it to real life? This table can be found in a book called System Performance, Enterprise and Cloud by Brendan Gregg.
Vendors have created advisory pages with details and links to patches. Below is an overview of all the vendor pages.
- Cisco https://tools.cisco.com/security/center/content/CiscoSecurityAdvisory/cisco-sa-20180104-cpusidechannel
- NetApp https://security.netapp.com/advisory/ntap-20180104-0001/
- IBM https://www.ibm.com/blogs/psirt/potential-impact-processors-power-family/
- VMWARE https://www.vmware.com/security/advisories/VMSA-2018-0002.html
- Microsoft https://support.microsoft.com/en-us/help/4073119/protect-against-speculative-execution-side-channel-vulnerabilities-in
- RedHat https://access.redhat.com/security/vulnerabilities/speculativeexecution
- Citrix https://www.citrix.com/blogs/2018/01/04/industry-wide-computer-processor-vulnerability-what-to-know-what-to-do/
- Aruba Networks http://www.arubanetworks.com/support-services/security-bulletins/
The proof of the pudding is in the tasting – real life numbers
A theory is great but a theory is just a theory. I haven’t had the chance of measuring any performance at customers myself but I was looking for others that did real-life testing. So I’ve been scavenging the Internet to find data posted by others to show the impact of the patches.
Gordon Mah Ung did an update of his Surface book and published the results in a blog. Below is the test where the left image is the test before the patch was applied and the right one is the after the patch was applied. The big impact here is seen with the 4K read and write test with high queue depth. The impact here is around 40%
The other test he did was run Principled Technologies WebXPRT 2015 on the surface book.
Find the whole blog here: PCWorld
Another test I found was at virtualizationHowto, they looked at the Windows 10 side but also took into account the hypervisor.
The screenshots above are from an unpatched Windows 10 system and from a patched Windows 10 system (Not the security patch) The performance hit is not that substantial although also here we see the 4K read and write hit coming back a little bit. Write impact is still 20% while the Read impact is very small here. So if we also patch the hypervisor how will this impact the performance.
After the patch has been applied the performance is dropping even more. We started at 346.6 and are down to 136 for the 4K read performance. After VMware ESXi was patched the security patch for Windows can be deployed. If we look at the numbers from all the test we see a drop of performance in general. It seems that the impact is actually there.
Find the whole blog: VirtualizationHowto.com blog
Trentent Tye (twitter.com/trententTye) has been testing his Citrix XenApp environment and posted some details on Twitter. He claims to see a 20-35% impact on CPU. I’ve been looking at his feed more lately but didn’t find any new data yet.
So This is what I found so far, it seems there is an issue but it also seems that the impact is different in different scenarios. The issue we all are facing right now when will we patch our systems? what will be the impact on your production environment and how will you cope with that? I’m still on the lookout for more information and more data. If you are planning to apply patches please try to patch, test and measure on your non-production environment first. You need to understand the the impact before it kills your production environment.