-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8283199: Linux os::cpu_microcode_revision() stalls cold startup #7825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8283199: Linux os::cpu_microcode_revision() stalls cold startup #7825
Conversation
👋 Welcome back shade! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What versions of Linux have /sys/devices/system/cpu/cpu0/microcode/version
? And is it present in virtualized environments? What cost is there for systems that always have to take the second path?
Thanks,
David
break; | ||
sscanf(data, "%x", &result); | ||
} | ||
fclose(fp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff would be much simpler if you added a return here and so avoided the need for the if statement below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that change didn't really help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't see what you mean then, so I am open for clear suggestion how would you write it.
I looked through Linux kernel sources, and I believe it was added by torvalds/linux@9a4b9ef back in 2006, and landed in 2.6.19. So even a relatively ancient RHEL 6 already has it.
One of my x86_64 QEMU nodes does not have it, apparently. There is still
So, emulating the missing sysfs file... Cold start:
Hot start:
So I'd say we pay about 1 ms penalty if sysfs file does not exist. Instrumenting the sysfs read path shows the entire access to missing sysfs entry takes <100 us. I would chalk it up as "pretend this is the actual performance cost for JDK-8249672 work", push this follow-up, and actually look at moving all this mess out of the startup sequence with JDK-8283200. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! Most startup tests of ours are continuous so I can't say we detected this regression.
@shipilev This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 58 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Interesting find. Do you know why access to /proc is slower than to /sys/devices? Is it all /proc accesses or just /proc/cpuinfo? My first thought was that since we use /proc for many things, we'd probably end up paying for it anyway. Just idle thoughts, I'm fine with the patch in general (had no close look yet). |
My "Oracle Linux Server release 7.9" doesn't have it. But I'm not sure if I'm virtualized or not ... I suspect yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The block structure still makes the diff a lot more complex than it needs to be given you are just inserting a new block of code. But the functional change seems okay.
I suspect that's because we generate the entirety of
|
I further suspect this is where the 1 second "cache" comes from: getting frequency from APERF snapshot: https://github.com/torvalds/linux/blob/master/arch/x86/kernel/cpu/aperfmperf.c#L33 |
Makes sense. And it does it for every core. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Thanks.
Thanks for reviews! /integrate |
Going to push as commit 1443f6b.
Your commit was automatically rebased without conflicts. |
Noticed this when staring at timeline profiles for JVM startup. If you run any small startup workload on Linux, there is a significant time gap where JVM does nothing. I pinpointed that to fopen of
/proc/cpuinfo
due to JDK-8249672.This does not reproduce if you run startup workloads continuously, as it looks as if
/proc/cpuinfo
is cached for about a second, and stalls reads after that, I suspect for CPU info updates, like frequency. This reproduces on at least two of my systems running Linux kernels 5.4 and 5.15.Observe:
It directly impacts JVM startup:
Without the sleep:
There is another way to do this: read
/sys/devices/system/cpu/cpu0/microcode/version
, this is what this patch does. Withsleep 1
:Which means it improves startup time from ~45ms to ~25ms, or about 1.8x!
os::cpu_microcode_revision()
is currently used to generate the VM features string. It raises a bigger question if VM features string should be generated on startup, but it seems to take negligible time otherwise, see JDK-8283200. Given JDK-8249672 had been backported to update releases, let's do a pointed fix for this regression first.Additional testing:
tier1
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7825/head:pull/7825
$ git checkout pull/7825
Update a local copy of the PR:
$ git checkout pull/7825
$ git pull https://git.openjdk.java.net/jdk pull/7825/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7825
View PR using the GUI difftool:
$ git pr show -t 7825
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7825.diff