-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8283726: x86_64 intrinsics for compareUnsigned method in Integer and Long #9068
Conversation
👋 Welcome back merykitty! A progress list of the required criteria for merging this PR into |
@merykitty The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
/label hotspot-compiler |
Webrevs
|
@merykitty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add microbenchmark and show its results.
I have added a benchmark for the intrinsic. The result is as follows, thanks a lot:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good. I submitted testing.
You need second review.
@merykitty This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 321 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@vnkozlov, @jatin-bhateja) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
Tier1-4 testing passed - no new failures. I suggest to push it into JDK 20 after fork and after you get second review. |
@merykitty Could you please also add the micro benchmark where compareUnsigned result is stored directly in an integer and show the performance of that? |
Thanks @sviswa7 for the suggestion, the results of getting the value of
|
__ cmpl($src1$$Register, $src2$$Register); | ||
__ movl($dst$$Register, -1); | ||
__ jccb(Assembler::below, done); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By placing compare adjacent to conditional jump in-order frontend can trigger macro-fusion.
Kindly refer section 3.4.2.2 of Intel's optimization manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realised that by swapping the mov
and the cmp
instruction, the rule needs to have dst
different from src1
and src2
, which increases register pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not follow your comment, allocation decisions purely based on LRGs interferences and data flow attributes attached to operands and is agnostic to encoding block contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion requires us having additional TEMP dst
for the match rule. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, macro fusion is a fine microarchitectural optimization which can reduce load on entire execution pipeline and is deterministic for specific pair of cmp + jump instructions, you have aggregated destination's defs and its usages towards the tail which can save TEMP attribution on destination operand and may save a redundant spill only for high register pressure blocks. I am ok with existing handling.
Thanks for your explanations.
__ cmpq($src1$$Register, $src2$$Register); | ||
__ movl($dst$$Register, -1); | ||
__ jccb(Assembler::below, done); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
@@ -13022,6 +13022,32 @@ instruct testL_reg_mem2(rFlagsReg cr, rRegP src, memory mem, immL0 zero) | |||
ins_pipe(ialu_cr_reg_mem); | |||
%} | |||
|
|||
// Manifest a CmpU result in an integer register. Very painful. | |||
// This is the test to avoid. | |||
instruct cmpU3_reg_reg(rRegI dst, rRegI src1, rRegI src2, rFlagsReg flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to add 32 bit support?
Integer pattern can be moved to common file x86.ad and 64 pattern can handled in 32/64 bit AD files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I will add support for 32-bit after this patch, basic rules are often put in the bit-specific ad file so I think it would be more preferable to follow that convention here.
// Since it is not consumed by Bools, it is not really a Cmp. | ||
init_class_id(Class_Sub); | ||
} | ||
virtual int Opcode() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In-lining may connect the inputs to constant, hence a Value routine may be useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CmpU3
inherits the Value
method from its superclass CmpU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its fine then.
init_class_id(Class_Sub); | ||
} | ||
virtual int Opcode() const; | ||
virtual uint ideal_reg() const { return Op_RegI; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value routine to handle constant folding.
@jatin-bhateja Thanks a lot for your reviews and suggestions, I have answered your comments. |
Thank you very much for your reviews |
@merykitty |
/sponsor |
Going to push as commit 108cd69.
Your commit was automatically rebased without conflicts. |
@jatin-bhateja @merykitty Pushed as commit 108cd69. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hi,
This patch implements intrinsics for
Integer/Long::compareUnsigned
using the same approach as the JVM does for long and floating-point comparisons. This allows efficient and reliable usage of unsigned comparison in Java, which is a basic operation and is important for range checks such as discussed in #8620 .Thank you very much.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9068/head:pull/9068
$ git checkout pull/9068
Update a local copy of the PR:
$ git checkout pull/9068
$ git pull https://git.openjdk.org/jdk pull/9068/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 9068
View PR using the GUI difftool:
$ git pr show -t 9068
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9068.diff