Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8270842: G1: Only young regions need to redirty outside references in remset. #4853

Closed

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Jul 21, 2021

For evac failure objects in non-young regions (old) of cset, the outside reference remset already recorded in the dirty queue, we only needs to do it for obj in young regions


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8270842: G1: Only young regions need to redirty outside references in remset.

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4853/head:pull/4853
$ git checkout pull/4853

Update a local copy of the PR:
$ git checkout pull/4853
$ git pull https://git.openjdk.java.net/jdk pull/4853/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4853

View PR using the GUI difftool:
$ git pr show -t 4853

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4853.diff

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 21, 2021

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 21, 2021
@openjdk
Copy link

openjdk bot commented Jul 21, 2021

@Hamlin-Li The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Jul 21, 2021
@mlbridge
Copy link

mlbridge bot commented Jul 21, 2021

Webrevs

@tschatzl
Copy link
Contributor

tschatzl commented Jul 22, 2021

Instead of trying to optimize this scanning a bit, did you consider avoiding this rescan for all types of regions?

It seems that the code for skipping the enqueuing for young region is this:

oop G1ParScanThreadState::handle_evacuation_failure_par(oop old, markWord m) {
  assert(_g1h->is_in_cset(old), "Object " PTR_FORMAT " should be in the CSet", p2i(old));

  oop forward_ptr = old->forward_to_atomic(old, m, memory_order_relaxed);
  if (forward_ptr == NULL) {
    // Forward-to-self succeeded. We are the "owner" of the object.
    HeapRegion* r = _g1h->heap_region_containing(old);

    if (_g1h->notify_region_failed_evacuation(r->hrm_index())) {
      _g1h->hr_printer()->evac_failure(r);
    }

    _g1h->preserve_mark_during_evac_failure(_worker_id, old, m);

    G1ScanInYoungSetter x(&_scanner, r->is_young());
    old->oop_iterate_backwards(&_scanner);

I.e. just as a thought, would it be worth a try to fake an "old" region by doing

    // Always make the scanning assume this is from an old region, causing it to collect
    // dirty cards for remembered sets as we will turn this region with a failed allocation
    // into old later.
    G1ScanInYoungSetter x(&_scanner, false);

I have not really checked this actually works, but it seems a nice hack to avoid the rescanning if it worked.

If it does, the naming of the scoped object G1ScanInYoungSetter and the related members and such should be changed to indicate that this is about tracking cross-references for remembered sets.

@Hamlin-Li
Copy link
Author

Thanks for the suggestion, I think it should work, I will update the patch as you suggested.

In fact, I have this question in my mind for a while, this is not related to evac failed obj. If an obj A in young region has the ONLY reference to obj B in optional set (but not in current cset), where this refering relationship is recorded after the young is evaced to survirvor regions? I guessed it's recorded by other "dead" old obj, as in remset there is no reference from young to old.

@tschatzl
Copy link
Contributor

tschatzl commented Jul 23, 2021

Please have a look at G1ParScanThreadState::remember_reference_into_optional_region - G1 collects those references in an extra data structure when scanning an object after it has been copied in G1ParScanThreadState as it encounters those, and uses them as appropriate during optional evacuation.

@tschatzl
Copy link
Contributor

Fwiw, I did not do any performance evaluation of that idea to completely avoid scanning during evacuation failure. I do not think it has significant impact though.

@Hamlin-Li
Copy link
Author

Sorry for delayed reply, I've been occupied by other things.
I'm afraid the solution (hacking G1ScanInYoungSetter) does not work as expected. With following super simple patch, I got a crash when Concurrent Mark starts (after evacuation failure in young regions, I triggerred it with G1EvacuationFailureALot).
'''
diff --git a/src/hotspot/share/gc/g1/g1EvacFailure.cpp b/src/hotspot/share/gc/g1/g1EvacFailure.cpp
index 1f775badbac..1ac1a69fc0a 100644
--- a/src/hotspot/share/gc/g1/g1EvacFailure.cpp
+++ b/src/hotspot/share/gc/g1/g1EvacFailure.cpp
@@ -155,7 +155,7 @@ public:
// remembered set entries missing given that we skipped cards on
// the collection set. So, we'll recreate such entries now.
if (_is_young) {

  •    obj->oop_iterate(_log_buffer_cl);
    
  •    //obj->oop_iterate(_log_buffer_cl);
     }
    
     HeapWord* obj_end = obj_addr + obj_size;
    

diff --git a/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp b/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
index 9eebe411936..7e82f50b373 100644
--- a/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
+++ b/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
@@ -611,7 +611,7 @@ oop G1ParScanThreadState::handle_evacuation_failure_par(oop old, markWord m) {

 _g1h->preserve_mark_during_evac_failure(_worker_id, old, m);
  • G1ScanInYoungSetter x(&_scanner, r->is_young());
  • G1ScanInYoungSetter x(&_scanner, false);
    old->oop_iterate_backwards(&_scanner);

    return old;
    diff --git a/src/hotspot/share/gc/g1/g1ParScanThreadState.hpp b/src/hotspot/share/gc/g1/g1ParScanThreadState.hpp
    index 79f5b3a22cb..2a2060642cc 100644
    --- a/src/hotspot/share/gc/g1/g1ParScanThreadState.hpp
    +++ b/src/hotspot/share/gc/g1/g1ParScanThreadState.hpp
    @@ -130,7 +130,7 @@ public:

template void enqueue_card_if_tracked(G1HeapRegionAttr region_attr, T* p, oop o) {
assert(!HeapRegion::is_in_same_region(p, o), "Should have filtered out cross-region references already.");

  • assert(!_g1h->heap_region_containing(p)->is_young(), "Should have filtered out from-young references already.");
  • // assert(!_g1h->heap_region_containing(p)->is_young(), "Should have filtered out from-young references already.");

#ifdef ASSERT
HeapRegion* const hr_obj = _g1h->heap_region_containing(o);
'''

After investigation, I think it's because the difference between the initial patch (says pathc I) and the patch hacking G1ScanInYoungSetter (says patch H) is that:

  1. Patch I keep all the logic in UpdateLogBuffersDeferred for evac failure obj in young regions, especially it will enqueue all the fields for evac failure obj (check UpdateLogBuffersDeferred::do_oop_work(T* p), not matter where it points to(except of pointing to its own region).
  2. Patch H will not do it for fields of evac failure obj in young regions if a field points to obj in cset. (check G1ScanEvacuatedObjClosure::do_oop_work(T* p), G1ParScanThreadState::do_oop_evac(T* p))

If I amend following patch (says Patch A), it will resolve the crash issue, but I don't think this is what we want.
'''
diff --git a/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp b/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
index 9eebe411936..91d9bc76528 100644
--- a/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
+++ b/src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
@@ -208,9 +208,9 @@ void G1ParScanThreadState::do_oop_evac(T* p) {
return;
}
HeapRegion* from = _g1h->heap_region_containing(p);

  • if (!from->is_young()) {
  • enqueue_card_if_tracked(_g1h->region_attr(obj), p, obj);
  • }

'''

I still don't know the exact reason why the Patch H will fail, not sure if you will have some clue?
But seems the Patch I is the right way to go.

@tschatzl
Copy link
Contributor

tschatzl commented Aug 2, 2021

The problem of patch H is that in G1ScanEvacuatedObjClosure::do_oop_work the code does not always enqueue cards but only if the reference is already outside of the collection set.
If inside the collection set, we push it on the task queue, meaning that ultimately we call G1ParScanThreadState::do_oop_evac on it; the suggested patch A then unconditionally enqueues the card that we "missed" just before.

I experimented a bit how to remove the iteration during self-forwarding pointer removal, one option (not 100% finished, one FIXME and some commented out asserts and the new class isn't as nice as it could be, but it seems to work) is here: https://github.com/tschatzl/jdk/tree/submit/evac-failure-no-scan-during-remove-self-forwards

Not sure if I like it, and we would need to test if it actually an improvement (i.e. faster overall - important for region based pinning). So for now the original patch from you seems best to continue with. Initial look at it seems good, but let me push it through our internal testing and re-review it.

@Hamlin-Li
Copy link
Author

Thanks Thomas, looking forward to hear the further result from you.

(I'm sorry for my previous comments, I tried to put patch/diff content in the comment, but seems it's a mess. I start/end patch content with a ''', but seems it does not work as expected.)

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the reason why we need to re-scan only young gen objects is the use of from->is_young() in various places to avoid enqueuing too many cards. The exact condition would be something like from->is_survivor() (note: this means the newly allocated survivor regions allocated during this gc - remember that at start of the this gc we relabel all previous survivor regions as eden, which will become part of the eden of the next gc).

I (force-)pushed a prototype that adds such a label to G1HeapRegionAttr to also avoid touching the HeapRegion table completely at https://github.com/tschatzl/jdk/tree/submit/evac-failure-no-scan-during-remove-self-forwards.
During this investigation a few additional (unrelated, preexisting) issues with the current handling of objects during evacuation failure became kind of obvious, I filed https://bugs.openjdk.java.net/browse/JDK-8271871 and https://bugs.openjdk.java.net/browse/JDK-8271870.

However I think this change is good as is though and can be replaced later with a more refined version of the suggested prototype.

Testing tier1-5 has also been good afaict (still rerunning as I had some apparently different issues on Windows), as well as running gc/g1 jtreg tests with globally injected -XX:+G1EvacuationFailureALot (via JAVA_OPTIONS_).

@openjdk
Copy link

openjdk bot commented Aug 4, 2021

@Hamlin-Li This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8270842: G1: Only young regions need to redirty outside references in remset.

Reviewed-by: tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 258 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 4, 2021
Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in RemoveSelfForwardPtrObjClosure::do_object() before the change needs updating. It's completely incomprehensible to me (at this point) and mostly refers to very old behavior. Please change before pushing.

Something like:

      // During evacuation failure we do not record inter-region
      // references referencing regions that need a remembered set
      // update originating from young regions (including eden) that
      // failed evacuation. Make up for that omission now by rescanning
      // these failed objects.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Aug 4, 2021
@Hamlin-Li
Copy link
Author

Thanks Thomas, I've updated the comments.
Seems the pr is not ready for now, should I continue or do you have some other plan?

@Hamlin-Li
Copy link
Author

Just checked your prototype code, I think it's a good way to go.

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 6, 2021
@Hamlin-Li
Copy link
Author

Thanks for your review, Thomas.

@Hamlin-Li
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Aug 6, 2021

Going to push as commit cc61520.
Since your change was applied there have been 262 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Aug 6, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 6, 2021
@openjdk
Copy link

openjdk bot commented Aug 6, 2021

@Hamlin-Li Pushed as commit cc61520.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@kimbarrett
Copy link

@Hamlin-Li two reviews are required for hotspot changes.

@Hamlin-Li
Copy link
Author

@kimbarrett Sorry, I will pay more attention. Thanks for reminding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

3 participants