Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: potential deadlock cycle caused by scavenge.lock [1.13 backport] #34150

Closed
gopherbot opened this issue Sep 6, 2019 · 3 comments
Closed
Labels
CherryPickApproved Used during the release process for point releases FrozenDueToAge
Milestone

Comments

@gopherbot
Copy link
Contributor

@mknyszek requested issue #34047 to be considered for backport to the next 1.13 minor release.

This has the potential for reduced stability in Go 1.13. While the chance of deadlock is extremely low, when it hits you it will tend to hit you consistently, because stack depth is consistent (for example #32105).

CC @aclements

@gopherbot Please open a backport issue for 1.13.

@gopherbot gopherbot added the CherryPickCandidate Used during the release process for point releases label Sep 6, 2019
@gopherbot gopherbot added this to the Go1.13.1 milestone Sep 6, 2019
@gallir
Copy link

gallir commented Sep 9, 2019

Just in case.

It seems we were affected by this in https://github.com/gallir/smart-relayer It was in production and we were not able to debug/trace it properly but we found that most of the connections were blocked (every one is managed by one or several goroutines) and the RSS reached the instances RAM (32 GB) and were killed by the OOM.

We went back to a version compiled with 1.12 and didn't see the issue again:

$ ps axl| grep smart-relayer
4 0 7584 1 20 0 3754992 947680 - Ssl ? 46:10 /usr/local/bin/smart-relayer -c /usr/local/etc/relayer.conf

As you can see, the runtime starts tens of threads:
$ ps -T -p 7584 | wc
103 515 4522

@bcmills bcmills added the CherryPickApproved Used during the release process for point releases label Sep 25, 2019
@gopherbot gopherbot removed the CherryPickCandidate Used during the release process for point releases label Sep 25, 2019
@bcmills bcmills modified the milestones: Go1.13.1, Go1.13.2 Sep 25, 2019
@mknyszek mknyszek self-assigned this Sep 26, 2019
@gopherbot
Copy link
Contributor Author

Change https://golang.org/cl/197501 mentions this issue: [release-branch.go1.13] runtime: fix lock acquire cycles related to scavenge.lock

@gopherbot
Copy link
Contributor Author

Closed by merging 8a6cd7a to release-branch.go1.13.

gopherbot pushed a commit that referenced this issue Oct 2, 2019
…cavenge.lock

There are currently two edges in the lock cycle graph caused by
scavenge.lock: with sched.lock and mheap_.lock. These edges appear
because of the call to ready() and stack growths respectively.
Furthermore, there's already an invariant in the code wherein
mheap_.lock must be acquired before scavenge.lock, hence the cycle.

The fix to this is to bring scavenge.lock higher in the lock cycle
graph, such that sched.lock and mheap_.lock are only acquired once
scavenge.lock is already held.

To faciliate this change, we move scavenger waking outside of
gcSetTriggerRatio such that it doesn't have to happen with the heap
locked. Furthermore, we check scavenge generation numbers with the heap
locked by using gopark instead of goparkunlock, and specify a function
which aborts the park should there be any skew in generation count.

Fixes #34150.

Change-Id: I3519119214bac66375e2b1262b36ce376c820d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/191977
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
(cherry picked from commit 62e4156)
Reviewed-on: https://go-review.googlesource.com/c/go/+/197501
Reviewed-by: Austin Clements <austin@google.com>
@katiehockman katiehockman modified the milestones: Go1.13.2, Go1.13.3 Oct 17, 2019
@golang golang locked and limited conversation to collaborators Oct 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CherryPickApproved Used during the release process for point releases FrozenDueToAge
5 participants