Revert "[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > UNITS_PER_WORD"
[official-gcc.git] / libgo / go / runtime / mgc.go
blob65cbcdb55224debf5ec356c552926b676316bc21
1 // Copyright 2009 The Go Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style
3 // license that can be found in the LICENSE file.
5 // Garbage collector (GC).
6 //
7 // The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple
8 // GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is
9 // non-generational and non-compacting. Allocation is done using size segregated per P allocation
10 // areas to minimize fragmentation while eliminating locks in the common case.
12 // The algorithm decomposes into several steps.
13 // This is a high level description of the algorithm being used. For an overview of GC a good
14 // place to start is Richard Jones' gchandbook.org.
16 // The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see
17 // Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978.
18 // On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978),
19 // 966-975.
20 // For journal quality proofs that these steps are complete, correct, and terminate see
21 // Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world.
22 // Concurrency and Computation: Practice and Experience 15(3-5), 2003.
24 // 1. GC performs sweep termination.
26 // a. Stop the world. This causes all Ps to reach a GC safe-point.
28 // b. Sweep any unswept spans. There will only be unswept spans if
29 // this GC cycle was forced before the expected time.
31 // 2. GC performs the mark phase.
33 // a. Prepare for the mark phase by setting gcphase to _GCmark
34 // (from _GCoff), enabling the write barrier, enabling mutator
35 // assists, and enqueueing root mark jobs. No objects may be
36 // scanned until all Ps have enabled the write barrier, which is
37 // accomplished using STW.
39 // b. Start the world. From this point, GC work is done by mark
40 // workers started by the scheduler and by assists performed as
41 // part of allocation. The write barrier shades both the
42 // overwritten pointer and the new pointer value for any pointer
43 // writes (see mbarrier.go for details). Newly allocated objects
44 // are immediately marked black.
46 // c. GC performs root marking jobs. This includes scanning all
47 // stacks, shading all globals, and shading any heap pointers in
48 // off-heap runtime data structures. Scanning a stack stops a
49 // goroutine, shades any pointers found on its stack, and then
50 // resumes the goroutine.
52 // d. GC drains the work queue of grey objects, scanning each grey
53 // object to black and shading all pointers found in the object
54 // (which in turn may add those pointers to the work queue).
56 // e. Because GC work is spread across local caches, GC uses a
57 // distributed termination algorithm to detect when there are no
58 // more root marking jobs or grey objects (see gcMarkDone). At this
59 // point, GC transitions to mark termination.
61 // 3. GC performs mark termination.
63 // a. Stop the world.
65 // b. Set gcphase to _GCmarktermination, and disable workers and
66 // assists.
68 // c. Perform housekeeping like flushing mcaches.
70 // 4. GC performs the sweep phase.
72 // a. Prepare for the sweep phase by setting gcphase to _GCoff,
73 // setting up sweep state and disabling the write barrier.
75 // b. Start the world. From this point on, newly allocated objects
76 // are white, and allocating sweeps spans before use if necessary.
78 // c. GC does concurrent sweeping in the background and in response
79 // to allocation. See description below.
81 // 5. When sufficient allocation has taken place, replay the sequence
82 // starting with 1 above. See discussion of GC rate below.
84 // Concurrent sweep.
86 // The sweep phase proceeds concurrently with normal program execution.
87 // The heap is swept span-by-span both lazily (when a goroutine needs another span)
88 // and concurrently in a background goroutine (this helps programs that are not CPU bound).
89 // At the end of STW mark termination all spans are marked as "needs sweeping".
91 // The background sweeper goroutine simply sweeps spans one-by-one.
93 // To avoid requesting more OS memory while there are unswept spans, when a
94 // goroutine needs another span, it first attempts to reclaim that much memory
95 // by sweeping. When a goroutine needs to allocate a new small-object span, it
96 // sweeps small-object spans for the same object size until it frees at least
97 // one object. When a goroutine needs to allocate large-object span from heap,
98 // it sweeps spans until it frees at least that many pages into heap. There is
99 // one case where this may not suffice: if a goroutine sweeps and frees two
100 // nonadjacent one-page spans to the heap, it will allocate a new two-page
101 // span, but there can still be other one-page unswept spans which could be
102 // combined into a two-page span.
104 // It's critical to ensure that no operations proceed on unswept spans (that would corrupt
105 // mark bits in GC bitmap). During GC all mcaches are flushed into the central cache,
106 // so they are empty. When a goroutine grabs a new span into mcache, it sweeps it.
107 // When a goroutine explicitly frees an object or sets a finalizer, it ensures that
108 // the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish).
109 // The finalizer goroutine is kicked off only when all spans are swept.
110 // When the next GC starts, it sweeps all not-yet-swept spans (if any).
112 // GC rate.
113 // Next GC is after we've allocated an extra amount of memory proportional to
114 // the amount already in use. The proportion is controlled by GOGC environment variable
115 // (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M
116 // (this mark is tracked in gcController.heapGoal variable). This keeps the GC cost in
117 // linear proportion to the allocation cost. Adjusting GOGC just changes the linear constant
118 // (and also the amount of extra memory used).
120 // Oblets
122 // In order to prevent long pauses while scanning large objects and to
123 // improve parallelism, the garbage collector breaks up scan jobs for
124 // objects larger than maxObletBytes into "oblets" of at most
125 // maxObletBytes. When scanning encounters the beginning of a large
126 // object, it scans only the first oblet and enqueues the remaining
127 // oblets as new scan jobs.
129 package runtime
131 import (
132 "internal/cpu"
133 "runtime/internal/atomic"
134 "unsafe"
137 const (
138 _DebugGC = 0
139 _ConcurrentSweep = true
140 _FinBlockSize = 4 * 1024
142 // debugScanConservative enables debug logging for stack
143 // frames that are scanned conservatively.
144 debugScanConservative = false
146 // sweepMinHeapDistance is a lower bound on the heap distance
147 // (in bytes) reserved for concurrent sweeping between GC
148 // cycles.
149 sweepMinHeapDistance = 1024 * 1024
152 func gcinit() {
153 if unsafe.Sizeof(workbuf{}) != _WorkbufSize {
154 throw("size of Workbuf is suboptimal")
156 // No sweep on the first cycle.
157 sweep.active.state.Store(sweepDrainedMask)
159 // Initialize GC pacer state.
160 // Use the environment variable GOGC for the initial gcPercent value.
161 gcController.init(readGOGC())
163 work.startSema = 1
164 work.markDoneSema = 1
165 lockInit(&work.sweepWaiters.lock, lockRankSweepWaiters)
166 lockInit(&work.assistQueue.lock, lockRankAssistQueue)
167 lockInit(&work.wbufSpans.lock, lockRankWbufSpans)
170 // gcenable is called after the bulk of the runtime initialization,
171 // just before we're about to start letting user code run.
172 // It kicks off the background sweeper goroutine, the background
173 // scavenger goroutine, and enables GC.
174 func gcenable() {
175 // Kick off sweeping and scavenging.
176 c := make(chan int, 2)
177 expectSystemGoroutine()
178 go bgsweep(c)
179 expectSystemGoroutine()
180 go bgscavenge(c)
183 memstats.enablegc = true // now that runtime is initialized, GC is okay
186 // Garbage collector phase.
187 // Indicates to write barrier and synchronization task to perform.
188 var gcphase uint32
190 // The compiler knows about this variable.
191 // If you change it, you must change gofrontend/wb.cc, too.
192 // If you change the first four bytes, you must also change the write
193 // barrier insertion code.
194 var writeBarrier struct {
195 enabled bool // compiler emits a check of this before calling write barrier
196 pad [3]byte // compiler uses 32-bit load for "enabled" field
197 needed bool // whether we need a write barrier for current GC phase
198 cgo bool // whether we need a write barrier for a cgo check
199 alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load
202 // gcBlackenEnabled is 1 if mutator assists and background mark
203 // workers are allowed to blacken objects. This must only be set when
204 // gcphase == _GCmark.
205 var gcBlackenEnabled uint32
207 const (
208 _GCoff = iota // GC not running; sweeping in background, write barrier disabled
209 _GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED
210 _GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED
213 //go:nosplit
214 func setGCPhase(x uint32) {
215 atomic.Store(&gcphase, x)
216 writeBarrier.needed = gcphase == _GCmark || gcphase == _GCmarktermination
217 writeBarrier.enabled = writeBarrier.needed || writeBarrier.cgo
220 // gcMarkWorkerMode represents the mode that a concurrent mark worker
221 // should operate in.
223 // Concurrent marking happens through four different mechanisms. One
224 // is mutator assists, which happen in response to allocations and are
225 // not scheduled. The other three are variations in the per-P mark
226 // workers and are distinguished by gcMarkWorkerMode.
227 type gcMarkWorkerMode int
229 const (
230 // gcMarkWorkerNotWorker indicates that the next scheduled G is not
231 // starting work and the mode should be ignored.
232 gcMarkWorkerNotWorker gcMarkWorkerMode = iota
234 // gcMarkWorkerDedicatedMode indicates that the P of a mark
235 // worker is dedicated to running that mark worker. The mark
236 // worker should run without preemption.
237 gcMarkWorkerDedicatedMode
239 // gcMarkWorkerFractionalMode indicates that a P is currently
240 // running the "fractional" mark worker. The fractional worker
241 // is necessary when GOMAXPROCS*gcBackgroundUtilization is not
242 // an integer and using only dedicated workers would result in
243 // utilization too far from the target of gcBackgroundUtilization.
244 // The fractional worker should run until it is preempted and
245 // will be scheduled to pick up the fractional part of
246 // GOMAXPROCS*gcBackgroundUtilization.
247 gcMarkWorkerFractionalMode
249 // gcMarkWorkerIdleMode indicates that a P is running the mark
250 // worker because it has nothing else to do. The idle worker
251 // should run until it is preempted and account its time
252 // against gcController.idleMarkTime.
253 gcMarkWorkerIdleMode
256 // gcMarkWorkerModeStrings are the strings labels of gcMarkWorkerModes
257 // to use in execution traces.
258 var gcMarkWorkerModeStrings = [...]string{
259 "Not worker",
260 "GC (dedicated)",
261 "GC (fractional)",
262 "GC (idle)",
265 // pollFractionalWorkerExit reports whether a fractional mark worker
266 // should self-preempt. It assumes it is called from the fractional
267 // worker.
268 func pollFractionalWorkerExit() bool {
269 // This should be kept in sync with the fractional worker
270 // scheduler logic in findRunnableGCWorker.
271 now := nanotime()
272 delta := now - gcController.markStartTime
273 if delta <= 0 {
274 return true
276 p := getg().m.p.ptr()
277 selfTime := p.gcFractionalMarkTime + (now - p.gcMarkWorkerStartTime)
278 // Add some slack to the utilization goal so that the
279 // fractional worker isn't behind again the instant it exits.
280 return float64(selfTime)/float64(delta) > 1.2*gcController.fractionalUtilizationGoal
283 var work struct {
284 full lfstack // lock-free list of full blocks workbuf
285 empty lfstack // lock-free list of empty blocks workbuf
286 pad0 cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait
288 wbufSpans struct {
289 lock mutex
290 // free is a list of spans dedicated to workbufs, but
291 // that don't currently contain any workbufs.
292 free mSpanList
293 // busy is a list of all spans containing workbufs on
294 // one of the workbuf lists.
295 busy mSpanList
298 // Restore 64-bit alignment on 32-bit.
299 _ uint32
301 // bytesMarked is the number of bytes marked this cycle. This
302 // includes bytes blackened in scanned objects, noscan objects
303 // that go straight to black, and permagrey objects scanned by
304 // markroot during the concurrent scan phase. This is updated
305 // atomically during the cycle. Updates may be batched
306 // arbitrarily, since the value is only read at the end of the
307 // cycle.
309 // Because of benign races during marking, this number may not
310 // be the exact number of marked bytes, but it should be very
311 // close.
313 // Put this field here because it needs 64-bit atomic access
314 // (and thus 8-byte alignment even on 32-bit architectures).
315 bytesMarked uint64
317 markrootNext uint32 // next markroot job
318 markrootJobs uint32 // number of markroot jobs
320 nproc uint32
321 tstart int64
322 nwait uint32
324 // Number of roots of various root types. Set by gcMarkRootPrepare.
325 nDataRoots, nSpanRoots, nStackRoots int
327 // Base indexes of each root type. Set by gcMarkRootPrepare.
328 baseData, baseSpans, baseStacks, baseEnd uint32
330 // stackRoots is a snapshot of all of the Gs that existed
331 // before the beginning of concurrent marking. The backing
332 // store of this must not be modified because it might be
333 // shared with allgs.
334 stackRoots []*g
336 // Each type of GC state transition is protected by a lock.
337 // Since multiple threads can simultaneously detect the state
338 // transition condition, any thread that detects a transition
339 // condition must acquire the appropriate transition lock,
340 // re-check the transition condition and return if it no
341 // longer holds or perform the transition if it does.
342 // Likewise, any transition must invalidate the transition
343 // condition before releasing the lock. This ensures that each
344 // transition is performed by exactly one thread and threads
345 // that need the transition to happen block until it has
346 // happened.
348 // startSema protects the transition from "off" to mark or
349 // mark termination.
350 startSema uint32
351 // markDoneSema protects transitions from mark to mark termination.
352 markDoneSema uint32
354 bgMarkReady note // signal background mark worker has started
355 bgMarkDone uint32 // cas to 1 when at a background mark completion point
356 // Background mark completion signaling
358 // mode is the concurrency mode of the current GC cycle.
359 mode gcMode
361 // userForced indicates the current GC cycle was forced by an
362 // explicit user call.
363 userForced bool
365 // totaltime is the CPU nanoseconds spent in GC since the
366 // program started if debug.gctrace > 0.
367 totaltime int64
369 // initialHeapLive is the value of gcController.heapLive at the
370 // beginning of this GC cycle.
371 initialHeapLive uint64
373 // assistQueue is a queue of assists that are blocked because
374 // there was neither enough credit to steal or enough work to
375 // do.
376 assistQueue struct {
377 lock mutex
378 q gQueue
381 // sweepWaiters is a list of blocked goroutines to wake when
382 // we transition from mark termination to sweep.
383 sweepWaiters struct {
384 lock mutex
385 list gList
388 // cycles is the number of completed GC cycles, where a GC
389 // cycle is sweep termination, mark, mark termination, and
390 // sweep. This differs from memstats.numgc, which is
391 // incremented at mark termination.
392 cycles uint32
394 // Timing/utilization stats for this cycle.
395 stwprocs, maxprocs int32
396 tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start
398 pauseNS int64 // total STW time this cycle
399 pauseStart int64 // nanotime() of last STW
401 // debug.gctrace heap sizes for this cycle.
402 heap0, heap1, heap2, heapGoal uint64
405 // GC runs a garbage collection and blocks the caller until the
406 // garbage collection is complete. It may also block the entire
407 // program.
408 func GC() {
409 // We consider a cycle to be: sweep termination, mark, mark
410 // termination, and sweep. This function shouldn't return
411 // until a full cycle has been completed, from beginning to
412 // end. Hence, we always want to finish up the current cycle
413 // and start a new one. That means:
415 // 1. In sweep termination, mark, or mark termination of cycle
416 // N, wait until mark termination N completes and transitions
417 // to sweep N.
419 // 2. In sweep N, help with sweep N.
421 // At this point we can begin a full cycle N+1.
423 // 3. Trigger cycle N+1 by starting sweep termination N+1.
425 // 4. Wait for mark termination N+1 to complete.
427 // 5. Help with sweep N+1 until it's done.
429 // This all has to be written to deal with the fact that the
430 // GC may move ahead on its own. For example, when we block
431 // until mark termination N, we may wake up in cycle N+2.
433 // Wait until the current sweep termination, mark, and mark
434 // termination complete.
435 n := atomic.Load(&work.cycles)
436 gcWaitOnMark(n)
438 // We're now in sweep N or later. Trigger GC cycle N+1, which
439 // will first finish sweep N if necessary and then enter sweep
440 // termination N+1.
441 gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1})
443 // Wait for mark termination N+1 to complete.
444 gcWaitOnMark(n + 1)
446 // Finish sweep N+1 before returning. We do this both to
447 // complete the cycle and because runtime.GC() is often used
448 // as part of tests and benchmarks to get the system into a
449 // relatively stable and isolated state.
450 for atomic.Load(&work.cycles) == n+1 && sweepone() != ^uintptr(0) {
451 sweep.nbgsweep++
452 Gosched()
455 // Callers may assume that the heap profile reflects the
456 // just-completed cycle when this returns (historically this
457 // happened because this was a STW GC), but right now the
458 // profile still reflects mark termination N, not N+1.
460 // As soon as all of the sweep frees from cycle N+1 are done,
461 // we can go ahead and publish the heap profile.
463 // First, wait for sweeping to finish. (We know there are no
464 // more spans on the sweep queue, but we may be concurrently
465 // sweeping spans, so we have to wait.)
466 for atomic.Load(&work.cycles) == n+1 && !isSweepDone() {
467 Gosched()
470 // Now we're really done with sweeping, so we can publish the
471 // stable heap profile. Only do this if we haven't already hit
472 // another mark termination.
473 mp := acquirem()
474 cycle := atomic.Load(&work.cycles)
475 if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) {
476 mProf_PostSweep()
478 releasem(mp)
481 // gcWaitOnMark blocks until GC finishes the Nth mark phase. If GC has
482 // already completed this mark phase, it returns immediately.
483 func gcWaitOnMark(n uint32) {
484 for {
485 // Disable phase transitions.
486 lock(&work.sweepWaiters.lock)
487 nMarks := atomic.Load(&work.cycles)
488 if gcphase != _GCmark {
489 // We've already completed this cycle's mark.
490 nMarks++
492 if nMarks > n {
493 // We're done.
494 unlock(&work.sweepWaiters.lock)
495 return
498 // Wait until sweep termination, mark, and mark
499 // termination of cycle N complete.
500 work.sweepWaiters.list.push(getg())
501 goparkunlock(&work.sweepWaiters.lock, waitReasonWaitForGCCycle, traceEvGoBlock, 1)
505 // gcMode indicates how concurrent a GC cycle should be.
506 type gcMode int
508 const (
509 gcBackgroundMode gcMode = iota // concurrent GC and sweep
510 gcForceMode // stop-the-world GC now, concurrent sweep
511 gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user)
514 // A gcTrigger is a predicate for starting a GC cycle. Specifically,
515 // it is an exit condition for the _GCoff phase.
516 type gcTrigger struct {
517 kind gcTriggerKind
518 now int64 // gcTriggerTime: current time
519 n uint32 // gcTriggerCycle: cycle number to start
522 type gcTriggerKind int
524 const (
525 // gcTriggerHeap indicates that a cycle should be started when
526 // the heap size reaches the trigger heap size computed by the
527 // controller.
528 gcTriggerHeap gcTriggerKind = iota
530 // gcTriggerTime indicates that a cycle should be started when
531 // it's been more than forcegcperiod nanoseconds since the
532 // previous GC cycle.
533 gcTriggerTime
535 // gcTriggerCycle indicates that a cycle should be started if
536 // we have not yet started cycle number gcTrigger.n (relative
537 // to work.cycles).
538 gcTriggerCycle
541 // test reports whether the trigger condition is satisfied, meaning
542 // that the exit condition for the _GCoff phase has been met. The exit
543 // condition should be tested when allocating.
544 func (t gcTrigger) test() bool {
545 if !memstats.enablegc || panicking != 0 || gcphase != _GCoff {
546 return false
548 switch t.kind {
549 case gcTriggerHeap:
550 // Non-atomic access to gcController.heapLive for performance. If
551 // we are going to trigger on this, this thread just
552 // atomically wrote gcController.heapLive anyway and we'll see our
553 // own write.
554 return gcController.heapLive >= gcController.trigger
555 case gcTriggerTime:
556 if gcController.gcPercent.Load() < 0 {
557 return false
559 lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime))
560 return lastgc != 0 && t.now-lastgc > forcegcperiod
561 case gcTriggerCycle:
562 // t.n > work.cycles, but accounting for wraparound.
563 return int32(t.n-work.cycles) > 0
565 return true
568 // gcStart starts the GC. It transitions from _GCoff to _GCmark (if
569 // debug.gcstoptheworld == 0) or performs all of GC (if
570 // debug.gcstoptheworld != 0).
572 // This may return without performing this transition in some cases,
573 // such as when called on a system stack or with locks held.
574 func gcStart(trigger gcTrigger) {
575 // Since this is called from malloc and malloc is called in
576 // the guts of a number of libraries that might be holding
577 // locks, don't attempt to start GC in non-preemptible or
578 // potentially unstable situations.
579 mp := acquirem()
580 if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" {
581 releasem(mp)
582 return
584 releasem(mp)
585 mp = nil
587 // Pick up the remaining unswept/not being swept spans concurrently
589 // This shouldn't happen if we're being invoked in background
590 // mode since proportional sweep should have just finished
591 // sweeping everything, but rounding errors, etc, may leave a
592 // few spans unswept. In forced mode, this is necessary since
593 // GC can be forced at any point in the sweeping cycle.
595 // We check the transition condition continuously here in case
596 // this G gets delayed in to the next GC cycle.
597 for trigger.test() && sweepone() != ^uintptr(0) {
598 sweep.nbgsweep++
601 // Perform GC initialization and the sweep termination
602 // transition.
603 semacquire(&work.startSema)
604 // Re-check transition condition under transition lock.
605 if !trigger.test() {
606 semrelease(&work.startSema)
607 return
610 // For stats, check if this GC was forced by the user.
611 work.userForced = trigger.kind == gcTriggerCycle
613 // In gcstoptheworld debug mode, upgrade the mode accordingly.
614 // We do this after re-checking the transition condition so
615 // that multiple goroutines that detect the heap trigger don't
616 // start multiple STW GCs.
617 mode := gcBackgroundMode
618 if debug.gcstoptheworld == 1 {
619 mode = gcForceMode
620 } else if debug.gcstoptheworld == 2 {
621 mode = gcForceBlockMode
624 // Ok, we're doing it! Stop everybody else
625 semacquire(&gcsema)
626 semacquire(&worldsema)
628 if trace.enabled {
629 traceGCStart()
632 // Check that all Ps have finished deferred mcache flushes.
633 for _, p := range allp {
634 if fg := atomic.Load(&p.mcache.flushGen); fg != mheap_.sweepgen {
635 println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen)
636 throw("p mcache not flushed")
640 gcBgMarkStartWorkers()
642 systemstack(gcResetMarkState)
644 work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs
645 if work.stwprocs > ncpu {
646 // This is used to compute CPU time of the STW phases,
647 // so it can't be more than ncpu, even if GOMAXPROCS is.
648 work.stwprocs = ncpu
650 work.heap0 = atomic.Load64(&gcController.heapLive)
651 work.pauseNS = 0
652 work.mode = mode
654 now := nanotime()
655 work.tSweepTerm = now
656 work.pauseStart = now
657 if trace.enabled {
658 traceGCSTWStart(1)
660 systemstack(stopTheWorldWithSema)
661 // Finish sweep before we start concurrent scan.
662 systemstack(func() {
663 finishsweep_m()
666 // clearpools before we start the GC. If we wait they memory will not be
667 // reclaimed until the next GC cycle.
668 clearpools()
670 work.cycles++
672 // Assists and workers can start the moment we start
673 // the world.
674 gcController.startCycle(now, int(gomaxprocs))
675 work.heapGoal = gcController.heapGoal
677 // In STW mode, disable scheduling of user Gs. This may also
678 // disable scheduling of this goroutine, so it may block as
679 // soon as we start the world again.
680 if mode != gcBackgroundMode {
681 schedEnableUser(false)
684 // Enter concurrent mark phase and enable
685 // write barriers.
687 // Because the world is stopped, all Ps will
688 // observe that write barriers are enabled by
689 // the time we start the world and begin
690 // scanning.
692 // Write barriers must be enabled before assists are
693 // enabled because they must be enabled before
694 // any non-leaf heap objects are marked. Since
695 // allocations are blocked until assists can
696 // happen, we want enable assists as early as
697 // possible.
698 setGCPhase(_GCmark)
700 gcBgMarkPrepare() // Must happen before assist enable.
701 gcMarkRootPrepare()
703 // Mark all active tinyalloc blocks. Since we're
704 // allocating from these, they need to be black like
705 // other allocations. The alternative is to blacken
706 // the tiny block on every allocation from it, which
707 // would slow down the tiny allocator.
708 gcMarkTinyAllocs()
710 // At this point all Ps have enabled the write
711 // barrier, thus maintaining the no white to
712 // black invariant. Enable mutator assists to
713 // put back-pressure on fast allocating
714 // mutators.
715 atomic.Store(&gcBlackenEnabled, 1)
717 // In STW mode, we could block the instant systemstack
718 // returns, so make sure we're not preemptible.
719 mp = acquirem()
721 // Concurrent mark.
722 systemstack(func() {
723 now = startTheWorldWithSema(trace.enabled)
724 work.pauseNS += now - work.pauseStart
725 work.tMark = now
726 memstats.gcPauseDist.record(now - work.pauseStart)
729 // Release the world sema before Gosched() in STW mode
730 // because we will need to reacquire it later but before
731 // this goroutine becomes runnable again, and we could
732 // self-deadlock otherwise.
733 semrelease(&worldsema)
734 releasem(mp)
736 // Make sure we block instead of returning to user code
737 // in STW mode.
738 if mode != gcBackgroundMode {
739 Gosched()
742 semrelease(&work.startSema)
745 // gcMarkDoneFlushed counts the number of P's with flushed work.
747 // Ideally this would be a captured local in gcMarkDone, but forEachP
748 // escapes its callback closure, so it can't capture anything.
750 // This is protected by markDoneSema.
751 var gcMarkDoneFlushed uint32
753 // gcMarkDone transitions the GC from mark to mark termination if all
754 // reachable objects have been marked (that is, there are no grey
755 // objects and can be no more in the future). Otherwise, it flushes
756 // all local work to the global queues where it can be discovered by
757 // other workers.
759 // This should be called when all local mark work has been drained and
760 // there are no remaining workers. Specifically, when
762 // work.nwait == work.nproc && !gcMarkWorkAvailable(p)
764 // The calling context must be preemptible.
766 // Flushing local work is important because idle Ps may have local
767 // work queued. This is the only way to make that work visible and
768 // drive GC to completion.
770 // It is explicitly okay to have write barriers in this function. If
771 // it does transition to mark termination, then all reachable objects
772 // have been marked, so the write barrier cannot shade any more
773 // objects.
774 func gcMarkDone() {
775 // Ensure only one thread is running the ragged barrier at a
776 // time.
777 semacquire(&work.markDoneSema)
779 top:
780 // Re-check transition condition under transition lock.
782 // It's critical that this checks the global work queues are
783 // empty before performing the ragged barrier. Otherwise,
784 // there could be global work that a P could take after the P
785 // has passed the ragged barrier.
786 if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) {
787 semrelease(&work.markDoneSema)
788 return
791 // forEachP needs worldsema to execute, and we'll need it to
792 // stop the world later, so acquire worldsema now.
793 semacquire(&worldsema)
795 // Flush all local buffers and collect flushedWork flags.
796 gcMarkDoneFlushed = 0
797 systemstack(func() {
798 gp := getg().m.curg
799 // Mark the user stack as preemptible so that it may be scanned.
800 // Otherwise, our attempt to force all P's to a safepoint could
801 // result in a deadlock as we attempt to preempt a worker that's
802 // trying to preempt us (e.g. for a stack scan).
803 casgstatus(gp, _Grunning, _Gwaiting)
804 forEachP(func(_p_ *p) {
805 // Flush the write barrier buffer, since this may add
806 // work to the gcWork.
807 wbBufFlush1(_p_)
809 // Flush the gcWork, since this may create global work
810 // and set the flushedWork flag.
812 // TODO(austin): Break up these workbufs to
813 // better distribute work.
814 _p_.gcw.dispose()
815 // Collect the flushedWork flag.
816 if _p_.gcw.flushedWork {
817 atomic.Xadd(&gcMarkDoneFlushed, 1)
818 _p_.gcw.flushedWork = false
821 casgstatus(gp, _Gwaiting, _Grunning)
824 if gcMarkDoneFlushed != 0 {
825 // More grey objects were discovered since the
826 // previous termination check, so there may be more
827 // work to do. Keep going. It's possible the
828 // transition condition became true again during the
829 // ragged barrier, so re-check it.
830 semrelease(&worldsema)
831 goto top
834 // There was no global work, no local work, and no Ps
835 // communicated work since we took markDoneSema. Therefore
836 // there are no grey objects and no more objects can be
837 // shaded. Transition to mark termination.
838 now := nanotime()
839 work.tMarkTerm = now
840 work.pauseStart = now
841 getg().m.preemptoff = "gcing"
842 if trace.enabled {
843 traceGCSTWStart(0)
845 systemstack(stopTheWorldWithSema)
846 // The gcphase is _GCmark, it will transition to _GCmarktermination
847 // below. The important thing is that the wb remains active until
848 // all marking is complete. This includes writes made by the GC.
850 // There is sometimes work left over when we enter mark termination due
851 // to write barriers performed after the completion barrier above.
852 // Detect this and resume concurrent mark. This is obviously
853 // unfortunate.
855 // See issue #27993 for details.
857 // Switch to the system stack to call wbBufFlush1, though in this case
858 // it doesn't matter because we're non-preemptible anyway.
859 restart := false
860 systemstack(func() {
861 for _, p := range allp {
862 wbBufFlush1(p)
863 if !p.gcw.empty() {
864 restart = true
865 break
869 if restart {
870 getg().m.preemptoff = ""
871 systemstack(func() {
872 now := startTheWorldWithSema(true)
873 work.pauseNS += now - work.pauseStart
874 memstats.gcPauseDist.record(now - work.pauseStart)
876 semrelease(&worldsema)
877 goto top
880 // Disable assists and background workers. We must do
881 // this before waking blocked assists.
882 atomic.Store(&gcBlackenEnabled, 0)
884 // Wake all blocked assists. These will run when we
885 // start the world again.
886 gcWakeAllAssists()
888 // Likewise, release the transition lock. Blocked
889 // workers and assists will run when we start the
890 // world again.
891 semrelease(&work.markDoneSema)
893 // In STW mode, re-enable user goroutines. These will be
894 // queued to run after we start the world.
895 schedEnableUser(true)
897 // endCycle depends on all gcWork cache stats being flushed.
898 // The termination algorithm above ensured that up to
899 // allocations since the ragged barrier.
900 nextTriggerRatio := gcController.endCycle(now, int(gomaxprocs), work.userForced)
902 // Perform mark termination. This will restart the world.
903 gcMarkTermination(nextTriggerRatio)
906 // World must be stopped and mark assists and background workers must be
907 // disabled.
908 func gcMarkTermination(nextTriggerRatio float64) {
909 // Start marktermination (write barrier remains enabled for now).
910 setGCPhase(_GCmarktermination)
912 work.heap1 = gcController.heapLive
913 startTime := nanotime()
915 mp := acquirem()
916 mp.preemptoff = "gcing"
917 _g_ := getg()
918 _g_.m.traceback = 2
919 gp := _g_.m.curg
920 casgstatus(gp, _Grunning, _Gwaiting)
921 gp.waitreason = waitReasonGarbageCollection
923 // Run gc on the g0 stack. We do this so that the g stack
924 // we're currently running on will no longer change. Cuts
925 // the root set down a bit (g0 stacks are not scanned, and
926 // we don't need to scan gc's internal state). We also
927 // need to switch to g0 so we can shrink the stack.
928 systemstack(func() {
929 gcMark(startTime)
930 // Must return immediately.
931 // The outer function's stack may have moved
932 // during gcMark (it shrinks stacks, including the
933 // outer function's stack), so we must not refer
934 // to any of its variables. Return back to the
935 // non-system stack to pick up the new addresses
936 // before continuing.
939 systemstack(func() {
940 work.heap2 = work.bytesMarked
941 if debug.gccheckmark > 0 {
942 // Run a full non-parallel, stop-the-world
943 // mark using checkmark bits, to check that we
944 // didn't forget to mark anything during the
945 // concurrent mark process.
946 startCheckmarks()
947 gcResetMarkState()
948 gcw := &getg().m.p.ptr().gcw
949 gcDrain(gcw, 0)
950 wbBufFlush1(getg().m.p.ptr())
951 gcw.dispose()
952 endCheckmarks()
955 // marking is complete so we can turn the write barrier off
956 setGCPhase(_GCoff)
957 gcSweep(work.mode)
960 _g_.m.traceback = 0
961 casgstatus(gp, _Gwaiting, _Grunning)
963 if trace.enabled {
964 traceGCDone()
967 // all done
968 mp.preemptoff = ""
970 if gcphase != _GCoff {
971 throw("gc done but gcphase != _GCoff")
974 // Record heap_inuse for scavenger.
975 memstats.last_heap_inuse = memstats.heap_inuse
977 // Update GC trigger and pacing for the next cycle.
978 gcController.commit(nextTriggerRatio)
979 gcPaceSweeper(gcController.trigger)
980 gcPaceScavenger(gcController.heapGoal, gcController.lastHeapGoal)
982 // Update timing memstats
983 now := nanotime()
984 sec, nsec, _ := time_now()
985 unixNow := sec*1e9 + int64(nsec)
986 work.pauseNS += now - work.pauseStart
987 work.tEnd = now
988 memstats.gcPauseDist.record(now - work.pauseStart)
989 atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user
990 atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us
991 memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS)
992 memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow)
993 memstats.pause_total_ns += uint64(work.pauseNS)
995 // Update work.totaltime.
996 sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm)
997 // We report idle marking time below, but omit it from the
998 // overall utilization here since it's "free".
999 markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTime
1000 markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm)
1001 cycleCpu := sweepTermCpu + markCpu + markTermCpu
1002 work.totaltime += cycleCpu
1004 // Compute overall GC CPU utilization.
1005 totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)
1006 memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu)
1008 // Reset sweep state.
1009 sweep.nbgsweep = 0
1010 sweep.npausesweep = 0
1012 if work.userForced {
1013 memstats.numforcedgc++
1016 // Bump GC cycle count and wake goroutines waiting on sweep.
1017 lock(&work.sweepWaiters.lock)
1018 memstats.numgc++
1019 injectglist(&work.sweepWaiters.list)
1020 unlock(&work.sweepWaiters.lock)
1022 // Finish the current heap profiling cycle and start a new
1023 // heap profiling cycle. We do this before starting the world
1024 // so events don't leak into the wrong cycle.
1025 mProf_NextCycle()
1027 // There may be stale spans in mcaches that need to be swept.
1028 // Those aren't tracked in any sweep lists, so we need to
1029 // count them against sweep completion until we ensure all
1030 // those spans have been forced out.
1031 sl := sweep.active.begin()
1032 if !sl.valid {
1033 throw("failed to set sweep barrier")
1036 systemstack(func() { startTheWorldWithSema(true) })
1038 // Flush the heap profile so we can start a new cycle next GC.
1039 // This is relatively expensive, so we don't do it with the
1040 // world stopped.
1041 mProf_Flush()
1043 // Prepare workbufs for freeing by the sweeper. We do this
1044 // asynchronously because it can take non-trivial time.
1045 prepareFreeWorkbufs()
1047 // Ensure all mcaches are flushed. Each P will flush its own
1048 // mcache before allocating, but idle Ps may not. Since this
1049 // is necessary to sweep all spans, we need to ensure all
1050 // mcaches are flushed before we start the next GC cycle.
1051 systemstack(func() {
1052 forEachP(func(_p_ *p) {
1053 _p_.mcache.prepareForSweep()
1056 // Now that we've swept stale spans in mcaches, they don't
1057 // count against unswept spans.
1058 sweep.active.end(sl)
1060 // Print gctrace before dropping worldsema. As soon as we drop
1061 // worldsema another cycle could start and smash the stats
1062 // we're trying to print.
1063 if debug.gctrace > 0 {
1064 util := int(memstats.gc_cpu_fraction * 100)
1066 var sbuf [24]byte
1067 printlock()
1068 print("gc ", memstats.numgc,
1069 " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ",
1070 util, "%: ")
1071 prev := work.tSweepTerm
1072 for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} {
1073 if i != 0 {
1074 print("+")
1076 print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev))))
1077 prev = ns
1079 print(" ms clock, ")
1080 for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} {
1081 if i == 2 || i == 3 {
1082 // Separate mark time components with /.
1083 print("/")
1084 } else if i != 0 {
1085 print("+")
1087 print(string(fmtNSAsMS(sbuf[:], uint64(ns))))
1089 print(" ms cpu, ",
1090 work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ",
1091 work.heapGoal>>20, " MB goal, ",
1092 gcController.stackScan>>20, " MB stacks, ",
1093 gcController.globalsScan>>20, " MB globals, ",
1094 work.maxprocs, " P")
1095 if work.userForced {
1096 print(" (forced)")
1098 print("\n")
1099 printunlock()
1102 semrelease(&worldsema)
1103 semrelease(&gcsema)
1104 // Careful: another GC cycle may start now.
1106 releasem(mp)
1107 mp = nil
1109 // now that gc is done, kick off finalizer thread if needed
1110 if !concurrentSweep {
1111 // give the queued finalizers, if any, a chance to run
1112 Gosched()
1116 // gcBgMarkStartWorkers prepares background mark worker goroutines. These
1117 // goroutines will not run until the mark phase, but they must be started while
1118 // the work is not stopped and from a regular G stack. The caller must hold
1119 // worldsema.
1120 func gcBgMarkStartWorkers() {
1121 // Background marking is performed by per-P G's. Ensure that each P has
1122 // a background GC G.
1124 // Worker Gs don't exit if gomaxprocs is reduced. If it is raised
1125 // again, we can reuse the old workers; no need to create new workers.
1126 for gcBgMarkWorkerCount < gomaxprocs {
1127 expectSystemGoroutine()
1128 go gcBgMarkWorker()
1130 notetsleepg(&work.bgMarkReady, -1)
1131 noteclear(&work.bgMarkReady)
1132 // The worker is now guaranteed to be added to the pool before
1133 // its P's next findRunnableGCWorker.
1135 gcBgMarkWorkerCount++
1139 // gcBgMarkPrepare sets up state for background marking.
1140 // Mutator assists must not yet be enabled.
1141 func gcBgMarkPrepare() {
1142 // Background marking will stop when the work queues are empty
1143 // and there are no more workers (note that, since this is
1144 // concurrent, this may be a transient state, but mark
1145 // termination will clean it up). Between background workers
1146 // and assists, we don't really know how many workers there
1147 // will be, so we pretend to have an arbitrarily large number
1148 // of workers, almost all of which are "waiting". While a
1149 // worker is working it decrements nwait. If nproc == nwait,
1150 // there are no workers.
1151 work.nproc = ^uint32(0)
1152 work.nwait = ^uint32(0)
1155 // gcBgMarkWorker is an entry in the gcBgMarkWorkerPool. It points to a single
1156 // gcBgMarkWorker goroutine.
1157 type gcBgMarkWorkerNode struct {
1158 // Unused workers are managed in a lock-free stack. This field must be first.
1159 node lfnode
1161 // The g of this worker.
1162 gp guintptr
1164 // Release this m on park. This is used to communicate with the unlock
1165 // function, which cannot access the G's stack. It is unused outside of
1166 // gcBgMarkWorker().
1167 m muintptr
1170 func gcBgMarkWorker() {
1171 setSystemGoroutine()
1173 gp := getg()
1175 // We pass node to a gopark unlock function, so it can't be on
1176 // the stack (see gopark). Prevent deadlock from recursively
1177 // starting GC by disabling preemption.
1178 gp.m.preemptoff = "GC worker init"
1179 node := new(gcBgMarkWorkerNode)
1180 gp.m.preemptoff = ""
1182 node.gp.set(gp)
1184 node.m.set(acquirem())
1185 notewakeup(&work.bgMarkReady)
1186 // After this point, the background mark worker is generally scheduled
1187 // cooperatively by gcController.findRunnableGCWorker. While performing
1188 // work on the P, preemption is disabled because we are working on
1189 // P-local work buffers. When the preempt flag is set, this puts itself
1190 // into _Gwaiting to be woken up by gcController.findRunnableGCWorker
1191 // at the appropriate time.
1193 // When preemption is enabled (e.g., while in gcMarkDone), this worker
1194 // may be preempted and schedule as a _Grunnable G from a runq. That is
1195 // fine; it will eventually gopark again for further scheduling via
1196 // findRunnableGCWorker.
1198 // Since we disable preemption before notifying bgMarkReady, we
1199 // guarantee that this G will be in the worker pool for the next
1200 // findRunnableGCWorker. This isn't strictly necessary, but it reduces
1201 // latency between _GCmark starting and the workers starting.
1203 for {
1204 // Go to sleep until woken by
1205 // gcController.findRunnableGCWorker.
1206 gopark(func(g *g, nodep unsafe.Pointer) bool {
1207 node := (*gcBgMarkWorkerNode)(nodep)
1209 if mp := node.m.ptr(); mp != nil {
1210 // The worker G is no longer running; release
1211 // the M.
1213 // N.B. it is _safe_ to release the M as soon
1214 // as we are no longer performing P-local mark
1215 // work.
1217 // However, since we cooperatively stop work
1218 // when gp.preempt is set, if we releasem in
1219 // the loop then the following call to gopark
1220 // would immediately preempt the G. This is
1221 // also safe, but inefficient: the G must
1222 // schedule again only to enter gopark and park
1223 // again. Thus, we defer the release until
1224 // after parking the G.
1225 releasem(mp)
1228 // Release this G to the pool.
1229 gcBgMarkWorkerPool.push(&node.node)
1230 // Note that at this point, the G may immediately be
1231 // rescheduled and may be running.
1232 return true
1233 }, unsafe.Pointer(node), waitReasonGCWorkerIdle, traceEvGoBlock, 0)
1235 // Preemption must not occur here, or another G might see
1236 // p.gcMarkWorkerMode.
1238 // Disable preemption so we can use the gcw. If the
1239 // scheduler wants to preempt us, we'll stop draining,
1240 // dispose the gcw, and then preempt.
1241 node.m.set(acquirem())
1242 pp := gp.m.p.ptr() // P can't change with preemption disabled.
1244 if gcBlackenEnabled == 0 {
1245 println("worker mode", pp.gcMarkWorkerMode)
1246 throw("gcBgMarkWorker: blackening not enabled")
1249 if pp.gcMarkWorkerMode == gcMarkWorkerNotWorker {
1250 throw("gcBgMarkWorker: mode not set")
1253 startTime := nanotime()
1254 pp.gcMarkWorkerStartTime = startTime
1256 decnwait := atomic.Xadd(&work.nwait, -1)
1257 if decnwait == work.nproc {
1258 println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc)
1259 throw("work.nwait was > work.nproc")
1262 systemstack(func() {
1263 // Mark our goroutine preemptible so its stack
1264 // can be scanned. This lets two mark workers
1265 // scan each other (otherwise, they would
1266 // deadlock). We must not modify anything on
1267 // the G stack. However, stack shrinking is
1268 // disabled for mark workers, so it is safe to
1269 // read from the G stack.
1270 casgstatus(gp, _Grunning, _Gwaiting)
1271 switch pp.gcMarkWorkerMode {
1272 default:
1273 throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")
1274 case gcMarkWorkerDedicatedMode:
1275 gcDrain(&pp.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)
1276 if gp.preempt {
1277 // We were preempted. This is
1278 // a useful signal to kick
1279 // everything out of the run
1280 // queue so it can run
1281 // somewhere else.
1282 if drainQ, n := runqdrain(pp); n > 0 {
1283 lock(&sched.lock)
1284 globrunqputbatch(&drainQ, int32(n))
1285 unlock(&sched.lock)
1288 // Go back to draining, this time
1289 // without preemption.
1290 gcDrain(&pp.gcw, gcDrainFlushBgCredit)
1291 case gcMarkWorkerFractionalMode:
1292 gcDrain(&pp.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)
1293 case gcMarkWorkerIdleMode:
1294 gcDrain(&pp.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)
1296 casgstatus(gp, _Gwaiting, _Grunning)
1299 // Account for time.
1300 duration := nanotime() - startTime
1301 gcController.logWorkTime(pp.gcMarkWorkerMode, duration)
1302 if pp.gcMarkWorkerMode == gcMarkWorkerFractionalMode {
1303 atomic.Xaddint64(&pp.gcFractionalMarkTime, duration)
1306 // Was this the last worker and did we run out
1307 // of work?
1308 incnwait := atomic.Xadd(&work.nwait, +1)
1309 if incnwait > work.nproc {
1310 println("runtime: p.gcMarkWorkerMode=", pp.gcMarkWorkerMode,
1311 "work.nwait=", incnwait, "work.nproc=", work.nproc)
1312 throw("work.nwait > work.nproc")
1315 // We'll releasem after this point and thus this P may run
1316 // something else. We must clear the worker mode to avoid
1317 // attributing the mode to a different (non-worker) G in
1318 // traceGoStart.
1319 pp.gcMarkWorkerMode = gcMarkWorkerNotWorker
1321 // If this worker reached a background mark completion
1322 // point, signal the main GC goroutine.
1323 if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {
1324 // We don't need the P-local buffers here, allow
1325 // preemption because we may schedule like a regular
1326 // goroutine in gcMarkDone (block on locks, etc).
1327 releasem(node.m.ptr())
1328 node.m.set(nil)
1330 gcMarkDone()
1335 // gcMarkWorkAvailable reports whether executing a mark worker
1336 // on p is potentially useful. p may be nil, in which case it only
1337 // checks the global sources of work.
1338 func gcMarkWorkAvailable(p *p) bool {
1339 if p != nil && !p.gcw.empty() {
1340 return true
1342 if !work.full.empty() {
1343 return true // global work available
1345 if work.markrootNext < work.markrootJobs {
1346 return true // root scan work available
1348 return false
1351 // gcMark runs the mark (or, for concurrent GC, mark termination)
1352 // All gcWork caches must be empty.
1353 // STW is in effect at this point.
1354 func gcMark(startTime int64) {
1355 if debug.allocfreetrace > 0 {
1356 tracegc()
1359 if gcphase != _GCmarktermination {
1360 throw("in gcMark expecting to see gcphase as _GCmarktermination")
1362 work.tstart = startTime
1364 // Check that there's no marking work remaining.
1365 if work.full != 0 || work.markrootNext < work.markrootJobs {
1366 print("runtime: full=", hex(work.full), " next=", work.markrootNext, " jobs=", work.markrootJobs, " nDataRoots=", work.nDataRoots, " nSpanRoots=", work.nSpanRoots, " nStackRoots=", work.nStackRoots, "\n")
1367 panic("non-empty mark queue after concurrent mark")
1370 if debug.gccheckmark > 0 {
1371 // This is expensive when there's a large number of
1372 // Gs, so only do it if checkmark is also enabled.
1373 gcMarkRootCheck()
1375 if work.full != 0 {
1376 throw("work.full != 0")
1379 // Drop allg snapshot. allgs may have grown, in which case
1380 // this is the only reference to the old backing store and
1381 // there's no need to keep it around.
1382 work.stackRoots = nil
1384 // Clear out buffers and double-check that all gcWork caches
1385 // are empty. This should be ensured by gcMarkDone before we
1386 // enter mark termination.
1388 // TODO: We could clear out buffers just before mark if this
1389 // has a non-negligible impact on STW time.
1390 for _, p := range allp {
1391 // The write barrier may have buffered pointers since
1392 // the gcMarkDone barrier. However, since the barrier
1393 // ensured all reachable objects were marked, all of
1394 // these must be pointers to black objects. Hence we
1395 // can just discard the write barrier buffer.
1396 if debug.gccheckmark > 0 {
1397 // For debugging, flush the buffer and make
1398 // sure it really was all marked.
1399 wbBufFlush1(p)
1400 } else {
1401 p.wbBuf.reset()
1404 gcw := &p.gcw
1405 if !gcw.empty() {
1406 printlock()
1407 print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork)
1408 if gcw.wbuf1 == nil {
1409 print(" wbuf1=<nil>")
1410 } else {
1411 print(" wbuf1.n=", gcw.wbuf1.nobj)
1413 if gcw.wbuf2 == nil {
1414 print(" wbuf2=<nil>")
1415 } else {
1416 print(" wbuf2.n=", gcw.wbuf2.nobj)
1418 print("\n")
1419 throw("P has cached GC work at end of mark termination")
1421 // There may still be cached empty buffers, which we
1422 // need to flush since we're going to free them. Also,
1423 // there may be non-zero stats because we allocated
1424 // black after the gcMarkDone barrier.
1425 gcw.dispose()
1428 // Flush scanAlloc from each mcache since we're about to modify
1429 // heapScan directly. If we were to flush this later, then scanAlloc
1430 // might have incorrect information.
1432 // Note that it's not important to retain this information; we know
1433 // exactly what heapScan is at this point via scanWork.
1434 for _, p := range allp {
1435 c := p.mcache
1436 if c == nil {
1437 continue
1439 c.scanAlloc = 0
1442 // Reset controller state.
1443 gcController.resetLive(work.bytesMarked)
1446 // gcSweep must be called on the system stack because it acquires the heap
1447 // lock. See mheap for details.
1449 // The world must be stopped.
1451 //go:systemstack
1452 func gcSweep(mode gcMode) {
1453 assertWorldStopped()
1455 if gcphase != _GCoff {
1456 throw("gcSweep being done but phase is not GCoff")
1459 lock(&mheap_.lock)
1460 mheap_.sweepgen += 2
1461 sweep.active.reset()
1462 mheap_.pagesSwept.Store(0)
1463 mheap_.sweepArenas = mheap_.allArenas
1464 mheap_.reclaimIndex.Store(0)
1465 mheap_.reclaimCredit.Store(0)
1466 unlock(&mheap_.lock)
1468 sweep.centralIndex.clear()
1470 if !_ConcurrentSweep || mode == gcForceBlockMode {
1471 // Special case synchronous sweep.
1472 // Record that no proportional sweeping has to happen.
1473 lock(&mheap_.lock)
1474 mheap_.sweepPagesPerByte = 0
1475 unlock(&mheap_.lock)
1476 // Sweep all spans eagerly.
1477 for sweepone() != ^uintptr(0) {
1478 sweep.npausesweep++
1480 // Free workbufs eagerly.
1481 prepareFreeWorkbufs()
1482 for freeSomeWbufs(false) {
1484 // All "free" events for this mark/sweep cycle have
1485 // now happened, so we can make this profile cycle
1486 // available immediately.
1487 mProf_NextCycle()
1488 mProf_Flush()
1489 return
1492 // Background sweep.
1493 lock(&sweep.lock)
1494 if sweep.parked {
1495 sweep.parked = false
1496 ready(sweep.g, 0, true)
1498 unlock(&sweep.lock)
1501 // gcResetMarkState resets global state prior to marking (concurrent
1502 // or STW) and resets the stack scan state of all Gs.
1504 // This is safe to do without the world stopped because any Gs created
1505 // during or after this will start out in the reset state.
1507 // gcResetMarkState must be called on the system stack because it acquires
1508 // the heap lock. See mheap for details.
1510 //go:systemstack
1511 func gcResetMarkState() {
1512 // This may be called during a concurrent phase, so lock to make sure
1513 // allgs doesn't change.
1514 forEachG(func(gp *g) {
1515 gp.gcscandone = false // set to true in gcphasework
1516 gp.gcAssistBytes = 0
1519 // Clear page marks. This is just 1MB per 64GB of heap, so the
1520 // time here is pretty trivial.
1521 lock(&mheap_.lock)
1522 arenas := mheap_.allArenas
1523 unlock(&mheap_.lock)
1524 for _, ai := range arenas {
1525 ha := mheap_.arenas[ai.l1()][ai.l2()]
1526 for i := range ha.pageMarks {
1527 ha.pageMarks[i] = 0
1531 work.bytesMarked = 0
1532 work.initialHeapLive = atomic.Load64(&gcController.heapLive)
1535 // Hooks for other packages
1537 var poolcleanup func()
1539 //go:linkname sync_runtime_registerPoolCleanup sync.runtime__registerPoolCleanup
1540 func sync_runtime_registerPoolCleanup(f func()) {
1541 poolcleanup = f
1544 func clearpools() {
1545 // clear sync.Pools
1546 if poolcleanup != nil {
1547 poolcleanup()
1550 // Clear central sudog cache.
1551 // Leave per-P caches alone, they have strictly bounded size.
1552 // Disconnect cached list before dropping it on the floor,
1553 // so that a dangling ref to one entry does not pin all of them.
1554 lock(&sched.sudoglock)
1555 var sg, sgnext *sudog
1556 for sg = sched.sudogcache; sg != nil; sg = sgnext {
1557 sgnext = sg.next
1558 sg.next = nil
1560 sched.sudogcache = nil
1561 unlock(&sched.sudoglock)
1563 // Clear central defer pool.
1564 // Leave per-P pools alone, they have strictly bounded size.
1565 lock(&sched.deferlock)
1566 // disconnect cached list before dropping it on the floor,
1567 // so that a dangling ref to one entry does not pin all of them.
1568 var d, dlink *_defer
1569 for d = sched.deferpool; d != nil; d = dlink {
1570 dlink = d.link
1571 d.link = nil
1573 sched.deferpool = nil
1574 unlock(&sched.deferlock)
1577 // Timing
1579 // itoaDiv formats val/(10**dec) into buf.
1580 func itoaDiv(buf []byte, val uint64, dec int) []byte {
1581 i := len(buf) - 1
1582 idec := i - dec
1583 for val >= 10 || i >= idec {
1584 buf[i] = byte(val%10 + '0')
1586 if i == idec {
1587 buf[i] = '.'
1590 val /= 10
1592 buf[i] = byte(val + '0')
1593 return buf[i:]
1596 // fmtNSAsMS nicely formats ns nanoseconds as milliseconds.
1597 func fmtNSAsMS(buf []byte, ns uint64) []byte {
1598 if ns >= 10e6 {
1599 // Format as whole milliseconds.
1600 return itoaDiv(buf, ns/1e6, 0)
1602 // Format two digits of precision, with at most three decimal places.
1603 x := ns / 1e3
1604 if x == 0 {
1605 buf[0] = '0'
1606 return buf[:1]
1608 dec := 3
1609 for x >= 100 {
1610 x /= 10
1611 dec--
1613 return itoaDiv(buf, x, dec)
1616 // Helpers for testing GC.
1618 // gcTestIsReachable performs a GC and returns a bit set where bit i
1619 // is set if ptrs[i] is reachable.
1620 func gcTestIsReachable(ptrs ...unsafe.Pointer) (mask uint64) {
1621 // This takes the pointers as unsafe.Pointers in order to keep
1622 // them live long enough for us to attach specials. After
1623 // that, we drop our references to them.
1625 if len(ptrs) > 64 {
1626 panic("too many pointers for uint64 mask")
1629 // Block GC while we attach specials and drop our references
1630 // to ptrs. Otherwise, if a GC is in progress, it could mark
1631 // them reachable via this function before we have a chance to
1632 // drop them.
1633 semacquire(&gcsema)
1635 // Create reachability specials for ptrs.
1636 specials := make([]*specialReachable, len(ptrs))
1637 for i, p := range ptrs {
1638 lock(&mheap_.speciallock)
1639 s := (*specialReachable)(mheap_.specialReachableAlloc.alloc())
1640 unlock(&mheap_.speciallock)
1641 s.special.kind = _KindSpecialReachable
1642 if !addspecial(p, &s.special) {
1643 throw("already have a reachable special (duplicate pointer?)")
1645 specials[i] = s
1646 // Make sure we don't retain ptrs.
1647 ptrs[i] = nil
1650 semrelease(&gcsema)
1652 // Force a full GC and sweep.
1653 GC()
1655 // Process specials.
1656 for i, s := range specials {
1657 if !s.done {
1658 printlock()
1659 println("runtime: object", i, "was not swept")
1660 throw("IsReachable failed")
1662 if s.reachable {
1663 mask |= 1 << i
1665 lock(&mheap_.speciallock)
1666 mheap_.specialReachableAlloc.free(unsafe.Pointer(s))
1667 unlock(&mheap_.speciallock)
1670 return mask
1673 // onCurrentStack reports whether the argument is on the current stack.
1674 // It is implemented in C.
1675 func onCurrentStack(uintptr) bool
1677 // getBSS returns the start of the BSS section.
1678 // It is implemented in C.
1679 func getBSS() uintptr
1681 // gcTestPointerClass returns the category of what p points to, one of:
1682 // "heap", "stack", "data", "bss", "other". This is useful for checking
1683 // that a test is doing what it's intended to do.
1685 // This is nosplit simply to avoid extra pointer shuffling that may
1686 // complicate a test.
1688 //go:nosplit
1689 func gcTestPointerClass(p unsafe.Pointer) string {
1690 p2 := uintptr(noescape(p))
1691 if onCurrentStack(p2) {
1692 return "stack"
1694 if base, _, _ := findObject(p2, 0, 0, false); base != 0 {
1695 return "heap"
1697 bss := getBSS()
1698 if p2 >= getText() && p2 < bss {
1699 return "data"
1701 if p2 >= bss && p2 < getEnd() {
1702 return "bss"
1704 KeepAlive(p)
1705 return "other"