Bug 1854550 - pt 1. Fix a comment r=glandium
[gecko.git] / memory / replace / phc / PHC.cpp
blob5ff11be5b5f63ab1a0a2f7ada64e97d5e3478b28
1 /* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
2 /* vim: set ts=8 sts=2 et sw=2 tw=80: */
3 /* This Source Code Form is subject to the terms of the Mozilla Public
4 * License, v. 2.0. If a copy of the MPL was not distributed with this
5 * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
7 // PHC is a probabilistic heap checker. A tiny fraction of randomly chosen heap
8 // allocations are subject to some expensive checking via the use of OS page
9 // access protection. A failed check triggers a crash, whereupon useful
10 // information about the failure is put into the crash report. The cost and
11 // coverage for each user is minimal, but spread over the entire user base the
12 // coverage becomes significant.
14 // The idea comes from Chromium, where it is called GWP-ASAN. (Firefox uses PHC
15 // as the name because GWP-ASAN is long, awkward, and doesn't have any
16 // particular meaning.)
18 // In the current implementation up to 64 allocations per process can become
19 // PHC allocations. These allocations must be page-sized or smaller. Each PHC
20 // allocation gets its own page, and when the allocation is freed its page is
21 // marked inaccessible until the page is reused for another allocation. This
22 // means that a use-after-free defect (which includes double-frees) will be
23 // caught if the use occurs before the page is reused for another allocation.
24 // The crash report will contain stack traces for the allocation site, the free
25 // site, and the use-after-free site, which is often enough to diagnose the
26 // defect.
28 // Also, each PHC allocation is followed by a guard page. The PHC allocation is
29 // positioned so that its end abuts the guard page (or as close as possible,
30 // given alignment constraints). This means that a bounds violation at the end
31 // of the allocation (overflow) will be caught. The crash report will contain
32 // stack traces for the allocation site and the bounds violation use site,
33 // which is often enough to diagnose the defect.
35 // (A bounds violation at the start of the allocation (underflow) will not be
36 // caught, unless it is sufficiently large to hit the preceding allocation's
37 // guard page, which is not that likely. It would be possible to look more
38 // assiduously for underflow by randomly placing some allocations at the end of
39 // the page and some at the start of the page, and GWP-ASAN does this. PHC does
40 // not, however, because overflow is likely to be much more common than
41 // underflow in practice.)
43 // We use a simple heuristic to categorize a guard page access as overflow or
44 // underflow: if the address falls in the lower half of the guard page, we
45 // assume it is overflow, otherwise we assume it is underflow. More
46 // sophisticated heuristics are possible, but this one is very simple, and it is
47 // likely that most overflows/underflows in practice are very close to the page
48 // boundary.
50 // The design space for the randomization strategy is large. The current
51 // implementation has a large random delay before it starts operating, and a
52 // small random delay between each PHC allocation attempt. Each freed PHC
53 // allocation is quarantined for a medium random delay before being reused, in
54 // order to increase the chance of catching UAFs.
56 // The basic cost of PHC's operation is as follows.
58 // - The physical memory cost is 64 pages plus some metadata (including stack
59 // traces) for each page. This amounts to 256 KiB per process on
60 // architectures with 4 KiB pages and 1024 KiB on macOS/AArch64 which uses
61 // 16 KiB pages.
63 // - The virtual memory cost is the physical memory cost plus the guard pages:
64 // another 64 pages. This amounts to another 256 KiB per process on
65 // architectures with 4 KiB pages and 1024 KiB on macOS/AArch64 which uses
66 // 16 KiB pages. PHC is currently only enabled on 64-bit platforms so the
67 // impact of the virtual memory usage is negligible.
69 // - Every allocation requires a size check and a decrement-and-check of an
70 // atomic counter. When the counter reaches zero a PHC allocation can occur,
71 // which involves marking a page as accessible and getting a stack trace for
72 // the allocation site. Otherwise, mozjemalloc performs the allocation.
74 // - Every deallocation requires a range check on the pointer to see if it
75 // involves a PHC allocation. (The choice to only do PHC allocations that are
76 // a page or smaller enables this range check, because the 64 pages are
77 // contiguous. Allowing larger allocations would make this more complicated,
78 // and we definitely don't want something as slow as a hash table lookup on
79 // every deallocation.) PHC deallocations involve marking a page as
80 // inaccessible and getting a stack trace for the deallocation site.
82 // Note that calls to realloc(), free(), and malloc_usable_size() will
83 // immediately crash if the given pointer falls within a page allocation's
84 // page, but does not point to the start of the allocation itself.
86 // void* p = malloc(64);
87 // free(p + 1); // p+1 doesn't point to the allocation start; crash
89 // Such crashes will not have the PHC fields in the crash report.
91 // PHC-specific tests can be run with the following commands:
92 // - gtests: `./mach gtest '*PHC*'`
93 // - xpcshell-tests: `./mach test toolkit/crashreporter/test/unit`
94 // - This runs some non-PHC tests as well.
96 #include "PHC.h"
98 #include <stdlib.h>
99 #include <time.h>
101 #include <algorithm>
103 #ifdef XP_WIN
104 # include <process.h>
105 #else
106 # include <sys/mman.h>
107 # include <sys/types.h>
108 # include <pthread.h>
109 # include <unistd.h>
110 #endif
112 #include "replace_malloc.h"
113 #include "FdPrintf.h"
114 #include "Mutex.h"
115 #include "mozilla/Assertions.h"
116 #include "mozilla/Atomics.h"
117 #include "mozilla/Attributes.h"
118 #include "mozilla/CheckedInt.h"
119 #include "mozilla/Maybe.h"
120 #include "mozilla/StackWalk.h"
121 #include "mozilla/ThreadLocal.h"
122 #include "mozilla/XorShift128PlusRNG.h"
124 using namespace mozilla;
126 //---------------------------------------------------------------------------
127 // Utilities
128 //---------------------------------------------------------------------------
130 #ifdef ANDROID
131 // Android doesn't have pthread_atfork defined in pthread.h.
132 extern "C" MOZ_EXPORT int pthread_atfork(void (*)(void), void (*)(void),
133 void (*)(void));
134 #endif
136 #ifndef DISALLOW_COPY_AND_ASSIGN
137 # define DISALLOW_COPY_AND_ASSIGN(T) \
138 T(const T&); \
139 void operator=(const T&)
140 #endif
142 static malloc_table_t sMallocTable;
144 // This class provides infallible operations for the small number of heap
145 // allocations that PHC does for itself. It would be nice if we could use the
146 // InfallibleAllocPolicy from mozalloc, but PHC cannot use mozalloc.
147 class InfallibleAllocPolicy {
148 public:
149 static void AbortOnFailure(const void* aP) {
150 if (!aP) {
151 MOZ_CRASH("PHC failed to allocate");
155 template <class T>
156 static T* new_() {
157 void* p = sMallocTable.malloc(sizeof(T));
158 AbortOnFailure(p);
159 return new (p) T;
163 //---------------------------------------------------------------------------
164 // Stack traces
165 //---------------------------------------------------------------------------
167 // This code is similar to the equivalent code within DMD.
169 class StackTrace : public phc::StackTrace {
170 public:
171 StackTrace() = default;
173 void Clear() { mLength = 0; }
175 void Fill();
177 private:
178 static void StackWalkCallback(uint32_t aFrameNumber, void* aPc, void* aSp,
179 void* aClosure) {
180 StackTrace* st = (StackTrace*)aClosure;
181 MOZ_ASSERT(st->mLength < kMaxFrames);
182 st->mPcs[st->mLength] = aPc;
183 st->mLength++;
184 MOZ_ASSERT(st->mLength == aFrameNumber);
188 // WARNING WARNING WARNING: this function must only be called when GMut::sMutex
189 // is *not* locked, otherwise we might get deadlocks.
191 // How? On Windows, MozStackWalk() can lock a mutex, M, from the shared library
192 // loader. Another thread might call malloc() while holding M locked (when
193 // loading a shared library) and try to lock GMut::sMutex, causing a deadlock.
194 // So GMut::sMutex can't be locked during the call to MozStackWalk(). (For
195 // details, see https://bugzilla.mozilla.org/show_bug.cgi?id=374829#c8. On
196 // Linux, something similar can happen; see bug 824340. So we just disallow it
197 // on all platforms.)
199 // In DMD, to avoid this problem we temporarily unlock the equivalent mutex for
200 // the MozStackWalk() call. But that's grotty, and things are a bit different
201 // here, so we just require that stack traces be obtained before locking
202 // GMut::sMutex.
204 // Unfortunately, there is no reliable way at compile-time or run-time to ensure
205 // this pre-condition. Hence this large comment.
207 void StackTrace::Fill() {
208 mLength = 0;
210 #if defined(XP_WIN) && defined(_M_IX86)
211 // This avoids MozStackWalk(), which causes unusably slow startup on Win32
212 // when it is called during static initialization (see bug 1241684).
214 // This code is cribbed from the Gecko Profiler, which also uses
215 // FramePointerStackWalk() on Win32: Registers::SyncPopulate() for the
216 // frame pointer, and GetStackTop() for the stack end.
217 CONTEXT context;
218 RtlCaptureContext(&context);
219 void** fp = reinterpret_cast<void**>(context.Ebp);
221 PNT_TIB pTib = reinterpret_cast<PNT_TIB>(NtCurrentTeb());
222 void* stackEnd = static_cast<void*>(pTib->StackBase);
223 FramePointerStackWalk(StackWalkCallback, kMaxFrames, this, fp, stackEnd);
224 #elif defined(XP_MACOSX)
225 // This avoids MozStackWalk(), which has become unusably slow on Mac due to
226 // changes in libunwind.
228 // This code is cribbed from the Gecko Profiler, which also uses
229 // FramePointerStackWalk() on Mac: Registers::SyncPopulate() for the frame
230 // pointer, and GetStackTop() for the stack end.
231 # pragma GCC diagnostic push
232 # pragma GCC diagnostic ignored "-Wframe-address"
233 void** fp = reinterpret_cast<void**>(__builtin_frame_address(1));
234 # pragma GCC diagnostic pop
235 void* stackEnd = pthread_get_stackaddr_np(pthread_self());
236 FramePointerStackWalk(StackWalkCallback, kMaxFrames, this, fp, stackEnd);
237 #else
238 MozStackWalk(StackWalkCallback, nullptr, kMaxFrames, this);
239 #endif
242 //---------------------------------------------------------------------------
243 // Logging
244 //---------------------------------------------------------------------------
246 // Change this to 1 to enable some PHC logging. Useful for debugging.
247 #define PHC_LOGGING 0
249 #if PHC_LOGGING
251 static size_t GetPid() { return size_t(getpid()); }
253 static size_t GetTid() {
254 # if defined(XP_WIN)
255 return size_t(GetCurrentThreadId());
256 # else
257 return size_t(pthread_self());
258 # endif
261 # if defined(XP_WIN)
262 # define LOG_STDERR \
263 reinterpret_cast<intptr_t>(GetStdHandle(STD_ERROR_HANDLE))
264 # else
265 # define LOG_STDERR 2
266 # endif
267 # define LOG(fmt, ...) \
268 FdPrintf(LOG_STDERR, "PHC[%zu,%zu,~%zu] " fmt, GetPid(), GetTid(), \
269 size_t(GAtomic::Now()), __VA_ARGS__)
271 #else
273 # define LOG(fmt, ...)
275 #endif // PHC_LOGGING
277 //---------------------------------------------------------------------------
278 // Global state
279 //---------------------------------------------------------------------------
281 // Throughout this entire file time is measured as the number of sub-page
282 // allocations performed (by PHC and mozjemalloc combined). `Time` is 64-bit
283 // because we could have more than 2**32 allocations in a long-running session.
284 // `Delay` is 32-bit because the delays used within PHC are always much smaller
285 // than 2**32.
286 using Time = uint64_t; // A moment in time.
287 using Delay = uint32_t; // A time duration.
289 // PHC only runs if the page size is 4 KiB; anything more is uncommon and would
290 // use too much memory. So we hardwire this size for all platforms but macOS
291 // on ARM processors. For the latter we make an exception because the minimum
292 // page size supported is 16KiB so there's no way to go below that.
293 static const size_t kPageSize =
294 #if defined(XP_MACOSX) && defined(__aarch64__)
295 16384
296 #else
297 4096
298 #endif
301 // We align the PHC area to a multiple of the jemalloc and JS GC chunk size
302 // (both use 1MB aligned chunks) so that their address computations don't lead
303 // from non-PHC memory into PHC memory causing misleading PHC stacks to be
304 // attached to a crash report.
305 static const size_t kPhcAlign = 1024 * 1024;
307 static_assert(IsPowerOfTwo(kPhcAlign));
308 static_assert((kPhcAlign % kPageSize) == 0);
310 // There are two kinds of page.
311 // - Allocation pages, from which allocations are made.
312 // - Guard pages, which are never touched by PHC.
314 // These page kinds are interleaved; each allocation page has a guard page on
315 // either side.
316 static const size_t kNumAllocPages = kPageSize == 4096 ? 4096 : 1024;
317 static const size_t kNumAllPages = kNumAllocPages * 2 + 1;
319 // The total size of the allocation pages and guard pages.
320 static const size_t kAllPagesSize = kNumAllPages * kPageSize;
322 // jemalloc adds a guard page to the end of our allocation, see the comment in
323 // AllocAllPages() for more information.
324 static const size_t kAllPagesJemallocSize = kAllPagesSize - kPageSize;
326 // The default state for PHC. Either Enabled or OnlyFree.
327 #define DEFAULT_STATE mozilla::phc::OnlyFree
329 // The junk value used to fill new allocation in debug builds. It's same value
330 // as the one used by mozjemalloc. PHC applies it unconditionally in debug
331 // builds. Unlike mozjemalloc, PHC doesn't consult the MALLOC_OPTIONS
332 // environment variable to possibly change that behaviour.
334 // Also note that, unlike mozjemalloc, PHC doesn't have a poison value for freed
335 // allocations because freed allocations are protected by OS page protection.
336 #ifdef DEBUG
337 const uint8_t kAllocJunk = 0xe4;
338 #endif
340 // The maximum time.
341 static const Time kMaxTime = ~(Time(0));
343 // The average delay before doing any page allocations at the start of a
344 // process. Note that roughly 1 million allocations occur in the main process
345 // while starting the browser. The delay range is 1..kAvgFirstAllocDelay*2.
346 static const Delay kAvgFirstAllocDelay = 64 * 1024;
348 // The average delay until the next attempted page allocation, once we get past
349 // the first delay. The delay range is 1..kAvgAllocDelay*2.
350 static const Delay kAvgAllocDelay = 16 * 1024;
352 // The average delay before reusing a freed page. Should be significantly larger
353 // than kAvgAllocDelay, otherwise there's not much point in having it. The delay
354 // range is (kAvgAllocDelay / 2)..(kAvgAllocDelay / 2 * 3). This is different to
355 // the other delay ranges in not having a minimum of 1, because that's such a
356 // short delay that there is a high likelihood of bad stacks in any crash
357 // report.
358 static const Delay kAvgPageReuseDelay = 256 * 1024;
360 // Truncate aRnd to the range (1 .. AvgDelay*2). If aRnd is random, this
361 // results in an average value of aAvgDelay + 0.5, which is close enough to
362 // aAvgDelay. aAvgDelay must be a power-of-two (otherwise it will crash) for
363 // speed.
364 template <Delay AvgDelay>
365 constexpr Delay Rnd64ToDelay(uint64_t aRnd) {
366 static_assert(IsPowerOfTwo(AvgDelay), "must be a power of two");
368 return aRnd % (AvgDelay * 2) + 1;
371 // Maps a pointer to a PHC-specific structure:
372 // - Nothing
373 // - A guard page (it is unspecified which one)
374 // - An allocation page (with an index < kNumAllocPages)
376 // The standard way of handling a PtrKind is to check IsNothing(), and if that
377 // fails, to check IsGuardPage(), and if that fails, to call AllocPage().
378 class PtrKind {
379 private:
380 enum class Tag : uint8_t {
381 Nothing,
382 GuardPage,
383 AllocPage,
386 Tag mTag;
387 uintptr_t mIndex; // Only used if mTag == Tag::AllocPage.
389 public:
390 // Detect what a pointer points to. This constructor must be fast because it
391 // is called for every call to free(), realloc(), malloc_usable_size(), and
392 // jemalloc_ptr_info().
393 PtrKind(const void* aPtr, const uint8_t* aPagesStart,
394 const uint8_t* aPagesLimit) {
395 if (!(aPagesStart <= aPtr && aPtr < aPagesLimit)) {
396 mTag = Tag::Nothing;
397 } else {
398 uintptr_t offset = static_cast<const uint8_t*>(aPtr) - aPagesStart;
399 uintptr_t allPageIndex = offset / kPageSize;
400 MOZ_ASSERT(allPageIndex < kNumAllPages);
401 if (allPageIndex & 1) {
402 // Odd-indexed pages are allocation pages.
403 uintptr_t allocPageIndex = allPageIndex / 2;
404 MOZ_ASSERT(allocPageIndex < kNumAllocPages);
405 mTag = Tag::AllocPage;
406 mIndex = allocPageIndex;
407 } else {
408 // Even-numbered pages are guard pages.
409 mTag = Tag::GuardPage;
414 bool IsNothing() const { return mTag == Tag::Nothing; }
415 bool IsGuardPage() const { return mTag == Tag::GuardPage; }
417 // This should only be called after IsNothing() and IsGuardPage() have been
418 // checked and failed.
419 uintptr_t AllocPageIndex() const {
420 MOZ_RELEASE_ASSERT(mTag == Tag::AllocPage);
421 return mIndex;
425 // Shared, atomic, mutable global state.
426 class GAtomic {
427 public:
428 static void Init(Delay aFirstDelay) {
429 sAllocDelay = aFirstDelay;
431 LOG("Initial sAllocDelay <- %zu\n", size_t(aFirstDelay));
434 static Time Now() { return sNow; }
436 static void IncrementNow() { sNow++; }
438 // Decrements the delay and returns the decremented value.
439 static int32_t DecrementDelay() { return --sAllocDelay; }
441 static void SetAllocDelay(Delay aAllocDelay) { sAllocDelay = aAllocDelay; }
443 static bool AllocDelayHasWrapped() {
444 // Delay is unsigned so we can't test for less that zero. Instead test if
445 // it has wrapped around by comparing with the maximum value we ever use.
446 return sAllocDelay > 2 * std::max(kAvgAllocDelay, kAvgFirstAllocDelay);
449 private:
450 // The current time. Relaxed semantics because it's primarily used for
451 // determining if an allocation can be recycled yet and therefore it doesn't
452 // need to be exact.
453 static Atomic<Time, Relaxed> sNow;
455 // Delay until the next attempt at a page allocation. See the comment in
456 // MaybePageAlloc() for an explanation of why it uses ReleaseAcquire
457 // semantics.
458 static Atomic<Delay, ReleaseAcquire> sAllocDelay;
461 Atomic<Time, Relaxed> GAtomic::sNow;
462 Atomic<Delay, ReleaseAcquire> GAtomic::sAllocDelay;
464 // Shared, immutable global state. Initialized by replace_init() and never
465 // changed after that. replace_init() runs early enough that no synchronization
466 // is needed.
467 class GConst {
468 private:
469 // The bounds of the allocated pages.
470 uint8_t* const mPagesStart;
471 uint8_t* const mPagesLimit;
473 // Allocates the allocation pages and the guard pages, contiguously.
474 uint8_t* AllocAllPages() {
475 // The memory allocated here is never freed, because it would happen at
476 // process termination when it would be of little use.
478 // We can rely on jemalloc's behaviour that when it allocates memory aligned
479 // with its own chunk size it will over-allocate and guarantee that the
480 // memory after the end of our allocation, but before the next chunk, is
481 // decommitted and inaccessible. Elsewhere in PHC we assume that we own
482 // that page (so that memory errors in it get caught by PHC) but here we
483 // use kAllPagesJemallocSize which subtracts jemalloc's guard page.
484 void* pages = sMallocTable.memalign(kPhcAlign, kAllPagesJemallocSize);
485 if (!pages) {
486 MOZ_CRASH();
489 // Make the pages inaccessible.
490 #ifdef XP_WIN
491 if (!VirtualFree(pages, kAllPagesJemallocSize, MEM_DECOMMIT)) {
492 MOZ_CRASH("VirtualFree failed");
494 #else
495 if (mmap(pages, kAllPagesJemallocSize, PROT_NONE,
496 MAP_FIXED | MAP_PRIVATE | MAP_ANON, -1, 0) == MAP_FAILED) {
497 MOZ_CRASH("mmap failed");
499 #endif
501 return static_cast<uint8_t*>(pages);
504 public:
505 GConst()
506 : mPagesStart(AllocAllPages()), mPagesLimit(mPagesStart + kAllPagesSize) {
507 LOG("AllocAllPages at %p..%p\n", mPagesStart, mPagesLimit);
510 class PtrKind PtrKind(const void* aPtr) {
511 class PtrKind pk(aPtr, mPagesStart, mPagesLimit);
512 return pk;
515 bool IsInFirstGuardPage(const void* aPtr) {
516 return mPagesStart <= aPtr && aPtr < mPagesStart + kPageSize;
519 // Get the address of the allocation page referred to via an index. Used when
520 // marking the page as accessible/inaccessible.
521 uint8_t* AllocPagePtr(uintptr_t aIndex) {
522 MOZ_ASSERT(aIndex < kNumAllocPages);
523 // Multiply by two and add one to account for allocation pages *and* guard
524 // pages.
525 return mPagesStart + (2 * aIndex + 1) * kPageSize;
529 static GConst* gConst;
531 // This type is used as a proof-of-lock token, to make it clear which functions
532 // require sMutex to be locked.
533 using GMutLock = const MutexAutoLock&;
535 // Shared, mutable global state. Protected by sMutex; all accessing functions
536 // take a GMutLock as proof that sMutex is held.
537 class GMut {
538 enum class AllocPageState {
539 NeverAllocated = 0,
540 InUse = 1,
541 Freed = 2,
544 // Metadata for each allocation page.
545 class AllocPageInfo {
546 public:
547 AllocPageInfo()
548 : mState(AllocPageState::NeverAllocated),
549 mBaseAddr(nullptr),
550 mReuseTime(0) {}
552 // The current allocation page state.
553 AllocPageState mState;
555 // The arena that the allocation is nominally from. This isn't meaningful
556 // within PHC, which has no arenas. But it is necessary for reallocation of
557 // page allocations as normal allocations, such as in this code:
559 // p = moz_arena_malloc(arenaId, 4096);
560 // realloc(p, 8192);
562 // The realloc is more than one page, and thus too large for PHC to handle.
563 // Therefore, if PHC handles the first allocation, it must ask mozjemalloc
564 // to allocate the 8192 bytes in the correct arena, and to do that, it must
565 // call sMallocTable.moz_arena_malloc with the correct arenaId under the
566 // covers. Therefore it must record that arenaId.
568 // This field is also needed for jemalloc_ptr_info() to work, because it
569 // also returns the arena ID (but only in debug builds).
571 // - NeverAllocated: must be 0.
572 // - InUse | Freed: can be any valid arena ID value.
573 Maybe<arena_id_t> mArenaId;
575 // The starting address of the allocation. Will not be the same as the page
576 // address unless the allocation is a full page.
577 // - NeverAllocated: must be 0.
578 // - InUse | Freed: must be within the allocation page.
579 uint8_t* mBaseAddr;
581 // Usable size is computed as the number of bytes between the pointer and
582 // the end of the allocation page. This might be bigger than the requested
583 // size, especially if an outsized alignment is requested.
584 size_t UsableSize() const {
585 return mState == AllocPageState::NeverAllocated
587 : kPageSize - (reinterpret_cast<uintptr_t>(mBaseAddr) &
588 (kPageSize - 1));
591 // The internal fragmentation for this allocation.
592 size_t FragmentationBytes() const {
593 MOZ_ASSERT(kPageSize >= UsableSize());
594 return mState == AllocPageState::InUse ? kPageSize - UsableSize() : 0;
597 // The allocation stack.
598 // - NeverAllocated: Nothing.
599 // - InUse | Freed: Some.
600 Maybe<StackTrace> mAllocStack;
602 // The free stack.
603 // - NeverAllocated | InUse: Nothing.
604 // - Freed: Some.
605 Maybe<StackTrace> mFreeStack;
607 // The time at which the page is available for reuse, as measured against
608 // GAtomic::sNow. When the page is in use this value will be kMaxTime.
609 // - NeverAllocated: must be 0.
610 // - InUse: must be kMaxTime.
611 // - Freed: must be > 0 and < kMaxTime.
612 Time mReuseTime;
615 public:
616 // The mutex that protects the other members.
617 static Mutex sMutex MOZ_UNANNOTATED;
619 GMut() : mRNG(RandomSeed<0>(), RandomSeed<1>()) { sMutex.Init(); }
621 uint64_t Random64(GMutLock) { return mRNG.next(); }
623 bool IsPageInUse(GMutLock, uintptr_t aIndex) {
624 return mAllocPages[aIndex].mState == AllocPageState::InUse;
627 // Is the page free? And if so, has enough time passed that we can use it?
628 bool IsPageAllocatable(GMutLock, uintptr_t aIndex, Time aNow) {
629 const AllocPageInfo& page = mAllocPages[aIndex];
630 return page.mState != AllocPageState::InUse && aNow >= page.mReuseTime;
633 // Get the address of the allocation page referred to via an index. Used
634 // when checking pointers against page boundaries.
635 uint8_t* AllocPageBaseAddr(GMutLock, uintptr_t aIndex) {
636 return mAllocPages[aIndex].mBaseAddr;
639 Maybe<arena_id_t> PageArena(GMutLock aLock, uintptr_t aIndex) {
640 const AllocPageInfo& page = mAllocPages[aIndex];
641 AssertAllocPageInUse(aLock, page);
643 return page.mArenaId;
646 size_t PageUsableSize(GMutLock aLock, uintptr_t aIndex) {
647 const AllocPageInfo& page = mAllocPages[aIndex];
648 AssertAllocPageInUse(aLock, page);
650 return page.UsableSize();
653 // The total fragmentation in PHC
654 size_t FragmentationBytes() const {
655 size_t sum = 0;
656 for (const auto& page : mAllocPages) {
657 sum += page.FragmentationBytes();
659 return sum;
662 void SetPageInUse(GMutLock aLock, uintptr_t aIndex,
663 const Maybe<arena_id_t>& aArenaId, uint8_t* aBaseAddr,
664 const StackTrace& aAllocStack) {
665 AllocPageInfo& page = mAllocPages[aIndex];
666 AssertAllocPageNotInUse(aLock, page);
668 page.mState = AllocPageState::InUse;
669 page.mArenaId = aArenaId;
670 page.mBaseAddr = aBaseAddr;
671 page.mAllocStack = Some(aAllocStack);
672 page.mFreeStack = Nothing();
673 page.mReuseTime = kMaxTime;
676 #if PHC_LOGGING
677 Time GetFreeTime(uintptr_t aIndex) const { return mFreeTime[aIndex]; }
678 #endif
680 void ResizePageInUse(GMutLock aLock, uintptr_t aIndex,
681 const Maybe<arena_id_t>& aArenaId, uint8_t* aNewBaseAddr,
682 const StackTrace& aAllocStack) {
683 AllocPageInfo& page = mAllocPages[aIndex];
684 AssertAllocPageInUse(aLock, page);
686 // page.mState is not changed.
687 if (aArenaId.isSome()) {
688 // Crash if the arenas don't match.
689 MOZ_RELEASE_ASSERT(page.mArenaId == aArenaId);
691 page.mBaseAddr = aNewBaseAddr;
692 // We could just keep the original alloc stack, but the realloc stack is
693 // more recent and therefore seems more useful.
694 page.mAllocStack = Some(aAllocStack);
695 // page.mFreeStack is not changed.
696 // page.mReuseTime is not changed.
699 void SetPageFreed(GMutLock aLock, uintptr_t aIndex,
700 const Maybe<arena_id_t>& aArenaId,
701 const StackTrace& aFreeStack, Delay aReuseDelay) {
702 AllocPageInfo& page = mAllocPages[aIndex];
703 AssertAllocPageInUse(aLock, page);
705 page.mState = AllocPageState::Freed;
707 // page.mArenaId is left unchanged, for jemalloc_ptr_info() calls that
708 // occur after freeing (e.g. in the PtrInfo test in TestJemalloc.cpp).
709 if (aArenaId.isSome()) {
710 // Crash if the arenas don't match.
711 MOZ_RELEASE_ASSERT(page.mArenaId == aArenaId);
714 // page.musableSize is left unchanged, for reporting on UAF, and for
715 // jemalloc_ptr_info() calls that occur after freeing (e.g. in the PtrInfo
716 // test in TestJemalloc.cpp).
718 // page.mAllocStack is left unchanged, for reporting on UAF.
720 page.mFreeStack = Some(aFreeStack);
721 Time now = GAtomic::Now();
722 #if PHC_LOGGING
723 mFreeTime[aIndex] = now;
724 #endif
725 page.mReuseTime = now + aReuseDelay;
728 static void CrashOnGuardPage(void* aPtr) {
729 // An operation on a guard page? This is a bounds violation. Deliberately
730 // touch the page in question to cause a crash that triggers the usual PHC
731 // machinery.
732 LOG("CrashOnGuardPage(%p), bounds violation\n", aPtr);
733 *static_cast<uint8_t*>(aPtr) = 0;
734 MOZ_CRASH("unreachable");
737 void EnsureValidAndInUse(GMutLock, void* aPtr, uintptr_t aIndex)
738 MOZ_REQUIRES(sMutex) {
739 const AllocPageInfo& page = mAllocPages[aIndex];
741 // The pointer must point to the start of the allocation.
742 MOZ_RELEASE_ASSERT(page.mBaseAddr == aPtr);
744 if (page.mState == AllocPageState::Freed) {
745 LOG("EnsureValidAndInUse(%p), use-after-free\n", aPtr);
746 // An operation on a freed page? This is a particular kind of
747 // use-after-free. Deliberately touch the page in question, in order to
748 // cause a crash that triggers the usual PHC machinery. But unlock sMutex
749 // first, because that self-same PHC machinery needs to re-lock it, and
750 // the crash causes non-local control flow so sMutex won't be unlocked
751 // the normal way in the caller.
752 sMutex.Unlock();
753 *static_cast<uint8_t*>(aPtr) = 0;
754 MOZ_CRASH("unreachable");
758 // This expects GMUt::sMutex to be locked but can't check it with a parameter
759 // since we try-lock it.
760 void FillAddrInfo(uintptr_t aIndex, const void* aBaseAddr, bool isGuardPage,
761 phc::AddrInfo& aOut) {
762 const AllocPageInfo& page = mAllocPages[aIndex];
763 if (isGuardPage) {
764 aOut.mKind = phc::AddrInfo::Kind::GuardPage;
765 } else {
766 switch (page.mState) {
767 case AllocPageState::NeverAllocated:
768 aOut.mKind = phc::AddrInfo::Kind::NeverAllocatedPage;
769 break;
771 case AllocPageState::InUse:
772 aOut.mKind = phc::AddrInfo::Kind::InUsePage;
773 break;
775 case AllocPageState::Freed:
776 aOut.mKind = phc::AddrInfo::Kind::FreedPage;
777 break;
779 default:
780 MOZ_CRASH();
783 aOut.mBaseAddr = page.mBaseAddr;
784 aOut.mUsableSize = page.UsableSize();
785 aOut.mAllocStack = page.mAllocStack;
786 aOut.mFreeStack = page.mFreeStack;
789 void FillJemallocPtrInfo(GMutLock, const void* aPtr, uintptr_t aIndex,
790 jemalloc_ptr_info_t* aInfo) {
791 const AllocPageInfo& page = mAllocPages[aIndex];
792 switch (page.mState) {
793 case AllocPageState::NeverAllocated:
794 break;
796 case AllocPageState::InUse: {
797 // Only return TagLiveAlloc if the pointer is within the bounds of the
798 // allocation's usable size.
799 uint8_t* base = page.mBaseAddr;
800 uint8_t* limit = base + page.UsableSize();
801 if (base <= aPtr && aPtr < limit) {
802 *aInfo = {TagLiveAlloc, page.mBaseAddr, page.UsableSize(),
803 page.mArenaId.valueOr(0)};
804 return;
806 break;
809 case AllocPageState::Freed: {
810 // Only return TagFreedAlloc if the pointer is within the bounds of the
811 // former allocation's usable size.
812 uint8_t* base = page.mBaseAddr;
813 uint8_t* limit = base + page.UsableSize();
814 if (base <= aPtr && aPtr < limit) {
815 *aInfo = {TagFreedAlloc, page.mBaseAddr, page.UsableSize(),
816 page.mArenaId.valueOr(0)};
817 return;
819 break;
822 default:
823 MOZ_CRASH();
826 // Pointers into guard pages will end up here, as will pointers into
827 // allocation pages that aren't within the allocation's bounds.
828 *aInfo = {TagUnknown, nullptr, 0, 0};
831 #ifndef XP_WIN
832 static void prefork() MOZ_NO_THREAD_SAFETY_ANALYSIS { sMutex.Lock(); }
833 static void postfork_parent() MOZ_NO_THREAD_SAFETY_ANALYSIS {
834 sMutex.Unlock();
836 static void postfork_child() { sMutex.Init(); }
837 #endif
839 #if PHC_LOGGING
840 void IncPageAllocHits(GMutLock) { mPageAllocHits++; }
841 void IncPageAllocMisses(GMutLock) { mPageAllocMisses++; }
842 #else
843 void IncPageAllocHits(GMutLock) {}
844 void IncPageAllocMisses(GMutLock) {}
845 #endif
847 #if PHC_LOGGING
848 struct PageStats {
849 size_t mNumAlloced = 0;
850 size_t mNumFreed = 0;
853 PageStats GetPageStats(GMutLock) {
854 PageStats stats;
856 for (const auto& page : mAllocPages) {
857 stats.mNumAlloced += page.mState == AllocPageState::InUse ? 1 : 0;
858 stats.mNumFreed += page.mState == AllocPageState::Freed ? 1 : 0;
861 return stats;
864 size_t PageAllocHits(GMutLock) { return mPageAllocHits; }
865 size_t PageAllocAttempts(GMutLock) {
866 return mPageAllocHits + mPageAllocMisses;
869 // This is an integer because FdPrintf only supports integer printing.
870 size_t PageAllocHitRate(GMutLock) {
871 return mPageAllocHits * 100 / (mPageAllocHits + mPageAllocMisses);
873 #endif
875 // Should we make new PHC allocations?
876 bool ShouldMakeNewAllocations() const {
877 return mPhcState == mozilla::phc::Enabled;
880 using PHCState = mozilla::phc::PHCState;
881 void SetState(PHCState aState) { mPhcState = aState; }
883 private:
884 template <int N>
885 uint64_t RandomSeed() {
886 // An older version of this code used RandomUint64() here, but on Mac that
887 // function uses arc4random(), which can allocate, which would cause
888 // re-entry, which would be bad. So we just use time() and a local variable
889 // address. These are mediocre sources of entropy, but good enough for PHC.
890 static_assert(N == 0 || N == 1, "must be 0 or 1");
891 uint64_t seed;
892 if (N == 0) {
893 time_t t = time(nullptr);
894 seed = t ^ (t << 32);
895 } else {
896 seed = uintptr_t(&seed) ^ (uintptr_t(&seed) << 32);
898 return seed;
901 void AssertAllocPageInUse(GMutLock, const AllocPageInfo& aPage) {
902 MOZ_ASSERT(aPage.mState == AllocPageState::InUse);
903 // There is nothing to assert about aPage.mArenaId.
904 MOZ_ASSERT(aPage.mBaseAddr);
905 MOZ_ASSERT(aPage.UsableSize() > 0);
906 MOZ_ASSERT(aPage.mAllocStack.isSome());
907 MOZ_ASSERT(aPage.mFreeStack.isNothing());
908 MOZ_ASSERT(aPage.mReuseTime == kMaxTime);
911 void AssertAllocPageNotInUse(GMutLock, const AllocPageInfo& aPage) {
912 // We can assert a lot about `NeverAllocated` pages, but not much about
913 // `Freed` pages.
914 #ifdef DEBUG
915 bool isFresh = aPage.mState == AllocPageState::NeverAllocated;
916 MOZ_ASSERT(isFresh || aPage.mState == AllocPageState::Freed);
917 MOZ_ASSERT_IF(isFresh, aPage.mArenaId == Nothing());
918 MOZ_ASSERT(isFresh == (aPage.mBaseAddr == nullptr));
919 MOZ_ASSERT(isFresh == (aPage.mAllocStack.isNothing()));
920 MOZ_ASSERT(isFresh == (aPage.mFreeStack.isNothing()));
921 MOZ_ASSERT(aPage.mReuseTime != kMaxTime);
922 #endif
925 // RNG for deciding which allocations to treat specially. It doesn't need to
926 // be high quality.
928 // This is a raw pointer for the reason explained in the comment above
929 // GMut's constructor. Don't change it to UniquePtr or anything like that.
930 non_crypto::XorShift128PlusRNG mRNG;
932 AllocPageInfo mAllocPages[kNumAllocPages];
933 #if PHC_LOGGING
934 Time mFreeTime[kNumAllocPages];
936 // How many allocations that could have been page allocs actually were? As
937 // constrained kNumAllocPages. If the hit ratio isn't close to 100% it's
938 // likely that the global constants are poorly chosen.
939 size_t mPageAllocHits = 0;
940 size_t mPageAllocMisses = 0;
941 #endif
943 // This will only ever be updated from one thread. The other threads should
944 // eventually get the update.
945 Atomic<PHCState, Relaxed> mPhcState =
946 Atomic<PHCState, Relaxed>(DEFAULT_STATE);
949 Mutex GMut::sMutex;
951 static GMut* gMut;
953 // When PHC wants to crash we first have to unlock so that the crash reporter
954 // can call into PHC to lockup its pointer. That also means that before calling
955 // PHCCrash please ensure that state is consistent. Because this can report an
956 // arbitrary string, use of it must be reviewed by Firefox data stewards.
957 static void PHCCrash(GMutLock, const char* aMessage)
958 MOZ_REQUIRES(GMut::sMutex) {
959 GMut::sMutex.Unlock();
960 MOZ_CRASH_UNSAFE(aMessage);
963 // On MacOS, the first __thread/thread_local access calls malloc, which leads
964 // to an infinite loop. So we use pthread-based TLS instead, which somehow
965 // doesn't have this problem.
966 #if !defined(XP_DARWIN)
967 # define PHC_THREAD_LOCAL(T) MOZ_THREAD_LOCAL(T)
968 #else
969 # define PHC_THREAD_LOCAL(T) \
970 detail::ThreadLocal<T, detail::ThreadLocalKeyStorage>
971 #endif
973 // Thread-local state.
974 class GTls {
975 public:
976 GTls(const GTls&) = delete;
978 const GTls& operator=(const GTls&) = delete;
980 // When true, PHC does as little as possible.
982 // (a) It does not allocate any new page allocations.
984 // (b) It avoids doing any operations that might call malloc/free/etc., which
985 // would cause re-entry into PHC. (In practice, MozStackWalk() is the
986 // only such operation.) Note that calls to the functions in sMallocTable
987 // are ok.
989 // For example, replace_malloc() will just fall back to mozjemalloc. However,
990 // operations involving existing allocations are more complex, because those
991 // existing allocations may be page allocations. For example, if
992 // replace_free() is passed a page allocation on a PHC-disabled thread, it
993 // will free the page allocation in the usual way, but it will get a dummy
994 // freeStack in order to avoid calling MozStackWalk(), as per (b) above.
996 // This single disabling mechanism has two distinct uses.
998 // - It's used to prevent re-entry into PHC, which can cause correctness
999 // problems. For example, consider this sequence.
1001 // 1. enter replace_free()
1002 // 2. which calls PageFree()
1003 // 3. which calls MozStackWalk()
1004 // 4. which locks a mutex M, and then calls malloc
1005 // 5. enter replace_malloc()
1006 // 6. which calls MaybePageAlloc()
1007 // 7. which calls MozStackWalk()
1008 // 8. which (re)locks a mutex M --> deadlock
1010 // We avoid this sequence by "disabling" the thread in PageFree() (at step
1011 // 2), which causes MaybePageAlloc() to fail, avoiding the call to
1012 // MozStackWalk() (at step 7).
1014 // In practice, realloc or free of a PHC allocation is unlikely on a thread
1015 // that is disabled because of this use: MozStackWalk() will probably only
1016 // realloc/free allocations that it allocated itself, but those won't be
1017 // page allocations because PHC is disabled before calling MozStackWalk().
1019 // (Note that MaybePageAlloc() could safely do a page allocation so long as
1020 // it avoided calling MozStackWalk() by getting a dummy allocStack. But it
1021 // wouldn't be useful, and it would prevent the second use below.)
1023 // - It's used to prevent PHC allocations in some tests that rely on
1024 // mozjemalloc's exact allocation behaviour, which PHC does not replicate
1025 // exactly. (Note that (b) isn't necessary for this use -- MozStackWalk()
1026 // could be safely called -- but it is necessary for the first use above.)
1029 static void Init() {
1030 if (!tlsIsDisabled.init()) {
1031 MOZ_CRASH();
1035 static void DisableOnCurrentThread() {
1036 MOZ_ASSERT(!GTls::tlsIsDisabled.get());
1037 tlsIsDisabled.set(true);
1040 static void EnableOnCurrentThread() {
1041 MOZ_ASSERT(GTls::tlsIsDisabled.get());
1042 uint64_t rand;
1043 if (GAtomic::AllocDelayHasWrapped()) {
1045 MutexAutoLock lock(GMut::sMutex);
1046 rand = gMut->Random64(lock);
1048 GAtomic::SetAllocDelay(Rnd64ToDelay<kAvgAllocDelay>(rand));
1050 tlsIsDisabled.set(false);
1053 static bool IsDisabledOnCurrentThread() { return tlsIsDisabled.get(); }
1055 private:
1056 static PHC_THREAD_LOCAL(bool) tlsIsDisabled;
1059 PHC_THREAD_LOCAL(bool) GTls::tlsIsDisabled;
1061 class AutoDisableOnCurrentThread {
1062 public:
1063 AutoDisableOnCurrentThread(const AutoDisableOnCurrentThread&) = delete;
1065 const AutoDisableOnCurrentThread& operator=(
1066 const AutoDisableOnCurrentThread&) = delete;
1068 explicit AutoDisableOnCurrentThread() { GTls::DisableOnCurrentThread(); }
1069 ~AutoDisableOnCurrentThread() { GTls::EnableOnCurrentThread(); }
1072 //---------------------------------------------------------------------------
1073 // Page allocation operations
1074 //---------------------------------------------------------------------------
1076 // Attempt a page allocation if the time and the size are right. Allocated
1077 // memory is zeroed if aZero is true. On failure, the caller should attempt a
1078 // normal allocation via sMallocTable. Can be called in a context where
1079 // GMut::sMutex is locked.
1080 static void* MaybePageAlloc(const Maybe<arena_id_t>& aArenaId, size_t aReqSize,
1081 size_t aAlignment, bool aZero) {
1082 MOZ_ASSERT(IsPowerOfTwo(aAlignment));
1084 if (aReqSize > kPageSize) {
1085 return nullptr;
1088 MOZ_ASSERT(gMut);
1089 if (!gMut->ShouldMakeNewAllocations()) {
1090 return nullptr;
1093 GAtomic::IncrementNow();
1095 // Decrement the delay. If it's zero, we do a page allocation and reset the
1096 // delay to a random number. Because the assignment to the random number isn't
1097 // atomic w.r.t. the decrement, we might have a sequence like this:
1099 // Thread 1 Thread 2 Thread 3
1100 // -------- -------- --------
1101 // (a) newDelay = --sAllocDelay (-> 0)
1102 // (b) --sAllocDelay (-> -1)
1103 // (c) (newDelay != 0) fails
1104 // (d) --sAllocDelay (-> -2)
1105 // (e) sAllocDelay = new_random_number()
1107 // It's critical that sAllocDelay has ReleaseAcquire semantics, because that
1108 // guarantees that exactly one thread will see sAllocDelay have the value 0.
1109 // (Relaxed semantics wouldn't guarantee that.)
1111 // Note that sAllocDelay is unsigned and we expect that it will wrap after
1112 // being decremented "below" zero. It must be unsigned so that IsPowerOfTwo()
1113 // can work on some Delay values.
1115 // Finally, note that the decrements that occur between (a) and (e) above are
1116 // effectively ignored, because (e) clobbers them. This shouldn't be a
1117 // problem; it effectively just adds a little more randomness to
1118 // new_random_number(). An early version of this code tried to account for
1119 // these decrements by doing `sAllocDelay += new_random_number()`. However, if
1120 // new_random_value() is small, the number of decrements between (a) and (e)
1121 // can easily exceed it, whereupon sAllocDelay ends up negative after
1122 // `sAllocDelay += new_random_number()`, and the zero-check never succeeds
1123 // again. (At least, not until sAllocDelay wraps around on overflow, which
1124 // would take a very long time indeed.)
1126 int32_t newDelay = GAtomic::DecrementDelay();
1127 if (newDelay != 0) {
1128 return nullptr;
1131 if (GTls::IsDisabledOnCurrentThread()) {
1132 return nullptr;
1135 // Disable on this thread *before* getting the stack trace.
1136 AutoDisableOnCurrentThread disable;
1138 // Get the stack trace *before* locking the mutex. If we return nullptr then
1139 // it was a waste, but it's not so frequent, and doing a stack walk while
1140 // the mutex is locked is problematic (see the big comment on
1141 // StackTrace::Fill() for details).
1142 StackTrace allocStack;
1143 allocStack.Fill();
1145 MutexAutoLock lock(GMut::sMutex);
1147 Time now = GAtomic::Now();
1148 Delay newAllocDelay = Rnd64ToDelay<kAvgAllocDelay>(gMut->Random64(lock));
1150 // We start at a random page alloc and wrap around, to ensure pages get even
1151 // amounts of use.
1152 uint8_t* ptr = nullptr;
1153 uint8_t* pagePtr = nullptr;
1154 for (uintptr_t n = 0, i = size_t(gMut->Random64(lock)) % kNumAllocPages;
1155 n < kNumAllocPages; n++, i = (i + 1) % kNumAllocPages) {
1156 if (!gMut->IsPageAllocatable(lock, i, now)) {
1157 continue;
1160 #if PHC_LOGGING
1161 Time lifetime = 0;
1162 #endif
1163 pagePtr = gConst->AllocPagePtr(i);
1164 MOZ_ASSERT(pagePtr);
1165 bool ok =
1166 #ifdef XP_WIN
1167 !!VirtualAlloc(pagePtr, kPageSize, MEM_COMMIT, PAGE_READWRITE);
1168 #else
1169 mprotect(pagePtr, kPageSize, PROT_READ | PROT_WRITE) == 0;
1170 #endif
1172 if (!ok) {
1173 pagePtr = nullptr;
1174 continue;
1177 size_t usableSize = sMallocTable.malloc_good_size(aReqSize);
1178 MOZ_ASSERT(usableSize > 0);
1180 // Put the allocation as close to the end of the page as possible,
1181 // allowing for alignment requirements.
1182 ptr = pagePtr + kPageSize - usableSize;
1183 if (aAlignment != 1) {
1184 ptr = reinterpret_cast<uint8_t*>(
1185 (reinterpret_cast<uintptr_t>(ptr) & ~(aAlignment - 1)));
1188 #if PHC_LOGGING
1189 Time then = gMut->GetFreeTime(i);
1190 lifetime = then != 0 ? now - then : 0;
1191 #endif
1193 gMut->SetPageInUse(lock, i, aArenaId, ptr, allocStack);
1195 if (aZero) {
1196 memset(ptr, 0, usableSize);
1197 } else {
1198 #ifdef DEBUG
1199 memset(ptr, kAllocJunk, usableSize);
1200 #endif
1203 gMut->IncPageAllocHits(lock);
1204 #if PHC_LOGGING
1205 GMut::PageStats stats = gMut->GetPageStats(lock);
1206 #endif
1207 LOG("PageAlloc(%zu, %zu) -> %p[%zu]/%p (%zu) (z%zu), sAllocDelay <- %zu, "
1208 "fullness %zu/%zu/%zu, hits %zu/%zu (%zu%%), lifetime %zu\n",
1209 aReqSize, aAlignment, pagePtr, i, ptr, usableSize, size_t(aZero),
1210 size_t(newAllocDelay), stats.mNumAlloced, stats.mNumFreed,
1211 kNumAllocPages, gMut->PageAllocHits(lock),
1212 gMut->PageAllocAttempts(lock), gMut->PageAllocHitRate(lock), lifetime);
1213 break;
1216 if (!pagePtr) {
1217 // No pages are available, or VirtualAlloc/mprotect failed.
1218 gMut->IncPageAllocMisses(lock);
1219 #if PHC_LOGGING
1220 GMut::PageStats stats = gMut->GetPageStats(lock);
1221 #endif
1222 LOG("No PageAlloc(%zu, %zu), sAllocDelay <- %zu, fullness %zu/%zu/%zu, "
1223 "hits %zu/%zu (%zu%%)\n",
1224 aReqSize, aAlignment, size_t(newAllocDelay), stats.mNumAlloced,
1225 stats.mNumFreed, kNumAllocPages, gMut->PageAllocHits(lock),
1226 gMut->PageAllocAttempts(lock), gMut->PageAllocHitRate(lock));
1229 // Set the new alloc delay.
1230 GAtomic::SetAllocDelay(newAllocDelay);
1232 return ptr;
1235 static void FreePage(GMutLock aLock, uintptr_t aIndex,
1236 const Maybe<arena_id_t>& aArenaId,
1237 const StackTrace& aFreeStack, Delay aReuseDelay)
1238 MOZ_REQUIRES(GMut::sMutex) {
1239 void* pagePtr = gConst->AllocPagePtr(aIndex);
1241 #ifdef XP_WIN
1242 if (!VirtualFree(pagePtr, kPageSize, MEM_DECOMMIT)) {
1243 PHCCrash(aLock, "VirtualFree failed");
1245 #else
1246 if (mmap(pagePtr, kPageSize, PROT_NONE, MAP_FIXED | MAP_PRIVATE | MAP_ANON,
1247 -1, 0) == MAP_FAILED) {
1248 PHCCrash(aLock, "mmap failed");
1250 #endif
1252 gMut->SetPageFreed(aLock, aIndex, aArenaId, aFreeStack, aReuseDelay);
1255 //---------------------------------------------------------------------------
1256 // replace-malloc machinery
1257 //---------------------------------------------------------------------------
1259 // This handles malloc, moz_arena_malloc, and realloc-with-a-nullptr.
1260 MOZ_ALWAYS_INLINE static void* PageMalloc(const Maybe<arena_id_t>& aArenaId,
1261 size_t aReqSize) {
1262 void* ptr = MaybePageAlloc(aArenaId, aReqSize, /* aAlignment */ 1,
1263 /* aZero */ false);
1264 return ptr ? ptr
1265 : (aArenaId.isSome()
1266 ? sMallocTable.moz_arena_malloc(*aArenaId, aReqSize)
1267 : sMallocTable.malloc(aReqSize));
1270 static void* replace_malloc(size_t aReqSize) {
1271 return PageMalloc(Nothing(), aReqSize);
1274 static Delay ReuseDelay(GMutLock aLock) {
1275 return (kAvgPageReuseDelay / 2) +
1276 Rnd64ToDelay<kAvgPageReuseDelay / 2>(gMut->Random64(aLock));
1279 // This handles both calloc and moz_arena_calloc.
1280 MOZ_ALWAYS_INLINE static void* PageCalloc(const Maybe<arena_id_t>& aArenaId,
1281 size_t aNum, size_t aReqSize) {
1282 CheckedInt<size_t> checkedSize = CheckedInt<size_t>(aNum) * aReqSize;
1283 if (!checkedSize.isValid()) {
1284 return nullptr;
1287 void* ptr = MaybePageAlloc(aArenaId, checkedSize.value(), /* aAlignment */ 1,
1288 /* aZero */ true);
1289 return ptr ? ptr
1290 : (aArenaId.isSome()
1291 ? sMallocTable.moz_arena_calloc(*aArenaId, aNum, aReqSize)
1292 : sMallocTable.calloc(aNum, aReqSize));
1295 static void* replace_calloc(size_t aNum, size_t aReqSize) {
1296 return PageCalloc(Nothing(), aNum, aReqSize);
1299 // This function handles both realloc and moz_arena_realloc.
1301 // As always, realloc is complicated, and doubly so when there are two
1302 // different kinds of allocations in play. Here are the possible transitions,
1303 // and what we do in practice.
1305 // - normal-to-normal: This is straightforward and obviously necessary.
1307 // - normal-to-page: This is disallowed because it would require getting the
1308 // arenaId of the normal allocation, which isn't possible in non-DEBUG builds
1309 // for security reasons.
1311 // - page-to-page: This is done whenever possible, i.e. whenever the new size
1312 // is less than or equal to 4 KiB. This choice counterbalances the
1313 // disallowing of normal-to-page allocations, in order to avoid biasing
1314 // towards or away from page allocations. It always occurs in-place.
1316 // - page-to-normal: this is done only when necessary, i.e. only when the new
1317 // size is greater than 4 KiB. This choice naturally flows from the
1318 // prior choice on page-to-page transitions.
1320 // In summary: realloc doesn't change the allocation kind unless it must.
1322 MOZ_ALWAYS_INLINE static void* PageRealloc(const Maybe<arena_id_t>& aArenaId,
1323 void* aOldPtr, size_t aNewSize) {
1324 if (!aOldPtr) {
1325 // Null pointer. Treat like malloc(aNewSize).
1326 return PageMalloc(aArenaId, aNewSize);
1329 PtrKind pk = gConst->PtrKind(aOldPtr);
1330 if (pk.IsNothing()) {
1331 // A normal-to-normal transition.
1332 return aArenaId.isSome()
1333 ? sMallocTable.moz_arena_realloc(*aArenaId, aOldPtr, aNewSize)
1334 : sMallocTable.realloc(aOldPtr, aNewSize);
1337 if (pk.IsGuardPage()) {
1338 GMut::CrashOnGuardPage(aOldPtr);
1341 // At this point we know we have an allocation page.
1342 uintptr_t index = pk.AllocPageIndex();
1344 // A page-to-something transition.
1346 // Note that `disable` has no effect unless it is emplaced below.
1347 Maybe<AutoDisableOnCurrentThread> disable;
1348 // Get the stack trace *before* locking the mutex.
1349 StackTrace stack;
1350 if (GTls::IsDisabledOnCurrentThread()) {
1351 // PHC is disabled on this thread. Leave the stack empty.
1352 } else {
1353 // Disable on this thread *before* getting the stack trace.
1354 disable.emplace();
1355 stack.Fill();
1358 MutexAutoLock lock(GMut::sMutex);
1360 // Check for realloc() of a freed block.
1361 gMut->EnsureValidAndInUse(lock, aOldPtr, index);
1363 if (aNewSize <= kPageSize && gMut->ShouldMakeNewAllocations()) {
1364 // A page-to-page transition. Just keep using the page allocation. We do
1365 // this even if the thread is disabled, because it doesn't create a new
1366 // page allocation. Note that ResizePageInUse() checks aArenaId.
1368 // Move the bytes with memmove(), because the old allocation and the new
1369 // allocation overlap. Move the usable size rather than the requested size,
1370 // because the user might have used malloc_usable_size() and filled up the
1371 // usable size.
1372 size_t oldUsableSize = gMut->PageUsableSize(lock, index);
1373 size_t newUsableSize = sMallocTable.malloc_good_size(aNewSize);
1374 uint8_t* pagePtr = gConst->AllocPagePtr(index);
1375 uint8_t* newPtr = pagePtr + kPageSize - newUsableSize;
1376 memmove(newPtr, aOldPtr, std::min(oldUsableSize, aNewSize));
1377 gMut->ResizePageInUse(lock, index, aArenaId, newPtr, stack);
1378 LOG("PageRealloc-Reuse(%p, %zu) -> %p\n", aOldPtr, aNewSize, newPtr);
1379 return newPtr;
1382 // A page-to-normal transition (with the new size greater than page-sized).
1383 // (Note that aArenaId is checked below.)
1384 void* newPtr;
1385 if (aArenaId.isSome()) {
1386 newPtr = sMallocTable.moz_arena_malloc(*aArenaId, aNewSize);
1387 } else {
1388 Maybe<arena_id_t> oldArenaId = gMut->PageArena(lock, index);
1389 newPtr = (oldArenaId.isSome()
1390 ? sMallocTable.moz_arena_malloc(*oldArenaId, aNewSize)
1391 : sMallocTable.malloc(aNewSize));
1393 if (!newPtr) {
1394 return nullptr;
1397 Delay reuseDelay = ReuseDelay(lock);
1399 // Copy the usable size rather than the requested size, because the user
1400 // might have used malloc_usable_size() and filled up the usable size. Note
1401 // that FreePage() checks aArenaId (via SetPageFreed()).
1402 size_t oldUsableSize = gMut->PageUsableSize(lock, index);
1403 memcpy(newPtr, aOldPtr, std::min(oldUsableSize, aNewSize));
1404 FreePage(lock, index, aArenaId, stack, reuseDelay);
1405 LOG("PageRealloc-Free(%p[%zu], %zu) -> %p, %zu delay, reuse at ~%zu\n",
1406 aOldPtr, index, aNewSize, newPtr, size_t(reuseDelay),
1407 size_t(GAtomic::Now()) + reuseDelay);
1409 return newPtr;
1412 static void* replace_realloc(void* aOldPtr, size_t aNewSize) {
1413 return PageRealloc(Nothing(), aOldPtr, aNewSize);
1416 // This handles both free and moz_arena_free.
1417 MOZ_ALWAYS_INLINE static void PageFree(const Maybe<arena_id_t>& aArenaId,
1418 void* aPtr) {
1419 PtrKind pk = gConst->PtrKind(aPtr);
1420 if (pk.IsNothing()) {
1421 // Not a page allocation.
1422 return aArenaId.isSome() ? sMallocTable.moz_arena_free(*aArenaId, aPtr)
1423 : sMallocTable.free(aPtr);
1426 if (pk.IsGuardPage()) {
1427 GMut::CrashOnGuardPage(aPtr);
1430 // At this point we know we have an allocation page.
1431 uintptr_t index = pk.AllocPageIndex();
1433 // Note that `disable` has no effect unless it is emplaced below.
1434 Maybe<AutoDisableOnCurrentThread> disable;
1435 // Get the stack trace *before* locking the mutex.
1436 StackTrace freeStack;
1437 if (GTls::IsDisabledOnCurrentThread()) {
1438 // PHC is disabled on this thread. Leave the stack empty.
1439 } else {
1440 // Disable on this thread *before* getting the stack trace.
1441 disable.emplace();
1442 freeStack.Fill();
1445 MutexAutoLock lock(GMut::sMutex);
1447 // Check for a double-free.
1448 gMut->EnsureValidAndInUse(lock, aPtr, index);
1450 // Note that FreePage() checks aArenaId (via SetPageFreed()).
1451 Delay reuseDelay = ReuseDelay(lock);
1452 FreePage(lock, index, aArenaId, freeStack, reuseDelay);
1454 #if PHC_LOGGING
1455 GMut::PageStats stats = gMut->GetPageStats(lock);
1456 #endif
1457 LOG("PageFree(%p[%zu]), %zu delay, reuse at ~%zu, fullness %zu/%zu/%zu\n",
1458 aPtr, index, size_t(reuseDelay), size_t(GAtomic::Now()) + reuseDelay,
1459 stats.mNumAlloced, stats.mNumFreed, kNumAllocPages);
1462 static void replace_free(void* aPtr) { return PageFree(Nothing(), aPtr); }
1464 // This handles memalign and moz_arena_memalign.
1465 MOZ_ALWAYS_INLINE static void* PageMemalign(const Maybe<arena_id_t>& aArenaId,
1466 size_t aAlignment,
1467 size_t aReqSize) {
1468 MOZ_RELEASE_ASSERT(IsPowerOfTwo(aAlignment));
1470 // PHC can't satisfy an alignment greater than a page size, so fall back to
1471 // mozjemalloc in that case.
1472 void* ptr = nullptr;
1473 if (aAlignment <= kPageSize) {
1474 ptr = MaybePageAlloc(aArenaId, aReqSize, aAlignment, /* aZero */ false);
1476 return ptr ? ptr
1477 : (aArenaId.isSome()
1478 ? sMallocTable.moz_arena_memalign(*aArenaId, aAlignment,
1479 aReqSize)
1480 : sMallocTable.memalign(aAlignment, aReqSize));
1483 static void* replace_memalign(size_t aAlignment, size_t aReqSize) {
1484 return PageMemalign(Nothing(), aAlignment, aReqSize);
1487 static size_t replace_malloc_usable_size(usable_ptr_t aPtr) {
1488 PtrKind pk = gConst->PtrKind(aPtr);
1489 if (pk.IsNothing()) {
1490 // Not a page allocation. Measure it normally.
1491 return sMallocTable.malloc_usable_size(aPtr);
1494 if (pk.IsGuardPage()) {
1495 GMut::CrashOnGuardPage(const_cast<void*>(aPtr));
1498 // At this point we know aPtr lands within an allocation page, due to the
1499 // math done in the PtrKind constructor. But if aPtr points to memory
1500 // before the base address of the allocation, we return 0.
1501 uintptr_t index = pk.AllocPageIndex();
1503 MutexAutoLock lock(GMut::sMutex);
1505 void* pageBaseAddr = gMut->AllocPageBaseAddr(lock, index);
1507 if (MOZ_UNLIKELY(aPtr < pageBaseAddr)) {
1508 return 0;
1511 return gMut->PageUsableSize(lock, index);
1514 static size_t metadata_size() {
1515 return sMallocTable.malloc_usable_size(gConst) +
1516 sMallocTable.malloc_usable_size(gMut);
1519 void replace_jemalloc_stats(jemalloc_stats_t* aStats,
1520 jemalloc_bin_stats_t* aBinStats) {
1521 sMallocTable.jemalloc_stats_internal(aStats, aBinStats);
1523 // We allocate our memory from jemalloc so it has already counted our memory
1524 // usage within "mapped" and "allocated", we must subtract the memory we
1525 // allocated from jemalloc from allocated before adding in only the parts that
1526 // we have allocated out to Firefox.
1528 aStats->allocated -= kAllPagesJemallocSize;
1530 size_t allocated = 0;
1532 MutexAutoLock lock(GMut::sMutex);
1534 // Add usable space of in-use allocations to `allocated`.
1535 for (size_t i = 0; i < kNumAllocPages; i++) {
1536 if (gMut->IsPageInUse(lock, i)) {
1537 allocated += gMut->PageUsableSize(lock, i);
1541 aStats->allocated += allocated;
1543 // guards is the gap between `allocated` and `mapped`. In some ways this
1544 // almost fits into aStats->wasted since it feels like wasted memory. However
1545 // wasted should only include committed memory and these guard pages are
1546 // uncommitted. Therefore we don't include it anywhere.
1547 // size_t guards = mapped - allocated;
1549 // aStats.page_cache and aStats.bin_unused are left unchanged because PHC
1550 // doesn't have anything corresponding to those.
1552 // The metadata is stored in normal heap allocations, so they're measured by
1553 // mozjemalloc as `allocated`. Move them into `bookkeeping`.
1554 // They're also reported under explicit/heap-overhead/phc/fragmentation in
1555 // about:memory.
1556 size_t bookkeeping = metadata_size();
1557 aStats->allocated -= bookkeeping;
1558 aStats->bookkeeping += bookkeeping;
1561 void replace_jemalloc_ptr_info(const void* aPtr, jemalloc_ptr_info_t* aInfo) {
1562 // We need to implement this properly, because various code locations do
1563 // things like checking that allocations are in the expected arena.
1564 PtrKind pk = gConst->PtrKind(aPtr);
1565 if (pk.IsNothing()) {
1566 // Not a page allocation.
1567 return sMallocTable.jemalloc_ptr_info(aPtr, aInfo);
1570 if (pk.IsGuardPage()) {
1571 // Treat a guard page as unknown because there's no better alternative.
1572 *aInfo = {TagUnknown, nullptr, 0, 0};
1573 return;
1576 // At this point we know we have an allocation page.
1577 uintptr_t index = pk.AllocPageIndex();
1579 MutexAutoLock lock(GMut::sMutex);
1581 gMut->FillJemallocPtrInfo(lock, aPtr, index, aInfo);
1582 #if DEBUG
1583 LOG("JemallocPtrInfo(%p[%zu]) -> {%zu, %p, %zu, %zu}\n", aPtr, index,
1584 size_t(aInfo->tag), aInfo->addr, aInfo->size, aInfo->arenaId);
1585 #else
1586 LOG("JemallocPtrInfo(%p[%zu]) -> {%zu, %p, %zu}\n", aPtr, index,
1587 size_t(aInfo->tag), aInfo->addr, aInfo->size);
1588 #endif
1591 arena_id_t replace_moz_create_arena_with_params(arena_params_t* aParams) {
1592 // No need to do anything special here.
1593 return sMallocTable.moz_create_arena_with_params(aParams);
1596 void replace_moz_dispose_arena(arena_id_t aArenaId) {
1597 // No need to do anything special here.
1598 return sMallocTable.moz_dispose_arena(aArenaId);
1601 void replace_moz_set_max_dirty_page_modifier(int32_t aModifier) {
1602 // No need to do anything special here.
1603 return sMallocTable.moz_set_max_dirty_page_modifier(aModifier);
1606 void* replace_moz_arena_malloc(arena_id_t aArenaId, size_t aReqSize) {
1607 return PageMalloc(Some(aArenaId), aReqSize);
1610 void* replace_moz_arena_calloc(arena_id_t aArenaId, size_t aNum,
1611 size_t aReqSize) {
1612 return PageCalloc(Some(aArenaId), aNum, aReqSize);
1615 void* replace_moz_arena_realloc(arena_id_t aArenaId, void* aOldPtr,
1616 size_t aNewSize) {
1617 return PageRealloc(Some(aArenaId), aOldPtr, aNewSize);
1620 void replace_moz_arena_free(arena_id_t aArenaId, void* aPtr) {
1621 return PageFree(Some(aArenaId), aPtr);
1624 void* replace_moz_arena_memalign(arena_id_t aArenaId, size_t aAlignment,
1625 size_t aReqSize) {
1626 return PageMemalign(Some(aArenaId), aAlignment, aReqSize);
1629 class PHCBridge : public ReplaceMallocBridge {
1630 virtual bool IsPHCAllocation(const void* aPtr, phc::AddrInfo* aOut) override {
1631 PtrKind pk = gConst->PtrKind(aPtr);
1632 if (pk.IsNothing()) {
1633 return false;
1636 bool isGuardPage = false;
1637 if (pk.IsGuardPage()) {
1638 if ((uintptr_t(aPtr) % kPageSize) < (kPageSize / 2)) {
1639 // The address is in the lower half of a guard page, so it's probably an
1640 // overflow. But first check that it is not on the very first guard
1641 // page, in which case it cannot be an overflow, and we ignore it.
1642 if (gConst->IsInFirstGuardPage(aPtr)) {
1643 return false;
1646 // Get the allocation page preceding this guard page.
1647 pk = gConst->PtrKind(static_cast<const uint8_t*>(aPtr) - kPageSize);
1649 } else {
1650 // The address is in the upper half of a guard page, so it's probably an
1651 // underflow. Get the allocation page following this guard page.
1652 pk = gConst->PtrKind(static_cast<const uint8_t*>(aPtr) + kPageSize);
1655 // Make a note of the fact that we hit a guard page.
1656 isGuardPage = true;
1659 // At this point we know we have an allocation page.
1660 uintptr_t index = pk.AllocPageIndex();
1662 if (aOut) {
1663 if (GMut::sMutex.TryLock()) {
1664 gMut->FillAddrInfo(index, aPtr, isGuardPage, *aOut);
1665 LOG("IsPHCAllocation: %zu, %p, %zu, %zu, %zu\n", size_t(aOut->mKind),
1666 aOut->mBaseAddr, aOut->mUsableSize,
1667 aOut->mAllocStack.isSome() ? aOut->mAllocStack->mLength : 0,
1668 aOut->mFreeStack.isSome() ? aOut->mFreeStack->mLength : 0);
1669 GMut::sMutex.Unlock();
1670 } else {
1671 LOG("IsPHCAllocation: PHC is locked\n");
1672 aOut->mPhcWasLocked = true;
1675 return true;
1678 virtual void DisablePHCOnCurrentThread() override {
1679 GTls::DisableOnCurrentThread();
1680 LOG("DisablePHCOnCurrentThread: %zu\n", 0ul);
1683 virtual void ReenablePHCOnCurrentThread() override {
1684 GTls::EnableOnCurrentThread();
1685 LOG("ReenablePHCOnCurrentThread: %zu\n", 0ul);
1688 virtual bool IsPHCEnabledOnCurrentThread() override {
1689 bool enabled = !GTls::IsDisabledOnCurrentThread();
1690 LOG("IsPHCEnabledOnCurrentThread: %zu\n", size_t(enabled));
1691 return enabled;
1694 virtual void PHCMemoryUsage(
1695 mozilla::phc::MemoryUsage& aMemoryUsage) override {
1696 aMemoryUsage.mMetadataBytes = metadata_size();
1697 if (gMut) {
1698 MutexAutoLock lock(GMut::sMutex);
1699 aMemoryUsage.mFragmentationBytes = gMut->FragmentationBytes();
1700 } else {
1701 aMemoryUsage.mFragmentationBytes = 0;
1705 // Enable or Disable PHC at runtime. If PHC is disabled it will still trap
1706 // bad uses of previous allocations, but won't track any new allocations.
1707 virtual void SetPHCState(mozilla::phc::PHCState aState) override {
1708 gMut->SetState(aState);
1712 // WARNING: this function runs *very* early -- before all static initializers
1713 // have run. For this reason, non-scalar globals (gConst, gMut) are allocated
1714 // dynamically (so we can guarantee their construction in this function) rather
1715 // than statically. GAtomic and GTls contain simple static data that doesn't
1716 // involve static initializers so they don't need to be allocated dynamically.
1717 void replace_init(malloc_table_t* aMallocTable, ReplaceMallocBridge** aBridge) {
1718 // Don't run PHC if the page size isn't what we statically defined.
1719 jemalloc_stats_t stats;
1720 aMallocTable->jemalloc_stats_internal(&stats, nullptr);
1721 if (stats.page_size != kPageSize) {
1722 return;
1725 sMallocTable = *aMallocTable;
1727 // The choices of which functions to replace are complex enough that we set
1728 // them individually instead of using MALLOC_FUNCS/malloc_decls.h.
1730 aMallocTable->malloc = replace_malloc;
1731 aMallocTable->calloc = replace_calloc;
1732 aMallocTable->realloc = replace_realloc;
1733 aMallocTable->free = replace_free;
1734 aMallocTable->memalign = replace_memalign;
1736 // posix_memalign, aligned_alloc & valloc: unset, which means they fall back
1737 // to replace_memalign.
1738 aMallocTable->malloc_usable_size = replace_malloc_usable_size;
1739 // default malloc_good_size: the default suffices.
1741 aMallocTable->jemalloc_stats_internal = replace_jemalloc_stats;
1742 // jemalloc_purge_freed_pages: the default suffices.
1743 // jemalloc_free_dirty_pages: the default suffices.
1744 // jemalloc_thread_local_arena: the default suffices.
1745 aMallocTable->jemalloc_ptr_info = replace_jemalloc_ptr_info;
1747 aMallocTable->moz_create_arena_with_params =
1748 replace_moz_create_arena_with_params;
1749 aMallocTable->moz_dispose_arena = replace_moz_dispose_arena;
1750 aMallocTable->moz_arena_malloc = replace_moz_arena_malloc;
1751 aMallocTable->moz_arena_calloc = replace_moz_arena_calloc;
1752 aMallocTable->moz_arena_realloc = replace_moz_arena_realloc;
1753 aMallocTable->moz_arena_free = replace_moz_arena_free;
1754 aMallocTable->moz_arena_memalign = replace_moz_arena_memalign;
1756 static PHCBridge bridge;
1757 *aBridge = &bridge;
1759 #ifndef XP_WIN
1760 // Avoid deadlocks when forking by acquiring our state lock prior to forking
1761 // and releasing it after forking. See |LogAlloc|'s |replace_init| for
1762 // in-depth details.
1764 // Note: This must run after attempting an allocation so as to give the
1765 // system malloc a chance to insert its own atfork handler.
1766 sMallocTable.malloc(-1);
1767 pthread_atfork(GMut::prefork, GMut::postfork_parent, GMut::postfork_child);
1768 #endif
1770 // gConst and gMut are never freed. They live for the life of the process.
1771 gConst = InfallibleAllocPolicy::new_<GConst>();
1772 GTls::Init();
1773 gMut = InfallibleAllocPolicy::new_<GMut>();
1775 MutexAutoLock lock(GMut::sMutex);
1776 Delay firstAllocDelay =
1777 Rnd64ToDelay<kAvgFirstAllocDelay>(gMut->Random64(lock));
1778 GAtomic::Init(firstAllocDelay);