(The fix and CVE number for this issue are unknown)
A specially crafted web-page can cause a type confusion in HTML layout in Microsoft Internet Explorer 11. An attacker might be able to exploit this issue to execute arbitrary code.
Microsoft Internet Explorer 11
An attacker would need to get a target user to open a specially crafted web-page. Disabling Javascript should prevent an attacker from triggering the vulnerable code path.
Internally MSIE uses various lists of linked CTreePos
objects to represent
the DOM tree. For HTML/SVG elements a CTreeNode
element is created, which
embeds two CTreePos
instances: one that contains information about the first
child of the element and one that indicates the next sibling or parent of the
element. For text nodes an object containing only one CTreePos
is created, as
such nodes never have any children. CTreePos
instances have various flags
set. This includes a flag that indicates if they are the first (fTPBegin
) or
second (fTPEnd
) CTreePos
instance for an element, or the only instance for
a test node (fTPText
).
The CTreePos::
method of an CTreePos
instance embedded in a
CTreeNode
can be used to calculate a pointer to the CTreeNode
. It
determines if the CTreePos
instance is the first or second in the CTreeNode
by looking at the fTPBegin
flag and subtract the offset of this CTreePos
object in a CTreeNode
object to calculate the address of the later. This
method assumes that the CTreePos
instance is part of a CTreeNode
and not a
TextNode
. It will yield invalid results when called on the later. In a
TextNode
, the CTreePos
does not have the fTPBegin
flag set, so the code
assumes this is the second CTreePos
instance in a CTreeNode
object and
subtracts 0x24 from its address to calculate the address of the CTreeNode
.
Since the CTreePos
instance is the first element in a TextNode
, the
returned address will be 0x24 bytes before the TextNode
, pointing to memory
that is not part of the object.
Note that this behavior is very similar to another issue I found around the same time, in that that issues also caused the code to access memory 0x24 bytes before the start of a memory region containing an object. Looking back I believe that both issues may have had the same root cause and were fixed at the same time.
The CGeneratedContent::
method walks the DOM using one
of the CTreePos
linked lists. It looks for any descendant node of an element
that has a CTreePos
instance with a specific flag set. If found, the
CTreePos::
method is called to find the related CTreeNode
, without
checking if the CTreePos
is indeed part of a CTreeNode
. If a certain flag
is set on this CTreeNode
, it returns true. Otherwise it continues scanning.
If nothing is found, it returns false.
The repro creates a situation where the
CGeneratedContent::
method is called on an SVG path
element which has a TextNode
instance as a descendant with the right flags
set to cause it to call CTreePos::
on this TextNode
. This leads to
type confusion/a bad cast where a pointer that points before a TextNode
is
used as a pointer to a CTreeNode
.
While reversing the relevant parts, I created the following pseudo-code to illustrate the issue:
enum eTreePosFlags { fTPBegin = 0x01, // if set, this is a markup node fTPEnd = 0x02, // if set, this is a markup node fTPText = 0x04, // if set, this is a markup node fTPPointer = 0x08, // if set, this is not a markup node fTPTypeMask = 0x0f fTPLeftChild = 0x10, fTPLastChild = 0x20, // poNextSiblingOrParent => fTPLastChild ? parent : sibling fTPData2Pos = 0x40, // valid if fTPPointer is set fTPDataPos = 0x80, fTPUnknownFlag100 = 0x100, // if set, this is not a markup node } struct CTreePos { /*offs size*/ // THE BELOW ARE BEST GUESSES BASED ON INADEQUATE INFORMATION!! /*0000 0004*/ eTreePosType fFlags00; /*0004 0004*/ UINT uCharsCount04; // Seems to be counting some chars - not sure what exactly /*0008 0004*/ CTreePos* poFirstChild; // can be NULL if no children exist. /*000C 0004*/ CTreePos* poNextSiblingOrParent; // fFlags00 & fTPLastChild ? parent end tag : sibling start tag /*0010 0004*/ CTreePos* poThreadLeft10; // fFlags00 & fTPBegin ? previous sibling or parent : last child or start tag /*0014 0004*/ CTreePos* poThreadRight14; // fFlags00 & fTPBegin ? first child or end tag : /*0018 0004*/ flags (0x10 = something with CDATA /*0028 0004*/ } struct CTreeNode { /*offs size*/ // THE BELOW ARE BEST GUESSES BASED ON INADEQUATE INFORMATION!! /*0000 0004*/ CElement* poElement00; /*0004 0004*/ CTreeNode* poParent04; /*0008 0004*/ DWORD dwUnknown08; // flags? /*000C 0018*/ CTreePos oTreePosBegin0C; // represents the position in the document immediately before the start tag /*0024 0018*/ CTreePos oTreePosEnd24; // represents the position in the document immediately after the end tag /*003C ????*/ Unknown } struct TextNode { // I did not figure out what this is called in MSIE /*0000 0018*/ CTreePos oTreePosEnd00; // represents the position in the document immediately after the node. /*0018 0014*/ Unknown } CTreeNode* CTreePos::If you replace the <q>
tag with an <a>
tag in the repro, or insert a
<script>
tag before the <svg>
tag, the repro does not trigger an access
violation. At that point it is possible to use
document.
as well as recursively walk
document.
to get an idea of what the DOM tree looks
like around the time
of the crash.
document.
:
<html>
<head>
</head>
<body>
<svg xmlns="http://www. w3. org/2000/svg">
<path marker-start="url("#")">
<title>
<q>
<button> // no closing tag.
<script> // script is a sibling of button
#text // snipped
</script>
</q>
</title> // Things get really weird here:
</title>
</path> // all svg close tags are doubled!?
</path>
</svg> // Not sure what this means.
</svg>
</body>
</html>
Walking document.
:
<html>
<head>
<body>
<svg> // I did not look at attributes
<path> // ^^^ same here
<title>
<q>
<button>
<script> // script is a child of button
#text // snipped
I did not find any code path that could lead to exploitation. However, I did not do a thorough step through of the code to find out if and how I might control execution flow upwards in the stack. Also, it appears trivial to have MSIE survive the initial crash by massaging the heap. It might be possible that other methods are affected by a similar issue and that further DOM manipulations can be used to trigger a more interesting code path.
This report was generated using a predecessor of BugId, a Python script created to detect, analyze and id application bugs. Don't waste time manually analyzing issues and writing reports but try BugId out yourself today! You'll get even better reports than this one with the current version.id: MSHTML.dll!CGeneratedContent:: HasGeneratedSVGMarker Arbitrary~010 AVR(64BAE13E) description: Security: Attempt to read from unallocated arbitrary memory (@0x0F14D010) in MSHTML. dll!CGeneratedContent:: HasGeneratedSVGMarker note: Based on this information, this is expected to be a security issue!