[mps-discussion] Microsoft C compiler hides references causing heap corruption

Richard Brooksby rb at ravenbrook.com
Thu Jul 10 11:24:22 BST 2014

The MPS team is currently investigating a case of the Microsoft C compiler optimising away the last reference to an object, causing it to be prematurely recycled and corrupt the heap.  I'm posting this to the mps-discussion list for feedback, but also so that I can invite discussion on other lists.

In this case, the compiler optimizes away the base pointer to an object, keeping on the stack only the difference between it and another object.  The function is recursively operating on a tree that it is discovering from the heap, this is the last reference, and in fact, every recursive call creates another unsafe difference.

This in the Memory Pool System <http://www.ravenbrook.com/project/mps/> when "gcbench" program is compiled with Microsoft C 16.00.40219.01 for x64 with the "/Og" "/Oy" "/Ob2" optimisation flags (all included in "/O2").  In another test case, "/Og" was sufficient.

This could have consequences for other conservative collectors (and the BDWGC in particular) as well.

This repro manifests the problem at the Visual Studio x64 command prompt:

   git clone https://github.com/Ravenbrook/mps-temporary.git
   cd mps-temporary
   git checkout 4cf086cf56
   cd code
   nmake /f w3i6mv.nmk VARIETY=hot gcbench.exe
   w3i6mv\hot\gcbench.exe -x 1307308329 amc

You should see this output (or a debugger launch):

   seed: 1763120249
   The MPS detected a problem!
   lockw3.c:78: MPS ASSERTION FAILED: lock->claims == 0

The assertion does not occur near the cause, and it took us a while to track this down, but that's a debugging story for another time.

The problem occurs in update_tree <https://github.com/Ravenbrook/mps-temporary/blob/4cf086cf56a3302ee1ed613dc6380eafe0449033/code/gcbench.c#L129>.

Here is the disassembly, followed by an analysis.

   /* Update tree to be identical tree but with nodes reallocated
    * with probability pupdate.  This avoids writing to vector slots
    * if unecessary. */
   static obj_t update_tree(mps_ap_t ap, obj_t oldtree, unsigned d) {
   00404F74  push        ebp  
   00404F75  mov         ebp,esp  
   00404F77  push        ebx  
   00404F78  push        edi  
     obj_t tree;
     size_t i;
     if (oldtree == objNULL || d == 0)
   00404F79  mov         edi,dword ptr [ebp+0Ch]  
   00404F7C  cmp         edi,0DECEA5EDh  
   00404F82  je          00405022  
   00404F88  xor         ebx,ebx  
   00404F8A  cmp         dword ptr [ebp+10h],ebx  
   00404F8D  je          00405022  
     -- d;
   00404F93  dec         dword ptr [ebp+10h]  
   00404F96  push        esi  
     if (rnd_double() < pupdate) {
   00404F97  call        00401A91  
   00404F9C  fcomp       qword ptr ds:[004942C0h]  
   00404FA2  fnstsw      ax  
   00404FA4  test        ah,5  
   00404FA7  jp          00404FEB  
       tree = mkvector(ap, width);
   00404FA9  push        dword ptr ds:[004942B0h]  
   00404FAF  push        dword ptr [ebp+8]  
   00404FB2  call        00404DB3  
   00404FB7  pop         ecx  
   00404FB8  pop         ecx  
   00404FB9  mov         dword ptr [ebp+0Ch],eax  
       for (i = 0; i < width; ++i) {
   00404FBC  cmp         dword ptr ds:[004942B0h],ebx  
   00404FC2  jbe         0040501C  
   00404FC4  lea         esi,[eax+8]  
   00404FC7  sub         edi,eax  
         obj_t oldsubtree = aref(oldtree, i);
         obj_t subtree = update_tree(ap, oldsubtree, d);
   00404FC9  push        dword ptr [ebp+10h]  
   00404FCC  mov         eax,dword ptr [edi+esi]  
   00404FCF  push        eax  
   00404FD0  push        dword ptr [ebp+8]  
   00404FD3  call        00404F74  
   00404FD8  add         esp,0Ch  
         aset(tree, i, subtree);
   00404FDB  mov         dword ptr [esi],eax  
   00404FDD  inc         ebx  
   00404FDE  add         esi,4  
   00404FE1  cmp         ebx,dword ptr ds:[004942B0h]  
   00404FE7  jb          00404FC9  
     } else {
   00404FE9  jmp         0040501C  
       tree = oldtree;
   00404FEB  mov         dword ptr [ebp+0Ch],edi  
       for (i = 0; i < width; ++i) {
   00404FEE  cmp         dword ptr ds:[004942B0h],ebx  
   00404FF4  jbe         0040501C  
       tree = oldtree;
   00404FF6  lea         esi,[edi+8]  
         obj_t oldsubtree = aref(oldtree, i);
         obj_t subtree = update_tree(ap, oldsubtree, d);
   00404FF9  push        dword ptr [ebp+10h]  
   00404FFC  mov         edi,dword ptr [esi]  
   00404FFE  push        edi  
   00404FFF  push        dword ptr [ebp+8]  
   00405002  call        00404F74  
   00405007  add         esp,0Ch  
         if (subtree != oldsubtree) {
   0040500A  cmp         eax,edi  
   0040500C  je          00405010  
           aset(tree, i, subtree);
   0040500E  mov         dword ptr [esi],eax  
       for (i = 0; i < width; ++i) {
   00405010  inc         ebx  
   00405011  add         esi,4  
   00405014  cmp         ebx,dword ptr ds:[004942B0h]  
   0040501A  jb          00404FF9  
     return tree;
   0040501C  mov         eax,dword ptr [ebp+0Ch]  
   0040501F  pop         esi  
   00405020  jmp         00405024  
       return oldtree;
   00405022  mov         eax,edi  
   00405024  pop         edi  
   00405025  pop         ebx  
   00405026  pop         ebp  
   00405027  ret  

The instruction at 0x404FB9 stores the result of the call to mkvector ('tree') in the stack slot for the second argument ('oldtree').  At that point, oldtree is still in edi, but then the instruction at 0x404FC7 overwrites it (by subtracting 'tree' from it).

So at the point of the recursive call to update_tree (0x404FD3), the stack looks like this:

   ebp + 10:  d
   ebp + 0c:  tree (stored in the argument slot for 'oldtree')
   ebp + 08:  ap
   ebp + 04:  return address
   ebp:       caller's ebp
              caller's ebx
              caller's edi
              caller's esi
   esp:       ap

and these are the registers:

   edi oldtree-tree
   ebx i
   eax oldtree[i]
   esi &tree[i]

So here, there is no pointer to 'oldtree', not even to the interior of it.  There is a pointer to 'tree' (on the stack at ebp + 0c) and a pointer to the interior of it (in the register esi), but not to 'oldtree'.  So that object can be moved or reclaimed in the recursive call (which calls mkvector).  Then after return from the recursive call, we go around the loop (0x404FE7 loops back to 0x404FC9), and (at 0x404FCC) the next oldtree[i] is loaded from the address edi+esi, which is no longer a pointer into the 'oldtree' object.

In the MPS the problem manifests in the single-threaded case, where a GC "flip" (scan of unprotectable references, including registers and stack) occurs during the allocation called from mkvector.  It also manifests in the multi-threaded case, where a GC "flip" can occur between any two mutator instructions.

We have done some work  on test case reduction. In particular we've tried, first, inlining the essential parts of the Dylan object format, and second, making the object format clearer by representing objects as unions (as in our Scheme examples) instead of opaque words.

What we found was that when I changed obj_t (the type of the Dylan objects) from mps_word_t (an integer type) to a pointer type, the bug went away. This suggests a hypothesis: maybe Microsoft C only applies the optimization we are worried about (storing the difference between two values) if the values belong to an integer type. If that’s right, then things aren’t as bad as we fear.

Converting a pointer to an integer and vice versa is a implementation-defined operation in ANSI Standard C. The defined conversions for pointers are given in § of the C99 standard, which says:

> 5. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
> 6. Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Microsoft Visual Studio's list of implementation-defined behaviours is here <http://msdn.microsoft.com/en-us/library/3d61ah0s.aspx> but we cannot find any information on pointer–integer conversion.

We don't think any of this affects the fact that there is an issue here for garbage collectors, because the way that mutator programs are written is somewhat out of our control.  It at least reduces the set of programs whose memory we can successfully manage.


"A Proposal for Garbage-Collector-Safe C Compilation" by Hans Boehm and David Chase from 1992 <http://hboehm.info/gc/papers/boecha.ps.gz>.

More information about the MPS-discussion mailing list