🤬
  • ■ ■ ■ ■ ■ ■
    README.md
    skipped 4 lines
    5 5  ## Intro
    6 6   
    7 7  This is an example implementation for _Thread Stack Spoofing_ technique aiming to evade Malware Analysts, AVs and EDRs looking for references to shellcode's frames in an examined thread's call stack.
    8  -The idea is to walk back thread's call stack and overwrite return addresses in subsequent function frames thus masquerading allocations containing malware's code.
    9  - 
    10  -An implementation may differ, however the idea is roughly similar to what commercial C2 frameworks offer for its agents.
     8 +The idea is to hide references to the shellcode on thread's call stack thus masquerading allocations containing malware's code.
    11 9   
    12 10  Implementation along with my [ShellcodeFluctuation](https://github.com/mgeeky/ShellcodeFluctuation) brings Offensive Security community sample implementations to catch up on the offering made by commercial C2 products, so that we can do no worse in our Red Team toolings. 💪
    13 11   
    14 12   
    15 13  ### Implementation has changed
    16 14   
    17  -Current implementation differs heavily to what was originally published. This is because I realised that there is a way simpler approach to terminate thread's call stack and hide shellcode's related frames by simply writing `0` to the return address of our handler:
     15 +Current implementation differs heavily to what was originally published.
     16 +This is because I realised there is a way simpler approach to terminate thread's call stack processal and hide shellcode's related frames by simply writing `0` to the return address of the first frame we control:
    18 17   
    19 18  ```
    20 19  void WINAPI MySleep(DWORD _dwMilliseconds)
    skipped 9 lines
    30 29   
    31 30  The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2).
    32 31   
    33  -This implementation works nicely on both `Debug` and `Release` under two architectures - `x64` and `x86`.
     32 +This implementation is much more stable and works nicely on both `Debug` and `Release` under two architectures - `x64` and `x86`.
    34 33   
    35 34   
    36 35  ## Demo
    skipped 6 lines
    43 42   
    44 43  ![spoofed](images/spoofed2.png)
    45 44   
    46  -Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
     45 +Above we can see that the last frame on our call stack is our `MySleep` callback.
     46 +One can wonder if that immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected thread entry points within system libraries:
     47 + 
    47 48  ```
    48 49  kernel32!BaseThreadInitThunk+0x14
    49 50  ntdll!RtlUserThreadStart+0x21
    50 51  ```
    51 52   
    52  -However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
     53 +However the call stack of spoofed thread may look rather at first, a brief examination of my system shown, that there are other threads having call stacks not unwinding to the above handlers as well:
    53 54   
    54 55  ![legit call stack](images/legit-call-stack.png)
    55 56   
    56  -The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
     57 +The above screenshot shows a thread of unmodified **Total Commander x64**. As we can see, its call stack pretty much resembles our own in terms of initial call stack frames.
    57 58   
    58 59  Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
    59 60   
    60 61   
    61 62  ## How it works?
    62 63   
    63  -This program performs self-injection shellcode (roughly via classic `VirtualAlloc` + `memcpy` + `CreateThread`).
    64  -Then when shellcode runs (this implementation specifically targets Cobalt Strike Beacon implants) a Windows function will be hooked intercepting moment when Beacon falls asleep `kernel32!Sleep`.
    65  -Whenever hooked `MySleep` function gets invoked, it will spoof its own call stack leading to this `MySleep` function and begin sleeping.
    66  -Having awaited for expected amount of time, the Thread's call stack will get restored assuring stable return and shellcode's execution resumption.
    67  - 
    68 64  The rough algorithm is following:
    69 65   
    70 66  1. Read shellcode's contents from file.
    71 67  2. Acquire all the necessary function pointers from `dbghelp.dll`, call `SymInitialize`
    72 68  3. Hook `kernel32!Sleep` pointing back to our callback.
    73  -4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. A slight twist here is that our thread starts from a legitimate `ntdll!RltUserThreadStart+0x21` address to mimic other threads
     69 +4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. The thread should start from our `runShellcode` function to avoid having Thread's _StartAddress_ point into somewhere unexpected and anomalous (such as `ntdll!RtlUserThreadStart+0x21`)
    74 70  5. As soon as Beacon attempts to sleep, our `MySleep` callback gets invoked.
    75  -6. Overwrite last return address on the stack to `0` which effectively should finish the call stack.
     71 +6. We then overwrite last return address on the stack to `0` which effectively should finish the call stack.
    76 72  7. Finally a call to `::SleepEx` is made to let the Beacon's sleep while waiting for further communication.
    77 73  8. After Sleep is finished, we restore previously saved original function return addresses and execution is resumed.
    78 74   
    79  -Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register. In order to find them on the stack, we need to firstly collect frame pointers, then dereference them for overwriting:
     75 +Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register.
     76 +In order to find them on the stack, we need to firstly collect frame pointers, then dereference them for overwriting:
    80 77   
    81 78  ![stack frame](images/frame0.png)
    82 79   
    skipped 3 lines
    86 83   *(PULONG_PTR)(frameAddr + sizeof(void*)) = Fake_Return_Address;
    87 84  ```
    88 85   
    89  -This precise logic is provided by `walkCallStack` and `spoofCallStack` functions in `main.cpp`.
     86 +Initial implementation of `ThreadStackSpoofer` did that in `walkCallStack` and `spoofCallStack` functions, however the current implementation shows that these efforts _are not required to maintain stealthy call stack_.
    90 87   
    91 88   
    92 89  ## Example run
    skipped 194 lines
Please wait...
Page is in error, reload to recover