This is an example implementation for _Thread Stack Spoofing_ technique aiming to evade Malware Analysts, AVs and EDRs looking for references to shellcode's frames in an examined thread's call stack.
8
-
The idea is to walkback thread's call stackandoverwritereturnaddressesinsubsequentfunctionframes thus masquerading allocations containing malware's code.
9
-
10
-
An implementation may differ, however the idea is roughly similar to what commercial C2 frameworks offer for its agents.
8
+
The idea is to hidereferencestotheshellcodeon thread's call stack thus masquerading allocations containing malware's code.
11
9
12
10
Implementation along with my [ShellcodeFluctuation](https://github.com/mgeeky/ShellcodeFluctuation) brings Offensive Security community sample implementations to catch up on the offering made by commercial C2 products, so that we can do no worse in our Red Team toolings. 💪
13
11
14
12
15
13
### Implementation has changed
16
14
17
-
Current implementation differs heavily to what was originally published. This is because I realised that there is a way simpler approach to terminate thread's call stack and hide shellcode's related frames by simply writing `0` to the return address of our handler:
15
+
Current implementation differs heavily to what was originally published.
16
+
This is because I realised there is a way simpler approach to terminate thread's call stack processal and hide shellcode's related frames by simply writing `0` to the return address of the first frame we control:
18
17
19
18
```
20
19
void WINAPI MySleep(DWORD _dwMilliseconds)
skipped 9 lines
30
29
31
30
The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2).
32
31
33
-
This implementation works nicely on both `Debug` and `Release` under two architectures - `x64` and `x86`.
32
+
This implementation ismuchmorestableandworks nicely on both `Debug` and `Release` under two architectures - `x64` and `x86`.
34
33
35
34
36
35
## Demo
skipped 6 lines
43
42
44
43
![spoofed](images/spoofed2.png)
45
44
46
-
Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
45
+
Above we can see that the last frame on our call stack is our `MySleep` callback.
46
+
One can wonder if that immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected thread entry points within system libraries:
47
+
47
48
```
48
49
kernel32!BaseThreadInitThunk+0x14
49
50
ntdll!RtlUserThreadStart+0x21
50
51
```
51
52
52
-
However a brief examination of my system shown, that there are plentyof threads having call stacks not unwinding to the above handlers:
53
+
However thecallstackofspoofedthreadmaylookratheratfirst,a brief examination of my system shown, that there are other threads having call stacks not unwinding to the above handlersaswell:
53
54
54
55
![legit call stack](images/legit-call-stack.png)
55
56
56
-
The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
57
+
The above screenshot shows a thread of unmodified **Total Commander x64**. As we can see, its call stack pretty much resembles our own in terms of initial call stack frames.
57
58
58
59
Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
59
60
60
61
61
62
## How it works?
62
63
63
-
This program performs self-injection shellcode (roughly via classic `VirtualAlloc` + `memcpy` + `CreateThread`).
64
-
Then when shellcode runs (this implementation specifically targets Cobalt Strike Beacon implants) a Windows function will be hooked intercepting moment when Beacon falls asleep `kernel32!Sleep`.
65
-
Whenever hooked `MySleep` function gets invoked, it will spoof its own call stack leading to this `MySleep` function and begin sleeping.
66
-
Having awaited for expected amount of time, the Thread's call stack will get restored assuring stable return and shellcode's execution resumption.
67
-
68
64
The rough algorithm is following:
69
65
70
66
1. Read shellcode's contents from file.
71
67
2. Acquire all the necessary function pointers from `dbghelp.dll`, call `SymInitialize`
72
68
3. Hook `kernel32!Sleep` pointing back to our callback.
73
-
4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. A slight twist here is that our thread starts from a legitimate `ntdll!RltUserThreadStart+0x21` address to mimic other threads
69
+
4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. The thread should start from our `runShellcode` function to avoid having Thread's _StartAddress_ point into somewhere unexpected and anomalous (such as `ntdll!RtlUserThreadStart+0x21`)
74
70
5. As soon as Beacon attempts to sleep, our `MySleep` callback gets invoked.
75
-
6. Overwrite last return address on the stack to `0` which effectively should finish the call stack.
71
+
6. Wethenoverwrite last return address on the stack to `0` which effectively should finish the call stack.
76
72
7. Finally a call to `::SleepEx` is made to let the Beacon's sleep while waiting for further communication.
77
73
8. After Sleep is finished, we restore previously saved original function return addresses and execution is resumed.
78
74
79
-
Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register. Inordertofindthemonthestack,weneedtofirstlycollectframepointers,thendereferencethemforoverwriting:
75
+
Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register.
76
+
In order to find them on the stack, we need to firstly collect frame pointers, then dereference them for overwriting:
This precise logic is provided by `walkCallStack` and `spoofCallStack` functions in `main.cpp`.
86
+
Initial implementation of `ThreadStackSpoofer` did that in `walkCallStack` and `spoofCallStack` functions, however the current implementation shows that these efforts _are not required to maintain stealthy call stack_.