Revision indexing in progress... (symbol navigation in revisions will be accurate after indexed)
Total 1 files
■ ■ ■ ■ ■ ■
README.md
skipped 30 lines
31
31
The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2).
32
32
33
33
34
+
## Demo
35
+
36
+
This is how a call stack may look like when it is **NOT** spoofed:
37
+
38
+
![not-spoofed](images/not-spoofed.png)
39
+
40
+
This in turn, when thread stack spoofing is enabled:
41
+
42
+
![spoofed](images/spoofed2.png)
43
+
44
+
Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
45
+
```
46
+
kernel32!BaseThreadInitThunk+0x14
47
+
ntdll!RtlUserThreadStart+0x21
48
+
```
49
+
50
+
However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
51
+
52
+
![legit call stack](images/legit-call-stack.png)
53
+
54
+
The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
55
+
56
+
Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
57
+
58
+
34
59
## How it works?
35
60
36
61
This program performs self-injection shellcode (roughly via classic `VirtualAlloc` + `memcpy` + `CreateThread`).
skipped 25 lines
62
87
This precise logic is provided by `walkCallStack` and `spoofCallStack` functions in `main.cpp`.
63
88
64
89
65
-
## Demo
90
+
## Example run
91
+
92
+
Use case:
66
93
67
-
This is how a call stack may look like when it is **NOT** spoofed:
94
+
```
95
+
C:\> ThreadStackSpoofer.exe <shellcode> <spoof>
96
+
```
68
97
69
-
![not-spoofed](images/not-spoofed.png)
98
+
Where:
99
+
- `<shellcode>` is a path to the shellcode file
100
+
- `<spoof>` when `1` or `true` will enable thread stack spoofing and anything else disables it.
70
101
71
-
This in turn, when thread stack spoofing is enabled:
72
102
73
-
![spoofed](images/spoofed2.png)
103
+
Example run that spoofs beacon's thread call stack:
74
104
75
-
Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
[>] Original return address: 0x1926747bd51. Finishing call stack...
80
112
81
-
However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
113
+
===> MySleep(5000)
82
114
83
-
![legit call stack](images/legit-call-stack.png)
115
+
[<] Restoring original return address...
116
+
[>] Original return address: 0x1926747bd51. Finishing call stack...
84
117
85
-
The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
118
+
===> MySleep(5000)
86
119
87
-
Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
88
-
120
+
[<] Restoring original return address...
121
+
[>] Original return address: 0x1926747bd51. Finishing call stack...
122
+
```
89
123
124
+
---
90
125
91
126
## How do I use it?
92
127
skipped 7 lines
100
135
- **Clear out any leftovers from Reflective Loader** to avoid in-memory signatured detections
101
136
- **Unhook everything you might have hooked** (such as AMSI, ETW, WLDP) before sleeping and then re-hook afterwards.
102
137
138
+
139
+
---
103
140
104
141
## Actually this is not (yet) a true stack spoofing
105
142
106
-
As it's been pointed out to me, the technique here is not _yet_ truly holding up to its name for being a _stack spoofer_. Since we're merely overwriting return addresses on the thread's stack, we're not spoofing the remaining areas of the stack itself. Moreover we leaveasequenceof`::CreateFileW`addresseswhichlooksveryoddandlet the threadbeunabletounwinditsstack.That'sbecause`CreateFile`wasmeant to solelyactasanexample,we'remaking the stacknon-unwindablebutstillobscuringreferencestoourshellcodememorypages.
143
+
As it's been pointed out to me, the technique here is not _yet_ truly holding up to its name for being a _stack spoofer_. Since we're merely overwriting return addresses on the thread's stack, we're not spoofing the remaining areas of the stack itself. Moreover we'releavingourcallstack_unwindable_meakingitlookanomaloussince the systemwillnotbeable to properlywalk the entirecallstackframeschain.
107
144
108
145
However I'm aware of these shortcomings, at the moment I've left it as is since I cared mostly about evading automated scanners that could iterate over processes, enumerate their threads, walk those threads stacks and pick up on any return address pointing back to a non-image memory (such as `SEC_PRIVATE` - the one allocated dynamically by `VirtuaAlloc` and friends). A focused malware analyst would immediately spot the oddity and consider the thread rather unusual, hunting down our implant. More than sure about it. Yet, I don't believe that nowadays automated scanners such as AV/EDR have sorts of heuristics implemented that would _actually walk each thread's stack_ to verify whether its un-windable `¯\_(ツ)_/¯` .
109
146
110
147
Surely this project (and commercial implementation found in C2 frameworks) gives AV & EDR vendors arguments to consider implementing appropriate heuristics covering such a novel evasion technique.
111
148
112
-
The research on the subject is not yet finished and hopefully will result in a better quality _Stack Spoofing_ in upcoming days. Nonetheless, I'm releasing what I got so far in hope of sparkling inspirations and interest community into further researching this area.
149
+
In order to improve this technique, one can aim for a true _Thread Stack Spoofer_ by inserting carefully crafted fake stack frames established in an reverse-unwinding process.
150
+
Read more on this idea below.
113
151
114
-
Next areas for improving the outcome are to research how we can _exchange_ or copy stacks with one of the following ideas:
115
152
116
-
1. utilising [`GetCurrentThreadStackLimits`](https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getcurrentthreadstacklimits)/`NtQueryInformationThread`) from a legitimate thread running `kernel32!Sleep(INFINITE)`
117
-
118
-
2. manipulating our Beacon's thread `TEB/TIB` structures and fields such as `TebBaseAddress`, `NT_TIB.StackBase / NT_TIB.StackLimit` by swapping them with values taken from another legitimate thread.
119
-
120
-
3. playing with `RBP/EBP` and `RSP/ESP` pointers on a paused Beacon's thread to change stacks in a similar manner to ROP chains - by swapping values of these registers while Beacon's thread is suspended.
121
-
122
-
4. Create a new user stack with `RtlCreateUserStack` / `RtlFreeUserStack` and exchange stacks from a Beacons thread into that newly created one
123
-
124
-
125
-
## Implementing a true Thread Stack Spoofer
153
+
### Implementing a true Thread Stack Spoofer
126
154
127
155
Hours-long conversation with [namazso](https://twitter.com/namazso) teached me, that in order to aim for a proper thread stack spoofer we would need to reverse x64 call stack unwinding process.
128
156
Firstly, one needs to carefully acknowledge the stack unwinding process explained in (a) linked below. The system when traverses Thread call stack on x64 architecture will not simply rely on return addresses scattered around the thread's stack, but rather it: