Can confirm. Matching decompilation in particular (where you match the compiler along with your guess at source, compile, then compare assembly, repeating if it doesn't match) is very token-intensive, but it's now very viable: https://news.ycombinator.com/item?id=46080498
Of course LLMs see a lot more source-assembly pairs than even skilled reverse engineers, so this makes sense. Any area where you can get unlimited training data is one we expect to see top-tier performance from LLMs.
My own experience has been that "ghidra -> ask LLM to reason about ghidra decompilation" is very effective on all but the most highly obfuscated binaries.
Burning tokens by asking the LLM to compile, disassemble, compare assembly, recompile, repeat seems very wasteful and inefficient to me.
That matches my experience too - LLMs are very capable in "translating" between domains - one of the best experience I've had with LLMs is turning "decompiled" source into "human readable" source. I don't think that "Binary Only" closed-source isn't the defense against this that some people here seem to think it is.
> Has anyone used an LLM to deobfuscate compiled Javascript?
Seems like a waste of money; wouldn't it be better to extract the AST deterministically, write it out and only then ask an LLM to change those auto-generated symbol names with meaningful names?
yes, but it requires some nudging if you don't want to waste tokens. it will happily grep and sed through massive javascript bundles but if you tell it to first create tooling like babel scripts to format, it will be much quicker.
I absolutely agree that Microsoft could do better, but they are making progress in removing support entirely for broken (from a security perspective) older protocols such as NTLMv1 (which uses DES as well: more here -- https://bit.ly/crackingntlmv1) and SMB1.
The financial incentives drive Microsoft to support every possible (mis)configuration, forever. It's the tireless work of a few folk at Microsoft like Ned Pyle, Steve Syfus, and Mark Morowczynski that have landed the changes so far.
There could absolutely be a "security check" tool deployed by default with Server 2025 or similar that looks for Kerberoastable user accounts (any account with a ServicePrincipalName is technically Kerberoastable, like computer accounts), AS-REP roastable accounts, weak encryption types, etc. That would probably get more traction than changing defaults out of the box for everyone, as that's another way to phrase "breaking customer environments when they upgrade".
I don't think I understand you. The RC4 encryption type is msDs-supportedEncryptionTypes 4 (i.e. 0b0100). DES_CBC_CRC is 1, DES_CBC_MD5 is 2. They do not "correlate".
And regarding the NT hash: the NT hash is named after NTLM, not Kerberos. NTLM is a completely different (and much less secure) authentication mechanism. And the NT hash is not DES at all, it's MD4. You may be confusing it with the LM hash, which is indeed DES, but does not support unicode and is not common anymore.
The LM hash is disabled on domains with an LmCompatibilityLevel of 4 or above. (It's accepted, but clients shouldn't send it, on the default LmCompatibilityLevel of 3, which, by the way, unless your domain has devices from the stone age, you can safely set to 5, disabling LM and NTLMv1.) Although, if you can, you should disable NTLM on the domain altogether, because it's a much more vulnerable protocol than Kerberos.
KeySavvy is the normal workaround for this. $99 extra cost to both sides for them to handle the title verification and shipping, and to act as the dealer to make it qualify for EV credits.
We don't advance as a society unless people ask new questions. Having folk willing to spend some time answering those questions (in public, no less!) helps others. It's really, really damn hard to predict how advancements in one area can help another.
All that said, thanks for your interesting new question, and thanks for spending time on it :D
Title needs a small fix, it should be `ping ff02::1` (with two colons) to be a valid IPv6 address, match the actual command, and match the original title.
Of course LLMs see a lot more source-assembly pairs than even skilled reverse engineers, so this makes sense. Any area where you can get unlimited training data is one we expect to see top-tier performance from LLMs.
(also, hi Thomas!)
reply