That is a fascinating result. A remarkable mathematical achievement. And a night...

DannyBee · on May 25, 2023

Well, no.

What you are worried about is VBB (roughly can i take a program a and obfuscate it in a way that you get no information about the original from the obfuscated version), which has been proven impossible.

This is about indistinguishability, which is a form of obfuscation, but not the kind you are thinking about.

This is really something like "If you have two programs a and b that compute the same function, it's possible to obfuscate them in a way that the resulting programs are indistinguishable from each other"

So if program a is "bubble sort" and program b is "selection sort" (function here is the mathematical sense, so these compute the same function), obfuscation can make it so you can't tell them from each other, and you can't tell if the original program of what you are holding was bubble or selection .

Roughly: VBB is the ability to take any program and make it indistinguishable from random.

IO is the ability to take two programs that compute the same function and make them indistinguishable from each other.

"compute the same function" is actually what makes it feasible :)

The original use case was really trying to turn secret key systems into public key systems by hardwiring the secret key into the program and then obfuscating it.

So an example use case today would be "every DVD decoding program has a different key but computes the same function, can you make them all indinguishtable and thus hide the key" or something like that.

cs702 · on May 25, 2023

> The original use case was really trying to turn secret key systems into public key systems by hardwiring the secret key into the program and then obfuscating it.

Wow, that definitely qualifies as a "crown jewel." I, for one, would not look forward to a future in which software running on my hardware is able to hide secrets, like one-off media encryption keys, from me!

DannyBee · on May 25, 2023

In practice, as the thesis proves, you can use iO to generate a lot of things people care about in cryptography. I will go out on a limb and say you can use it to build almost everything interesting :)

Now, it happens for a lot of those we don't need iO and iO may be impractical, but as a theoretical building block, it's quite nice.

As for your concern, yes, this is now possible, at least in the sense that you do as good as is possible to do in making them computationally indistinguishable. This does not prevent you from attacking it in other ways :)

Two things, good or bad depending on how you look at it:

1. You can prove iO is as least as good as the best possible obfuscation scheme that can ever exist. So whatever that enables you to do or not, it's the limit.

2. It also means you can get away with handing over less secrets, ensure better isolation, etc.

The following things are possible with iO (trivially so), due to this paper:

Adaptively secure succinct garbled RAM - which would let you hand secure databases to untrusted providers and not worry about it.

Sender deniable encryption where you can't prove what the original plaintext was and various options are equally likely.

Fully deniable interactive encryption where secrecy of communication is protected even if the parties are later coerced (or willingly bribed) to expose their entire internal states: plaintexts, keys and randomness.

etc

These are just some examples.

Now, some of this, as I said, we know how to do already, some we don't. But this paper gives you iO as a building block that can do them without having to separately prove that it is as sound and secure as existing crypto systems are.

Again, iO only provides computational indistinguishability, not other things, but it is a nice primitive.

cs702 · on May 25, 2023

Thank you for taking the time to respond so thoroughly :-)

Everything you write makes sense to me -- though I had to lookup and skim Canetti and Holmgren's work to understand the bit about succint garbled RAM.

We live in interesting times!

eru · on May 26, 2023

What's iO in this context? Is it indistinguishable obfuscation?

eru · on May 26, 2023

> Wow, that definitely qualifies as a "crown jewel." I, for one, would not look forward to a future in which software running on my hardware is able to hide secrets, like one-off media encryption keys, from me!

That future is already here for most people. See things like Secure Enclave or Intel's SGX.

Tangurena2 · on May 25, 2023

We already have code that obfuscates itself. And runs differently based on whether it is running in a debugger or not. This has been the cat & mouse game played by:

virus writers vs anti-virus code.

game hackers vs anti-cheat software.

license checking code vs cheapskates.

Stuxnet.

> Code where no one can tell if it has a backdoor.

Back in 1984, Ken Thompson wrote Reflections on Trusting Trust [0]. The short answer is "no, you can never ever tell if it has a backdoor. Ever."

> The moral is obvious. You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of source-level verification or scrutiny will protect you from using untrusted code. In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode.

Back in the 90s, I was interested in hacking. The sort of hacking that starts with "this is the disassembler. Step one, hack the trial version of IDA Pro." It was fascinating at the time (and bores me now, my punishment for growing up, I guess) so I read & did lots of stuff like that.

0 - https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...

eru · on May 25, 2023

Just don't trust that code, and don't run it?

You should require positive proof that code does what you want.

Similar to how eg Haskell's type systems doesn't have to solve the halting problem [0]: it just rejects some programs that would be ok, but don't conform to the type system.

See also proof-carrying code. https://en.wikipedia.org/wiki/Proof-carrying_code

[0] Ok, unless you use undecidable instances or something like that.

Attrecomet · on May 25, 2023

> You should require positive proof that code does what you want.

That's just saying you should only ever solve simple problems with your computer.

eru · on May 26, 2023

Why? People have eg written proven-correct C compilers. And those things ain't 'simple'.

s3p · on May 25, 2023

Yes great so let's not use any software at all on our machines! only things we can program ourselves or are open-source. what a dystopia we live in...

eru · on May 26, 2023

Huh, why?

First, for already closed source programs the obfuscation results don't make a difference at all. They remain just as obscure before as they are now. (Unless you routinely try to de-compile all closed source programs?)

The obfuscation result in the submitted article really only makes a difference to open source programs.

But: thanks to advances in mathematics and cryptography, a vendor can prove to you that their program does what it's supposed to, without having to reveal the source code or anything else about the program.

ilyt · on May 25, 2023

> Just don't trust that code,

That's even harder problem to solve

zamalek · on May 25, 2023

Any program worth obfuscating is not worth trusting.

twelve40 · on May 25, 2023

Among other things, tons of desktop software and most mobile apps and even javascript you run from most websites is obfuscated. How is it possible to avoid that? Do you run 100% open source for everything with javascript disabled?

bentcorner · on May 25, 2023

Some people do take it that seriously: https://www.gnu.org/software/librejs/

eru · on May 26, 2023

People typically run website's javascript in a sandbox.

Mobile apps also typically have tightly controlled permissions these days.

And, of course, people don't typically review most softwares's sources anyway. They rely on trust and reputation.

Technotroll · on May 25, 2023

Wouldn't this be a great case for a sort of machine or program that delivers Zero Knowledge Proofs for each input and output?

eru · on May 26, 2023

You could deliver a zero knowledge proof that the program you are trying to sell me is benign, without revealing eg the source of the program.

I'm not sure what you would do with a zkp of each input and output of the program?

dataflow · on May 24, 2023

Is it just an existence proof or is it actually constructive? (I haven't read the dissertation.)

Animats · on May 24, 2023

I'm not sure. It seems to be applied to "circuits", which in this context seems to mean a one-way set of logic gates (such as AND, OR, NOT, NAND) which map a set of boolean inputs to a set of boolean outputs. In theory you can construct any finite digital function that way. It's a useful abstraction, like a Turing machine.

Unlike a Turing machine, a circuit is finite. All circuits are "solveable" (given a outputs, compute an input which yields them) by trying all the input patterns.. There's no undecidability and no halting problem. There's just difficulty. "Difficult" here means there's no way easier than trying all the patterns. This is the same property sound cryptosystems are supposed to have - there's no easier way than trying all the keys.

Whether this result can be extended to programs with iteration I'm not sure. The paper doesn't seem to mention iteration or storage.

jhanschoo · on May 25, 2023

A (polynomial-sized) circuit in this context is a sequence of DAGs whose nodes are binary logic gates where the nth DAG infers 1 output bit from n input bits, such that some polynomial bounds the size of the graphs of the sequence. See https://en.wikipedia.org/wiki/Circuit_complexity

The class of circuits in question is P/poly which includes BPP (which includes P, the class of polynomial-time programs). So the result is quite general, since most practical programs are in P.

For an introduction to complexity theory, see Arora, Barak (2009) (Draft available here https://theory.cs.princeton.edu/complexity/book.pdf ); same Barak that did the presentation "On the (Im)Possibility of Obfuscating Programs" (2001) mentioned in the abstract.

x-complexity · on May 25, 2023

IMO: Paranoid about the implications, slightly less worried about it applying in practice.

As shown by the DRM schemes used in modern games, this type of obfuscation comes at the cost of performance: Unless you want to compute a sensitive function at the user's end in an obfuscated manner, it'd be much simpler to just run that function on your end and optimize it in terms of running costs & performance.

Such a design would also runs counter to the everything-as-a-service model that companies are trending towards, as it places more power back at the user's end, even if the user can't decipher the obfuscated function's inner mechanisms. Such a design would reduce the need to phone home, and thus the need for EaaS.

anonymousDan · on May 25, 2023

Would it be possible to use this to obfuscate other cryptographic primitives (e.g. a hash function or a simple program that checks whether the hash of some input equals a hardcoded value)?

dmix · on May 25, 2023

The next question is why didn’t the high end shops spending a ton of money and recruiting effort on crypto and state malware like NSA .. or Russia(?) figure this out already or did they

whatshisface · on May 25, 2023

Because figuring out what a worm does doesn't help if it has already done it, and for that matter they're supposed to remain undetected. Furthermore people already cannot tell whether code has backdoors, or what ML classifiers do for that matter.

Unfortunately, the biggest industrial use case is making wi-fi routers that are absolutely impossible to install OpenWRT on, etc...

whitemary · on May 25, 2023

> people already cannot tell whether code has backdoors

I was feeling crazy after reading that.

tivert · on May 25, 2023

> The next question is why didn’t the high end shops spending a ton of money and recruiting effort on crypto and state malware like NSA .. or Russia(?) figure this out already or did they

There's no way to know. But the NSA (at least) has a documented history of making crypto breakthroughs and keeping them secret (e.g. public key crypto: https://en.wikipedia.org/wiki/Public-key_cryptography#Classi..., differential cryptanalysis: https://en.wikipedia.org/wiki/Differential_cryptanalysis#His...).

withinboredom · on May 25, 2023

The fun open secret about working as a researcher for the NSA is that even after you leave the NSA, any and all research must be approved by them before publishing.

KMag · on May 25, 2023

There are more efficient ways to hide malware from current scanners, and on the flip-side, the scanners that use heuristics/ML to classify the sequence of system calls made by the binary wouldn't be affected by this sort of obfuscation.

ihm · on May 25, 2023

As mentioned on page 9 of the thesis, IO as constructed in the thesis implies IO for RAM machines, so yes you can do programs with iteration.

scarmig · on May 25, 2023

Can't finite iteration and recursion just be unrolled? And storage just considered another part of the input?

jahewson · on May 25, 2023

This is how SMT solvers deal with bounded quantifiers.

grayhatter · on May 25, 2023

In theory? Absolutely; unrolling a loop is exactly what it sounds like. Recursion is just a fancy loop. But there's a reason no one does that, and why no compilers emit that as code.

User23 · on May 25, 2023

Fancy in the sense of being a loop with an implicit stack. Or maybe just some registers if TCO.

And yet the machines we can actually build are far closer to LBAs than TMs.

matthewdgreen · on May 24, 2023

It is not efficient enough that you need to worry about exciting developments using this scheme. But it’s a step towards more reasonable mathematical assumptions than previous constructions, assumptions that may actually be true. Give it a few more years and we might be able to use this stuff.

vlovich123 · on May 24, 2023

Looks like constructive but my math knowledge is probably equivalent to a freshman’s at best.

NoMoreNicksLeft · on May 24, 2023

Even if the vendor is willing to provide source, you can't be sure that the binary matches. Nice.

I think Saberhagen's Berserkers actually had this feature.

Animats · on May 25, 2023

> I think Saberhagen's Berserkers actually had this feature.

Yes. Code decoded other code, etc. So if the code wasn't running normally, it couldn't be analyzed. It was handwaving back then, but it may become real.

Tangurena2 · on May 25, 2023

It was standard coding practices if you were selling games - to prevent people from hacking the license key stuff. Or by cheating software to prevent detection by anti-cheat software. One of the hooks built into Windows is "is a debugger connected" helps the virus/cheat code determine if it is "safe" to run or not.

schoen · on May 25, 2023

You can be sure that it matches if the vendor (1) doesn't use an obfuscated binary, or (2) provides exact details of how the obfuscated binary was constructed (presumably including some kind of seed for the obfuscator).

Tangurena2 · on May 25, 2023

This was the point behind Reflections on Trusting Trust - you can't trust it. Unless you build the entire compiler/linker tool chain from source that you've already inspected.

https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...

Cyph0n · on May 24, 2023

I’m sure DRM companies like Denuvo are having a field day.

hgsgm · on May 24, 2023

Obfuscation is itself a malware signature.

Vervious · on May 25, 2023

I mean, the same argument has been applied to other cryptographic tools. Why encrypt your messages, unless you're sending something sketchy? Why obfuscate programs, unless you're hiding something?

Yet I think the utility of encryption is well-demonstrated, and not just theoretical. It can be used by good people to defend against adversarial intent. And it may well be that obfuscation (in its most direct application) has the same effect. (e.g. companies sharing proprietary algorithms for you to run at home, without revealing the secret sauce; or me delegating computation to AWS without revealing secrets)

Note that this is not why indistinguishability obfuscation (iO) is a crown jewel, here. Practically, iO is nowhere close to obfuscating anything larger than a tiny circuit. But it can still be useful to do things like obfuscate secret keys when designing cryptographic protocols. Theoretically, iO allows us to derive essentially every cryptographic primitive, which is why this paper is interesting, and why iO is called a crown jewel. And now, we can build iO for the first time from well-studied hardness assumptions.

moring · on May 25, 2023

> I mean, the same argument has been applied to other cryptographic tools. Why encrypt your messages, unless you're sending something sketchy? Why obfuscate programs, unless you're hiding something?

There is a huge difference: Encryption is used to hide something from a third party, while obfuscation is used to hide something from the intended recipient.

Or, if you'd like to argue that the intended recipient is the computer, not the user: To turn that computer into a deputy of the sender, while still formally belonging to the recipient.

consp · on May 25, 2023

> while obfuscation is used to hide something from the intended recipient.

No. Most obfuscation is done to increase the time needed to reverse the source of publicly available programs. Not the user (99% of the recipients) but the adversary is the intended target. The users are just cought in the crossfire.

Unless you consider every user of apps as someone who has both the skill and the need to reverse engineer your app.

Xelynega · on May 25, 2023

> No. Most obfuscation is done to increase the time needed to reverse the source of publicly available programs.

So to stop the intended recipient from deciphering the program.

"Adversaries" are the strawman used to hurt legitimate customers who are left with black boxes for device drivers and no functioning hardware since the manufacturers make it impossible to open source firmware updates.

ilyt · on May 25, 2023

In case of DRM sure, but there are cases like online games where "adversaries" are players (intended users of the app) trying to use cheat programs on game binary to get an advantage over other players.

I share the sentiment that in most cases it just serves DRM vendors and is disservice to users but that's not all the cases.

yjftsjthsd-h · on May 25, 2023

> I mean, the same argument has been applied to other cryptographic tools. Why encrypt your messages, unless you're sending something sketchy? Why obfuscate programs, unless you're hiding something?

It's easy to argue that everyone has legitimate interest in hiding some things in data, but what would be a legitimate case for hiding things in code?

Brotkrumen · on May 25, 2023

"This is a trade secret. We have not patented it, because it's so clever and valuable that it needs to be protected longer than a patent lasts"

Xelynega · on May 25, 2023

Except we know(and have example from cryptocurrency) that the problems with ownership in real life have little to do with cryptography.

The reason coca cola doesn't use Pepsi's recipe isnt because mixing the ingredients under heat creates an irreversible mixture that you can seperate to find the recipe. They don't use pepsis recipe because its illegal.

Attrecomet · on May 25, 2023

That's true for a recipe that was developed in a past millennium. Trade secrets are still very much relevant for companies that work with more modern technologies than soda from before the Great War, and a useful protection against competition. Even if you do believe that companies within the US or EU wouldn't touch other's IP with a 10 foot pole due to strong laws, you'd have to be very naive to think this is valid in all parts of the globe.

Edit: it's also trivially not true for trade secrets in the legal sense -- if you upload your trade secret willingly to AWS, it's no longer a trade secret, unless you have extra contracts in place. With obfuscation, you could skip a lot of red tape to keep it secret.

liminvorous · on May 25, 2023

You can't copyright a recipe, the thing reason it doesn't make sense for Pepsi to make an exact duplicate of coke is that Coke is better at selling Coke that Pepsi is, and it's better for them to be a niche alternative some people prefer than to be an undifferentiated competititor. Reverse engineering trade secrets is perfectly legal if you haven't agreed otherwise.

andrewflnr · on May 25, 2023

Since when is coca cola niche?

pxc · on May 25, 2023

That remains a fundamentally anti-social motivation...

joelfried · on May 25, 2023

Making it harder for bad actors to create malware would seem like one legitimate case.

ilyt · on May 25, 2023

> I mean, the same argument has been applied to other cryptographic tools.

Intent matters.

> Why encrypt your messages, unless you're sending something sketchy?

Because I am sending something private, I can't believe people bring this dumbass argument still, and even here.

> Why obfuscate programs, unless you're hiding something?

Why indeed. This one is the red flag, not message in your e-mail client being encrypted. Again. Intent, and context, matters.

Vervious · on May 26, 2023

You say the intent is different but don’t show why. What about obfuscation implies malicious intent? How come it doesn’t apply to encryption?

It can’t be about applications, because I can come up with a thousand applications for obfuscation with benign intent. Such as delegating private computation to AWS without Amazon learning my personal information.

denton-scratch · on May 25, 2023

> and why iO is called a crown jewel.

Ah, thanks. That's far from clear in the article.

Hnaomyiph · on May 25, 2023

Until it’s standard practice by all companies to protect up under the guise of consumer protection with all closed sourced software.

hulahoof · on May 25, 2023

like that time sony gave everyone a rootkit: https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootk...

RockRobotRock · on May 25, 2023

similar to starforce for those that don't know: https://www.youtube.com/watch?v=p-wyIalhdPU

jrpt · on May 25, 2023

Sure but it's a very weak signal.

tptacek · on May 24, 2023

Isn't that where we're already at with ML classifiers?

astrange · on May 24, 2023

Well, we know they're ML classifiers and they're not making syscalls. So there's obviously a limit to "we don't know what they do".

tptacek · on May 25, 2023

You'd still know what syscalls a cryptographically obfuscated userland program was making.

sbierwagen · on May 25, 2023

If it's a total black box, wouldn't the NOBUS thing to do would be to have some large key that it's watching input for that flips it into a malicious mode?

If BB(6) took years to execute, how long would you have to spend feeding random input to a suspected-hostile 10000 symbol Turing machine (whose source code and state you can't examine) in a sandbox before you decided it was safe?

rini17 · on May 25, 2023

Easy, have all executable segments read-only with none (or fixed) syscall instructions.

astrange · on May 25, 2023

Think it'd be a bit dangerous to let them make any though, unless you were single stepping.

godelski · on May 25, 2023

Wouldn't this also be a field day for encryption maximalists?

I'd imagine if all code were to be perfectly obscured then to the public it wouldn't matter, but many would demand for more open source and that'd be the only way to trust.

amoss · on May 25, 2023

Ah, "code is law". Good luck with that.

blurbleblurble · on May 25, 2023

Just wait until the legislators catch wind of this.

sacnoradhq · on May 27, 2023

Autonomous, polymorphic malware utilizing secure computation to perform operations that aren't revealed to systems they infect.

Seems like there is an opportunity for bot nets to perform computations, say BTC mining, in a completely sealed and zero-knowledge environment, which now may not even need a backdoor or are able to hide it perfectly.

krageon · on May 25, 2023

> viruses and worms nobody can figure out

Sandboxing and modeling interaction are still possible and useful, and is a large part of figuring out viruses and worms. That's not going away

stuckkeys · on May 25, 2023

Yeah pretty much. It is only going to get wilder in the ransomware scene. Good times ahead.

elefanten · on May 25, 2023

Seems some of those cases (backdoors and classifiers) could be prevented by regulation and/or social convention (eg- don’t use programs/services obfuscated in this way).

The malware seems trickier. Maybe systems will need to require proof of unobfuscated source to run code?

m463 · on May 25, 2023

Could you steal source code, obfuscate it, and sell it as your own?

quickthrower2 · on May 25, 2023

Does it run at the same speed as the original?

omgwtfbyobbq · on May 25, 2023

Across the same universe of discourse.