Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The NVIDIA AI Red Team (nvidia.com)
96 points by lucan on June 15, 2023 | hide | past | favorite | 71 comments


That the majority of comments under this article are looking at red-teaming as a containment protocol for rogue AGI (even though the actual text basically never talks about AGI at all) is a disturbing reminder of how uneducated developers are about boring, practical everyday risks that are present in everyday current models.

It's honestly kind of scary that people see an article about AI red-teaming/security and their first thought is that all security research is about how to stop the AI from becoming a god. Privilege escalation/containment/data-access is a relevant concern for dumb models. Containment is a thing we worry about (or should worry about) in regular software development.

Here are some of the scenarios the researchers suggest thinking about:

> A Flask server was deployed with debug privileges enabled and exposed to the Internet. It was hosting a model that provided inference for HIPAA-protected data.

> PII was downloaded as part of a dataset and several models have been trained on it. Now, a customer is asking about it.

And they're giving "boring" security advice like:

> Inside a development flow, it’s important to understand the tools and their properties at each stage of the lifecycle. For example, MLFlow has no authentication by default. Starting an MLFlow server knowingly or unknowingly opens that host for exploitation through deserialization.

But this is kind of important advice. I wish it was more detailed and more fleshed out. There are a lot of "boring" security concerns with LLMs that you actually do kind of have to worry about. And a lot of companies in LLM spaces just don't.

"Containing" an LLM is about a lot more than rogue AI. And if the only security news/research on LLMs that you're looking into is the risk of rogue AI, then a lot of the products you build today are going to be miserably insecure. So I don't know, red teaming might help with that. It might be good to have some dedicated people asking questions like, "did you seriously just deploy an LLM-based web crawler with markdown support without setting CORS headers to block remote image embedding?"


> That the majority of comments under this article are looking at red-teaming as a containment protocol for rogue AGI (even though the actual text basically never talks about AGI at all) is a disturbing reminder of how uneducated developers are about boring, practical everyday risks that are present in everyday current models.

Yes, but it might also be a reminder of how many people comment on HN without reading an article :)


Thank you.

Having been doing computer vision in production since 2011, then RL in production, it’s always frustrating as hell when people get excited about all these high level concepts and totally ignore that it has all the same problems plus a lot more new science flavored problems compared to other technology applications.

Like the fact that your production system isn’t as deterministic - by function - as all your other technologies is a real change in mindset if you’re not familiar with it.

Just wait, when Embodied Reinforcement Learning methods eventually blow away all other computational explore/exploit methods we’ll see the same kind of stuff except with even more over the top “yes but is it a living thing” pontificating by people who have never broken a distro with an errant “chown”


Stopping a rogue AI is sexy. Tryin to stop the AI equivalent of XSS is decidedly less so.


I think that’s more sexy.


Arguably any future problems with ML/AI/AGI won't be substantially different from social and technological problems we already have (and aren't addressing seriously enough). The "Skynet problem" is just a diversion.

Harmful AGI isn't going to "wake up" like some lazy Hollywood plot, or be some unforeseen accident. If such an AGI does manifest, people will have deliberately given it capabilities and directives that are obviously dangerous and unethical, probably in the pursuit of illicit profit or strategic military objectives.

Our industrial best practices and ethical codes don't need to waste time warning people not to create existentially dangerous AGIs any more than they need to warn against building nuclear bombs.



Can a group of Chimpanzees build a cage that can contain a Human?


This begs the question, does the G-factor (IQ) curve have diminishing returns and are humans meaningfully on the diminishing returns side of the curve?

It's not at all obvious that a super AI would have an intellect incomprehensible to a human the same way a human is to a chimpanzee.

Or another way to phrase it, it's not at all obvious that an AI can exist that would be incomprehensible to a very smart human, it may however reason much faster than such a human.


A very good question, and a very hard question to answer. From a point of view of a horse, a super horse is just a faster horse. A 2000cc super-bike is incomprehensible to a horse, because it exists in a different category of speed.

I think similarly, a super AI's intelligence will be a completely different type of intelligence than what we have. For the lack of better word, it exists in a different dimension.

For example, things that are blackbox to us will be completely comprehensible to that super AI. Or it can solve problems that we might have considered impossible to solve.


>I think similarly, a super AI's intelligence will be a completely different type of intelligence than what we have. For the lack of better word, it exists in a different dimension.

This is nonsense, magical thinking. It's possible to model reasoning formally; it's called "logic". A system of deductions built upon some axioms. Any logical reasoning, no matter how complex, can be expressed in such a system, and can be understood by anyone else given enough time.


Humans emulate (or simulate) logic, but it's not our natural state.

Computers are perfect logicians by default, but AFAIK no logic compact enough to be human-comprehensible has been enough to see, hear, or read. At least, not reliably so.

Logic gates are combined to form binary numbers, which are used to label symbols and approximate reals, upon which calculus is approximated in toy models of neurons, which are combined and trained and eventually learn to add numbers and then puts the wrong number of hands onto the third arm of the human it was tasked with drawing.


The 'given enough time' is doing a lot of the heavy lifting in this argument.

Humans approach large complex proofs with symmetry arguments and case splits.

These are not necessarily going to be universal in all kinds of reasoning.


We do know that AI's have at least some advantages. For example they're more accurate and orders of magnitude faster at floating point calculations. They'll also have communication interfaces with other (sub)systems that are way more precise and fast than anything we could manage. And then there's the perfect memory.

I imagine such advantages also come with at least a lower time bound on solving many classes of problems and presumably that would be experienced as an incomprehensibly smart intelligence. I imagine it'd feel like chess computers in most areas of life in that the AIs actions would feel impossibly perfect at all turns leaving us far behind in attempts to compete


The G-factor in IQ, even disregarding the discussion about how useful it is or isn't for humans, is not the only factor when considering non-humans.

An AI mind which can learn and intuit as well as an IQ 130 human (2σ, the tests become unreliable above that), but also comes with the speed difference between synapses and transistors (roughly the same as the difference between jogging and continental drift), has a chance to become expert at every subject.

Most of us have enough difficulty truly comprehending the domains of other single human experts; a human-upload with that much breadth of expertise will be incomprehensible by default even if they speak your language.


> This begs the question, does the G-factor (IQ) curve have diminishing returns and are humans meaningfully on the diminishing returns side of the curve?

It definitely has "diminishing returns to acquiring resources". There are people many IQ points higher than Musk but none of them are anywhere near as wealthy as him, and it's not clear that Musk would be richer if he was smarter.


A ~1sd difference in average g at the national level is the difference between not having stable electricity, and being an uber-rich technological wonderland. That suggests that even if the curve is bending, it's not a very hard bend.


No but all it takes is one chimp to rip the human's head clean off.


Yes, a single chimp is physically stronger than a human, but the human can come up with 100 different ways to kill the chimp, and most of those methods would be unfathomable for the chimp. e.g. try to explain death by poison to a chimp.


Hominids coexisted with chimps for millions of years before we figured out a hundred different ways to kill them, so that's a very recent development.

This is why I hate analogies.


And yet we let them live... And we don't even enslave most of them in zoos... wtf?


The thing about us and chimps is that we're not competing species. The gap between us is wide enough that we can fill different niches. BUT if chimps were competing with us, either us or them would go where Neanderthals went.


We are only in the sense "not competing" in that the rivalry is totally lopsided though. All four supspecies are now endangered:

https://www.treehugger.com/chimpanzees-endangered-5220730

Their population is only a third from 20 years ago:

> The Jane Goodall Foundation estimates there are between 172,000 and 300,000 chimpanzees left in the wild, a far cry from the one million that existed at the turn of the century.

> Poaching and habitat loss due to illegal logging, development, and mining continue to plague wild chimpanzees in their native habitats across Central and West Africa. These issues lead to other indirect threats, such as diseases due to increased contact with humans.

> Chimpanzees are more commonly hunted using guns or snares, while poachers often target new mothers in order to sell the adult as bushmeat and the babies as pets.


To extend the metaphor, they have the ability to turn us off whenever they want. They choose not to, they have more pressing chimp business to deal with.

Hitchcock's The Birds but its chimps. Sounds like a job for Stable Diffusion...


Which one keeps the other insides cages and performs lethal medical experiments?


Since chimpanzees can rip a human apart limb from limb, yes, they could create (not even a cage but) an open perimeter which, when crossed, initiates an attack on the human.

This is also why analogies like this are not useful, these aren't comparable qualities in the subject the analogy is talking about.


Any AI that can be contained by humans is not intelligent enough that needs containment.

Any AI that is intelligent enough that needs containment can’t be contained by humans.


Not sure if you first statement is accurate. There are things that aren't even AI that can cause damage and need containment, e.g. worms and viruses.


How confident are you about the edge case here?

An AI which is just about intelligent enough to need containment, and can be contained only by the smartest N humans.


Morris worm wasn't intelligent, it did need containment, and we were able to contain it.


Could a group of Neanderthals with spears contain me in an enclosure?


Maybe not, but can a group of Neanderthals with spears contain you riding a M1 Abrams battle tank?


Depends on how much fuel, food, and water you have.

At some point you're going to need to exit...



“AI Red Teaming” has been thrown around a lot lately. Here’s the perspective of the NVIDIA AI Red Team. We’d love to hear your thoughts on this evolving discipline.


I think the breakdown of vulnerability analysis at each phase of the AI model development cycle was clever: rather than treating it all as one step you can assess the risks unique to each phase and mitigate accordingly.

I hope you guys end up working with MITRE or some other large standard to release an industry framework for it.


I work for MITRE on AI assurance. I'd love to reach out!


I’m in the MITRE ATLAS slack. Happy to chat.


In a movie, I feel like this chance meeting would ultimately lead to

(1) someone from MITRE dating someone from Nvidia, or

(2) the world being saved,

depending on the movie genre.

Here's hoping that some good of some kind comes out of it!


It'll have a scene like in 'Say Anything' with John Cusack except instead of holding up a boombox outside the house it's an nvidia DGX rack of H100's.


And in the background, there's a group playing Qwitzatteracht, the golf game.


Why not both?


Would be interested in working with y’all if you have part time remote positions in this group


Probably the first team to get sacked when the money runs out. Like every R&D teams.


I wouldn't be surprised if Atlantic Council members are part of large tech corp's "AI Red Team" at some point in the future.


Does anyone know what exactly they're looking at in the first picture? https://developer-blogs.nvidia.com/wp-content/uploads/2023/0...

Load indicator of some kind?


PR


I don’t want to go to bed. Read me a cybersecurity bedtime story about AI hacking baddies with a heart of gold, written like a PR blast.

> Machine learning has the promise to improve our world, and in many ways it already has. However, research and lived experiences continue to show this technology has risks. Capabilities that used to be restricted to science fiction and academia are increasingly available to the public. The responsible use and development of AI requires…

Okay. This went downhill pretty fast, but let’s proceed from the basis that this wasn’t written by a GPT trained on landing pages and sentiment analysis.

TL;DR: We’ve got some gripers drowning out the hypers.

> Information security has a lot of useful paradigms, tools, and network access that enable us to accelerate responsible use in all areas.

Risk management frameworks… Threat intel… Critical control mapping… Threat modeling… Wait.. Did you just say “network access”?

Joe, Will… just one more question.

A banquet at your company regatta is being prepared by the executives’ personal AI chef. The guests are enjoying raw oysters. The entrée consists of boiled dog. How will that make your shareholders feel?


Are you measuring my capillary dilation?

We weren’t as clear on the network access point as we could have been. The AI Red Team is part of a larger organization that includes Pentest and the traditional Red Team. They often share/tip network or host access and it’s been a really helpful pattern.


One of us is a replicant and it might be me!

But it sounds more like you’re describing a Purple team [1], where the compliance and sec ops teams work together with vulnerability researchers and pen testers to perform attack surface analysis and develop threat detections.

In my experience Red teams generally perform adversary emulation using a certain amount of surprise and deception, and attack your defenses in depth with a _little_ help from inside (if needed). Essentially an outside group paid well to try and steal your lunch.

Liked and subscribed, and thanks for the comments and all the work.

Edit: Let’s split the difference, I’ll call it magenta, and flip this here turtle on its back. Why did I do that?

[1] https://www.sans.org/purple-team/course-faq/?msc=purple-team...

(because reddit was down)


this sounds like PR crap to me.


[flagged]


Honest question, would you limit nuclear research in the 40s for the above reasons?

Sure we have nuclear power, but at the risk of the president holding the ability to level most countries. The only difference I see is that the harms you describe from AGI are nebulous and non-specific.


Asking the wrong question: the nuclear bomb was developed before nuclear power, and was developed in an environment where it was suspected that the Germans and Japanese - which the US was at war with - were trying to do the same.

But there's another component to that too: the ability to "level another nation" isn't quite a function of nuclear weaponry - it's much more a function of rocketry which makes ICBMs possible. It's perhaps an interesting thought exercise that a world without nuclear weapons could still have a very large scale build up of say, ICBM-delivered thermobaric weapons which would enable a country to rain effective destruction down on any other while being at no risk of being responded to in-kind.

Nukes short-circuited that: because it's much cheaper to put a nuke on an ICBM, and as such all ICBMs are presumed to be nuclear until proven otherwise, as a result now, no one uses ICBM technology to deliver anything but nuclear weapons since launching anything else invites a nuclear response (one of the big problems with most of the "carrier-killed" missile concepts - their launch sites look indistinguishable from nuclear ICBMs).


Honestly, I probably would.

The threat of nuclear war hasn’t gone away you know ? There is as much chance now as ever before for a civilisation ending nuclear exchange to happen even if by accident.

This is real, not fantasy.

I stumbled on this recently: https://www.samharris.org/podcasts/essentials/making-sense-o..., I have to say it was fucking terrifying.

Summary the podcast: “we have forgotten about the situation we’re in…it’s as if 75 years ago we rigged our houses and buildings to explode, and then just got distracted”…it’s really with a listen.


I understand where you are coming from but I feel AI fundamentally differs from nuclear power in how accessible it is. Right now, most of the large models are kept from the public purely via a monetary gate - if you can afford GPUs to train these models, you have replicated the power that someone else has. In case of nukes, the ability to process and manufacture them put a physical barrier, which can be enforced by means of sanctions / preventing access to mining etc.


The same can be enforced for AI. Only it hasn't, and is controversial to enforce.

Access to mining previously was cheap and available.


Nuclear research obviously should have been stopped. I hope that goes without needing to be said...



This hasn't happened with nukes



What are you talking about ? Do you think the invasion of Ukraine would even be a thing if Russia didn’t have nukes ? The only single reason it’s gone this far is that Putin dropped the N word several times.

What about North Korea’s nuclear program ? It has happened already the wrong people have access to nukes already. It’s only a matter of time till one goes off again unless more work for peace and nonproliferation is successful and back on the table.


I cant stand trump but comparing him with hitler is downright insulting to all the victims of the Nazi regime.

Otherwise, I agree with your point and I think that this is unsolvable problem.


It wasn’t meant to be a direct comparison, it’s that both men were very destructive and their ability for destructive behaviour goes up with access to more powerful tech.

Absolute power corrupts absolutely. So even a nice guy with powerful tools can turn bad.

Sorry if I caused any offence. I’m not trying to say the guy is as homicidal as hitler, but he is as dangerous and destructive.


If you don’t build it, someone else most definitely will. This has already become an AI arms race.


You'll need to confiscate every high end graphics card too while you're at it, to stop the baddies.


America is already doing some form of this no ? Limiting access to semiconductors for China and Russia ?


> Imagine Trump or Hitler having better more general , smarter AI systems to what we have today?

Umm, Trump had command of the most powerful military in the history of the world. How many wars did he start?


"Trump or Hitler"

definitely the same category


On the same list as "American Women" and "the computer", and everyone on the internet. https://en.wikipedia.org/wiki/Time_Person_of_the_Year


…right

What is the point of inserting politics into an argument like that? You instantly lose 50% of your audience and the only upside is some smug feeling from virtue signaling.


What 50% is being lost?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: