Even though I started a substack explicitly focused on how reckless we're being about AI, it's not that I believe rogue superintelligence is a guarantee, just that it's plausible enough that it's hard to kick-back and relax about it, akin to how playing Russian Roulette would be a white-knuckle experience. Just because some experts think something is hard or impossible does not make it so.
Scott Alexander likes to mention the example of Rutherford declaring the idea of energetic nuclear chain reactions to be "moonshine", Leo Szilard reading about that, taking it as a challenge, and then proceeding to figure out how to do it in a matter of hours.
Though of course, that's not quite right: experimentation was still necessary to get from Szilard's insight to nuclear bombs and reactors. I think it's probably because of that that I'm not worried about a super-fast takeoff (basically the same as you said in this article), but there is still the likelihood that AI systems keep getting more useful and widespread, increasing the probability rogue superintelligence comes about at some point.
Bottom line, I trust more the safety-minded experts (https://aidid.substack.com/p/what-is-the-problem), those who think AI poses major risks, than the ones who don't, not because I apply the precautionary principle in every situation, but because it does seem to fit in this one.
I think the short version of my thoughts is that Szilard very specifically had a mental model of a chain reaction that corresponded to reality in a rigorous and straightforward way, whereas "intelligence" is an over-broad nothing word that doesn't have that sort of actual meaning. I agree that existential risk dramatically changes how you should analyze things, and I'm hyper-conservative about this stuff - but I do think the table stakes for counting it as an existential risk is an empirical framework that actually has some idea of what intelligence is.
The point of the New York analogy is that basically we're assuming that things being done by statistical correlations counts as intelligence because *we* would use intelligence to do it. Refuting this requires arguing that something like GPT-3 becoming "more useful" increases the risk of intelligence forming, but I think that statistical correlations becoming more useful does NOT imply any sort of improvement towards intelligence, in the same way that a race car going from 30mph to 3000 mph doesn't have any improvement on its ability to climb a ladder (even though we would personally use our legs both to run and to climb ladders, and we can't run 3000mph).
Probably worth a full post level of detail, so I'll percolate on it. But that's the short argument why it's not being a Dyatlov, in my opinion. Nuclear chain reactions can be shown to exist, and even before they were shown to exist Szilard could define what the essential parts of a chain reacting system would be. Statistical correlations representing any sort progress towards true intelligence has *not* been shown to exist and I think there's plenty of evidence to the contrary when you start thinking about the parts of intelligence we're actually worried about w/r/t predicting things and developing novel ontologies and whatever.
So to be clear, I do think that we might need to have these conversations about super-intelligent AI someday when we invent an entirely new theory of what intelligence actually is, THEN perform rigorous experiments clarifying that theory, THEN create new methods of computation that interact with the world on those methods. But I think there would be multiple clearly-defined bright lines separating "risky AI" from all computation as it stands today. Maybe the most extreme way to phrase my thoughts: let's say that OpenAI had developed a GPT-10, and it wrote a new Shakespearian play that no scholar could distinguish from Shakespeare himself. I argue that even this would represent something with exactly as much capacity for intelligent thought (ie, zero) as any other bundle of statistical correlations. This may sound like I'm trolling at first blush, because clearly something that would take more "intelligence" for us to write would require more "intelligence" to replicate, so GPT-10 is more "intelligent" than GPT-3, right? But I think the burden of proof is squarely the other way round - saying "our observations for what words tend to follow other words have gotten more accurate, so computers are now better at {problem solving/predicting/any of the other attributes of intelligence we're actually worried about}" is a total non-sequitur, and an appeal to "but no, the observations of what words tend to follow other words are like, SUPER good now" doesn't cut it, even for arbitrarily Shakespearian values of super. Does that make sense?
Personally I would expect a 3000 mph wheeled vehicle to be able to reach as high as any ladder a human could plausibly climb, simply by inserting an appropriately curved ramp to convert horizontal motion to vertical. In many real-world cases a pre-existing hill quite some distance away might be sufficient.
An indistinguishably-perfect imitation of some lost Shakespeare play would need to include not just poetic forms and a coherent narrative, but a mix of period-appropriate topical references and timeless commentary on the human condition - including some genuinely new stuff, not merely cribbed from existing works. Somebody could plausibly use that to autopilot their way through a thesis defense for a PhD in a field they otherwise know nothing about.
Kind of, but the fact is, there are plenty of domain experts who do not find this view compelling. Stuart Russell even titled a chapter (in his book Human Compatible) "The Not-So-Great Debate", the chapter being about those who don't think there is any point in talking about AI risk yet, which is rather strong evidence that such arguments are very unpersuasive to some experts.
But zooming back in at the object level claim, let me take a layman's whack at it. It appears to be an example of McAffee's Fallacy (https://twitter.com/esyudkowsky/status/852981816180973568?lang=en), in that it is precisely the problem, yet it is presented as if it does away with the problem. We could totally stumble accidentally on those things, without ever developing a theory of them. Or maybe the theory is developed rather suddenly in an Einstein-like Annus Mirabilis event. Hell, some experts are warning about prosaic AGI, the notion that perhaps the current techniques are actually good enough to give us AGI, they just need more data and more compute.
Basically, your views (and Mitchell's) are just nodes in a vast graph, and taking the graph as a whole, the thing looks like an incipient Chernobyl to me.
In the end, I don't think we have a real disagreement here though. What I will eventually argue for in the substack is that AI research needs to be conducted exclusively by High Reliability Organizations (https://en.wikipedia.org/wiki/High_reliability_organization), and probably also attempt to figure out why no one on the planet is attempting human genetic engineering (see what happened to He Jiankui), since this seems relevant to pinpointing how certain scientific disciplines become safety or ethics minded. It does not appear like you would disagree with either of those.
While it’s true that many forms of learning will require real experiments and so can’t be infinitely fast, this only goes so far in reducing my concerns. Air travel is an accelerant for the spread of viruses and social media for the spread of memes. Accelerants are concerning because they reduce our ability to cope, and this is still true even if there are limits on how much they can speed things up.
I get that idea, but I do genuinely think this is more than just an upper bound. I think it matters that what machine learning techniques are doing isn't actual learning *at all* - that they capital-m Must consume the results of human experiments via the encoding methods of their training data. So a hard pre-req for our current methods of ML achieving superintelligence is that it can do it solely using the data grammars of good old fallible humans, with no capacity to improve on them. While this is more of a normative question that doesn't have a "true" answer - you could believe that the corpus of human written word is enough to become super-intelligent on, even if the data stream is limited to just the brute characters - the history of science is full of examples of theories hitting dead ends and needing new experiments to support new frameworks. When you look at how knowledge has advanced historically, I really do think that it shows the paucity of theory-only advances, which in turn lowers anxiety about theory-only intelligence overcoming an experimental one. Or lowers mine, anyway.
Yeah, I have some issues with the framing because I think the bounds of complexity need to be a bit more formally proven, but intuitively I think the diminishing returns on complexity is exactly right as a frame. The analogy I like to use is that if you want to predict the waves arbitrary far into the future, eventually your storage medium needs to store the tiniest details as such fidelity that it has to be so big and so wet that it must be the ocean itself. I actually started a post to that effect, but I've iced it for now because I don't have the information theory chops to speak rigorous about where that line is - just a sort of "look around you" sense that it must be several orders of magnitude below a point where we could have a disembodied actor have total mind control powers or whatever.
I agree that there seems to be some trick, some secret sauce, the lack of which prevents today's machine learning from turning into "real" AI. Or possibly multiple tricks. But I don't think we can say anything about when such tricks will be discovered by some researchers somewhere. Past experience doesn't seem all that helpful in predicting future surprises, other than to show that we can sometimes be surprised.
And so I'm suspicious of armchair arguments that either say that such breakthroughs are inevitable or impossible. It seems better to humbly admit ignorance? We don't know what researchers will try.
Also, it seems like "real" experiments could happen at accelerated speeds in the domains of computer security, social networks, and financial systems?
First of all, OpenAI Codex and GPT’s capabilities feel fairly ... general. It’s hard to name some capability within the input text below a complexity threshold that they don’t have. And much of it is extremely simple techniques, albeit with a lot of refinement and tweaking and complexity, still extremely conceptually simple techniques relative to the scale of the number of floating point operations - being applied at supercomputer scale. The “slippery slope fallacy” of computing power increasing means it will continue to increase seems like not a fallacy - Moore’s law continues to push, and FLOP/s continues to multiply. GPT is already a good conversationalist relative to some people, and codex can code better than 80% of people. You speak of “redefining the things computers do as intelligence” - but historically, the opposite happened. As the power of computers expanded, things previously seen as typifying and proving intelligence (energy, motion, vitality, calculation, speech, representation, control, chess, then Go, then language, then programming, then image synthesis) were redefined as outside the human sphere.
> it’s going to need it’s own ability to choose what parts of the infinitely detailed world to resolve into data
... this sounds like “attention”, one of the NN new things behind GPT? And that applies to the context frame? AI is already doing that. Modern AI already does this
> Once you accept that there’s no universal listing of facts, the AI needs to generate its own facts as part of becoming “more intelligent”. Determining whether your facts are good or bad requires experiment - real experiment, not the machine learning sense of “experiment” that involves slicing the same static dataset in different ways and running your algorithms on it.
modern AI also already does this (sim2real for self driving cars for instance, or agents exploring simulated environments, are not static datasets at all). And within the context of AI the learned models absolutely are “generating facts”, an association learned by a model seems like a fact to me
Overall, unconvinced. I suspect a careful look at existing GAN/CLIP image generation, codex, etc will help clarify why
Even though I started a substack explicitly focused on how reckless we're being about AI, it's not that I believe rogue superintelligence is a guarantee, just that it's plausible enough that it's hard to kick-back and relax about it, akin to how playing Russian Roulette would be a white-knuckle experience. Just because some experts think something is hard or impossible does not make it so.
Scott Alexander likes to mention the example of Rutherford declaring the idea of energetic nuclear chain reactions to be "moonshine", Leo Szilard reading about that, taking it as a challenge, and then proceeding to figure out how to do it in a matter of hours.
Though of course, that's not quite right: experimentation was still necessary to get from Szilard's insight to nuclear bombs and reactors. I think it's probably because of that that I'm not worried about a super-fast takeoff (basically the same as you said in this article), but there is still the likelihood that AI systems keep getting more useful and widespread, increasing the probability rogue superintelligence comes about at some point.
Bottom line, I trust more the safety-minded experts (https://aidid.substack.com/p/what-is-the-problem), those who think AI poses major risks, than the ones who don't, not because I apply the precautionary principle in every situation, but because it does seem to fit in this one.
And like skybrian said, armchair arguments that certain scientific breakthroughs are impossible are suspect. As I once read a nihilist say (https://rsbakker.wordpress.com/essay-archive/outing-the-it-that-thinks-the-collapse-of-an-intellectual-ecosystem/), using philosophy to countermand science is "like using Ted Bundy’s testimony to convict Mother Theresa".
I think the short version of my thoughts is that Szilard very specifically had a mental model of a chain reaction that corresponded to reality in a rigorous and straightforward way, whereas "intelligence" is an over-broad nothing word that doesn't have that sort of actual meaning. I agree that existential risk dramatically changes how you should analyze things, and I'm hyper-conservative about this stuff - but I do think the table stakes for counting it as an existential risk is an empirical framework that actually has some idea of what intelligence is.
The point of the New York analogy is that basically we're assuming that things being done by statistical correlations counts as intelligence because *we* would use intelligence to do it. Refuting this requires arguing that something like GPT-3 becoming "more useful" increases the risk of intelligence forming, but I think that statistical correlations becoming more useful does NOT imply any sort of improvement towards intelligence, in the same way that a race car going from 30mph to 3000 mph doesn't have any improvement on its ability to climb a ladder (even though we would personally use our legs both to run and to climb ladders, and we can't run 3000mph).
Probably worth a full post level of detail, so I'll percolate on it. But that's the short argument why it's not being a Dyatlov, in my opinion. Nuclear chain reactions can be shown to exist, and even before they were shown to exist Szilard could define what the essential parts of a chain reacting system would be. Statistical correlations representing any sort progress towards true intelligence has *not* been shown to exist and I think there's plenty of evidence to the contrary when you start thinking about the parts of intelligence we're actually worried about w/r/t predicting things and developing novel ontologies and whatever.
So to be clear, I do think that we might need to have these conversations about super-intelligent AI someday when we invent an entirely new theory of what intelligence actually is, THEN perform rigorous experiments clarifying that theory, THEN create new methods of computation that interact with the world on those methods. But I think there would be multiple clearly-defined bright lines separating "risky AI" from all computation as it stands today. Maybe the most extreme way to phrase my thoughts: let's say that OpenAI had developed a GPT-10, and it wrote a new Shakespearian play that no scholar could distinguish from Shakespeare himself. I argue that even this would represent something with exactly as much capacity for intelligent thought (ie, zero) as any other bundle of statistical correlations. This may sound like I'm trolling at first blush, because clearly something that would take more "intelligence" for us to write would require more "intelligence" to replicate, so GPT-10 is more "intelligent" than GPT-3, right? But I think the burden of proof is squarely the other way round - saying "our observations for what words tend to follow other words have gotten more accurate, so computers are now better at {problem solving/predicting/any of the other attributes of intelligence we're actually worried about}" is a total non-sequitur, and an appeal to "but no, the observations of what words tend to follow other words are like, SUPER good now" doesn't cut it, even for arbitrarily Shakespearian values of super. Does that make sense?
Personally I would expect a 3000 mph wheeled vehicle to be able to reach as high as any ladder a human could plausibly climb, simply by inserting an appropriately curved ramp to convert horizontal motion to vertical. In many real-world cases a pre-existing hill quite some distance away might be sufficient.
An indistinguishably-perfect imitation of some lost Shakespeare play would need to include not just poetic forms and a coherent narrative, but a mix of period-appropriate topical references and timeless commentary on the human condition - including some genuinely new stuff, not merely cribbed from existing works. Somebody could plausibly use that to autopilot their way through a thesis defense for a PhD in a field they otherwise know nothing about.
> Does that make sense?
Kind of, but the fact is, there are plenty of domain experts who do not find this view compelling. Stuart Russell even titled a chapter (in his book Human Compatible) "The Not-So-Great Debate", the chapter being about those who don't think there is any point in talking about AI risk yet, which is rather strong evidence that such arguments are very unpersuasive to some experts.
But zooming back in at the object level claim, let me take a layman's whack at it. It appears to be an example of McAffee's Fallacy (https://twitter.com/esyudkowsky/status/852981816180973568?lang=en), in that it is precisely the problem, yet it is presented as if it does away with the problem. We could totally stumble accidentally on those things, without ever developing a theory of them. Or maybe the theory is developed rather suddenly in an Einstein-like Annus Mirabilis event. Hell, some experts are warning about prosaic AGI, the notion that perhaps the current techniques are actually good enough to give us AGI, they just need more data and more compute.
Basically, your views (and Mitchell's) are just nodes in a vast graph, and taking the graph as a whole, the thing looks like an incipient Chernobyl to me.
In the end, I don't think we have a real disagreement here though. What I will eventually argue for in the substack is that AI research needs to be conducted exclusively by High Reliability Organizations (https://en.wikipedia.org/wiki/High_reliability_organization), and probably also attempt to figure out why no one on the planet is attempting human genetic engineering (see what happened to He Jiankui), since this seems relevant to pinpointing how certain scientific disciplines become safety or ethics minded. It does not appear like you would disagree with either of those.
While it’s true that many forms of learning will require real experiments and so can’t be infinitely fast, this only goes so far in reducing my concerns. Air travel is an accelerant for the spread of viruses and social media for the spread of memes. Accelerants are concerning because they reduce our ability to cope, and this is still true even if there are limits on how much they can speed things up.
I get that idea, but I do genuinely think this is more than just an upper bound. I think it matters that what machine learning techniques are doing isn't actual learning *at all* - that they capital-m Must consume the results of human experiments via the encoding methods of their training data. So a hard pre-req for our current methods of ML achieving superintelligence is that it can do it solely using the data grammars of good old fallible humans, with no capacity to improve on them. While this is more of a normative question that doesn't have a "true" answer - you could believe that the corpus of human written word is enough to become super-intelligent on, even if the data stream is limited to just the brute characters - the history of science is full of examples of theories hitting dead ends and needing new experiments to support new frameworks. When you look at how knowledge has advanced historically, I really do think that it shows the paucity of theory-only advances, which in turn lowers anxiety about theory-only intelligence overcoming an experimental one. Or lowers mine, anyway.
A better technical refutation would be like something like Yarvin's thesis on how management theory puts a limit on AI growth. https://graymirror.substack.com/p/there-is-no-ai-risk
Yeah, I have some issues with the framing because I think the bounds of complexity need to be a bit more formally proven, but intuitively I think the diminishing returns on complexity is exactly right as a frame. The analogy I like to use is that if you want to predict the waves arbitrary far into the future, eventually your storage medium needs to store the tiniest details as such fidelity that it has to be so big and so wet that it must be the ocean itself. I actually started a post to that effect, but I've iced it for now because I don't have the information theory chops to speak rigorous about where that line is - just a sort of "look around you" sense that it must be several orders of magnitude below a point where we could have a disembodied actor have total mind control powers or whatever.
I agree that there seems to be some trick, some secret sauce, the lack of which prevents today's machine learning from turning into "real" AI. Or possibly multiple tricks. But I don't think we can say anything about when such tricks will be discovered by some researchers somewhere. Past experience doesn't seem all that helpful in predicting future surprises, other than to show that we can sometimes be surprised.
And so I'm suspicious of armchair arguments that either say that such breakthroughs are inevitable or impossible. It seems better to humbly admit ignorance? We don't know what researchers will try.
Also, it seems like "real" experiments could happen at accelerated speeds in the domains of computer security, social networks, and financial systems?
Quick disagreements.
First of all, OpenAI Codex and GPT’s capabilities feel fairly ... general. It’s hard to name some capability within the input text below a complexity threshold that they don’t have. And much of it is extremely simple techniques, albeit with a lot of refinement and tweaking and complexity, still extremely conceptually simple techniques relative to the scale of the number of floating point operations - being applied at supercomputer scale. The “slippery slope fallacy” of computing power increasing means it will continue to increase seems like not a fallacy - Moore’s law continues to push, and FLOP/s continues to multiply. GPT is already a good conversationalist relative to some people, and codex can code better than 80% of people. You speak of “redefining the things computers do as intelligence” - but historically, the opposite happened. As the power of computers expanded, things previously seen as typifying and proving intelligence (energy, motion, vitality, calculation, speech, representation, control, chess, then Go, then language, then programming, then image synthesis) were redefined as outside the human sphere.
> it’s going to need it’s own ability to choose what parts of the infinitely detailed world to resolve into data
... this sounds like “attention”, one of the NN new things behind GPT? And that applies to the context frame? AI is already doing that. Modern AI already does this
> Once you accept that there’s no universal listing of facts, the AI needs to generate its own facts as part of becoming “more intelligent”. Determining whether your facts are good or bad requires experiment - real experiment, not the machine learning sense of “experiment” that involves slicing the same static dataset in different ways and running your algorithms on it.
modern AI also already does this (sim2real for self driving cars for instance, or agents exploring simulated environments, are not static datasets at all). And within the context of AI the learned models absolutely are “generating facts”, an association learned by a model seems like a fact to me
Overall, unconvinced. I suspect a careful look at existing GAN/CLIP image generation, codex, etc will help clarify why