My foundational worldview

Updated 29/7 2021

Reality — What is this thing?

I am well aware that the topics I write about on this blog are weird: The potential risk of superhumanly competent goal-driven software containing unintended flaws, causing us to lose all or most future value (here). The possibility that we live in a multiverse that includes copies of us, giving us a kind of immortality (here). I also explore the simulation argument and how that interacts with the multiverse (here). Maybe their weirdness, their apparent otherworldlyness, is what attracted me. Even if personality traits such as high openness to experience can explain why I was attracted by them, some further explaining is in order for how I can take them seriously. In this post, I will try to convey some of the foundations of the worldview where they fit in. This attempt will involve some hand-waving and elaborating on subjects I’m far from an expert in. It is, therefore, likely that I will misrepresent facts in any of these subjects. I nevertheless think this post will provide some useful context to the previous ones.

Strange times

It’s easy to take modern technology for granted, but if you think about it for a second you’ll realize that it’s quite remarkable. Sometimes it can feel like magic. At the risk of rehashing a boring cliché, I will remind you of some of the miracles of modern technology. There are wagons made of metal pulled by invisible superhorses. Stiff giant birds gracefully carrying hundreds of people thousands of kilometers (or miles) in the sky. People touching a slab of lit-up glass in their hands that can send and receive messages, talk to almost anyone regardless of distance, play any music, display any image or video, buy almost anything, get directions to anywhere and distribute information invisibly through the air that is accessible to billions of people virtually without delay, each slab powered by billions of small electrical switches, switching on and off billions of times a second (1). There are robots on Mars, and we landed one on a comet. People are floating in a space station cruising around the Earth faster than the fastest bullet.

It feels like in the last couple of centuries, the development of technology has left the runway. I think this is supported by the evidence. For instance, change in economic output has gone from almost horizontal to almost vertical when plotted on a graph over the last two millennia (2).

gdp since 1 AD darkgrey

For the first time in human history, economical growth has outpaced population growth, making most people significantly richer. The global average income has increased tenfold (3) and the fraction of the world population living in extreme poverty has declined from about 90 percent to 10 percent since the 19th century (4). The effect of modern technology is not always positive, a demonstration of the unchecked and sometimes unintended power of our technology is our impact on the climate. A sign of modern technology can be seen on the carbon dioxide record, which looks normal for hundreds of thousands of years until about 1950 when it shoots straight up. With deliberate climate engineering, a large country could usher in an ice age if it feels like it by spending a tiny fraction of its GDP; humanity wields terrifying power (5).

It can feel as though the present is the normal state for things to be in. The AI safety pioneer Eliezer Yudkowsky jokingly points out in one of his talks that the world seems to tend towards greater and greater ”normality” over time. He brings up that women used to be unable to vote, that seems odd! Going back further, things were even stranger. However, if you could take an outside view of human life since the origin of our species, it would be in the latest few pages of this epic where things became strange. After all, we spent about 95% of our existence on this planet so far as hunter-gatherers. I regard it as somewhat likely that this trend towards greater strangeness to continue in the upcoming chapters.

Our ability to manipulate the world to an ever greater degree is strongly linked to our increasing scientific understanding of the world. Philosopher Rebecca Goldstein says about science that it ”gets reality itself to collaborate with us”, by ”prob[ing] reality so that it will answer us back when we are getting it wrong” (6). For instance, in the 19th-century people were confused by Mercury’s orbit, because it did not seem to follow the law of Newtonian physics. When Einstein proposed general relativity, where gravity is conceptualized as the curvature of spacetime, Mercurys orbit made sense (7). This was a big point in favor of the theory. Scientists make progress by coming up with ideas and collaboratively testing their merit, through argument and experiment.

One seemingly necessary condition for successful science is mathematics. Galileo Galilei, who helped birth the scientific revolution, compared nature to a book written in the language of mathematics (8). Physicist Eugene Wigner wrote about ”The Unreasonable Effectiveness of Mathematics in the Natural Sciences”, the (perhaps surprising) fact that nature follows intelligible mathematical rules. After Maxwell discovered his equation for electromagnetism we mastered electricity, rockets use Newton’s law of gravitation to reach orbit, nuclear power, as well as weapons, were developed through discoveries in particle physics, and the computer industry depends on quantum mechanics to make transistors work (9). Engineering students are taught physics and math because we make technologies work by understanding the forces and laws that govern them. In some cases, the necessity of science for technological development might be overstated, it could be the case that some technologies that came after scientific breakthroughs could have been invented and perfected without scientific theories and could rely only on trial-and-error and heuristics, but it would likely take a lot more time and resources. I suspect that for modern technology, in general, this approach would be infeasible.

Design processes

One way to benchmark the pace of technological progress is by comparing it to the first ”design process” (10) in the universe: evolution. Humans have in a few centuries developed technology with abilities that took evolution by natural selection millions of years to acquire and perfect. Flying, seeing, and hearing (i.e. detecting photons and airwaves), harnessing and storing solar energy to take a few examples. Sometimes when we put serious effort into an ability, we have pushed that ability further than evolution ever has. Airplanes fly faster than any bird, boring machines can dig through harder material than any animal can. This is partly because we arguably use a more efficient ”design process” than evolution, we make use of our (imperfect) scientific understanding, reasoning, and planning ability as opposed to evolution which lacks any understanding or foresight, and partly because we can choose any desired set of features to optimize, whereas evolution can be said to only optimize inclusive fitness. We can also make use of designs unavailable to evolution because the designs from evolution can only improve gradually. A classic example is the wheel: there is no use in a half-finished couple of wheels and there are no roads in nature. Nuclear power is another technology that will arguably never be available to the “design process” of evolution because it seems to require rare, enriched non-organic materials, large facilities, and planning.

An unbreakable pattern

So far I’ve suggested that we live in a strange era in which we are rearranging matter into useful stuff much faster than evolution accomplished similar things, in large parts thanks to scientific progress. I want to say two things with this.

Firstly, it should not be too far fetched to think that we can keep this pace up and eventually outperform evolution in the design of intelligent things, perhaps not far from now, which should unlock even faster technological developments – a telescoping of the future, to use Nick Bostrom’s words – a superintelligence could possibly invent in a day what would take humans thousands of years. The upcoming chapters could be strange indeed. Intelligence is an evolved ability (governed by the laws of physics I might add) that we are making progress on understanding and reproducing artificially, and just like with many other technologies we can probably take the desired ability to further heights than can be found in nature. It has been pointed out (though I can’t remember where) that after a certain human cognitive ability has been automated, it can instantly or soon after be done faster and better artificially.

The second thing I would point out is that if you work from the assumption that the world is intelligible and constituted of physical stuff interacting according to mathematical rules, you can make remarkably precise predictions (11) and do remarkably powerful stuff. We have, to my knowledge, never found anything that contradicts this assumption. Lightning was thought to be angry Gods, disease and natural disaster were thought to be God punishing the sinners, and sacrifice was thought to control the weather. Supernatural accounts of natural phenomena have one by one been overtaken by a scientific explanation, never has the refinement of explanation gone the other way. This makes me think that the world might be constituted entirely of physical stuff following mathematical rules. This means that there would be no purpose at the fundamental level of reality, just clockwork. We might therefore be left to our own devices here. The only thing we can count on is that the particles (or waves in quantum fields) that we and everything else is made of will mindlessly and without exception follow a partially understood set of rules. (12) As long as it’s allowed by the laws of physics, it can happen. The physicist Sean Carroll describes the world poetically as caught in the grip of an unbreakable pattern. The rough idea is that reality is a sequence of states, snapshots, following each other according to mathematical rules. Each state determines the next, like the Fibonacci sequence or the states of Conway’s game of life evolving according to a fixed set of rules.

The world can be described at different levels, with physics being the most fundamental to the best of our knowledge. It’s not useful to talk only in terms of fundamental particles and the laws of physics when explaining why someone delivered a pizza or how plate tectonics work, even though everything that happens is determined by them. As Dan Dennett notes in a discussion about this topic, someone who knows everything about the universe at the fundamental level, but nothing else i.e. had no conception of chairs, people, and pizza delivery apps (the so-called Laplace’s demon), would be surprised by the efficiency of a simple human who can predict with fairly high reliability that a pizza will be delivered to his door in 30 minutes without knowing the position and velocity of every particle in the universe, and without the computing power required to calculate their trajectories. Understanding the world at the higher level of description with emergent features like money, social trust, and language is indeed indispensable to us, even though they are not necessary for Laplace’s demon to make perfect predictions.

Do we have free will on this view, if we’re just collections of particles mindlessly following rules? I’m a compatibilist, which means that I think free will (the ability to make choices, see more in note 13) is compatible with determinism, the theory that every event is the necessary result of what happened before it. I’m not sure if the universe is deterministic, but free will better be compatible with determinism for us to have free will because introducing randomness doesn’t seem to help. I’d rather my decisions were determined by my previous experiences, intuition, and deliberation than by a roll of the dice (I think).

A concept I find useful is the distinction between the manifest and scientific image of the world, the world as it appears to our senses, and the world as it is unveiled by science. The manifest image and the scientific image aren’t always easy to reconcile, as in the case of free will where the feeling that you could have done otherwise which some people see as essential for free will clashes with the scientific view of brains as collections of particles obeying physical laws. In this case, I think the feeling is mistaken and that our notion of free will should be adjusted to fit the scientific view. The manifest image and the scientific image are true in different senses, the manifest image is true in the sense that what you experience of the world is a real experience. Even if you’re a brain in a vat, you can claim things like “I experience a red car” and not be mistaken, but you can be mistaken about the causes of your experience (the car might exist only as a representation in signals fed into your visual system). The scientific image is our best account of the causes of our experiences, and we can reach beyond our raw senses with microscopes, telescopes, infrared cameras, and other tools to get a richer view of reality. I think we should trust our scientific understanding over our hunches and raw perception when they disagree about the cause of our experiences since we know from experience that our senses can be simply fooled by for example conjuring tricks and visual illusions. It shouldn’t be surprising if the universe violates our common sense, we’re evolved to deal with the challenges of the African savanna – not understanding the nature of reality (I recommend this talk by Richard Dawkins on the topic).


The most important thing I can’t fit into the scientific, naturalist, view of reality is consciousness. I don’t understand how it can exist. But it is among the last things we would deny. While apparent causes of experiences can be illusory, conscious experience itself, I strongly believe, cannot be an illusion. It’s the very stage where illusions can appear. However, it seems to be a hard problem to explain why and how physical processes give rise to subjective experience. ”As physicists work toward completing a theory of the universe and biologists unravel the molecular complexity of life, a glaring incompleteness in this scientific vision becomes apparent. The ‘theory of everything’ that appears to be emerging includes everything but us […] We need a ‘theory of everything’ that does not leave it absurd that we exist.” says the back of Incomplete Nature by Terrance Deacon, which exposes his attempt of filling the missing piece. I’ve read two different accounts of consciousness by naturalists, Incomplete Nature and Dennett’s Consciousness explained as well as Brian Tomasik’s writings on consciousness, which echoes Dennett’s. Reading these accounts didn’t get me closer to feeling satisfied with a resolution, though I can’t claim that I fully understand them. Steven Pinker summarises (14) the believed function that consciousness plays according to some neuroscientists as a blackboard in the brain where ”a diverse set of computational modules can post their results in a common format that all the other modules can ’see’” These modules include perception, memory, language and action planning. This seems like a plausible account of the biological function of consciousness to me, but it doesn’t explain the nature of first-person experience.

How come some physical processes are accompanied by consciousness? This question seems kind of hopelessly difficult (15). Would any answer feel satisfactory? To any answer of the type “because of this particular process”, you could reply: why isn’t that process going on “in the dark” like any other process allegedly is? It is tempting to believe that consciousness is coming from somewhere else and that the physical process is merely summoning the consciousness, which exists outside the physical world. But how could a physical system know and report that it is conscious unless consciousness is interacting with that physical system? The non-physical would have to interact with the physical but this would seem to make the non-physical just an extension of the physical and the problem remains. The problem of consciousness is discussed lucidly in this episode of the Making Sense podcast.


According to my worldview, we live in highly unusual times from a zoomed-out historical perspective, and we should regard it as plausible that things get increasingly strange. Science, technology, and human creativity are incredibly powerful processes. They progress much faster and produce more powerful designs than the first “design process” in the universe, evolution, does for many of the features that are relevant to us. I anticipate that these faster processes will eventually outcompete evolution in the design of intelligent systems. My worldview also says that the world runs like Conway’s game of life, a mathematical pattern with simple beginnings that grew in complexity over time. It operates mindlessly with no intended purpose. The unbreakable (or flawless if you prefer) mathematical pattern underlies and determines everything, including the human mind. The human mind is the first and only known entity in the universe that has learned about the unbreakable pattern and has come to know the rules the pattern follows in everyday life: the physical laws that govern us and our local surroundings. I interpret our best scientific theories, with their amazingly acurate predictions and the technological progress they have enabled, as suggesting this view. However, the biggest unanswered question in my worldview is consciousness. How can first-person experiences emerge from a mathematical pattern?


(1) I like this poetic description of computer chips by a programmer on Reddit:
“We draw magic runes on sand with light then capture lighting in it and make it dream. As coders we whisper arcane spells to the dreaming stones.”
(2) As I’m required to mention by the CC-license: graph colors are edited.
(3) Page 426 of “An Introduction to Global Health” by Michael Seear, Obidimma Ezezika
(5) David Keith: A surprising idea for “solving” climate change
(6) Waking Up with Sam Harris #120 – What Is and What Matters (with Rebecca Goldstein and Max Tegmark)
(7) Tests of general relativity – Wikipedia
(8) Quote by Galileo: “Philosophy [nature] is written in that great bo…”
Cool fact: GPS satellites compensates for Einstein’s theory of relativity to transmit the right time. I think this is a surprising application of relativity, although it might not have been needed for GPS to work, arguably we would have figured it out with a space-clock experiment if nobody had yet figured out the theory of relativity, see
(10) There is disagreement among naturalists about whether evolution should be called a ”design process” or not, so I use the term in scare quotes.
(11) In Quantum electrodynamics, one of the most accurate theories in physics, agreements between predictions of the theory and observations is within 10 parts in a billion for a certain test.
(12) Unless we are simulated.
(13) There are different definitions of ”free will”, the one I’m using is from Wikipedia: ”Free will is the ability to choose between different possible courses of action unimpeded.and more specifically I think of free will ”as a psychological capacity, such as to direct one’s behavior in a way responsive to reason”, this is the definition used by determinists, if you think the notion of ”could have done otherwise even if the conditions before the decision were exactly the same” is important in the definition of free will, then you would not be a compatibilist. I like this post by Sean Carroll on determinism and free will.
(14) Page 426, Enlightenment Now.
(15) Maybe that’s why it’s called the hard problem.

The simulation argument and many worlds

Language improved 30/7 2021

As I brought up in my previous post, humanity might have lots of clone species. They could be far away in space (if space is sufficiently large) and in other Everett branches (if the many-worlds interpretation of quantum mechanics is correct). In this post, I want to explore how this possibility interacts with the simulation argument and my other reflections on Bostrom’s mind-bending idea.

It strikes me as quite likely that we, or at least a significant fraction of our cosmic counterparts (if we have any) will eventually create a superintelligence, or in some other way get the capability to create ancestor simulations (sentient simulations of people like us). If there are more people like us in ancestor simulations than in real history, we are more likely to be in one of the many simulated histories, according to the simulation argument (1). It seems plausible that many possible civilizations and superintelligences would make these simulations if they could since they would probably have high instrumental (and recreational) value (2) and advanced civilizations appear to have lots of computing power at their disposal, enough to make trillions of ancestor simulations each. It, therefore, seems likely that if people like us exist at many places in the multiverse, there will exist very many ancestor simulations, unless almost all of them either have a universal ban on ancestor simulations or go extinct (in some other fashion than by a misaligned superintelligence) before they can make any.

One reason to think that a ban on these simulations would be enforced is because future civilizations might deem sentient simulations immoral (as I hope they would (3)), but this assumes that just about every civilization succeeds in keeping their AIs and citizens under control for possibly millions of years. A second reason to think that we are not in a simulation is that even unfriendly agents might refrain from making simulations to reduce the likelihood that the agent is in a simulation themselves (4). However, it only takes that one in a thousand civilizations makes a thousand ancestor simulations for the simulation hypothesis (5) to become likely. I think the likelihood that we are simulated is substantial (maybe 40% if I had to pick a number (6)), especially if there will ever be any large number of civilizations at our stage of development in the base-level multiverse. A final possibility is that it is impossible to create simulations with sentient life, I don’t think this is plausible since I think the human brain functions in accordance with physical laws which seem to be computable.

If the simulation hypothesis is correct, our cosmic endowment could turn out to be a giant wallpaper, or the simulation might stop after a certain amount of resources have been used up by the computer running it. Due to this, and potential correlations between our decisions across simulations (if we make the world better in one simulation, it might make it more likely that the conditions in similar simulations improve as well), Brian Tomasik thinks the relatively short-term looks comparatively more important to improve for its own sake than otherwise. His argument doesn’t depend on the simulation hypothesis being very likely, which I think is counter-intuitive. I think his reasoning and math seem sound, but I don’t think it justifies ignoring far future concerns, and neither does he. Without considering the simulation argument, efforts to improve the “short term” (meaning decades or centuries in this context) for its own sake seem basically negligible compared to the far future (spanning millions or billions of years) in the expected impact. When taking the simulation argument into account, the “short term” starts to appear comparable to the far future in importance, when adding appropriately huge error bars to the values of variables in this calculation.

Max Tegmark thinks (7) that the simulation argument “logically self-destructs” when you reflect on the fact that all the simulated civilizations are likely to make simulations of their own, so we are more likely (according to the Simulation Argument) to be in a simulation within a simulation, and even more likely to be in a simulation in a simulation in a simulation, and so on. I don’t find this counterargument very persuasive, there is no infinite regress as long as the computer running the simulation doesn’t have limitless computing resources.

Simulation branching

Consider the case where the simulation hypothesis is correct (we live in a simulation), and the many-worlds interpretation is correct in our simulator’s reality (I’ll call it base level). Depending on what hardware is used to run the simulation, we get different effects. If observed quantum events in the simulation are determined by quantum computations in base-level reality, we should expect the universe where the simulation is running to branch when people inside the simulation do a quantum measurement or quantum computations. Quantum computing can be simulated by classical computers but would use up so much computing power in base-level reality that I guess it would be impractical (8). It’s also possible in principle to classically simulate a universe that behaves according to many-worlds, but this would require huge amounts of computing power unless it was crudely approximated.

If many-worlds is incorrect in base-level reality but space is sufficiently large, we should expect the same simulation to be run elsewhere in that space. If it is run at many places and uses quantum computers, we should get the same effect as if many-worlds is correct (9). That is, the simulation automatically branches if it’s run with quantum computers and base-level reality is sufficiently large and/or many-worlds is true. The simulation could also branch locally if it is designed to fork in certain conditions, to approximate many-worlds, or study counterfactuals.

If we live in a simulation, can we know anything about the laws of physics in our simulator’s reality? Yes, we can at least infer that if we are in a simulation, their laws of physics must allow for such a simulation to exist. Base-level reality must also permit the existence of intelligent agents and the creation of advanced technology for the simulation argument to hold, this constrains the set of possible physical laws in base-level reality. Remember, the reason we entertain the simulation hypothesis is the simulation argument and it relies on base-level reality once containing agents in a position similar to ours. The simulators had some reason for creating the simulation, if they use it to predict the development of life elsewhere, they would presumably want to design the physics in the simulation to approximate their laws of nature. The simulation might suit another purpose, in which case the simulation might resemble their reality to a lesser extent. I nevertheless think we should take seriously the possibility that if we live in a simulation, it might branch as if the many-worlds interpretation is true inside the simulation.

Does it make sense to ask if we live either in a simulation or in base-level reality? It might make more sense to think of oneself not as a particular instance of matter that is either simulated or not, but rather as a decision algorithm implemented in different places across the multiverse, some in base-level reality and some in simulations.


(1) The physicist Sean Carroll has noticed an interesting flaw in this logic.
(2) This point relies partly on recognizing that simulations are useful to us now and partly on the idea called “Computational Irreducibility“, that the behavior of a complex system usually cannot be predicted without simulating it.
(3) Unless if they can prevent more suffering by running the simulations than they create, that might be morally justified in some cases.
(4) I was not the first to suggest this, see “Multiverse-wide Cooperation via Correlated Decision Making” page 99.
(5) The simulation argument and the simulation hypothesis should not be confused, see Bostrom’s FAQ, where he also responds to some common counterarguments.
(6) This would be a lot higher if I was more certain that my thinking isn’t completely confused.
(7) (Our Mathematical Universe, p 347) In a recent panel discussion, Tegmark was asked what likelihood he would assign to the simulation hypothesis, to which he replied 17%. David Chalmers, who was also on the panel, gave it a 42% chance tongue-in-cheek.
(8) See Signal-based classical emulation of a universal quantum computer and Quantum simulator (Wikipedia). My reason for guessing that it would be impractical to simulate observations of quantum events on classical computers comes from Scott Aaronson’s Quantum Computing Since Democritus: “Describing the state of 200 particles takes more bits than there are particles in the universe” (page 217). It might however be easier to fool us into thinking we are observing quantum events than to run the necessary quantum computation.
(9) [PDF] The Multiverse Hierarchy (page 8), see also “Unifying the Inflationary & Quantum Multiverses (Max Tegmark)

On whether to prioritize extinction risks or s-risks

Language improved 30/7 2021

Human extinction might surprisingly not be an all-or-nothing thing. Some fraction of perfect or close copies of humanity in the multiverse (if it exists) might go extinct while others don’t. These copies could be very far away in space – outside the observable universe (level I multiverse), in other bubble universes where cosmic inflation has ended (level II), other branches in so-called Hilbert space (level III), or possibly in other mathematical structures (level IV)(1). If any of these possibilities are true, it suggests that humanity in a non-local sense is unlikely to go extinct in a very long time. My increased level of confidence in the existence of multiple instances of humanity increases my interest in reducing suffering risk relative to reducing extinction risk since my primary reason for prioritizing extinction risk was that I thought we would lose all future value if we (a particular instance of humanity) went extinct.

Additionally, focusing on reducing suffering risk looks more robustly beneficial to me than extinction risks. Reducing involuntary suffering is always good (holding everything else constant), while reducing extinction risk might not be, depending on the quality of the future and your preferences (your E- and N-ratios). Suffering risks are also more neglected than extinction risks (2). Notwithstanding, It would be an inconceivable tragedy if humanity will ever only exist on this Earth and never reach anywhere close to its full potential. This seems like a real possibility, so I’m very sympathetic to the cause of reducing the risk of extinction.


(1) See Max Tegmark’s four levels on Wikipedia. From what I gather, each level is more controversial than the previous one, with the first one being uncontroversial among cosmologists and the last one highly speculative. Even if level I is uncontroversial, we don’t know if it is big enough to host other instances of humanity but cosmological evidence suggests that it could be infinite, from Wikipedia: Arguments have been put forward that the observational data best fit with the conclusion that the shape of the global universe is infinite and flat, but the data are also consistent with other possible shapes.”
(2) In “Against Wishful Thinking”, Brian Tomasik explains why he thinks people don’t pay enough attention to the risk that suffering could get greatly multiplied in the future.

The cosmic endowment and superintelligence

Update 25/7 2021: I have changed my mind and no longer agree with parts of this post. Keeping it up for archiving purposes. Note below.

Humanity’s cosmic endowment (all the resources available to us if we develop the ability to colonize space) allows for the computation of about 1085 operations if we convert all the accessible cosmic resources into efficient computers, according to an estimate by philosopher Nick Bostrom (1). What could we do with all that computing power? There has existed about 1025 (give or take a few orders of magnitude) sentient life forms in the entire history of life on Earth (2). To simulate all their neural activity, every experience that every organism has ever felt would require about 1039 to 1052 operations according to Nick Bostrom’s estimate (3). If we take the high, conservative estimate, we could run “sentient history” about 1033 times with our cosmic supercomputers (4). If we assume that consciousness doesn’t go away when carbon and neurons are replaced with silicon and transistors, the cosmic endowment seems to allow for the creation of conscious experience comparable to at least a billion trillion trillion copies of the history of life on Earth (5). To put this number into perspective, that’s roughly as many copies as there would be grains of sand on all beaches in the Milky Way galaxy if every star in it had a planet just like Earth.

What we make of the cosmic endowment is therefore plausibly trillions of trillions of times more morally significant than everything that has ever taken place on Earth (6). There are plenty of uncertainties in this estimation, but the result must be off by tens of magnitudes for my conclusion to change: ensuring that smarter-than-human AI is as benevolent as possible looks incredibly important. Why? Because it looks as if a superintelligence, which can be viewed as an extremely competent goal-achieving system, with a wrongly specified goal might grab the cosmic endowment (including Earth) and turn it into whatever structures best fulfills its goal. If you want to understand why I think this is plausible enough to be taken seriously as a possibility, I primarily recommend reading the best-seller Superintelligence by Nick Bostrom. If you want the gist without investing several hours, I can recommend this succinct Vox article and Robert Miles excellent videos, especially these two that explain core concepts: The Orthogonality Thesis, Intelligence, and Stupidity and Why Would AI Want to do Bad Things? Instrumental Convergence. The advent of a superintelligence could plausibly happen this century (7) and lead to astronomical amounts of suffering in at least three ways, as a side-effect if a superintelligence finds instrumental value in creating suffering agents, in a potential conflict situation (for instance if threats of creating intense suffering are used to extort other agents) or if the creation of suffering is part of its goal, which could conceivably happen due to a faulty implementation of human values into the AI (8).

Right now, there is one organization I’m aware of that has as its main focus to understand and reduce the risk of astronomical suffering, namely Center on long-term risk, an organization in the Effective Altruism community. Another organization that might be beneficial in my opinion is MIRI, which works on AI alignment research, technical research into ensuring that a future superintelligence produces good outcomes. However, it has been speculated that the worst outcomes might come from an almost aligned superintelligence and research on AI alignment might make those outcomes more likely (9). Although I can imagine that you could make the opposite argument, namely that without alignment research we will reach almost aligned AI and we will need alignment research to get out of the “hyper-existential pit” in the design landscape.

My most preferred option would be for humanity to not make any superintelligences — it’s just not worth the risks in my opinion — but that might be asking for too much, taking into account the economic incentives and enthusiasm in AI research: “the prospect of discovery is too sweet” (10).

Note: The part I now disagree with is the “‘hyper-existential pit’ in the design landscape” idea. The idea is similar to the uncanny valley idea and says that AI safety research could get us into the valley but fail to get past it. I now think that this is overly simplified. There could be other peaks far from the hypothesized valley.


(1) Bostrom’s estimate of our cosmic endowment (in Superintelligence, p 102) assumes that there don’t exist any other technological civilizations within our cosmic horizons, if they do exist we would have to cooperate or compete with them for the resources. It also assumes that we don’t live in a computer simulation with limited computing resources (Bostrom, Are You Living In a Simulation?). If we choose to aestivate the computing resources could increase 30 orders of magnitude.
Calculating low- and high-end estimates for the number of sentient animals that have existed.
High end:
Current number of animals: about 1022 (mostly nematodes)
x 52 x 500,000,000 ~ 10^32
(one-week lifespan on average assumed, animals have existed for half a billion years, no change in populations over that time assumed)
Low end:
Total number of mammals that have ever existed: probably about 1020
Current number of mammals: about 1011
x 200,000,000 ~ 1019
(one-year lifespan on average assumed, mammals have existed for 200 million years, no change in populations over that time assumed so it’s likely an overestimate)
It’s hard to estimate the total number of sentient life forms that have existed, but it’s likely in the range of 1019 to 1032. It depends on where we draw the line between sentient and non-sentient, it’s closer to 1030 if we include insects. Numbers from Tomasik, How Many Wild Animals Are There?
See also:
How many animals have ever lived?
How many organisms have ever lived on Earth?
(3) Number of operations required to simulate “sentient history” is derived from Superintelligence, p 26: “If we were to simulate 1025 neurons over a billion years of evolution (longer than the existence of nervous systems as we know them), and we allow our computers to run for one year, these figures would give us a requirement in the range of 1031-1044 FLOPS.”
There are about 3 x 107 seconds in a year, so I multiplied with that. Note that Bostrom’s  1025-estimate is the number of neurons in nature at any given time, while my 1025-estimate is the number of sentient animals in the history of Earth, his estimate also counts non-sentient animals.
(4) We get this by dividing the 1085 operations allowed by the cosmic endowment with the 1052 operations representing all neural activity. Alternatively, we could create 1058 digital humans with 100-year lifespans, interacting with each other in virtual worlds (Superintelligence, p 25–26, p 102–103).
(5) If we use the high estimate for the number of operations needed to simulate all “sentient history” and we assume moral significance is proportional to the number of operations (an extreme oversimplification), “sentient history” on Earth is morally comparable to 1025 human lives. If we use the low “sentient history” estimate, “sentient history” is comparable to 1012 human lives.
(6) I’m assuming here that total moral significance is linear with the amount of conscious experiences with moral significance. It doesn’t seem far-fetched to think that headaches are twice as bad if they are twice as intense, are experienced for a period that is twice as long, or occurs to twice as many (all else being equal).
(7) Nobody knows if and when superintelligence will be developed, but in a 2016 survey, the mean of AI expert opinion was that it is 50 percent likely that AIs will outperform humans in all tasks by around 2060. Superintelligence could also arrive through what’s called whole brain emulation. This path should be easier to predict because it doesn’t depend on any theoretical breakthrough, “just” continued incremental progress in computing, microscopy, automatic image recognition, and neuroscience. Oxford researcher Anders Sandberg estimates that there is a 50 percent chance of this technology being available in the 2060s, and about 90 percent by 2100. Technological forecasting is very difficult, so one should take these predictions with a big pinch of salt. However, the suggestion that artificial superintelligence will arrive this century doesn’t appear ridiculous. I like this quote by researchers Sotala and Yamploskiy: “If the judgment of experts is not reliable, then, probably, neither is anyone else’s. This suggests that it is unjustified to be highly certain of AGI being near, but also of it not being near.”
(8) See the paper “Superintelligence as a Cause or Cure for Risks of Astronomical Suffering” by Kaj Sotala and Lukas Gloor.
(9) See “Separation from hyperexistential risk” for proposed attempts to mitigate this risk.
(10) Quote by Geoffrey Hinton in the New Yorker.