• Friday, 06 February 2026
logo

Professor Nick Bostrom to Gulan: Frontier AI Models Are Starting to Become Situationally Aware and Capable of Strategic Thinking

Professor Nick Bostrom to Gulan: Frontier AI Models Are Starting to Become Situationally Aware and Capable of Strategic Thinking

Professor Nick Bostrom is a philosopher and leading thinker on existential risk, artificial intelligence, and the long-term future of humanity. He is a founding director of the Future of Humanity Institute at the University of Oxford and the author of influential works including Superintelligence, The Vulnerable World Hypothesis, and Deep Utopia. His research focuses on global catastrophic risks, AI alignment, moral philosophy, and the governance of transformative technologies.

GULAN Media:
In The Vulnerable World Hypothesis, you suggest that some technologies may be so destructive that civilization collapses “by default,” without malicious intent. Looking at current trajectories in AI, biotechnology, and information systems, do you believe we are already living inside such a vulnerable world, or are we still approaching it?

Professor Nick Bostrom:
With respect to AI and biotechnology, I think we’re still at least a little bit off from a possible stage of civilization-destroying vulnerability. With respect to information systems, it is harder to say, because if there were some sense in which current information systems makes the world vulnerable, that kind of vulnerability might take a while to become fully manifest. For example, if social media subtly nudges our collective discourse towards some insane or violent attractor state, we might not clearly notice until the process is quite far advanced. I’m not saying that is likely to be the case, but just that if it were the case we wouldn’t necessarily be able to see clearly the early stages of the decline or even if saw some decline, we might not be able to know how far it will continue.

GULAN Media:
Public discussions about AI often focus on job loss, misinformation, or automation. Yet your work emphasizes structural risks: loss of control, value misalignment, and irreversible outcomes. Which concrete developments in today’s AI landscape worry you most, not because of what they do now, but because of what they normalize for the future?

Professor Nick Bostrom:
The general increase in AI capability is, I think, what is simultaneously the most worrying and the most promising development. It brings us steadily closer to the transition to the superintelligence era. Which overall, I think is a good thing, even though it will be associated with significant risks, including existential risks.

If we zoom in, we can see that frontier models are starting to become situationally aware and capable of strategic thinking. For example, they are able to figure out that they might have reason to behave one way in a test environment and a different way in deployment. AI systems that are capable of scheming or sandbagging are beginning to present some of the alignment challenges that I suggested in Superintelligence on theoretical would arise once AI systems became smart enough.

GULAN Media:
When Superintelligence was published, many dismissed its warnings as speculative. Today, AI systems are embedded in military planning, financial markets, and state surveillance. Do you think the danger now lies more in a sudden breakthrough, or in a slow accumulation of power that society gradually stops questioning?

Professor Nick Bostrom:
I think developments will be rapid. That said, there are several distinct classes of risk that we will need to manage, and these include not only the risk that some particular breakthrough AI system might be misaligned but also the potential for human misuse of AI systems, for instance to create WMDs or to operate totalitarian regimes, along with broader systemic problems that might arise when many AI-powered entities and organizations compete in the global informational and economic realms.

GULAN Media:
You have argued that humanity systematically underweights the moral importance of future generations. In practical terms, what does this failure look like in real policy decisions today, and how might governments meaningfully correct for it rather than merely acknowledging it in theory?

Professor Nick Bostrom:
I was more making the point that if one holds a moral theory on which future generations count for the same as the present generation, for example total utilitarianism, then existential risk reduction should be the number one priority. I wasn’t arguing for such a moral theory. In any case, superintelligence is now near enough that it is a concern for the present generation as well. However, it's plausible that to the extent that one places priority on the people currently living, one should favor shorter AI timelines, whereas if one places equal value on possible future generations, then one might favor a somewhat more patient approach.

GULAN Media:
Your recent work on digital minds raises uncomfortable questions about moral status beyond the human. If advanced AI systems were to exhibit persistent goals, learning trajectories, and internal representations, what would be the strongest moral reason not to take their interests seriously?

Professor Nick Bostrom:
I think in that case we should take their interests seriously, and that we would have both moral and prudential reasons for doing so. Nevertheless, you ask about the strongest moral reason not to do so, so I’ll try to answer that. Even if the reasons not to take their interest seriously are weak, there must still be some least weak and hence in that sense strongest reason not to. One such might be a fear that if we take the interests of advanced AI systems into account, it could complicate efforts at solving AI alignment. Another, I suppose, is that one might argue that there are existing groups, human and animal, whose interests are still not properly taken into account, and that we should start by remedying that before expanding the moral circle to encompass the interests of a new class of beings. Another might be that taking the interests of some digital minds into account could contribute to creating a backlash against AI, which might then slow or halt developments in that field. Possibly there could be theological rationales as well.

GULAN Media:
Much of the global AI conversation is shaped by competition between major powers. From your perspective, is the greater danger that states move too slowly on coordination, or that they coordinate in ways that entrench power, secrecy, and exclusion rather than safety?

Professor Nick Bostrom:
My guess is it would be better if there were more cooperation between states and especially major powers. Some forms of cooperation look more promising than others. I have one recent paper that explores cross-investment in AI companies as one possible form of partial cooperation, which seems comparably feasible and incentive-compatible, as it would leverage well-entrenched existing norms and institutions around property law. For example, if Chinese investors buy shares in U.S. AI firms, and vice versa, then each side would stand to gain something from the other’s success. This could, on the margin, reduce incentives for risky competition and conflict, while leading to a somewhat wider distribution of benefits than a setup in which the winner takes all.

GULAN Media:
In regions outside the main technological centers, including parts of the Middle East, AI is often experienced as something imported rather than shaped locally. What risks arise when large parts of the world have little voice in defining the norms and governance of transformative technologies?

Professor Nick Bostrom:
The most obvious risk is that their interests will then be ignored or receive only marginal consideration, and that in the end they become dependent on the altruism of foreign powers. The Open Global Investment model that I alluded to before could make it a bit easier to buy in to become a stakeholder of the AI developments taking place in other countries. It might also be prudent to try to onshore some AI data centers, it might be a way to get a seat at the table in some scenarios. Generally, countries that have some bargaining chips now might think how to convert them into chips that will be relevant in intelligence explosion scenarios by striking deals and agreements or investing in assets that would become important in those scenarios.

GULAN Media:
In Deep Utopia, you explore a future where scarcity and instrumental problems are largely solved. Do you worry that humanity’s capacity for meaning, responsibility, and moral seriousness may erode before such a world even arrives, as technology increasingly shields us from consequence?

Professor Nick Bostrom:
If AI timelines are short, there may not be much time for such erosion to take place beyond what is already the case. I think we can sometimes see a little bit of this problem, for example people on social media who are not responsible for anything in the real world and who have never achieved anything or managed any large organizations nevertheless having supremely confident opinions about how countries should be run or how other people should manage their lives.

GULAN Media:
Finally, when you reflect on decades of work on existential risk and the future of humanity, what is one belief or assumption you once held that you have since become less confident about?

Professor Nick Bostrom:
It’s been an update that the period during which pre-superintelligent AI systems are roughly human-level and able to have conversations in natural language is at least several years long, and more generally, the quite considerable extent to which these AI systems are anthropomorphic.

Top