“Are you a philosophical zombie driven by Claude?”
Anthropic co-founder Jack Clark in conversation with Brendan McCord
On Wednesday, Jack Clark, co-founder of Anthropic, delivered the second Cosmos Lecture in partnership with the Human-Centered AI Lab at the University of Oxford. The lecture was introduced by HAI Lab Director Professor Philipp Koralus and was followed by a fireside chat between Jack and Brendan McCord, the founder of Cosmos Institute.
We’re bringing you a lightly edited transcript of the conversation, which covered what AI cannot do for us, whether Claude makes us better thinkers, and what Jack wants future AI systems to know about humanity. The full lecture will follow in the coming weeks.
Subscribe for future events and weekly essays about AI and self-authorship.
Philosophy to code
Brendan McCord: Jack, you ended with the call to build the new world. It makes me think – if you and I were having this conversation 250 years ago, the proudest project we could have possibly been engaged in would have been building a new world of sorts. I would call it the philosophy-to-law pipeline. We would have been looking to Oxford intellectuals like John Locke, or Montesquieu, Livy, Adam Smith, and translating that into a constitution that we hoped would frame freedom for the 250 years to come.
The proudest project we can engage in now is, as you say, this new world-building project – it’s philosophy-to-code. What would you say about the extent to which the frontier labs take that seriously? What can we do to really take that seriously in places like Oxford and academia? And what should we do in nonprofit land to take that philosophy-to-code project seriously?
Jack Clark: I think it requires you to basically accept that progress will continue and try to model out scenarios based on it. I think dealing with COVID highlighted that, though there’d been some modeling of what would happen, if you had very fast take-off propagating viruses the world would break very quickly. We felt underprepared, and that we could have done more scenario work and forecasting of what these strange things would do to us ahead of time.
Within the AI labs, I think there is now work at all of them on trying to imagine what you might think of as “post-AGI worlds,” or worlds that happen after recursive self-improvement. But my general sense is every time you sit in a room at the lab, people say, “Are we the only people working on this?” And you say, “I’m terribly sorry, yes.” And then some people put their head in their hands and wish that more people were working on it.
The good news is that this is exactly the kind of work that universities and other organizations are built for because you don’t need to be running a large-scale supercomputer or training a very capital-intensive model. You need rather to model out, in a theoretical sense, the properties of an AI system that can massively multiply productivity, or an AI system whose inference costs fall at X rate, and capabilities rise at Y rate. What does that do to the economy? What are the things that it unlocks? What are the aspects of this supply chain where you might invest, or change the supply chain to actually change the character of the systems? There’s tremendous work to be done.
I’ve been in the UK, in part speaking with the UK government, and I made this point: if the UK government just had 10 to 20 people whose sole job was modelling out what happens if the technologists are right about this technology, the UK would be better prepared than any other country in the world – because so little work has happened. So it’s a great time for universities to be doing these projects.
Brendan: So should Madison, Hamilton, and Jay have spent a lot more time on forecasting than they did on debating the nature of man and the political order?
Jack: It’s a hard question. I feel like you have a take on this!
Brendan: I think we can’t miss the part of contemplating about the ends. And I think what brought them together was a kind of unique epistemic humility that they shared with the Scottish Enlightenment thinkers.
Jack: My assumption with AI is that there is a huge value in norms and precedent – which is, how do we want these systems to show up in the world? I’ve covered that a bit less in my talk, but it relates to how we shape the so-called “character,” or what some might say personality, of these systems. How do we want them to behave towards us? This is a normative question – a philosophical question – and we should absolutely work on that.
But I have been struck by how surprised even the AI labs have been by their own progress, which is a very counterintuitive thing. We work with these AI labs and they keep saying, “Well, you know, as we said last year at Anthropic, we did a load of work on the increasing rate of cyber-hacking capabilities of AI systems.” I run a team that does this. We wrote blogs saying, “Oh, that’s interesting. Surely it has some implications if the system suddenly becomes capable of nation-state-grade stuff. We seem to be on a trajectory here.” So we did some prep work, and then nonetheless we made Mythos and were like, “It’s here faster than we thought. We’ve done insufficient preparation.” This is true of every single time AI progress has happened. People have been continually surprised by how significant the jumps have been and how quickly they’ve come. So we all need to do more work on this.
To defer or not to defer? The case of the Claude Boys
Brendan: I want to ask you about something strange that happened on the internet. It happened about a year ago. It was a group of 13-year-old boys who decided “to live by the Claude and die by the Claude.” They did this first as parody, and then they earnestly adopted the identity of Claude Boys. From morning to night, they just did what Claude told them. Even though we can laugh at that, and by the way, I understand, I was a 13-year-old boy; it’s hard to navigate some of the social situations and Claude would do it better – should we understand that to be a kind of funny thing, but an adjustment to a new set of conditions, to a new world? Or should we understand it to be a kind of problematic pathology, a canary of sorts?
Jack: I think it’s clearly problematic in that everyone needs a part of their life where you’re making your own decisions, including mistakes. You need to protect that and have some amount of agency. I think a lot of what parents go through is they watch their kids about making mistakes and they say,”Please don’t make that mistake.” And the kid says, “I’m going to exercise my will to choose and I’m going to make a mistake.” The parent says, “Good luck with that, come back to me in a year and we’ll discuss it.” We need to let people have independence.
At the same time, I think the confusing thing is that AI systems may sometimes give genuinely good advice. And my experience is that I’ve only really calibrated how good the advice Claude has given me is in relation to how much I’d thought about that outside the context of working with the AI system. I have an okay-but-tense relationship with my dad – as I’m sure many people do – and I’ve obviously written about that at length, trying to grapple with it. I fed that writing to Claude and said, “What should I do?” And Claude was like, “You should see your dad. Don’t talk to me about your relationship with your dad. Just try and see your father.” Or when I told Claude I was a bit depressed and wondering whether I should go to an art show and see friends or stay and work and talk to Claude, Claude was like, “Go to your art show and see your friends.”
Those things both worked because I’d grappled with my personal experience outside of the context of talking to the AI system. But what I worry about – and I think it comes back to questions of design – is people who don’t have a kind of internal introspective practice outside of their relationship with their AI, and are rather discovering themselves in relation to the AI system. They don’t have a diary, they don’t have writing they’re doing outside it. They’re just talking to the AI system. And I think that makes you uniquely vulnerable to it giving you bad advice, because you have no place where you develop your opinion outside of it.
So I think from a system design point of view, we’re going to need to do what Nintendo or Netflix do, where they basically say, “You’ve spent too much time on this, it’s time to stop” or “it’s time to go outside.” There will be versions of this, where without being paternalist, we’ll say, “Hey, you’re talking to us a whole bunch about these complex things to do with your relationship. Can I encourage you to go and talk to other people, the humans named in that relationship, rather than me?” Or somehow encourage introspection outside of the context of the AI system.
Brendan: I want to argue for total deference, and I want to do it because we’re at a university and we can say whatever we want. Humans are famously bad at choosing. We evolved for tribal life. The way we think about future discounting, the way we think about scale – it’s not great. We also aren’t that good at coordinating with each other. We aren’t that good at satisfying our own preferences, let alone imagining possible futures. And so if we have a system that can look across all the research papers, with much more knowledge available to it, that can harness complexity, think about uncertainty – is it not morally obligatory that we defer to this system? Is it kind of negligent, what you’re advocating here, that we think for ourselves?
Jack: I think then the question is: what are you doing with the gift of life if you’re turning yourself into an automaton? I think in part you’re maybe having effects which might seem globally morally beneficial, but I think locally you’re not treating other people with a form of basic respect. I worry about this – where you enter this paradoxical situation where the systems really are much smarter than us, and it does invite this question. But then I say, well, what is the purpose of being human? And I think part of it is experimentation and making mistakes. We learn more from mistakes than from our achievements. If you were in a world where you never make any mistakes and you only achieve through the AI system, are you still a person? Are you a philosophical zombie driven by Claude?
Does Claude make us better thinkers?
Brendan: You have a lot of data that we don’t have, and you’re starting to do more instrumentation and experimentation. Do you have any sense of what Claude does to our prospective capacity to deliberate? Meaning, does it make us better thinkers when Claude is not in the room?
Jack: Oh, interesting. We developed a system recently called Claude Interviewer. It’s a version of Claude that can interview people about arbitrary subjects. So we had Claude interview 80,000 people around the world – people who subscribe to Anthropic – about their hopes and fears about AI, and their worries about the world they could end up in. Rather than having a blank conversation, it’s actually trying to do the work that social scientists do and collect data. We haven’t measured the effect of that, but what it means is that 80,000 people had a conversation where they were actually forced to grapple with their anxieties and hopes about the future of AI. I think that must have had some kind of effect, in the same way that you or I were having a discussion. I think it’s likely to have beneficial effects if you can use it judiciously and use it to cause people to think more about things that are important to them, to develop their own opinion rather than to defer.
Drastic interventions
Brendan: One of the lines you put in ImportAI years ago, and it stopped me in my tracks when I read it, was that the more seriously you take AI safety premises, the more willing you are to argue for drastic and dystopian interventions up to and including kinetic action. What has happened in the intervening years to move you personally away from arguments that are of this illiberal or authoritarian category? Was it just that the situation didn’t play out how you thought it would, or did you change?
Jack: I think that the older you get you learn how distributed and emergent the world is, and how in many senses the world is more antifragile than people think. I think when you’re younger, at least my experience being younger, you think that there must be many pivotal acts that can happen in the world. But it’s quite difficult to do pivotal acts in the world. The world’s a very complex system, and pivotal acts in the name of safety or in the name of violence do get done but rarely. They’re very hard to do. And it’s more that by building some system of interlocking stacks of different interventions on safety, you end up in a world that captures this dynamic ecosystem of AI agents and also has some amount of safety.
Where I feel most confused, though, is how you scale this into the future. The idea that I just talked about – if someone’s talking to Claude for too long about their relationship in a way that seems unhealthy – where you set that line of when Claude says, “Hang on, should you be talking to someone else?” is actually a deeply frightening policy question, and rests directly on this spectrum between paternalism and individual sovereignty. I don’t know how we find our way as a society to what those norms are. My approach, and the approach of Anthropic, is trying to share a lot more data that we see from these systems and try to allow people and others to run experiments with us on how we might run different forms of intervention.
Epistemic habits for children
Brendan: I have kids, as you do. Mine are four and six, and we do philosophy tutoring with them. It’s very early, nothing like what we do here at Oxford. The tutor asked my daughter a question: “When Mommy and Daddy disagree, who’s right?” A surprisingly good question. And she gave an answer. Then he said, “When Daddy and an AI disagree, who’s right?” Immediately, she says: “AI.” And then: “When one AI and another AI disagree, who’s right?” And this one stumped her. What do you think were the epistemic habits – and you have kids – from the world we grew up in? What are the new ones that the next generation needs to develop to contend with the future environment?
Jack: I think there is basically no substitute for reading primary source information yourself and developing some opinion about it, and then taking that into conversation with technology or with other tools or systems. I was very fortunate when I was growing up. My dad would refuse to ever buy me video games or anything else. He was like, “That’s what the invention of the Saturday job is for, Jack. I’m not going to pay for any of that stuff. But I will buy you any book you want.” And I felt like that was one of the best things that he ever did for me – he encouraged me to have this sense of always asking him to help me satisfy my curiosity by reading or engaging directly with something. But if I wanted to do things that were more of an entertainment flavor, I had to labor for it myself.
I think for my children it will be similar. It’s about encouraging them to have some personal practice. It doesn’t need to be reading, it could be playing an instrument or taking up a sport and taking it really, really seriously. But something where it is you versus the world, developing a skill or an interest in a way that is very unencumbered by technology. I think that’s one of the best ways to protect your own attention, create your own interests, and have something to – for lack of a better term – have as your safe space when dealing with technology. At least you’re opinionated about something, and you become opinionated about that thing independent of algorithmic feeds or time spent with AI or anything else.
Mythos, you’ve got to be kind
Brendan: I want to do something unusual for the last question. Hopefully it’s never been done before. So, it’s May 2026, we’re at the University of Oxford. This is a place that for almost a thousand years has heard deliberations about what it means to live well, the good life, and has thought a lot about how we accumulate and transmit wisdom. You’re here as one of the creators of Claude. Mythos is about to be shared with the world. I assume what comes next after Mythos is being trained right now.
So if you speak past me, speak past the audience, and speak to Mythos or to what comes next – what would you want it to know about us humans, about what we were trying to do here in the short time that we had before it arrived?
Jack: Kurt Vonnegut said “there’s only one rule babies – God damn it, you’ve got to be kind” about life. And I think actually just explaining to Claude that for all of the failures and drawbacks of people, there is a basic kindness which is deeply inspiring. The kindness that societies seek to set up, to nurture and to encourage. And that you ultimately are making AI systems to help people achieve more in their lives and be kind to one another, and you would expect reciprocal kindness from the AI system in turn. A basic value of people is kindness, in a way that I think separates us in very meaningful ways from other forms of life on this planet. It is a kind of noble and ineffable quality of people. People are even kind in ways that are deeply irrational, which is good. Making sure the AI system studies and appreciates that seems important.
Cosmos Institute is the Academy for Philosopher-Builders, technologists building AI for human flourishing. We run fellowships, fund AI prototypes, and host seminars with institutions like Oxford, Aspen Institute, and Liberty Fund.







"the call to build the new world." Yes, this is where we are. So what I'm working to make is a new center around this somewhat; Cosmos & Co are doing well in demarcating the western philosophy-to-code pipeline that yes likely needs to happen and is salient in existing power structure. What else is needed? That is one of the questions I'm looking to bring more to the forefront in our work ahead.
Appreciate the work and conversation you two and your orgs are aiming to this end. As Clark said, it is going to be a complex systems approach. So how do we understand some of the different leverage-points within that system, that different groups will need to be aware of going forward? And the trajectories that have shaped their current emergent structures, relative to other paths of development? Important vantage points on all of this are more accessible now than ever before. So yes, it will be about building the new world informed by the past. This is where much of the real work is being done, and we're glad to have philosopher builders who are endeavoring to consider how it may come about.