on Veo 3: Nothing is true anymore

With this week’s unveiling of Veo 3, I believe we have crossed a crucial threshold in the development of AI, and we need to have a conversation about it.

Veo 3 is, to me, the first tool that can create videos that are indistinguishable from traditional, human-produced video. The game changer is the native integration of audio generation alongside video generation, all from text prompts.

Of course, if you look/listen closely, you can sometimes pick out artifacts that reveal the footage’s artificial origin. But only sometimes, and this is the first iteration. Within a short timeframe (6 months?), the technology will have improved such that Veo generations are indistinguishable from human-produced video.

This ushers in a new age for our world. One in which there is no way to know if any information you receive through a screen is grounded in reality. Misinformation has already been a problem for some time. But the quality and volume of deepfakes has been such that discerning viewers have been able to comb through the mess to find out what is going on. No more.

So what does this mean for truth, especially in matters of politics, war, climate change, etc?

First, we can no longer view screen-based media as primary-source information. Footage of war, political speeches, natural disasters, etc can all be either entirely false or falsified. This is also true of audio content (podcasts, radio interviews) as well as text (articles, essays) and images.

When all media that is mediated by a screen can no longer be trusted, what can? Well, the answer is: your own lived experience. Footage of a war crime may be artificially generated. Seeing it with your own eyes can not.

I believe that now, more than ever, trust is the central issue. If anyone can lie to you, who do you trust not to?

This could result in a revitalisation of legacy/large news organizations. A place like the BBC, for example, has the staffing and resources to have humans on the ground in places of crisis around the globe to bear witness to what is truthfully happening there. But they will only be able to effectively operate if enough viewers trust them to share that information accurately and ethically.

What I suspect will actually happen is a rapid escalation of the ongoing separation of society into bubbles, each with a different understanding of what is true. We may look back on the tensions and divisions of the last years as relatively docile compared to what comes next.

And I suspect the other outcome of this will be a reliance on the local above all else. Screens mediate the truths we do not experience in our day to day life. Those we do experience take place within a local context. Though we may disagree with our neighbors about whether our neighborhood needs a new condominium building, we will be able to agree that one is, in fact, being built.

I don’t know exactly where the conversation goes from there, but I want to ask the question: how can we, as localized communities of human beings, build structures in which we talk to one another in person? And how can we use our local community to bolster ourselves against the waves of misinformation that are at our doorstep?

Next
Next

The Crisis of Race in America