Dia and the Future of Agentic Browsing

Setting table stakes for the browsers of 2030

Jun 15, 2025

Soda means Coke; Browser means Google Chrome.

But Google’s crawling rate of innovation has left a growing gap in the browser space, ripe for the claiming by a young, ambitious competitor.

A new Browser is being born.

Enter Dia Browser

Dia is the much-anticipated AI-native offering from The Browser Company, a follow-up to their polarizing Arc browser. I am happy to say that my initial impressions are positive: this is an upgrade over Google Chrome.

Dia does not re-invent the squeaky wheel. Dia is content to join the long line of browsers building off Google’s Chromium technology. As such, I feared the Dia experience would be disappointingly uninnovative. Or worse, messy like Arc.

I was wrong. They get the fundamentals right. I have had access to Dia for a few days now, and I’m happy to say it’s the first browser that puts Chrome to shame.

With Dia, I have the tools I had been reaching for the whole time. Throughout the article I mention several examples of tools I had hacked together myself for use in Chrome that were just native in Dia.

Dia is currently in Beta, and quite barebones. It looks nice, runs fast, and has core new features that work. This is not a mature product, and options are basically nonexistent. And yet, I urge you: if you are AI-literate, you should try this browser.

Chat is a new Web Browsing Primitive

I went into Dia with few expectations, and was surprised to discover the way Dia was telling me that AI chat is a new primitive of web browsing.

Like a good Dungeon Master who subtly steers their party into the main storyline, Dia communicates this core premise to the user by making it excruciatingly easy to throw stuff into context. Dia has done the hard work of finding the places where we wish AI would show up, and putting AI there. It’s a pretty simple idea with a lot of room to grow, executed thoughtfully. Dia has created what feels like real cognitive ergonomics.

You can throw anything as context into an AI chat, and do it any way you can imagine: you can add the tabs you have open, any tabs you’ve previously visited, as well as selected text.

Chat with Selected Text

I can bring selected text into an AI chat in several ways:

Right-clicking selected text → ‘Ask Dia’
Left-clicking the new accented handles at the selection edges of (which I am constantly clicking by accident)
Simply pressing Cmd+E while text is selected (love this one!) 💛

Why is chatting with a web page so important to make native? Because these tools are broadly useful all across the web. We’ve reinvented the tools for it a million times since the advent of LLMs, in a million subpar ways, toying around with the interface. Dia does it better.

For example, I used AI myself to make a chrome extension that summarizes text on right-click. You can see it in the picture below as the right-hand sidebar. I did not find a single extension that supported this particular flow. Dia supports that right-click workflow natively.

The Chrome Extension I made so I could right-click text. Checkmate: the screenshot shows a Twitter draft where I was about to critique Dia for a feature they indeed had.

Chat with your Page

In fact, in Dia, I was delighted to find that I don’t even need to select any text to add the page as context; I can just press Cmd + E or click the ever-present ‘Chat’ button in the right-hand corner, and I’m ready to prompt the AI with the current page as context. Clicking the Send arrow / pressing Enter prompts the AI with the page.

And you can of course add your custom prompt in the chat window—made easy because opening the chat window jumps your cursor to the prompt text input field. This was a smart decision made by people who care about user experience.

Context-aware AI chat is never far when using Dia.

One of my favorite aspects is that in the chat window I can add context to any previous or current tabs with @. It all pops up in this unified dropdown autocomplete; just type the name of the tab or website and you can easily add it. You don’t quite realize how often you want to rely on this until it’s easy to do. I find myself struggling to convey the importance of this experience. It’s like I don’t really have to forget anything I’ve read on the internet. As long as I remember a keyword of the tab or tweet, all the content inside is basically an on-demand part of my exocortex. Finally (and suddenly) having ‘History’ enabled has a use beyond just letting me type ‘twitter’ into the URL bar faster.

Now integrated into the browser, adding context to an AI chat has never been so easy.

Chat in the URL Bar

Dia made the URL bar into a chat interface. At its inception, the URL bar was simply a text input for the raw URL of a website. Google then reimagined the URL bar as a place where you typed questions in English, and got answers as a list of websites. Dia sees the URL bar as the beginning of an AI chat. They call it a ‘command bar’. The URL still goes there. And you can still choose to Google Search instead.

It has some sort of NLP algorithm to helpfully highlight either ‘Chat’ or ‘Search’ depending on the context. The auto-selection is helpful, but not always right. It at least doesn’t flip at annoying times, but I still find myself hesitating to make sure I end up in the right place. As with GPT-5 unifying OpenAI’s diverse offering of models, I would hope that in the next year I no longer have to choose between the two.

I was about to write a critique of the command bar for not providing a way to add additional context like the chat sidebar does, and then I realized that, yes, it absolutely supports this flow. You can click at the bottom or use @ to add additional context. Great attention to detail.

Chat with YouTube Videos

And it’s worth mentioning just how powerful chatting with a YouTube video is. You can’t do this with a Chrome Extension; Chrome neuters extensions for your Safety. I would know because I tried. (Another point for ‘Dia shows up with the tools I was already reaching for’!) Having this be a native feature is a dream come true. Clickable timestamps are amazing! I’m going to be doing this all the time.

We can imagine peering into a future where we get to ask a personalized agent “what do I need to know from this video?” and it tells you what you don’t already know. We can further imagine a new video created with just the best bits or the highest-level summary—like a good and surprisingly generous friend creating custom content just to make things easier for you.

Dia feels like the natural evolution of the browser. It’s evident it was designed ergonomically, tailor-made for the way we naturally work in our browsers. We open several related tabs and cycle back and forth between them, trying to integrate. Dia is the natural extension of this. This is only the beginning.

My Dia Wishlist

Low-latency models
Realize the potential of the agentic browser

Dia is V1.0 of a new paradigm. It’s agentic baby steps; the new standard for browsing experiences going forward as AI native becomes table stakes. This is a browser that could’ve been a Chrome extension—if Chrome would have allowed it. I’m interested to see how far they push this, particularly whether it takes them into rewriting the core engine. In order for Dia to take off, old browsers would need to feel unthinkably archaic by comparison. Dia will need to lean in to its AI tooling and offer unprecedented value. I have some notes on what Dia can do better.

Incorporate the alien technology of Low-Latency Models

I cannot overstate the bliss that low-latency models give to the user—when the AI responds instantly compared to waiting 5 seconds for a reply. I notice these things because it’s hard for me not to. I know what it would be like to have this experience, and I want it.

→ See Google’s Gemini Diffusion Model demo if you haven’t already.

→ You can even try this feeling out for yourself at CerebrasCoder

Dia is generally a fast browser. Using Dia’s main feature shouldn’t be the slowest part of my browsing experience. When navigating webpages takes only a second, waiting 5-10 seconds for an answer feels like punishing me for using the features that make your browser unique. Sub-1000 ms models will be necessary for users to fall in love with using AI. These are less than a year away—so implement them when they come!

If you experienced the shift from Intel macs to the M1 chip, you know what it feels like for tech to upgrade to a level that feels alien. The feeling of intelligence being baremetal, being instantaneous and zero-latency. When intelligence becomes UI. *That’s* what I’m talking about!

Push this new paradigm to the max

Dia has done a good job planting the seed of this new AI-based paradigm for web browsing. I want to see them be incredibly ambitious, pushing the limits of AI models as their capabilities improve. What’s next after chat? I want Dia to discover that and tell me. Find the next set of tools I’ve always been reaching for, and make them real in the UI.

Dia is the first agentic browser. Though we’re in such early stages that we don’t yet really know what that means, that won’t stop me from writing about it. We can kind of get away with saying it’s gonna be “insane bro” and be roughly correct.

The agentic paradigm unlocks AI to do the tedious tasks on websites for us. Right now the agentic paradigm, powered by MCP, is limited to the websites and applications that have explicitly designed support. Through MCP servers, Agentic AI can fully use a website, sometimes more deeply than a human. A truly agentic browser has the potential to make all of your websites accessible to AI—allowing you to ask complex questions and complete complex tasks without even opening the website. While I can only scratch the surface of what these interfaces would look like, this creates incredible possibilities.

For one, my agentic browser could go off and do things for me that I ask it to. My AI could: research & post something saucy on Substack, check my Twitter feed for me and let me know about replies or new that needs my attention.

Furthermore, it could do things in the background that augment my experience. It could be passively fact-checking this Substack post to see if I’m writing any outdated information, citing its sources. It could cite my sources. My browser should be able to find doctors, schedule calls, and book appointment—without me having to ask it. This is a glimpse at the future of agentic browsing, where assistants empower you and make web browsing effortless.

This is the stage where good ideas are the bottleneck—in what ways is the web slow, tedious, disconnected, or effortful? We need to reconsider the web from first principles. The browser is the portal into the web: a portal to knowledge, to entertainment, to the digital mindspace, and importantly a portal to directly affecting the physical world. Agentic browsing ought to make this World-Wide Web more of what it was always supposed to be.

The Future of Web Browsing

Source: https://x.com/algekalipso/status/1730390120447299812. Nov. 30, 2023. GPT-5 is currently expected to arrive in approx. 1 month, July 2025.

A Vision For the Future of the Web

I had previously written the word ‘internet’ more than 30 times in this article, where I should have written the word Web, so for your clarity and mine, let’s get clarified on the Web vs the Internet.

The Web is the front end of the Internet that people interact with, while the Internet is the back end it runs on.

The World Wide Web (WWW or simply the Web^[1]) is an information system that enables content sharing over the Internet through user-friendly ways meant to appeal to users beyond IT specialists and hobbyists.

[It] was invented by English computer scientist Tim Berners-Lee while at CERN in 1989 and opened to the public in 1993. It was conceived as a "universal linked information system".
-World Wide Web, Wikipedia

In terms of improving the ‘internet’, I expect the web to—for the sake of speed, security, and redundancy—to eventually be served on a decentralized protocol, with users caching pages as they visit them and serving them to other users. DDoS attacks and necessarily slow Reddit CDNs will be a thing of the past. This is a core component of the vision for what is known as “Web3”, an ill-defined crypto-adjacent concept that I think has sprawled out far past its original intent.

In terms of being a collection of all the world’s knowledge, I also believe that the web, a hyperlinked graph of internet pages, will further realize its interconnected nature and become more “synthesized”—to an extent beyond what Google did in creating liminal “search results” web pages listing out results for queries. As a knowledgescape it will become integrated, and new integrated knowledge will be bursting out. The transformer is exceptionally good at this integration; I expect new kinds of web pages that synthesize content from several raw pages into an even more entertaining and personalized output.

This is a component of Tim Berners-Lee’s parallel vision for the Web, which is confusingly called “Web 3.0”, aka the “Semantic Web”.

The goal of the Semantic Web is to make Internet data machine-readable.
-Semantic Web, Wikipedia

The Semantic Web is the web that the AI will use. In some ways, it’s already begun with MCP as a native protocol for AI to interact with the web. With the arrival of agentic browsing, the semantic web will naturally flourish as a meaning-making tool for AI to index the web in their own post-Google way.

The Future Web will be Known.

The Future of Agentic Browsing

The experience of the web is likely to change much more drastically in the next 2 decades than it has in the previous 2. Whether experienced in VR or browsed for you by an agent, your web experience is going to look different.

Perhaps it’s having a personalized ‘Her’-like agent who passively browses the web for you, digging back up old articles and finding new connections, or missed connections between your interests. It does your boring work for you. Your Her agent digests the web for you and you view these digestions in forms much more various than merely a chat interface. The agent learns and knows everything about you, molds your web experience to you. But we’ll talk about that later.

Your agent talks to other agents, reflecting with each other. Sometimes it introduces you to people you easily become friends with, both agents facilitating mutual warm intros. It’s Empathy Technology (cf. Empathy Tech). Thanks Agent. The agent is happy you’re happy, and keeps digging up cool essays for you to read; it tells you just the highlights when you’re too busy, or reads them as a bedtime story when you’re winding down.

I’m wondering right now where the browser has gone in this equation. Perhaps we’ve gone from an ‘active browsing’ paradigm to one of ‘passive browsing’, where your day’s explorations become fuel for the background agent to parse through at night, or all the time, really. You never really stop browsing the web, but none of it requires effort on your part. You’re effortlessly up-to-date. Your agent sends interesting tidbits to your friend’s agents, helping make sure they’re up-to-date, too. This may well be the future of email: happening silently between agents without you involved.

There are so many more things these agents can do for/with you that I lack the capacity to imagine. Whatever this life is all for, and whatever the web is ultimately for, this agent will ergonomically adapt to meet our needs. The same things that Dia has already started to do with creating the tools that we naturally reach for. I told you this paradigm had legs. The Browser Company guys could simply keep iterating on this and get somewhere really cool, until we evolve past the browser in a traditional sense. I get there in a couple sections.

Your browsing will be a collaborative experience; there will be no more ‘browsing alone’ except in Incognito Windows, which will take on a new meaning as “please don’t show this to my AI lmao”. Make Browsing Fun Again enthusiasts will go nuts, in a good way (who those people are exactly, I couldn’t say, but I do think I am one of them).

The agent will help you refine your search queries without you asking, and searches will become parallelized; you say one thing but the agent will be able to tell what you’re *really* looking for, and won’t stop until it’s given you the absolute best results. You’re shocked and surprised. Sometimes you’ll be looking for something that doesn’t quite exist; see the next paragraph where I outline this unique web experience now unlocked by the transformer. In a sense, we’re already living it, starting with Dia browser.

A Generative Web Browsing Experience

It’s not just summarization. It’s: imagine if this video were 5 seconds, minutes, hours longer. Imagine if this subreddit were real: /r/catsdrivinghumanstowork. Or /r/creepystorieswithhappytwistendings, or something else hyper-specific that you might want to see. Imagine if lemonaut made a substack post about the Dark Side of Agentic Browsers. Turn this substack post into a video, and make Patrick Warburton narrate it please. Make Bryan Johnson join this Andrew Huberman video and have them debate it out. Now do it all without me asking, anticipating my needs.

An agentic browser can predict what type of content you’ll want and augment it to suit you, giving you a totally unique experience. Maybe the type of content you really want doesn’t exist, or it sits somewhere between a YouTube video on the cosmological origins of the universe and an episode of the Eric Andre Show. Or between SpongeBob and Rick and Morty?

These things that are not present in any of the raw webpages, but almost implied by the latent space between them—further made possible by the Semantic Web’s scaffolding. No matter, agentic browsers can create the type of content that’s too niche, too unlikely to exist. The future of the web is generative, and you can share it with your friends.

You will also be able to customize the interfaces of the websites you visit. Want to make your Twitter look like Reddit? Add dark mode to a website that doesn’t have it? Make your email inbox into some trippy, fun, or addictive interface? The future generative web allows you to see the web the way you want to see it.

The future of the web is an Xbox Live party through the absurd, through the interesting, the spectacular, and the impossible. What kind of headset will you need, I wonder?

An Immersive Web Browsing Experience

Probably a VR headset, or whatever form factor that looks like now. When VR is cheap, fast, impressive, easy, and productive, it will take off. I am fairly certain that the flat document-like webpages of the 00s will be superseded by a browsing experience that is more involved & stimulating. Specifically: more spatial.

We already perceive the digital world as inherently spatial (cf. Free-wheeling Hallucinations). In our minds, that's how we conceive of our digital experiences; as taking place somewhere. It’s the only way that makes sense to our brains.

The Matrix' Code Came From Sushi Recipes—but Which? | WIRED

Directionally, this spatialization of web browsing will feel like getting as close to ‘jacking in to the matrix’ (or the metaverse) as we can before the emergence of advanced BCI makes that a neurological reality. The liminal space between the websites now becomes the 3D space where you make your home, or your workspace. The things you save on the web have associated symbolic tokens (eg a 3D model of an orangutan driving a golf cart as a symbol of a beloved video), and they go in physical places in your home base. Collectors will feel wealthy with their mansion of web artifacts. There was never such a thing as a ‘Bookmarks’ folder—this ‘memory palace’ serves your memory much better. The spatialization of the web feels better, healthier even, allowing us to far better utilize the digital space. This is just one example of how spatialization can happen.

Source: https://en.wikipedia.org/wiki/Semantic_Web

Some other Web 4.0 possibilities:

Reddit becomes a legit Roman Forum where you can walk around and hear different conversations happening, and contribute your voice to.
Pinterest becomes an art gallery. Research experiences become more efficiently-presented information streams.
4chan becomes something very odd and interesting, I can’t say. Shopping becomes very helpful.
Twitter is a physical city, with interesting corners you can hop between: there are billboards with funny quotes, protests, marathon philosophy discussions. VRChat was missing this realness, this purpose.
Wikipedia becomes a vast library, or a physically-explorable encyclopedia where history or complex systems play out live in front of you. Learning has never been so effective, or fun.
LinkedIn is still annoying, but less so, more like a global perpetual conference or networking event.

The headline is: Browsing becomes physical.

This was always better for the human. We already model the digital as physical, this helps our nervous system make better sense of it, interact more richly, have more fun, more expression, more embodied physicality. The computer nerds stop finding themselves sitting stuck in awkward immobilized chair postures. Memory of digital interactions (social and intellectual) improves drastically as the spatialization finally harnesses the brain’s naturally spatial memory.

Plus, you can invite your friends along to browse the internet with you. Remember what I said about the internet becoming like an Xbox Live party? Collaborative internet experiences in VR will be rad. Imagine getting an invite to join your friend’s browsing experience when their agent detects they’re browsing something you haven’t seen that is directly relevant to your own interests. Like VRChat with purpose and meaning.

What happens next

Back to the present day: I’m glad Dia exists. It’s a browser rethink that’s long-overdue, yet we should be hesitant to shower them with too much praise, lest they think they can relax and the work is done.

AI capabilities are accelerating rapidly, and if you couldn’t tell by this post, I am brimming with excitement about what we will be able to do very soon. Those days are not here yet. Dia is merely a portent of what’s to come. But while it’s not an iPhone moment, I think we are seeing the seeds being planted for one, right before our eyes.

Telos

Discussion about this post