GPT Chat’s Path to Profit: Web Search

Consumer-facing GPT chatbots will inevitably start to monetize their products. Cost pressures from burning through cash and the high costs of inference may accelerate this change. But even without the cost pressure, it seems unlikely that tech companies will leave any potential profits on the table. The standard game plan for tech companies providing “free” services to consumers is to make money through data harvesting and advertising.

The data harvesting lever is already in place. The conversational nature of GPT chat interactions allows for a particularly rich data set. User intent, beliefs, and preferences are much more explicit in chat interactions than, say keyword searches. Social media interaction data is probably the closest parallel. We already know how valuable social media data is. Because users consider a chat session private, or at least more private than posting on social media, those interactions will likely provide even more detailed personal information about users. The more personalization, the more data brokers will pay for it. Not only will companies sell the usage data, they can use it to maximize other income channels.

Let’s pause here and pose the question, “What exactly is the product for GPT chat services?” The value proposition has never really been defined; in fact, it has been left to users to figure out what the value will be. Users want a coding assistant, a doctor, a friend, and many different things. But the overwhelming use case is search. I would also expand the common definition of search to its field of information retrieval. GPT chatbots are retrieving information by translating user intent to search queries and then merging and formatting the results into a unified response.

Web search is a massively valuable market, and capturing even a relatively small percentage of it could be enough to make GPT-driven chatbots profitable. Google operates an effective monopoly on the web search business. This total capture may, in fact, be what contributes to the current opportunity for GPT chat to steal market share. Growth is the key driver for tech companies to meet investor expectations. Google faced a challenge of maintaining revenue growth when there were no longer any more users left to acquire. Because of Google’s dominant position in search, they had the option to make the experience worse without worrying about losing users. This is exactly what they elected to do in order to show more ads and increase revenue. If you have to look through more pages of search results to find what you are looking for, then Google gets to show you more ads.

Google launched with a very simple and minimalist interface that provided search results that were far more accurate than competitors. Both minimalism and accuracy have declined over time. ChatGPT is appealing as a search replacement because of its minimalist style. ChatGPT does a better job of showing you the content that you care about while filtering out noise. Content creators often monetize their content through their own advertising. While this is reasonable, it can lead to a cluttered and frustrating user experience when navigating content. If Google ranks ad-laden websites in its top results, then users will need to parse through those ads to find the actual information that they are seeking. If a GPT chatbot cuts out the ads and simply gives the user the valuable information they seek, it is a better user experience. Yes, this breaks the current monetization model for content creators, but that is a problem for another day.

Google rose to dominance because their search results were better than everyone else’s. This was due to their use of the PageRank algorithm, which is a form of authority weighting. Websites that are linked to more frequently, especially if they are linked to from popular websites, are likely to be better results than websites that are not. PageRank is relatively straightforward to implement and can be calculated at indexing time. That means that search is fast.

PageRank is a heuristic to determine authoritative sources. It is relatively easy to calculate and, in practice, helps produce good search results. But it doesn’t cover every scenario. If an expert in a particular field creates web content that doesn’t have that high of a PageRank, their subject matter expertise may get buried under more popular, but less expert content. In an ideal world, an expert would review search results and determine what was best based on the actual content contained in the websites. Having a human expert perform this task doesn’t scale to indexing the web. But we could now implement an authority assessment with reasoning models. It might work or it might not, my point is that there is more than one way to judge an authoritative source.

Search result ranking matters a lot when you have a traditional search results interface like Google. Most users will only consider the top results. If you are wrapping your search results inside a chatbot interface that summarizes the content, the results ranking may be less critical to success. The end user may never see a ranked list of search results; they simply see a response from the chatbot. This means that you can do results analysis on the client side as well. Let’s say the chat interface runs a search or several searches that return a hundred different websites. It can look through all of those results and decide which ones to add to its summary.

User expectations matter and impact what technical options you have available when designing an information retrieval system. Users expect a traditional web search to be fast and extremely accurate on the first attempt. GenAI chatbots are conversational and multi-turn by nature. Users understand that they may need to add more context to get to their desired results. Natural language can often be both well-formed and ambiguous at the same time. Search has to deal with this as well. You may notice a “Did you mean?” suggestion or even a note that the actual search performed was different than your original request. Disambiguation in search is rudimentary. Search transactions are stateless and don’t build context throughout the interaction. Search wrapped in a GenAI, on the other hand, has as many turns as needed to ask the user questions that will help refine the intent of their search.

Web search isn’t personalized; a search query by you will likely return the same results as it will for me. You could argue that that is a desirable feature and makes the search more objective. It also fits into the narrative of search as a public service for information retrieval. While there is a lot of value in information retrieval as a public service, the reality is that web search is run by for-profit corporations. People like content that agrees with their perspectives on the world. Search running behind Gen AI chat could personalize how the results are presented to the user. That mirrors how data collected from social media interactions influences what content users see. While you and I might see the same top one hundred search results, our individual profiles within a GenAI chat environment could influence which of those results get picked for the summary. This would create another echo chamber of self-affirming content. As long as users are engaging, that’s all that matters. This can then become a virtuous (so to speak) cycle of data collection and increasing personalization, which builds a more valuable data profile on users.

Ok, now the companies behind GPT chatbots can start to make some sweet moolah. We have a bunch of personalized user data, which we are probably already selling. The next step is setting up an advertising platform on top of our chatbot. We will explore this in part two.