retroanime style, two monsters fighting like their dancing. colorfulPlaygroundai.com

Why does ChatGPT fascinate people so much? Why are people so excited?

These are the questions that I have been trying to answer the last two months.

Studying and working on NLP for a long time, I merely saw ChatGPT as a new version of language models that have been developing for a while (for more than two decades, actually), but of course with much larger scale training data and model, resulting in much better response qualities. It is still amazing to try it out myself, but, at the same time, knowing how it works and its limitations, I wasn’t too hyped.

I realized I am suffering from the “curse of knowledge”, a serious cognitive bias that is hindering me from answering those questions, so when I read headlines like “ChatGPT will change everything!”, I wasn’t buying it too much.

To overcome this, I started talking to friends without NLP background, joining discussion groups and reading more non-technical articles to gain more perspectives. These efforts have definitely been helpful.

I don’t think ChatGPT will change everything “today”, but it will gradually change everything in the next decade or more. I read a book called “The innovator's dilemma” recently and found out that new technologies usually are born from great firm’s research labs (note: guess where transformers, the building block of GPT, were invented?), but fail to get adopted because they are not “good enough”, compared to the at the time “state-of-the-art”. Will history repeat itself?

In this post, I want to discuss my thoughts on how ChatGPT will change the future of search and the web ecosystem. It’s a huge topic, but as an individual working for a search company and a web content writer, I could not NOT think about this.

* disclaimer: The ideas in my blog are my own, but do not represent my employer Google. I only use public information to write.

This post is also posted in Korean.

What aspect of ChatGPT fascinates people?

When search engines first came out, it was just fascinating to get blue links of websites with relevant contents. But now this is not new at all. People want answers right away, rather than scrolling through websites.

This trend has been reflected from people using Reddit to ask questions in a community (and also using Google to search Reddit). People don’t want to surf the web anymore. According to some, “It’s full of trash and ads.” They want answers right away.

Google has already been trying to increase the velocity by introducing features like snippets, which show part of the information from a web page to users before they click on it. Google is also utilizing their Knowledge Graph to show factual answers inside snippets.

answers shown in the snippets (top box)

While Google's snippets are well-organized information, when the query becomes more complex, Chat+Search interface is much more convenient (assuming the answer it returns is correct). It’s basically asking a smart friend or assistant. I’ve also been using my Google Assistant at home, and when it works, it is very very nice. (problem is that when it doesn’t, it is not).

Indeed, the conversational interface is convenient, and as humans this is the ultimate interface we’ve been using since language was invented. Forget books, radio, GUI, etc.. Conversation is THE most natural interface we are used to. ChatGPT is fascinating because it seems like this is the first computer to achieve our level!

I believe that the Chat+Search interface is an inevitable direction, because it is the most natural human computer interface.

However, the change will not happen all at once, but gradually evolve.

But, how about the web?

ChatGPT, or Google Bard, or any kind of large language model are basically trained with web text data and reinforced via additional task-specific data. They rely on the web as if fish need water to survive. And what is the web? The Web is a conglomerate of content creators like myself to generate knowledge and share with the world.

Let’s think about the incentive for creators to produce contents:

1. Direct Monetization

Platforms like Youtube, Instagram, and TikTok share advertising revenues to creators. This has been the most important part of growing so fast. For the general web, Google AdSense achieves the same thing. For example, bloggers can get compensated by putting Google AdSense to their blogs.

2. Secondary Monetization

The content may be associated with another service that people may pay for. For example, a medical advice content is posted by a clinic, or a camping tip is written by an outdoor e-commerce business.

3. Self-fulfillment, self-promotion

Not everything is related to money. People are motivated with non-monetary incentives like a sense of self-fulfillment and self-promotion, -  in an easy term - “attention”. For example, I feel happy when someone clicks on likes and puts a comment with any feedback to my post (yeah, go for it!).

--

The key here is traffic. How many people visit your site?

While my Korean newsletter <Weekly NLP> has 4000 email subscribers, 78% of the traffic came from organic search in 2023 (This is a huge jump from 64% in 2022. Probably due to ChatGPT).

Google Analytics of jiho-ml.com

I am 100% sure that ChatGPT & Bard are using my blog posts as training data, so they will answer questions related to NLP based on my and other people’s content*. If this happens, less and less people will visit my blog, thus lowering my incentive to write.

Lowered incentive means less human writers on the web. Less writers & contents mean lower quality web.

While a bit off-topic, I wonder how I would feel or act when Chat+Search’s answer gets based on my writing. I admit that even my posts are not truly original, referencing other papers, tutorials, etc., so I am not free from using other people’s content as training data, but I still try to include citations.

Search Giant’s Dilemma

The search giant (we all know who we are talking about here) has a huge dilemma. The company has a strong incentive to maintain a healthy web ecosystem, because the better the quality of the web, the better the quality of the search engine.

The better the quality of the search engine, the more people use the search engine and consume the web contents it links to. Ads are on both the search engine and the individual contents (via AdSense).

I do not believe the search giant will throw away the whole existing search and replace it with a Chat interface, rather it will try to augment it, but a significant amount of traffic may be lost, hurting the web ecosystem and its ads business.

Not choosing to change is also not an option as Bing+ChatGPT already entered. ChatGPT enhanced Bing can actually eat a good portion of the market share.

I hope that with our innovation they will come out and show that they can dance - Microsoft CEO Satya Nadella

The search giant already started dancing (quite hastily, unfortunately). The stock market showed that it has much more to lose than the competitor. While I am not a business expert, anyone can see that by looking at the revenue profit streams of each company.

VisualCapitalist, How Big tech Revenue and Profit Breaks down, by company
VisualCapitalist, How Big tech Revenue and Profit Breaks down, by company

Microsoft has given the search giant a dilemma: come out and innovate! But be careful it may destroy “your” core business! Then you won’t have money to fund our competition for Office (<=>Workspace), Azure (<=>GCP), etc.

This reminds me of how Spotify affected the music ecosystem (I learned through Netflix series The Playlist). The traditional record companies had no choice but to come out and dance with the streaming technology. Did it make the creators and the listeners better off?

Interesting, but what will it mean for the users? In the short-term, finding knowledge will be easier than ever. However, in the long run of the dystopian world of Chat+Search, Web content may dry up. In this world, the contents you see on the open web will most likely be written by language models, not based on real experience and review of facts.

<45 ways to find purpose of life>, authored by ChatGPT in Korean

Conversely, maybe language model based writer tools like ChatGPT, Translate, Grammerly, etc., may increase productivity, resulting in more high quality content on the web. Who knows?


In today’s post I shared my thoughts on ChatGPT’s influence on search and the web ecosystem. Predicting the future is almost impossible, so I am more likely to be wrong.

But I know for sure that the change will not happen in the short-term. There are still a lot of hurdles the large language models need to improve, the most challenging one being factuality. Also, I’m not sure how the economy of Chat+Search will work. Running such a large model(s) on thousands of GPUs for everyday search sounds insanely expensive, yet. This is not even considering the huge retraining cost to reflect up-to-date facts and error fixes.

Nevertheless, change will happen slowly for the best or the worst. Let’s remember when small computing devices like PDAs came out in the 1990s, but it took 20-30 years to get to where we are now with smartphones.