LLMs and a possible future for Search

Grigory Sapunov
7 min readDec 22, 2022

The recent surge of Generative AI has pushed the boundaries of what was previously thought possible.

This year, in particular, the progress made in image generation has been colossal, starting with the DALLE-2 hype, then with Midjourney taking huge strides in this area. StableDiffusion, the leading provider of generative models, is constantly updating its algorithms to meet the demands of new applications. I personally like the idea of generating movie images as if they were shot by another director or in a different culture, say, “Star Wars” by Akira Kurosawa or a Bollywood adaptation.

“Star Wars” by Akira Kurosawa by Alex Grekov and Midjourney V4

An impressive recent feat of this technology has been the rise of the Lensa application, which has become virally popular in the last month. With the potential of Generative AI becoming ever more realized, it is likely that this technology will continue to have a huge impact on our lives. More applications and startups to come!

“Star wars” Bollywood adaptation by Alex Grekov and Midjourney

Right now, a similar phase transition is happening with text generative AI, a field older than the image generation field, which was born for the mass audience since the GPT-3 publication.

The recent leader, ChatGPT, is the GPT-3 descendant in its current version of GPT-3.5 with the addition of Reinforcement Learning from Human Feedback (RLHF), like in the InstructGPT model.

ChatGPT is a revolutionary new technology that is quickly becoming essential for any writer. It is capable of not only generating texts in English but in many other human languages as well. It has gone beyond human languages and can even generate in non-human computer languages like Python (and it compiles and runs correctly).

ChatGPT generated a Python program for calculating 2D convolution. It’s not the best possible implementation, but a pretty good place to start.

Even the earlier versions of GPT were able to generate quality texts. We have our own case of preparing a product announcement for which we did A/B testing. One text was written by a copywriter for which we spent a week explaining what we needed and a decent amount of money. Another text was generated by one of the recent GPT-3 models. Suddenly, the CTR for opening the letter and clicking on the link was 37x higher for the AI-generated text.

The newest ChatGPT looks even better (and we’re still waiting for the GPT-4 to be announced soon, too many rumors about it). Recently, a man was able to generate and publish a children’s book on Amazon using Midjourney for images and ChatGPT for text in just a weekend. Another interesting case of text generation is a system called Dramatron by DeepMind (it uses its own models, not GPT-3). Dramatron is designed to co-write theatre scripts and screenplays. ChatGPT and similar models are quickly becoming a go-to tool for writers looking to increase their productivity and quality.

Interestingly, in some cases, the new model can replace an internet search engine (the recent article in New York Times says that Google considers this possibility seriously).

This chat session is a good replacement for the internet search for me. The search engine didn’t help me find what I was looking for, but this chat finally gave me the correct idea. However, with the wrong author :) I was looking for the “Avogadro Corp” story, which is the first book of the Singularity Series, and the “Singularity Series” is the second.

There are still cases when the model gets wrong.

First of all, the model can generate false statements and produce garbage (pretty beautiful garbage, usually). In the example above, ChatGPT created its own thoughts of what “The Three-Body Problem” is about. Search engines can produce wrong results as well, especially when indexing poor-quality sites and/or when malicious actors try to influence them. While the problem could be solved by controlling source quality, it is not enough for the case of GPT-like models. Even with a fully correct and curated training dataset, the model still can produce incorrect answers. They will require a separate model to estimate the quality of the answer or some other tools for fact-checking and post-processing. There is already a set of APIs for toxicity detection and some other related issues (for example, Perspective API from Google or Moderation API from OpenAI), but detecting truthfulness, correctness, and, more generally, answer quality will require a separate set of tools.

Second, the model can just not know something, either because it is a special domain the model wasn’t trained on (say, a very special topic about surgery or organic chemistry) or because the world has changed since model training (another person became the president of the US, or even more real case, a prime minister of the UK; some scientific breakthrough just happened, and so on). For this class of problems, there is a solution already. It is called retrieval-based (or retrieval-augmented) models, which can access external storage or even internet search. The instance of this class, OpenAI’s WebGPT model, used a text-based browser to search on the internet. There have been other significant results in this field during the last few years, RETRO from DeepMind, REALM and LaMDA from Google, RAG and Atlas from Meta, to name a few. I’d expect the next breakthrough version of GPT, say SearchGPT, will use this technology.

A user search session on a web search engine may become much more dialogue-like than today. The query language of today looks more like a separate language with special grammar, even if it looks like English or any other language. The query structure is not the same as typical sentence structure, and these search languages can be treated as separate dialects (even without special query language special operators). The query language of the recent future could be much more natural.

Looking a little further into the future, there is an interesting research direction toward performing actions. Actually, the whole field of reinforcement learning is about making actions in an environment to obtain as much reward as possible. But there are some things much closer to large language models (LLMs). To be precise, even the WebGPT model was a model with actions, it could make search queries to the Bing API, click on links, scroll the page up and down, search a page for particular text, and so on.

Actions the WebGPT model can take. Source: https://arxiv.org/abs/2112.09332

A transformer for actions, like ACT-1 (by the way, created by the company co-founded by the people behind the original paper on transformer architecture), is a possible next step. It can also use a web browser to click, type, scroll, and interact with the UI elements available on the page. With just one sentence, you can achieve what would require ten or more clicks.

Some foundation models (like Codex or even ChatGPT) can generate code to interact with different APIs. Merging Codex, ChatGPT, and WebGPT is around the corner, and it is possible that such merged models will be the most active users of sites like StackOverflow, not junior programmers as it is now🤔. At the moment, the StackOverflow site has the opposite problem. It raised a temporary ban for ChatGPT-generated answers (because “the average rate of getting correct answers from ChatGPT is too low”). They take care of the dataset quality 😉

Generated by Midjourney: “a computer program that writes code, searches internet and solves problems instead of a junior programmer; art style”

I believe this is not too distant future. Actually, I’ll be surprised if something like this merged model doesn’t appear next year. The more distant future is a separate interesting topic on its own.

Generative AI is already disrupting many workflows in the present day, and the disruption is only going to increase as more useful products and applications emerge. Venture Capital firms have taken note and are investing heavily in this field, signaling that the future of generative AI is very bright indeed. Here is a great post by Bessemer Venture Partners, a beautiful post by Sequoia, and a selection of posts by a16z. Together with Flint Capital, we wrote a post about the ethical implications of generative AI.

We are only beginning to scratch the surface of the potential of generative AI, and it is likely that we will see many innovative products come out of the technology in the coming years. For now, the disruption is already here, and we can only imagine the possibilities for the future.

Author: Grigory Sapunov
Director of Photography: Midjourney
Writers: GPT-3.5, ChatGPT
Translators: Intento Translation Portal, Google Cloud Translation
Editors: Grammarly

--

--

Grigory Sapunov

ML/DL/AI expert. Software engineer with 20+ years programming experience. Loves Life Sciences. CTO and co-Founder of Intento. Google Developer Expert in ML.