April 20, 2024

5 questions answered about the OpenAI search engine

OpenAI is reported to be working on a search engine that would directly challenge Google. But details missing from the report raise questions about whether OpenAI is creating a standalone search engine or if there is another reason for the announcement.

OpenAI Web Search Report

The report published in The Information reports that OpenAI is developing a web search product that will compete directly with Google. A key detail of the report is that it will be powered in part by Bing, Microsoft’s search engine. Other than that, there are no other details, not even whether it will be a standalone search engine or integrated into ChatGPT.

All reports indicate that it will be a direct challenge to Google, so let’s start there.

1. Is OpenAI presenting a challenge to Google?

OpenAI is said to be using Bing Search as part of the rumored search engine, a combination of a GPT-4 with Bing Search, plus something in between to coordinate between the two.

In that scenario, what OpenAI is not doing is developing its own search indexing technology, but rather using Bing.

What is then left for OpenAI to create a search engine is to figure out how the search interface interacts with GPT-4 and Bing.

And that’s a problem Bing has already solved using what Microsoft calls the orchestration layer. Bing Chat uses retrieval augmented generation (RAG) to improve responses by adding web search data to use as context for the responses GPT-4 creates. To learn more about how orchestration and RAG work, watch the keynote at the Microsoft Build 2023 event by Kevin Scott, Microsoft’s CTO, at 31:45 here).

If OpenAI is creating a challenge for Google Search, what exactly is left for OpenAI to do that Microsoft isn’t already doing with Bing Chat? Bing is a mature and experienced search technology, an experience that OpenAI does not have.

Is OpenAI challenging Google? A more plausible answer is that Bing is challenging Google through OpenAI as a proxy.

2. Does OpenAI have the momentum to challenge Google?

ChatGPT is the fastest growing app of all time, currently having around 180 million users, achieving in two months what took years for Facebook and Twitter.

However, despite that advantage, Google’s advantage is a steep hill for OpenAI to climb. Consider that Google has approximately 3-4 billion users worldwide, absolutely dwarfing OpenAI’s 180 million.

Assuming that OpenAI’s 180 million users performed an average of 4 searches per day, the daily number of searches could reach 720 million searches per day.

Statista estimates that there are 6.3 million Google searches per minute, which is equivalent to more than 9 billion searches per day.

If OpenAI wants to compete, it will have to offer a useful product with a compelling reason to use it. For example, Google and Apple have a captive audience in the mobile device ecosystem that integrates them into the daily lives of their users, both at work and at home. It is quite evident that it is not enough to create a search engine to compete.

Realistically, how can OpenAI achieve that level of ubiquity and utility?

OpenAI faces an uphill battle not only against Google but also against Microsoft and Apple. If we count Internet of Things applications and devices, let’s add Amazon to that list of competitors that already have a presence in the daily lives of billions of users.

OpenAI doesn’t have the momentum to launch a search engine that competes with Google because it doesn’t have the ecosystem to support integration into users’ lives.

3. OpenAI lacks experience in information retrieval

The search is formally known as information retrieval (IR) in research articles and patents. No search in Arxiv.org’s research article repository will turn up articles written by OpenAI researchers related to information retrieval. The same can be said for searching for patents related to information retrieval (IR). OpenAI’s list of research papers also lacks IR-related studies.

It’s not that OpenAI is being secretive. OpenAI has a long history of publishing research articles on the technologies they are developing. Research on IR does not exist. So if OpenAI really plans to issue a challenge to Google, where is the smoke from that fire?

It’s reasonable to assume that search is not something OpenAI is developing right now. There are no signs that it is even flirting with creating a search engine, there is nothing there.

4. Is the OpenAI search engine a Microsoft project?

There is substantial evidence that Microsoft is heavily researching how to use LLMs as part of a search engine.

All of the following research works are classified as belonging to the fields of information retrieval (also known as search), artificial intelligence, and natural language computing.

Here are some research articles from 2024:

Improving Human Annotation: Leveraging Large Language Models and Efficient Batch Processing
It involves using AI to classify search queries.

Extracting structured entities using large language models
This research paper discovers a way to extract structured information from unstructured text (such as web pages). It’s like converting a web page (unstructured data) to a machine-understandable format (structured data).

Improving text embedding with large language models (PDF version here)
This research paper discusses a way to obtain high-quality text embeddings that can be used for information retrieval (IR). Text embeddings are a reference to creating a representation of text in a way that algorithms can use to understand semantic meanings and relationships between words.

The above research work explains the usage:

“Text embeddings are vector representations of natural language that encode its semantic information. They are widely used in various natural language processing (NLP) tasks such as information retrieval (IR), question answering, etc. In the field of IR, first-stage retrieval often relies on text embeddings to efficiently retrieve a small set of candidate documents from a large-scale corpus using approximate nearest neighbor search techniques.”

There is more research done by Microsoft related to search, but these are the ones that are specifically related to search along with large language models (like GPT-4.5).

Following the trail of breadcrumbs leads directly to Microsoft as the technology powering whatever search engine OpenAI is supposedly planning… if that rumor is true.

5. Is the rumor destined to steal Gemini’s attention?

The rumor that OpenAI is launching a competing search engine was published on February 14. The next day, February 15, Google announced the release of Gemini 1.5, after announcing Gemini Advanced on February 8.

Is it a coincidence that the OpenAI announcement completely overshadowed the Gemini announcement the next day? The moment is incredible.

At this point the OpenAI search engine is just a rumor.

Featured image from Shutterstock/rafapress

Leave a Reply

Your email address will not be published. Required fields are marked *