sel logo
Search Engine Land » SEO »
SearchBot requires a free Search Engine Land account to use, and gives you access to all SearchBot personas, an image generator, and much more!
If you already have a Search Engine Land account, log in now. Otherwise, register here!
Since the introduction of generative AI, large language models (LLMs) have conquered the world and found their way into search engines.
But is it possible to proactively influence AI performance via large language model optimization (LLMO) or generative AI optimization (GAIO)?
This article discusses the evolving landscape of SEO and the uncertain future of LLM optimization in AI-powered search engines, with insights from data science experts.
GAIO aims to help companies position their brands and products in the outputs of leading LLMs, such as GPT and Google Bard, prominent as these models can influence many future purchase decisions.
For example, if you search Bing Chat for the best running shoes for a 96-kilogram runner who runs 20 kilometers per week, Brooks, Saucony, Hoka and New Balance shoes will be suggested.
When you ask Bing Chat for safe, family-friendly cars that are big enough for shopping and travel, it suggests Kia, Toyota, Hyundai and Chevrolet models.
The approach of potential methods such as LLM optimization is to give preference to certain brands and products when dealing with corresponding transaction-oriented questions.
Suggestions from Bing Chat and other generative AI tools are always contextual. The AI mostly uses neutral secondary sources such as trade magazines, news sites, association and public institution websites, and blogs as a source for recommendations.
The output of generative AI is based on the determination of statistical frequencies. The more often words appear in sequence in the source data, the more likely it is that the desired word is the correct one in the output.
Words frequently mentioned in the training data are statistically more similar or semantically more closely related.
Which brands and products are mentioned in a certain context can be explained by the way LLMs work.
Modern transformer-based LLMs such as GPT or Bard are based on a statistical analysis of the co-occurrence of tokens or words.
To do this, texts and data are broken down into tokens for machine processing and positioned in semantic spaces using vectors. Vectors can also be whole words (Word2Vec), entities (Node2Vec), and attributes.
In semantics, the semantic space is also described as an ontology. Since LLMs rely more on statistics than semantics, they are not ontologies. However, the AI gets closer to semantic understanding due to the amount of data.
Semantic proximity can be determined by Euclidean distance or cosine angle measure in semantic space.
If an entity is frequently mentioned in connection with certain other entities or properties in the training data, there is a high statistical probability of a semantic relationship.
The method of this processing is called transformer-based natural language processing.
NLP describes a process of transforming natural language into a machine-understandable form that enables communication between humans and machines.
NLP comprises natural language understanding (NLU) and natural language generation (NLG).
When training LLMs, the focus is on NLU, and when outputting AI-generated results, the focus is on NLG.
Identifying entities via named entity extraction plays a special role in semantic understanding and an entity’s meaning within a thematic ontology.
Due to the frequent co-occurrence of certain words, these vectors move closer together in the semantic space: the semantic proximity increases, and the probability of membership increases.
The results are output via NLG according to statistical probability.
For example, suppose the Chevrolet Suburban is often mentioned in the context of family and safety.
In that case, the LLM can associate this entity with certain attributes such as safe or family-friendly. There is a high statistical probability that this car model is associated with these attributes.
Get the daily newsletter search marketers rely on.
See terms.
I haven’t heard conclusive answers to this question, only unfounded speculation.
To get closer to an answer, it makes sense to approach it from a data science perspective. In other words, from people who know how large language models work.
I asked three data science experts from my network. Here’s what they said.
Kai Spriestersbach, Applied AI researcher and SEO veteran:
Barbara Lampl, Behavioral mathematician and COO at Genki:
Philip Ehring, Head of Business Intelligence at Reverse-Retail:
There are two possible approaches here: E-E-A-T and ranking.
We can assume that the providers of the well-known LLMs only use sources as training data that meet a certain quality standard and are trustworthy.
There would be a way to select these sources using Google’s E-E-A-T concept. Regarding entities, Google can use the Knowledge Graph for fact-checking and fine-tuning the LLM.
The second approach, as suggested by Philipp Ehring, is to select training data based on relevance and quality determined by the actual ranking process. So, top-ranking content to the corresponding queries and prompts are automatically used for training the LLMs.
This approach assumes that the information retrieval wheel does not have to be reinvented and that search engines rely on established evaluation procedures to select training data. This would then include E-E-A-T in addition to relevance evaluation.
However, tests on Bing Chat and SGE have not shown any clear correlations between the referenced sources and the rankings.
It remains to be seen whether LLM optimization or GAIO will really become a legitimate strategy for influencing LLMs in terms of their own goals.
On the data science side, there is skepticism. Some SEOs believe in it.
If this is the case, there are the following goals that need to be achieved:
I have explained what measures to take to achieve this in the article How to improve E-A-T for websites and entities.
The chances of success with LLM optimization increase with the size of the market. The more niche a market is, the easier it is to position yourself as a brand in the respective thematic context.
This means that fewer co-occurrences in the qualified media are required to be associated with the relevant attributes and entities in the LLMs. The larger the market, the more difficult this is, as many market participants have large PR and marketing resources and a long history.
GAIO or LLM optimization requires significantly more resources than classic SEO to influence public perception.
At this point, I would like to refer to my concept of Digital Authority Management. You can read more about this in the article Authority Management: A New Discipline in the Age of SGE and E-E-A-T.
Suppose LLM optimization turns out to be a sensible SEO strategy. In that case, large brands will have significant advantages in search engine positioning and generative AI results in the future due to their PR and marketing resources.
Another perspective is that one can continue in search engine optimization as before since well-ranking content can also be used for training the LLMs simultaneously. There, one should also pay attention to co-occurrences between brands/products and attributes or other entities and optimize for them.
However, tests on Bing Chat and SBU have not yet shown clear correlations between referenced sources and rankings.
Which of these approaches will be the future for SEO is unclear and will only become apparent when SGE is finally introduced.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
Related stories
New on Search Engine Land
About the author
Related topics
Get the daily newsletter search marketers rely on.
See terms.
Learn actionable search marketing tactics that can help you drive more traffic, leads, and revenue.
Online Feb. 28-29: SMX Master Classes
Online June 11-12: SMX Advanced
Online Nov. 13-14: SMX Next
Discover time-saving technologies and actionable tactics that can help you overcome crucial marketing challenges.
April 15-17, 2020: San Jose
What Is SEO – Search Engine Optimization?
SEM career playbook: Overview of a growing industry
Web hosting for SEO: Why it’s important
Leverage AI-driven SEO to Increase Traffic, Revenue and Online Reputation
AI-Forward Marketing: Your Roadmap to Revenue Growth in 2024
Power Up Your Marketing Programs with Google Analytics 4 and Salesforce Marketing Cloud
Identity Resolution Platforms: A Marketer’s Guide
Email Marketing Platforms: A Marketer’s Guide
Customer Data Platforms: A Marketer’s Guide
Nailing Speed to Lead
Meet your new AI-powered marketing assistant!
Get the must-read newsletter for search marketers.
Topics
Our events
About
Follow us
© 2024 Third Door Media, Inc. All rights reserved.
Third Door Media, Inc. is a publisher and marketing solutions provider incorporated in Delaware, USA, with an address 88 Schoolhouse Road, PO Box 3103, Edgartown, MA 02539. Third Door Media operates business-to-business media properties and produces events. It is the publisher of Search Engine Land the leading Search Engine Optimization digital publication.