We’ve been talking with search industry pros and innovators about persistent challenges, trending opportunities, and the technologies people and companies are using to stay relevant in competitive search results.
One trend driving massive advancements in search technology is the shift from keywords to data that better represents the meaning of the query, and what’s known about it.
Keyword search has been driving content discovery since 1230 AD. That’s when French cardinal and biblical commentator Cardinal Hugh de St Cher completed the first known index in history.
Vector search marks a major shift from this traditional method of information retrieval to a future in which all of the complex data that makes up modern content assets can be put to work.
So what do you need to know about it right now?
We reached out to Edo Liberty, the former head of Amazon’s AI lab and now CEO of Pinecone, for a primer on vector search and why you may want to have the associated technologies on your radar.
We asked Liberty:
- How will vector search redefine traditional keyword search?
- How would you explain vector search to a 5-year-old?
- What are some of the challenges that you faced using ML algorithms for Amazon Web Services (AWS) customers, and how did you overcome them?
- What is Pinecone and what does it do?
- What tips or advice do you have for SEO beginners who are just stepping into the world of ML and AI?
Let’s start with this – why is natural language processing (NLP) so important to the future of SEO, and how can marketers prepare for what’s next?
We’ve Burned The Ships Of Keyword Search
Edo Liberty: “Just as SEOs mastered the PageRank algorithm, they now need to know about NLP in order to succeed and beat the competition.
Unlike PageRank, however, the field of NLP is growing fast and has thousands of contributors.
It’s going to take more effort than following Matt Cutts (from Google) on Twitter and tracking SERP changes.
Thankfully, although NLP is a more complicated topic, it is not shrouded in mystery like PageRank is.
A lot of the work on NLP is being done in the open, with free and abundant research papers, open-source software, and no-cost online courses on NLP.
One thing is clear about NLP: It’s here to stay.
It’s far from perfect, but it’s improving fast, and the big tech companies have burned the ships of keyword search and there is no going back.”
Vector Search Enables Us To Search The Way We Speak
How will vector search redefine traditional keyword search?
Edo Liberty: “Vector search doesn’t redefine keyword search; it replaces it whole-cloth.
Instead of working with keywords – and their synonyms and misspellings – vector search works with vector embeddings.
That’s a piece of data that represents the meaning of the search phrase along with other information known about the query or the user.
(To a human, the vector embedding is unrecognizable and just looks like a long array of numbers.)
This representation of the search phrase and the user is then used to sort through massive collections of embeddings that represent other content and user preferences to find the most relevant result.
From the user’s perspective, this means they can search the way they speak.
They no longer need to learn the quirks and syntaxes of search engines.
From the SEO’s perspective, this means they can truly focus on themes and topics withouting about precise keywords.”
How Would You Explain Vector Search To A 5-year-old?
Edo Liberty: “Our article explaining vector search basics comes close.
The ELI5 version, as I’ve practiced on my own family, is this: If I say ‘Italian food,’ you might think of pizza or pasta.
You’ve learned that those things are related because you remember eating pizza at an Italian restaurant or learning that pasta is popular in Italy.
But a computer never learned that. So the phrase ‘Italian food’ means exactly that and doesn’t contain information to say it’s related to pasta or pizza.
So, when I ask a computer to search for an ‘Italian restaurant,’ it might leave out the pizza places.
Machine learning is a way of helping computers understand the meaning of what we say or type.
And vector search is a way for those computers to search through everything they know, based on meaning and not exact words.
So now, if I ask the computer to recommend an Italian place, it might suggest your favorite pizza place just like you would.
Organizations can finally focus on creating and organizing content for humans.
There are many thousands of scientists and engineers working tirelessly to make ML and NLP resemble the human mind.
Do you really want to go against that? The winning strategy for SEO is to optimize for the human mind.”
Overcoming Challenges In Machine Learning
What are some of the challenges that you faced using ML algorithms for Amazon Web Services (AWS) customers, and how did you overcome them?
Edo Liberty: “I can’t speak about specific projects or challenges from AWS. I can say more broadly, from my experience, I saw that ML algorithms are no longer the bottlenecks.
To be sure, they are far from perfect, and there’s a lot of work to be done, but that work is happening at breakneck speed.
The next challenge is in running those algorithms at the scale needed to support consumer products and enterprise applications.
Those representations I mentioned earlier, vector embeddings, are computationally costly to search through.
An index of just 1M items (vector embeddings) already requires specialized software along with careful tuning; an index of 100M items requires specialized software and infrastructure; an index of 1B or more items requires you to be Google or Amazon.
(As an aside, this is why I started Pinecone: To make it easy for engineering teams to add vector search to their applications.)”
What Is Pinecone?
What is Pinecone and what does it do?
Edo Liberty: Today, Pinecone makes it easy for engineers to build fast, fresh, and filtered vector search into their applications.
It gives engineering teams the search infrastructure needed to run vector search at scale, all packaged in a managed service with an easy API.
(We’ve dropped the version numbers because the releases come fast, and because as a managed service, users always get the latest version and don’t need to worry about updates.)
Working with algorithms is extremely fun and absolutely worth the challenges.
With vector search, we’re at the intersection of cutting-edge algorithms, database architectures, and serverless applications.
And, we get to see our customers apply this technology to products that are revolutionizing both consumer and enterprise applications such as semantic search, recommendation systems, IT security, wearables, computer vision, and more.
Getting Started In ML & AI
What tips or advice do you have for SEO beginners who are just stepping into the worlds of ML and AI?
Edo Liberty: “Don’t feel intimidated. Even the brightest researchers in this field are ‘figuring things out.’
Learning about AI/ML beyond the surface-level articles will make you a better SEO professional, and there are plenty of free resources that help you do that.
For those interested in careers in this field, we are currently hiring across all teams: engineering, research, customer success, sales, marketing, and operations.
Featured Image: Courtesy of Pinecone