John McAlpin’s Post

Tech & Content SEO | Web Dev | Writer at Search Engine Journal & Search Engine Land

2mo

Before we all take the latest ”Google API Leak” as confirmation of ancient theories, let’s all take a breath and contextualize what this document actually might be. Roger Montti has a great breakdown on SEJ that I think we all should consider. https://lnkd.in/ga6F6BsE

Google Data Leak Clarification

searchenginejournal.com

To view or add a comment, sign in

More Relevant Posts

Jevgeni Mihhailenko

Senior SEO Specialist and Web Analyst
8mo
Report this post
If anyone wants to understand how Google works and ranks its results, here is a recent analysis (thanks to Natzir Turrado https://lnkd.in/eUR86E7u) of the latest leaks of Google documents—algorithms and components. It makes for exciting reading, providing food for thought. P.S. I suppose that, very soon, questions like "Name all Google algorithms and components" will become common in interviews. https://lnkd.in/en4sTrvr

Google's Algorithms Uncovered: How the Search Engine Works According to Leaked Documents

https://www.analistaseo.es
Like Comment
To view or add a comment, sign in
𝗦𝗵𝗶𝘃𝗮 𝗡𝗮𝗶𝗱𝘂 Digital Marketing Consultant and Outreach Expert

Top Digital Marketing Consultant for Small Businesses & Startups | Outreach Experts, Blogger (Niche: Tech, SaaS, CyberSecurity, Casino & Marketing, IT Services etc) 100K Plus Guest Blogger Network, Link Exchange Services
12mo
Report this post
Google research paper describes a remarkable framework called TW-BERT that improves search ranking without requiring major changes - https://shorturl.at/nuNS8 #Google / #BERT / #GoogleAI

Google Ranking Algorithm Research Introduces TW-BERT

searchenginejournal.com
Like Comment
To view or add a comment, sign in
Ryan Jones

Senior Vice President, SEO at Razorfish
1mo
Report this post
You want the Google algorithm? here it is: R(q,X) = (Sqx / NDL) + a * Q(X) q is a query. X is a document. Both vectors in this equation. Sqx is the salience of the query to the document. Think cosine similarity. But likely a mix of other metrics like BM25F and some entity shit. NDL is normalized document length. a is a quality modifier. Think navboost. Q(X) is the quality of the document. In the old days this was just pagerank. Now it's a whole bunch of ML systems and scores all aggregated up. Think one score that combines pagerank, panda, spambrain, HCU, etc. ok ok vastly oversimplified, but that's literally the algorithm behind any search engine.

39 Comments
Like Comment
To view or add a comment, sign in
DigitalYolks

2,328 followers
11mo
Report this post
TW-BERT is a remarkable framework that appears to improve the accuracy of information retrieval systems and could be in use by #Google. #seoupdates #onpageseo #searchengineranking

Google Ranking Algorithm Research Introduces TW-BERT

searchenginejournal.com
Like Comment
To view or add a comment, sign in
Abdul Khadar S.

Manager - Digital Marketing
12mo
Report this post
Google Ranking Algorithm Research Introduces TW-BERT Google's research paper describes a remarkable framework called TW-BERT that improves search ranking without requiring major changes. - TW-BERT is an end-to-end query term weighting framework that bridges two paradigms to improve search results - Integrates with existing query expansion models and improves performance - Deploying the new framework requires minimal changes https://lnkd.in/g8EUVHmv

Google Ranking Algorithm Research Introduces TW-BERT

searchenginejournal.com
Like Comment
To view or add a comment, sign in
Konrad Szymaniak, M.Sc.

Senior SEO Specialist | Google Colab, Screaming Frog, Ahrefs, Sistrix | I Utilise My Academic Background To Improve & Automate SEO Processes, Tactics and Strategies | SEO Research
1mo
Report this post
Sharing the Google algorithm attempt: R(q,X) = (Sqx / NDL) + a * Q(X) q: query X: document Sqx: salience of the query to the document (e.g., cosine similarity, BM25F, etc.) NDL: normalized document length a: quality modifier (e.g., navboost) Q(X): quality of the document (combines PageRank, Panda, SpamBrain, HCU, etc.) The above is the closest you can get to figuring out how the algorithm works. #algorithm #seo #seotips

Ryan Jones

Senior Vice President, SEO at Razorfish
1mo

You want the Google algorithm? here it is: R(q,X) = (Sqx / NDL) + a * Q(X) q is a query. X is a document. Both vectors in this equation. Sqx is the salience of the query to the document. Think cosine similarity. But likely a mix of other metrics like BM25F and some entity shit. NDL is normalized document length. a is a quality modifier. Think navboost. Q(X) is the quality of the document. In the old days this was just pagerank. Now it's a whole bunch of ML systems and scores all aggregated up. Think one score that combines pagerank, panda, spambrain, HCU, etc. ok ok vastly oversimplified, but that's literally the algorithm behind any search engine.
Like Comment
To view or add a comment, sign in
Christoph C. Cemper

Founder of AIPRM, RAGIDX, LinkResearchTools (LRT), Link Detox, URLinspector
4mo Edited
Report this post
A very good start by Jon Gillhams of Originality on an analysis of the first "hand job" by Google for the pure spam manual actions issued so far. A manual action is assumed, if a site had zero pages indexed after March 5, but had at least one page indexed in February. This is a good approximation for the 1446 manual actions, and I'd say a 10% possible error is realistic and acceptable. Why an error at all? De-indexation can happen for many reasons. But this approach is probably the most pragmatic to filter those 1446 out of over 79,000 websites. The second part of the article, the "AI content test" was done only for 14 sites of those 1446, which could be a lot more for better insights and maybe also suffers from a bias. 1% out of the potentially penalize websites is just not enough. The result could lead towards false assumptions with that small sample. But it's a great start, thanks for this. https://lnkd.in/dk7vadfS
Like Comment
To view or add a comment, sign in
Alina K.

SEO specialist
8mo
Report this post
How the Search Engine Works According to Leaked Documents: https://lnkd.in/e_BjFSJ9

Google's Algorithms Uncovered: How the Search Engine Works According to Leaked Documents

https://www.analistaseo.es
Like Comment
To view or add a comment, sign in
Nima Shahbazi, Ph.D.

AI Scientist, Zillow Prize Winner, Kaggle GrandMaster, Algo Trader
11mo
Report this post
💥💥opensource community rocks!!! StackOverflow adopted a semantic search solution using Weaviate. The process involved employing a pre-trained BERT model from the SentenceTransformers library to create embeddings. They opted for Weaviate due to its open-source nature, allowing self-hosting to ensure data privacy. follwoing their steps to create your own.... read more here: #llm #languagemodels #stackoverflow https://lnkd.in/g_qJYtkF

Ask like a human: Implementing semantic search on Stack Overflow

stackoverflow.blog

1 Comment
Like Comment
To view or add a comment, sign in
Lora Yordanova

SEO & Digital Content Strategist
12mo
Report this post
Fascinating look into some search algorithm research - ⭐️ One major way this “term weighting (TW)” framework improves search is by giving more or less “weight” to parts of a search query - like “Nike” being the more weighted part of the search query “nike running shoes” to give you more accurate results. While Google hasn’t confirmed they’ve integrated this into their search algorithm, it could explain the recent ranking and traffic volatility many people have been seeing. And if not, with how easy this framework is to deploy into an existing algorithm, would definitely expect to see it in use soon! https://lnkd.in/gtGVDVsG

Google Ranking Algorithm Research Introduces TW-BERT

searchenginejournal.com
Like Comment
To view or add a comment, sign in

1,635 followers

294 Posts

View Profile Follow

John McAlpin’s Post

More Relevant Posts

Explore topics