This document summarizes several patents related to query parsing and semantic search. It describes patents for multi-stage query processing, query breadth, query analysis, midpage query refinements (search suggestions), context vectors, and categorical quality (re-ranking search results based on the category of the query). Each patent is briefly described, including inventors, filing dates, and some technical details. The document aims to provide an overview of the evolution of semantic search and query understanding technologies at Google.
1) Knowledge graphs are structured databases that represent real-world entities and their relationships to each other. They help search engines like Google understand topics at a deeper level.
2) Entities (topics) are becoming more important than keywords for search engines to understand content. Google's entity understanding can be checked using their natural language processing tool.
3) Semantic SEO techniques like tightly linking topics both internally and to relevant external pages can help improve how search engines understand and represent the entities within a website through their knowledge graphs.
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
William slawski-google-patents- how-do-they-influence-search
Bill Slawski presented a webinar on analyzing patents related to search engines and SEO. He discussed 12 Google patents covering topics like PageRank, Google's news ranking algorithm, analyzing images to detect brand penetration, and building user location history. The patents described Google's work in building knowledge graphs from web pages, ranking entities in search results, question answering, and determining quality visits to local businesses.
1) Google uses various techniques to extract structured information like entities, relationships, and properties from unstructured text on the web and databases. This extracted information is then used to generate knowledge graphs and provide augmented responses to user queries.
2) One key technique is to identify patterns in which tuples of information are stored in databases, and then extract additional tuples by repeating the process and utilizing the identified patterns.
3) Google also extracts entities from user queries and may generate a knowledge graph to answer questions by providing information about the entities from sources like its own knowledge graph and information extracted from the web.
The document summarizes a presentation given by Bill Slawski at the Semantic Technology & Business Conference in San Jose. The presentation discussed how adding semantic information and structuring content around entities can help websites better optimize for search engines and provide more relevant experiences for users. It also provided several examples of how search engines are using entities and knowledge graphs to enhance search results and anticipate related queries.
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
The document discusses using Python for SEO applications such as data extraction, preparation, analysis, machine learning and deep learning. It provides an agenda and examples of using Python to solve challenging SEO problems from site migrations and traffic losses. Methods demonstrated include pulling data from Google Analytics, storing in DataFrames, regular expression grouping, and training machine learning models on page features to classify page groups and identify losses. Later sections discuss using deep learning with computer vision models to classify web pages from screenshots.
This document summarizes how Google search results are evolving to include more semantic data through direct answers, structured snippets, and rich snippets. It provides examples of direct answers being extracted from authoritative sources using natural language queries and intent templates. It also discusses how including structured data like tables, schemas, and markup can help search engines understand and display page content in a more standardized way. While knowledge-based trust is an interesting concept, current search ranking still primarily relies on link analysis and does not consider factual correctness.
How to Automatically Subcategorise Your Website Automatically With Python
The document describes a Python script that can automatically generate new subcategories for an ecommerce website based on clustering product names. It discusses:
- Using NLTK to generate n-grams from product names to cluster related products
- Filtering the n-grams to keep only those with commercial value by checking for search volume and CPC data
- Running the script on a large home improvement site to identify over 1,650 new subcategory opportunities with a total search volume of over 13 million
- Sharing the script so others can automate subcategory identification for their own sites to scale up an important SEO tactic.
Google's search results now include entities and concepts. Entities refer to people, places, things, and 20-30% of queries are for name entities. Google uses meta data like Freebase to build a taxonomy of entities and their relationships. This supports features like the Knowledge Graph, which provides information panels, and allows querying of nearby entities which may soon be available in search results.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
This document provides SEO metrics and comparisons for the website hangikredi.com over several time periods between April 2019 and September 2019. It shows substantial increases in key metrics like organic traffic, clicks, impressions, and average position after Google algorithm updates in May, June, July, and September. However, it also shows significant drops in these metrics during a server outage in early August. Overall the data demonstrates the site's strong SEO performance and organic growth over the 6-month period analyzed.
This document discusses digital marketing strategies focused on establishing authority through valuable, timeless content. It recommends creating content such as articles, videos, and academic papers on topics that will remain relevant for years to establish expertise. Creating a steady stream of high-quality content over time builds an online presence and credibility without major risks of losses, and may lead to job offers, clients, or other opportunities. It provides examples of interactive dashboards and open-source software that gained popularity and users by continuously publishing improvements and documentation without needing to rely on things like resumes or company profiles.
BrightonSEO March 2021 | Dan Taylor, Image Entity Tags
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
BrightonSEO October 2022 - Log File Analysis - Steven van Vessum.pdf
This document discusses how log file insights can help companies improve their crawling, indexing and organic marketing performance. It outlines some of the common issues companies face like not understanding search engine behavior and not reflecting on their past work. With log file insights accessible in real-time and automatically distilled, companies can answer critical questions to speed up their crawl times, see how search engines are handling their updated content and troubleshoot issues. The presenter promotes their solution, ContentKing, which provides real-time log file analysis from CDN logs to help companies learn what search engines know and keep sharpening their SEO strategies.
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing is publishing a page on the Internet by adding a semantic layer (i.e., semantic enrichment) in the form of structured data that describes the page itself.
Presentation given at the British Library Turing workshop on Software Citation, considering what lessons could be learned from the world of data citation
The document discusses best practices for preparing data for open publication. It recommends thinking openly and planning early by creating detailed data management plans. It provides examples of repositories like GenBank, ClinicalTrials.gov, FlyBase, Figshare, and Dryad that accept different types of data. The document emphasizes documenting data thoroughly with metadata and standards and following ethical guidelines for sharing and preserving data in the long term.
How to manage the complete content strategy in WordPress using plugins. Do your content inventory in WordPress -- no spreadsheets! Do content modeling using custom post types, taxonomies, and fields. Video: http://wordpress.tv/2013/08/02/stephanie-leary-content-strategy-wordpress-case-studies/
The document summarizes Bill Slawski's presentation on search and social media patents from 2012 and beyond. It discusses various patents Google has acquired related to search, social media, hardware, fiber optic networks, and more. It also outlines patents for phrase-based indexing, concept-based indexing, ranking pages based on user interactions, building a knowledge graph, and developing a planet-scale distributed search index. Slawski suggests Google may expand into hardware, entertainment, internet service provision, and more based on its patent portfolio.
This document provides guidance on evaluating electronic information sources for research. It discusses formulating a research question and where to find answers, such as books, academic journals, newspapers, magazines and the internet. It outlines criteria for evaluating sources, such as checking the author and publisher, assessing the purpose and reliability of the information. It also discusses using search techniques like Boolean operators and quotation marks to refine searches. Domain name extensions like .edu and .gov are explained to determine the type of website. Steps of the research process, including developing search strategies and assessing source quality, are also outlined.
Open Access and Open Education: Background, lobby tips, and continuing the di...
This document summarizes a presentation on open access and open education. It discusses the Right to Research Coalition and SPARC's work promoting open access to research and educational resources. Key points covered include the growth of the open access movement, challenges of high journal and textbook costs, policies advancing open access, and ways students can advocate for open access and open educational resources on their campuses.
An update on public access activities at the National Agricultural Library and next steps, presented 11 January 2017 at the Earth Science Information Partners (ESIP) meeting in Bethesda, Maryland.
Workshop - finding and accessing data - Cambridge August 22 2016
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
This document provides information about searching online, including:
- The size of the internet has grown tremendously, making proper searching skills more important.
- IPV6 was launched in 2012 to accommodate more internet addresses as devices increase.
- Search engines, directories, and databases are described as important tools for online research. Keywords, boolean searchers, and other search techniques are also outlined.
- Criteria like authority, purpose, currency and bias are important to evaluate sources found in online searches.
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
Part of the SciDataCon14 workshop on "Data Papers and their applications" run by myself and Brian Hole to help attendees understand current data-publishing journals and trends and help them understand the editorial processes on NPG's Scientific Data and Ubiquity's Open Health Data.
Finding things to write about can be difficult for bloggers. Here is how to get the most out of your content by using resources already available to you.
Lesson 8 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
There is a method to it: Making meaning in information research through a mix...
This document summarizes a presentation on using mixed methods approaches in information behavior research. It discusses how log analysis of search queries can be combined with user interviews to provide context around search behaviors. As an example, the presentation examines a project that analyzed WorldCat Discovery search logs and conducted interviews with users to better understand how they navigate from search to accessing information. The methodology provides benefits like context around quantitative log data, but also challenges in terms of resources needed. Overall it argues for the value of mixed methods approaches in gaining a richer understanding of user behaviors and experiences.
Using Gale In Context: Biography Instructional Presentation
Gale in Context is a subscription-based biography database containing over 600,000 entries from all time periods and fields. It provides biographical profiles, images, audio/video selections, and articles from journals and newspapers. As a Chicago Public Library patron, you can access it for free on their website. You can search by name, browse categories, or use advanced search filters to find profiles. Once you find a relevant source, the database tools allow you to highlight/annotate, translate, generate citations, or find related content. Librarians are available to help navigate and use this resource.
This document discusses sharing research data. It describes the Data Services Center, which provides data services including finding and providing access to datasets. It notes that funders and publishers require data sharing, and that shared data receives more citations. It recommends sharing the minimum data needed to reproduce results, and considering timing, usability and granularity of data sharing. For sharing methods, it recommends using disciplinary or general repositories like UR Research, Dryad and REACTUR, which provide long-term preservation and access. Workshops and help are available for data management and sharing.
This document provides guidance on finding argument sources for a research topic. It discusses defining searchable keywords, identifying relevant databases, and learning advanced search techniques. The document outlines different types of sources - background, exhibit, argument, and method - and where to find each type. It suggests brainstorming keywords, identifying subject areas, practicing database searches, and using techniques like advanced search options and citations to find full-text sources. The overall aim is to help students effectively search databases and identify significant scholarship related to their research topic.
Scaling Recommendations, Semantic Search, & Data Analytics with solr
This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village.
Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions.
Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Learning to Curate Research Data
Jennifer Doty, Research Data Librarian, Emory Center for Digital Scholarship, Emory University, Robert W. Woodruff Library
How to evaluate the whole web (without being Google)
Could you build your own, private view of the Internet? One that isn't reliant on Google or Bing? Majestic has done this and now has one of the largest web indexes on the planet. Whilst known and a backlink analysis engine, Majestic infact has its own, unique view of the Internet and is able to derive meaning, influence and context out of its dataset. Here's how they did it. (2018)
Powering Up Your Digital Strategy, Amplifying the Potential of Performance-Ba...
In our rapidly evolving digital landscape, yesterday's strategies simply won't suffice. Join us for a groundbreaking session on revenue based marketing where we'll explore cutting-edge approaches and the latest strategies that can supercharge your digital marketing plans. Discover how to leverage performance-based PR, influencer marketing, and affiliate marketing to drive revenue, optimize your campaigns, and achieve measurable results. We'll dive into effective methods for building brand awareness, cultivating deep engagement, driving conversions, and fostering lasting customer loyalty. Prepare to gain fresh ideas, valuable insights, and innovative methodologies designed to elevate your digital marketing efforts to new heights. Don't miss this opportunity to transform your strategy and stay ahead of the curve!
Key Takeaways:
1. Advanced Revenue-Driven Strategies: Learn how performance-based PR, influencer marketing, and affiliate marketing can drive revenue and optimize your marketing efforts.
2. Building and Engaging Your Audience: Discover effective methods for increasing brand awareness and cultivating deep engagement with your target audience.
3. Driving Conversions and Loyalty: Gain insights into strategies for driving conversions and fostering lasting customer loyalty to sustain your brand's growth.
10 Advantages and Disadvantages of Social Media Marketing in 2024
Explore the dynamic landscape of social media marketing in 2024 with our comprehensive presentation. Delve into the top 10 advantages and disadvantages that digital marketers face in leveraging social media platforms. Understand the opportunities for growth, engagement, and brand visibility, as well as the challenges and potential pitfalls that come with navigating the ever-evolving digital ecosystem. This presentation will provide valuable insights and actionable strategies for maximizing the benefits of social media marketing while mitigating its drawbacks, tailored specifically for the needs of Markonik.
An Odyssey into Composable Digital Solutions - Brian McKeiver
Much like Odysseus's fabled journey, the venture of an organization into creating compelling websites, easy-to-use digital solutions, and flawless user experience is laden with trials and triumphs. This session explores a BizStream customer case study that demonstrates how crafting composable digital solutions with headless CMS and headless commerce is possible. The result now serves as a modern-day Athena, navigating the customer through the stormy seas of digital transformation. Attendees can expect to learn how to embrace modern composable solutions, understand the benefits they bring, and identify which of Odysseus's conflicts to avoid.
Key Takeaways:
What makes up a composable digital solution.
Why content is still king in a composable world.
How Headless CMS and Headless Commerce are different.
Struggling to get high-quality backlinks? Our latest presentation reveals the strategies you need to succeed in 2024. Learn practical tips to boost your SEO and elevate your website’s authority. Click below to access the full presentation!
Full blog here - https://digitalmarketingphilippines.com/how-to-get-high-quality-backlinks-in-2024/
NIMA2024 | Hoe Danone Trends vertaalt naar Strategie voor het versterken van ...
Develop a category & retail vision to drive business impact today
Join Arnoud from Danone and Tris from Ipsos Strategy3 as they guide you on a journey through the art of leveraging trends and foresights to craft a category and retail vision. Discover the crucial need of future readiness, and understand how the future can lead to new opportunities, here and now. Be prepared to unlock the future potential of your enterprise!
SEO for Revenue, Grow Your Business, Not Just Your Rankings - Dale Bertrand
Let’s be honest. Improvements in search rankings and organic traffic don’t always translate into sales. Yet, you spend the majority of your SEO resources on driving rankings and traffic. What if you built your SEO content with conversion in mind from the beginning? You’d generate more organic traffic that actually converts into revenue! Join 20-year search marketing veteran as he unveils his framework for developing SEO content with conversion in mind every step of the way ‒ from keyword strategy to content development and publication.
Takeaways:
Tactics and benchmarks for SEO content that converts in 2024
Page layouts and content formats that convert organic traffic
Crafting keyword strategy and calls-to-action for conversion
Digital marketing metrics every one must know in 2024
The "Digital Marketing Metrics" PDF by Digital Scape provides a detailed guide to essential metrics used in digital marketing. It explains the importance of metrics in tracking and optimizing marketing efforts, offering definitions, formulas, and examples for each metric. The document covers metrics such as Return on Ad Spend (ROAS), Customer Lifetime Value (CLV), Cost of Acquisition (COA), Click Through Rate (CTR), Conversion Rate (CVR), Cost Per Sale (CPS), Bounce Rate, and Lead Conversion Rate (LCR). The aim is to equip marketers with the knowledge needed to make data-driven decisions and enhance campaign performance.
Learn what is metrics, difference in metrics, different types of metrics and calculation.
Top 10 Cases of Amnesia A Journey through Memory Loss.pptx
Amnesia, the loss of memory, is a fascinating and complex condition that has captured the imagination of scientists, storytellers, and the general public alike. It can be triggered by various factors such as brain injury, psychological trauma, or even certain medical conditions. This article delves into ten intriguing cases of amnesia each offering unique insights into the human mind and the fragile nature of memory.
Importance of SEO to support holistic marketing strategies and the rise of n...
A presentation for the Digital Marketing World Forum by Jessica Redman and Andrew Fox.
Discussing how SEO supports across numerous marketing channels and how user search behaviour is changing.
Discover how to optimise social media posts for discoverability and learn about Topical Domination.
PHP (Hypertext Preprocessor) is a widely-used open-source scripting language that is particularly suited for web development and can be embedded into HTML. It is primarily used for server-side scripting but can also be used as a general-purpose programming language. PHP is renowned for its simplicity, flexibility, and ease of integration with various databases and web servers, making it one of the most popular languages for building dynamic websites and web applications.led by Mr. Hirdesh Bharadwaj, is an ideal choice for summer training in PHP in Delhi. With Mr. Bharadwaj's extensive 15 years of experience in the field, Webs Jyoti offers top-notch training in PHP development.
One notable aspect of Webs Jyoti is its unique approach. It's not just a training institute but also functions as a development agency. This means that students not only receive theoretical knowledge but also gain practical experience by working on real-world projects.Ducat offers comprehensive PHP training with a strong focus on practical implementation and live projects. Their course covers the latest industry standards and trends, ensuring that students are well-prepared for job placements .
Webs Jyoti: This institute provides 100% practical classes, study materials written by the founder, and training on 2-3 live projects. They also offer job placement assistance and grooming sessions for job seekers.Voice Search Optimization ACIL Computer Education: Known for its industry-standard training, ACIL offers various PHP courses ranging from basic to advanced levels. They emphasize hands-on training with real-world simulations and provide job assistance and placement guarantees for certain courses.
APTRON Gurgaon: APTRON offers a well-structured PHP course with modules on basic to advanced PHP concepts, webs jyoti, and CodeIgniter. They also provide live project experience and job placement assistance.
SLA Consultants India: SLA offers an advanced PHP training program designed by experienced professionals. Their course includes live projects, instructor-led classroom sessions, and extensive practical exposure to ensure students are industry-ready .
Each of these institutes has its own strengths, so you might choose one based on specific criteria such as course content, faculty experience, or placement records.Webs Jyoti: This institute provides 100% practical classes, study materials written by the founder, and training on 2-3 live projects. They also offer job placement assistance and grooming sessions for job seekers.Webs Jyoti ensures that students receive top-notch education and support to kickstart their careers in coding and software development.One notable aspect of Webs Jyoti is its unique approach. It's not just a training institute but also functions as a development agency. This means that students not only receive theoretical knowledge but also gain practical experience by working on real-world projects. Mr. Bharadwaj's extensive 15 years experien
In 2024, digital marketing is not just an optional strategy for businesses; it's a fundamental component of any successful marketing plan. The rapid evolution of technology and changing consumer behaviors have made digital marketing more critical than ever. Here’s why digital marketing is indispensable in 2024 and how digital marketing agency can propel your business to new heights.
EyekooTech is committed to helping businesses navigate the complexities of digital marketing. Whether you're a small startup or a large enterprise, our innovative strategies and data-driven approach can elevate your brand and connect you with your target audience.
Create Content in Half the Time with Generative AI - Nick Mattar
Nick will present his "best of" findings from reviewing and testing more than 200 generative AI platforms over the last three years. While some programs will save you more than half the time, you can bet to save at least 50% of your time creating content if you begin using these tools.
Key Takeaways:
Attendees will walk away with a comprehensive list of generative AI programs that will make their lives easier. From blogging to video production and even AI marketing assistants, you will learn about nearly 20 AI platforms that are guaranteed to make your life easier in some way.
The Top 6 Facebook Ad Hacks of 2024, Targeting the Untargetable - Larry Kim
It’s been a difficult few years for Facebook Ads due to signal loss from iOS/Firefox/Chrome and the associated loss of ad targeting precision and ROAS. In this session, delve into 100% new high-impact strategies for thriving in Facebook advertising in a world without 3rd party cookies.
You'll uncover the top 7 Facebook ad hacks of 2024, all centered around first party ad signal data restoration and how to coax the new default Meta Audience+ ad targeting system to do what you want it to do, each backed by solid results and case studies. Learn how to skyrocket your landing page conversions by 20-25%, how to scale ads like never before, and target niche audiences with strategies that defy traditional norms.
Plus, gain insights into critical privacy regulations and how to maintain a full compliance therein.
Coronavirus and Future of SEO: Digital Marketing and Remote CultureKoray Tugberk GUBUR
I have attended a great SEO and Digital Marketing webinar with Founder of Stradiji and SEMRush Turkey Lead Mr. Mert Erkal and My Dearest Friend and SEO Consultant Atakan Erdoğan.
Small Note: After I uploaded the presentation, Google launched a new Covid-19 news address like Bing/covid-19. You may want to look at it -> https://www.google.com/covid-19
I have prepared a Presentation about Coronavirus's Effects on Search Engine Optimization (SEO).
You will find Coronavirus's changing effects on Digital Marketing and psychology of global society while using Search Engines.
I also have focused on Search Engine's and Social Media Brands, E-commerce Site's reflexes against Coronavirus Pandemic.
You will see the web sites and categories who earn more traffic and lose traffic. You will also see conversion rate differences because of Coronavirus.
Also, I have told about Search Engine's differences and their attitude against the Coronavirus Pandemic, their future, their updates during the pandemic.
In the last part, you will see some new 2020 Web Technology and Design Trends with AI.
There are also Google Researches for better Search Engine technologies.
Questions:
1- What are the differences between Yandex, Google, Bing, and Duckduckgo for Coronavirus Pandemic?
2- Twitter, Instagram, Amazon or Apple, what are they doing?
3- What do people search most for during the Coronavirus Crisis?
4- What changes from country to country?
5- What are the future technologies of Web and App?
6- How and why do Search Engines improve AI, what is the last events?
7- Which sites loose traffic and which earn more?
8- Lots of quotes from International SEOs about the pandemic.
And more...
I am Koray Tuğberk GÜBÜR and a Holistic SEO Expert.
I sincerely thank you for my Dearest Friend Atakan Erdoğan and Mr. Mert Erkal for this awesome webinar opportunity and experience.
To watch the webinar, please visit Stradiji's Official Youtube Channel.
https://www.youtube.com/watch?v=V4sJTNcRqaM&t=100s
Quality Content at Scale Through Automated Text Summarization of UGCHamlet Batista
The document discusses using automated text summarization techniques to generate quality content at scale from user-generated content like online product reviews. It proposes a technical plan to download Amazon reviews, remove duplicate sentences using neural semantic textual similarity, and then generate frequently asked questions and corresponding FAQ schema by feeding the review text into a neural question generation model. The goal is to leverage user content and machine learning to automatically create helpful content for websites.
Slawski New Approaches for Structured Data:Evolution of Question Answering Bill Slawski
Google has moved from Search to Knowledge, and Focusing on Answering questions with knowledge graph entity information provides has led to answering queries with Knowledge graphs for those questions, with confidence scores between entities and other entities or attributes of entities, based upon freshness, reliabilillity, popularity, and proximity between an entity and another entity or an attribute.
1) Knowledge graphs are structured databases that represent real-world entities and their relationships to each other. They help search engines like Google understand topics at a deeper level.
2) Entities (topics) are becoming more important than keywords for search engines to understand content. Google's entity understanding can be checked using their natural language processing tool.
3) Semantic SEO techniques like tightly linking topics both internally and to relevant external pages can help improve how search engines understand and represent the entities within a website through their knowledge graphs.
The Python Cheat Sheet for the Busy MarketerHamlet Batista
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
William slawski-google-patents- how-do-they-influence-searchBill Slawski
Bill Slawski presented a webinar on analyzing patents related to search engines and SEO. He discussed 12 Google patents covering topics like PageRank, Google's news ranking algorithm, analyzing images to detect brand penetration, and building user location history. The patents described Google's work in building knowledge graphs from web pages, ranking entities in search results, question answering, and determining quality visits to local businesses.
Semantic search Bill Slawski DEEP SEA ConBill Slawski
1) Google uses various techniques to extract structured information like entities, relationships, and properties from unstructured text on the web and databases. This extracted information is then used to generate knowledge graphs and provide augmented responses to user queries.
2) One key technique is to identify patterns in which tuples of information are stored in databases, and then extract additional tuples by repeating the process and utilizing the identified patterns.
3) Google also extracts entities from user queries and may generate a knowledge graph to answer questions by providing information about the entities from sources like its own knowledge graph and information extracted from the web.
The document summarizes a presentation given by Bill Slawski at the Semantic Technology & Business Conference in San Jose. The presentation discussed how adding semantic information and structuring content around entities can help websites better optimize for search engines and provide more relevant experiences for users. It also provided several examples of how search engines are using entities and knowledge graphs to enhance search results and anticipate related queries.
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
The document discusses using Python for SEO applications such as data extraction, preparation, analysis, machine learning and deep learning. It provides an agenda and examples of using Python to solve challenging SEO problems from site migrations and traffic losses. Methods demonstrated include pulling data from Google Analytics, storing in DataFrames, regular expression grouping, and training machine learning models on page features to classify page groups and identify losses. Later sections discuss using deep learning with computer vision models to classify web pages from screenshots.
Semantic seo and the evolution of queriesBill Slawski
This document summarizes how Google search results are evolving to include more semantic data through direct answers, structured snippets, and rich snippets. It provides examples of direct answers being extracted from authoritative sources using natural language queries and intent templates. It also discusses how including structured data like tables, schemas, and markup can help search engines understand and display page content in a more standardized way. While knowledge-based trust is an interesting concept, current search ranking still primarily relies on link analysis and does not consider factual correctness.
How to Automatically Subcategorise Your Website Automatically With Pythonsearchsolved
The document describes a Python script that can automatically generate new subcategories for an ecommerce website based on clustering product names. It discusses:
- Using NLTK to generate n-grams from product names to cluster related products
- Filtering the n-grams to keep only those with commercial value by checking for search volume and CPC data
- Running the script on a large home improvement site to identify over 1,650 new subcategory opportunities with a total search volume of over 13 million
- Sharing the script so others can automate subcategory identification for their own sites to scale up an important SEO tactic.
Bill Slawski SEO and the New Search ResultsBill Slawski
Google's search results now include entities and concepts. Entities refer to people, places, things, and 20-30% of queries are for name entities. Google uses meta data like Freebase to build a taxonomy of entities and their relationships. This supports features like the Knowledge Graph, which provides information panels, and allows querying of nearby entities which may soon be available in search results.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateKoray Tugberk GUBUR
This document provides SEO metrics and comparisons for the website hangikredi.com over several time periods between April 2019 and September 2019. It shows substantial increases in key metrics like organic traffic, clicks, impressions, and average position after Google algorithm updates in May, June, July, and September. However, it also shows significant drops in these metrics during a server outage in early August. Overall the data demonstrates the site's strong SEO performance and organic growth over the 6-month period analyzed.
This document discusses digital marketing strategies focused on establishing authority through valuable, timeless content. It recommends creating content such as articles, videos, and academic papers on topics that will remain relevant for years to establish expertise. Creating a steady stream of high-quality content over time builds an online presence and credibility without major risks of losses, and may lead to job offers, clients, or other opportunities. It provides examples of interactive dashboards and open-source software that gained popularity and users by continuously publishing improvements and documentation without needing to rely on things like resumes or company profiles.
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsDan Taylor
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
BrightonSEO October 2022 - Log File Analysis - Steven van Vessum.pdfSteven van Vessum
This document discusses how log file insights can help companies improve their crawling, indexing and organic marketing performance. It outlines some of the common issues companies face like not understanding search engine behavior and not reflecting on their past work. With log file insights accessible in real-time and automatically distilled, companies can answer critical questions to speed up their crawl times, see how search engines are handling their updated content and troubleshoot issues. The presenter promotes their solution, ContentKing, which provides real-time log file analysis from CDN logs to help companies learn what search engines know and keep sharpening their SEO strategies.
Semantic Publishing and Entity SEO - Conteference 20-11-2022Massimiliano Geraci
Semantic Publishing is publishing a page on the Internet by adding a semantic layer (i.e., semantic enrichment) in the form of structured data that describes the page itself.
Presentation given at the British Library Turing workshop on Software Citation, considering what lessons could be learned from the world of data citation
The document discusses best practices for preparing data for open publication. It recommends thinking openly and planning early by creating detailed data management plans. It provides examples of repositories like GenBank, ClinicalTrials.gov, FlyBase, Figshare, and Dryad that accept different types of data. The document emphasizes documenting data thoroughly with metadata and standards and following ethical guidelines for sharing and preserving data in the long term.
How to manage the complete content strategy in WordPress using plugins. Do your content inventory in WordPress -- no spreadsheets! Do content modeling using custom post types, taxonomies, and fields. Video: http://wordpress.tv/2013/08/02/stephanie-leary-content-strategy-wordpress-case-studies/
Search and social patents for 2012 and beyondBill Slawski
The document summarizes Bill Slawski's presentation on search and social media patents from 2012 and beyond. It discusses various patents Google has acquired related to search, social media, hardware, fiber optic networks, and more. It also outlines patents for phrase-based indexing, concept-based indexing, ranking pages based on user interactions, building a knowledge graph, and developing a planet-scale distributed search index. Slawski suggests Google may expand into hardware, entertainment, internet service provision, and more based on its patent portfolio.
This document provides guidance on evaluating electronic information sources for research. It discusses formulating a research question and where to find answers, such as books, academic journals, newspapers, magazines and the internet. It outlines criteria for evaluating sources, such as checking the author and publisher, assessing the purpose and reliability of the information. It also discusses using search techniques like Boolean operators and quotation marks to refine searches. Domain name extensions like .edu and .gov are explained to determine the type of website. Steps of the research process, including developing search strategies and assessing source quality, are also outlined.
Open Access and Open Education: Background, lobby tips, and continuing the di...Nicole Allen
This document summarizes a presentation on open access and open education. It discusses the Right to Research Coalition and SPARC's work promoting open access to research and educational resources. Key points covered include the growth of the open access movement, challenges of high journal and textbook costs, policies advancing open access, and ways students can advocate for open access and open educational resources on their campuses.
Public access to research results at USDACyndy Parr
An update on public access activities at the National Agricultural Library and next steps, presented 11 January 2017 at the Earth Science Information Partners (ESIP) meeting in Bethesda, Maryland.
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
This document provides information about searching online, including:
- The size of the internet has grown tremendously, making proper searching skills more important.
- IPV6 was launched in 2012 to accommodate more internet addresses as devices increase.
- Search engines, directories, and databases are described as important tools for online research. Keywords, boolean searchers, and other search techniques are also outlined.
- Criteria like authority, purpose, currency and bias are important to evaluate sources found in online searches.
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
Part of the SciDataCon14 workshop on "Data Papers and their applications" run by myself and Brian Hole to help attendees understand current data-publishing journals and trends and help them understand the editorial processes on NPG's Scientific Data and Ubiquity's Open Health Data.
Finding things to write about can be difficult for bloggers. Here is how to get the most out of your content by using resources already available to you.
Lesson 8 in a set of 10 created by DataONE on Best Practices for Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
There is a method to it: Making meaning in information research through a mix...Lynn Connaway
This document summarizes a presentation on using mixed methods approaches in information behavior research. It discusses how log analysis of search queries can be combined with user interviews to provide context around search behaviors. As an example, the presentation examines a project that analyzed WorldCat Discovery search logs and conducted interviews with users to better understand how they navigate from search to accessing information. The methodology provides benefits like context around quantitative log data, but also challenges in terms of resources needed. Overall it argues for the value of mixed methods approaches in gaining a richer understanding of user behaviors and experiences.
Using Gale In Context: Biography Instructional Presentationrikkimoore
Gale in Context is a subscription-based biography database containing over 600,000 entries from all time periods and fields. It provides biographical profiles, images, audio/video selections, and articles from journals and newspapers. As a Chicago Public Library patron, you can access it for free on their website. You can search by name, browse categories, or use advanced search filters to find profiles. Once you find a relevant source, the database tools allow you to highlight/annotate, translate, generate citations, or find related content. Librarians are available to help navigate and use this resource.
This document discusses sharing research data. It describes the Data Services Center, which provides data services including finding and providing access to datasets. It notes that funders and publishers require data sharing, and that shared data receives more citations. It recommends sharing the minimum data needed to reproduce results, and considering timing, usability and granularity of data sharing. For sharing methods, it recommends using disciplinary or general repositories like UR Research, Dryad and REACTUR, which provide long-term preservation and access. Workshops and help are available for data management and sharing.
This document provides guidance on finding argument sources for a research topic. It discusses defining searchable keywords, identifying relevant databases, and learning advanced search techniques. The document outlines different types of sources - background, exhibit, argument, and method - and where to find each type. It suggests brainstorming keywords, identifying subject areas, practicing database searches, and using techniques like advanced search options and citations to find full-text sources. The overall aim is to help students effectively search databases and identify significant scholarship related to their research topic.
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village.
Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions.
Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Learning to Curate Research Data
Jennifer Doty, Research Data Librarian, Emory Center for Digital Scholarship, Emory University, Robert W. Woodruff Library
How to evaluate the whole web (without being Google)Dixon Jones
Could you build your own, private view of the Internet? One that isn't reliant on Google or Bing? Majestic has done this and now has one of the largest web indexes on the planet. Whilst known and a backlink analysis engine, Majestic infact has its own, unique view of the Internet and is able to derive meaning, influence and context out of its dataset. Here's how they did it. (2018)
Similar to Semantic Search Engine: Semantic Search and Query Parsing with Phrases and Entities (20)
In our rapidly evolving digital landscape, yesterday's strategies simply won't suffice. Join us for a groundbreaking session on revenue based marketing where we'll explore cutting-edge approaches and the latest strategies that can supercharge your digital marketing plans. Discover how to leverage performance-based PR, influencer marketing, and affiliate marketing to drive revenue, optimize your campaigns, and achieve measurable results. We'll dive into effective methods for building brand awareness, cultivating deep engagement, driving conversions, and fostering lasting customer loyalty. Prepare to gain fresh ideas, valuable insights, and innovative methodologies designed to elevate your digital marketing efforts to new heights. Don't miss this opportunity to transform your strategy and stay ahead of the curve!
Key Takeaways:
1. Advanced Revenue-Driven Strategies: Learn how performance-based PR, influencer marketing, and affiliate marketing can drive revenue and optimize your marketing efforts.
2. Building and Engaging Your Audience: Discover effective methods for increasing brand awareness and cultivating deep engagement with your target audience.
3. Driving Conversions and Loyalty: Gain insights into strategies for driving conversions and fostering lasting customer loyalty to sustain your brand's growth.
10 Advantages and Disadvantages of Social Media Marketing in 2024Markonik
Explore the dynamic landscape of social media marketing in 2024 with our comprehensive presentation. Delve into the top 10 advantages and disadvantages that digital marketers face in leveraging social media platforms. Understand the opportunities for growth, engagement, and brand visibility, as well as the challenges and potential pitfalls that come with navigating the ever-evolving digital ecosystem. This presentation will provide valuable insights and actionable strategies for maximizing the benefits of social media marketing while mitigating its drawbacks, tailored specifically for the needs of Markonik.
Much like Odysseus's fabled journey, the venture of an organization into creating compelling websites, easy-to-use digital solutions, and flawless user experience is laden with trials and triumphs. This session explores a BizStream customer case study that demonstrates how crafting composable digital solutions with headless CMS and headless commerce is possible. The result now serves as a modern-day Athena, navigating the customer through the stormy seas of digital transformation. Attendees can expect to learn how to embrace modern composable solutions, understand the benefits they bring, and identify which of Odysseus's conflicts to avoid.
Key Takeaways:
What makes up a composable digital solution.
Why content is still king in a composable world.
How Headless CMS and Headless Commerce are different.
Struggling to get high-quality backlinks? Our latest presentation reveals the strategies you need to succeed in 2024. Learn practical tips to boost your SEO and elevate your website’s authority. Click below to access the full presentation!
Full blog here - https://digitalmarketingphilippines.com/how-to-get-high-quality-backlinks-in-2024/
NIMA2024 | Hoe Danone Trends vertaalt naar Strategie voor het versterken van ...BBPMedia1
Develop a category & retail vision to drive business impact today
Join Arnoud from Danone and Tris from Ipsos Strategy3 as they guide you on a journey through the art of leveraging trends and foresights to craft a category and retail vision. Discover the crucial need of future readiness, and understand how the future can lead to new opportunities, here and now. Be prepared to unlock the future potential of your enterprise!
Let’s be honest. Improvements in search rankings and organic traffic don’t always translate into sales. Yet, you spend the majority of your SEO resources on driving rankings and traffic. What if you built your SEO content with conversion in mind from the beginning? You’d generate more organic traffic that actually converts into revenue! Join 20-year search marketing veteran as he unveils his framework for developing SEO content with conversion in mind every step of the way ‒ from keyword strategy to content development and publication.
Takeaways:
Tactics and benchmarks for SEO content that converts in 2024
Page layouts and content formats that convert organic traffic
Crafting keyword strategy and calls-to-action for conversion
Digital marketing metrics every one must know in 2024Digital Scape
The "Digital Marketing Metrics" PDF by Digital Scape provides a detailed guide to essential metrics used in digital marketing. It explains the importance of metrics in tracking and optimizing marketing efforts, offering definitions, formulas, and examples for each metric. The document covers metrics such as Return on Ad Spend (ROAS), Customer Lifetime Value (CLV), Cost of Acquisition (COA), Click Through Rate (CTR), Conversion Rate (CVR), Cost Per Sale (CPS), Bounce Rate, and Lead Conversion Rate (LCR). The aim is to equip marketers with the knowledge needed to make data-driven decisions and enhance campaign performance.
Learn what is metrics, difference in metrics, different types of metrics and calculation.
Top 10 Cases of Amnesia A Journey through Memory Loss.pptxelizabethella096
Amnesia, the loss of memory, is a fascinating and complex condition that has captured the imagination of scientists, storytellers, and the general public alike. It can be triggered by various factors such as brain injury, psychological trauma, or even certain medical conditions. This article delves into ten intriguing cases of amnesia each offering unique insights into the human mind and the fragile nature of memory.
Importance of SEO to support holistic marketing strategies and the rise of n...JessicaRedman5
A presentation for the Digital Marketing World Forum by Jessica Redman and Andrew Fox.
Discussing how SEO supports across numerous marketing channels and how user search behaviour is changing.
Discover how to optimise social media posts for discoverability and learn about Topical Domination.
PHP (Hypertext Preprocessor) is a widely-used open-source scripting language that is particularly suited for web development and can be embedded into HTML. It is primarily used for server-side scripting but can also be used as a general-purpose programming language. PHP is renowned for its simplicity, flexibility, and ease of integration with various databases and web servers, making it one of the most popular languages for building dynamic websites and web applications.led by Mr. Hirdesh Bharadwaj, is an ideal choice for summer training in PHP in Delhi. With Mr. Bharadwaj's extensive 15 years of experience in the field, Webs Jyoti offers top-notch training in PHP development.
One notable aspect of Webs Jyoti is its unique approach. It's not just a training institute but also functions as a development agency. This means that students not only receive theoretical knowledge but also gain practical experience by working on real-world projects.Ducat offers comprehensive PHP training with a strong focus on practical implementation and live projects. Their course covers the latest industry standards and trends, ensuring that students are well-prepared for job placements .
Webs Jyoti: This institute provides 100% practical classes, study materials written by the founder, and training on 2-3 live projects. They also offer job placement assistance and grooming sessions for job seekers.Voice Search Optimization ACIL Computer Education: Known for its industry-standard training, ACIL offers various PHP courses ranging from basic to advanced levels. They emphasize hands-on training with real-world simulations and provide job assistance and placement guarantees for certain courses.
APTRON Gurgaon: APTRON offers a well-structured PHP course with modules on basic to advanced PHP concepts, webs jyoti, and CodeIgniter. They also provide live project experience and job placement assistance.
SLA Consultants India: SLA offers an advanced PHP training program designed by experienced professionals. Their course includes live projects, instructor-led classroom sessions, and extensive practical exposure to ensure students are industry-ready .
Each of these institutes has its own strengths, so you might choose one based on specific criteria such as course content, faculty experience, or placement records.Webs Jyoti: This institute provides 100% practical classes, study materials written by the founder, and training on 2-3 live projects. They also offer job placement assistance and grooming sessions for job seekers.Webs Jyoti ensures that students receive top-notch education and support to kickstart their careers in coding and software development.One notable aspect of Webs Jyoti is its unique approach. It's not just a training institute but also functions as a development agency. This means that students not only receive theoretical knowledge but also gain practical experience by working on real-world projects. Mr. Bharadwaj's extensive 15 years experien
In 2024, digital marketing is not just an optional strategy for businesses; it's a fundamental component of any successful marketing plan. The rapid evolution of technology and changing consumer behaviors have made digital marketing more critical than ever. Here’s why digital marketing is indispensable in 2024 and how digital marketing agency can propel your business to new heights.
EyekooTech is committed to helping businesses navigate the complexities of digital marketing. Whether you're a small startup or a large enterprise, our innovative strategies and data-driven approach can elevate your brand and connect you with your target audience.
Nick will present his "best of" findings from reviewing and testing more than 200 generative AI platforms over the last three years. While some programs will save you more than half the time, you can bet to save at least 50% of your time creating content if you begin using these tools.
Key Takeaways:
Attendees will walk away with a comprehensive list of generative AI programs that will make their lives easier. From blogging to video production and even AI marketing assistants, you will learn about nearly 20 AI platforms that are guaranteed to make your life easier in some way.
It’s been a difficult few years for Facebook Ads due to signal loss from iOS/Firefox/Chrome and the associated loss of ad targeting precision and ROAS. In this session, delve into 100% new high-impact strategies for thriving in Facebook advertising in a world without 3rd party cookies.
You'll uncover the top 7 Facebook ad hacks of 2024, all centered around first party ad signal data restoration and how to coax the new default Meta Audience+ ad targeting system to do what you want it to do, each backed by solid results and case studies. Learn how to skyrocket your landing page conversions by 20-25%, how to scale ads like never before, and target niche audiences with strategies that defy traditional norms.
Plus, gain insights into critical privacy regulations and how to maintain a full compliance therein.
2. @KorayGubur
A b o u t M e
Koray Tuğberk GÜBÜR
Owner and Founder of Holistic SEO & Digital
• Educates his team
• Publishes SEO Case Studies, Researches & Guides
• Twitter: @KorayGubur
• Email: ktgubur@holisticseo.digital
• Official Site: https://www.holisticseo.digital
6. @KorayGubur
What is Query Parsing?
• Query Parsing it the process of
understanding the different sections of a
query.
• Types: Entity-seeking Query, a Substitue
Term, or Synonym Term.
• Canonical and Represented Versions: A
Canonical Query can represent close
variations.
• Query Character: Affects the SERP Design,
Dominant and Minor Search Intent
Assigments.
• Query Process: Other name of the Query
Parsing.
@KorayGubur
7. @KorayGubur
Multi-Stage Query Processing
• The first patent that talks about «Context
of Words».
• It tries to delete the stop words.
• Stemming the concrete words.
• Expanding words with Synonyms and Co-
occurence.
• Some Criterias: Absent Queries, Boolean
Logic, Query Term Weights, Document
Popularity, Word Proximity (Distance),
Word Adjacency.
• It uses «VIPS» and Web Page Layout.
@KorayGubur
Inventors: Jeffrey Adgate Dean, Paul G.
Haahr, Olcan Sercinoglu, and Amitabh
K. Singhal
US Patent Application 20060036593
Filed: August 13, 2004
Published February 16, 2006
8. @KorayGubur
Query Breadth
• This is for «adjecent words» and
«unknown entities».
• It uses related document count to see
the ‘query breadth’.
• Query Breadth can be decreased with
the ‘adjecent word’ count.
• Query Breadth can be used for ‘Named
Entity Recognition’, or Triple Creation
(An Object and two Subject).
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
@KorayGubur
9. @KorayGubur
Query Analysis
• Selection Over Time: For different timespans,
a document can be chosen more frequently.
• Documents with Hot Topics: Rising Queries
can boost documents that include these
queries.
• Documents with Related Hot Topics: Related
queries for rising queries can boost the
documents with related queries.
• Constant Queries with Consistently Changing
Results: Constant Query is the always popular
query with changing information for a topic.
• Freshness of Documents: Date of the
information on the web page, not the date of
the document’s last version.
@KorayGubur
Invented by Karl Pfleger and Brian Larson
Assigned to Google
US Patent 7,925,657
Granted April 12, 2011
Filed: March 17, 2004
10. @KorayGubur
Query Analysis
• Staleness of Documents: Historical Data
amount can be a positive ranking signal
for a page for a query.
• Overly Broad Pages: Includes discordant
queries, a signal for spam.
• Continuation Patent filed in 2011 for
«document locator». And, some terms
changed.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
11. @KorayGubur
Query Analysis
• Trends Related to Topics and Search Terms: Grouping
Topics, and Subtopics announced for Trending Queries.
• Access Times to Determine Freshness and Staleness:
Compares the First Access and Last Access time for
certain documents.
• Frequency of Selection: Compares the selection count
for the first and latter time.
• When Staleness Might be Preferred: Even if there is
fresh news, or documents, the user can choose the stale
document. These documents are not affected by stale
information.
• Spam Determination Based Upon Breadth of Rankings,
and Authority: If the document is popular, or
authoritative (link-based), or the source is relevant
enough, it will be an exception.
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
12. @KorayGubur
Query Analysis
• Continuation of the Historical Data
Patent.
• Speaks about Topics, and Query
Categorization based on Topics.
• It is important beause, same year,
Google Launched its Knowledge Graph
with 5 million entities, and 500 million
facts.
@KorayGubur
Inventors: DEAN; Jeffrey; (Palo Alto,
CA) ; Haahr; Paul; (San Francisco, CA) ;
Henzinger; Monika; (Corseaux, CH) ;
Lawrence; Steve; (Mountain View, CA) ;
Pfleger; Karl; (Mountain View, CA) ;
Sercinoglu; Olcan; (Mountain View, CA) ;
Tong; Simon; (Mountain View, CA)
Assignee: GOOGLE INC.
Mountain View
CA
Family ID: 34381362
Appl. No.: 13/244853
Filed: September 26, 2011
13. @KorayGubur
Midpage Query Refinements
• In 2006, Google published the
«Midpage Query Refinements», a.k.a,
Search Suggestions from today.
• The GUI test was between 2004-2006.
• The patent filed in 2003.
• Includes Semantic Query Clusters for
Different Contexts.
• A Matcher, a Clusterer, A Scorer, and A
Presenter.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
14. @KorayGubur
Midpage Query Refinements
• Precomputation Engine has four parts.
• Associator: Query and Document
Association.
• Selector: Document and Query Section
Selector.
• Regenerator: Checks the query logs to
refresh the selections.
• Inverter: Checks the Cached Data for
presenting.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
15. @KorayGubur
Midpage Query Refinements
• Query Ambiguity: If the query is ambigous,
Search Engine can use the query clusters.
• Homonyms, General Terms, Improper
Context, and Narrow Terms can create a
stateless SERP Instance.
• To prevent this, Semantic Grouping,
Centroids and Centroid distance are used.
• A Query Cluster and Document Cluster can
be paired. If Document cluster is larger, or
more relevant, the query cluster will be
used as query suggestion.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven;
(Mountain View, CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
16. @KorayGubur
Midpage Query Refinements
• Matcher: Stored query variations are put into a
cluster, and document phrase variations are
matched.
• Clusterer: The matched query variations, and
documents are clustered together. Different
than query clusters.
• Scorer: Determines the center of the centroid.
If the term vectors are distant to the centroid,
another cluster will be chosen by the Clusterer
for Scorer.
• Presenter: Created Clusters, and Centroids are
presented to the user. According to the
preferred choices, presenter will use sub-
centroids.
@KorayGubur
Inventors: Haahr, Paul; (San Francisco, CA) ; Baker, Steven; (Mountain V
CA)
Correspondence Address:
PATRICK J S INOUYE P S
810 3RD AVENUE
SUITE 258
SEATTLE
WA
98104
US
Family ID: 34228721
Appl. No.: 10/668721
Filed: September 22, 2003
17. @KorayGubur
Midpage Query Refinements
• During 2017, the patent has been
refreshed.
• The Scorer Method has been changed.
• Representative Queries are chosen based
on centroids.
• For every cluster, a representative query is
chosen.
• According to the cluster size, and relevance
scores, the clusters are aligned.
• And, sub-queries are used as the
refinement queries.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
18. @KorayGubur
Midpage Query Refinements
• Inventors of the Midpage Query Refinement
Methodology are Paul Haahr and Steven D.
Baker.
• Steven Baker has written the Google
Synonyms Blog Post for Google’s Synonym
Update before the RankBrain Announcement.
• Helping Search Engines to Understand
Language:
https://googleblog.blogspot.com/2010/01/hel
ping-computers-understand-language.html
• Paul Haahr is the owner of the How Google
Works Presentation from SMX West. Includes
lots of useful insights.
@KorayGubur
Inventors: Paul Haahr and Steven D. Baker
Assignee: Google Inc.
The United States Patent 9,552,388
Granted: January 24, 2017
Filed: January 31, 2014
19. @KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
20. @KorayGubur
Context-Vectors
• Midpage Query Refinements and Query-
Document Logical Pairs with Centroids and
Clusters are the beginning of RankBrain.
• Context-Vectors were the second step for
completing the journey.
• Word Vectors and Context Vectors are
different from each other.
• Word Vectors are the combination of
words.
• Context Vectors are the list of combination
of words for a Contextual Domain.
• Term Vector is a word combination from a
Contextual Domain.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
21. @KorayGubur
Context-Vectors
• Context-Vectors are close to the ‘Lexicon’
of the first research paper of Google which
is An Anatomy of Large Hypertextual Web
Search Engine document.
• Context-Vectors are the version of Lexicon
with different Contextual Domains.
• Context-Vectors are located in Domain List
Terms.
• A Domain List Terms can include 800.000
words, and word combinations.
• A Domain List Terms can include a macro-
context, and a sub-context with sub-
portions.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
22. @KorayGubur
Context-Vectors
• Context-vectors use ‘Topical Entries’.
• A Topical Entry, can be used for macro-
context.
• These topical entries can be used for
question generation.
• Generated questions can be used for
differentiating the different sub-contexts
from each other.
• A Macro-context can have a Dominant
Knowledge Domain. A Context-Vector can
be used for intersectional areas.
@KorayGubur
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
23. @KorayGubur
Categorical Quality
• This is an ‘Re-ranking’ Algorithm Patent.
• There is a strong difference between the
Re-ranking and Initial Ranking.
• Re-ranking Algorithms are the modifying
algorithms for the Query Results.
• Inventor is Tyrstan Upstill, author of the
Evidence-based Ranking Research.
• Categorical Quality doesn’t focus on
relevance, or authoritativeness, it focuses
on Understanding the Category of the
Query.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
24. @KorayGubur
Categorical Quality
• This patent mentions the ‘social media shares’
and community size.
• If the query satisfy the ‘categorical query’
conditions, the search results will be evaluated
for related and close queries too.
• If a resource satisfies also the related categorical
queries, a categorical quality score will be
assigned to the source.
• Categorical Quality Methodology collects
Navigational Queries for different sources.
• If the source has more navigational queries, it
means that it has a popularity for the category.
• Categorical Quality mentions «Topicality Score».
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
25. @KorayGubur
Categorical Quality
• If a source includes all query terms for a
topic, it will have more Categorical Quality
and Topicality Score.
• This method also mentions ‘Click
Selection.’
• To understand the Model’s Success, they
do not take every click or CTR into
account.
• They take CTR and Clicks into account if it
meets with certain criterias such as time,
frequency, or personal interest.
@KorayGubur
Inventors: Trystan G. Upstill, Abhishek Das, Jeongwoo
Ko, Neesha Subramaniam, and Vishnu P. Natchu
US Patent Application: 20190155948
Published on: May 23, 2019
Filed: March 31, 2015
26. @KorayGubur
Substitue Query
• Substitue Query is the query that can replace
another query. These queries are used for
bolding the some sections of the content.
• Substitue Queries make ‘context’ more
important. Because, synonyms make change
the context. Such as, car and auto can be
same thing for ‘repair’ but they are not same
for ‘railroad’.
• There is a railroad car, but not auto.
• Thus, Sustitue Queries are not synonyms.
They are the replacble words without
changing the context.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
27. @KorayGubur
Substitue Query
• Co-occurence Matrix and Phrase-
based Indexing are used to support
the Substitue Queries.
• The method uses the Space Vectors
to compare the word vectors to each
other.
• If the queries are similar to each
other with enough co-occurent
words, it means that they can be
subtitue to each other.
@KorayGubur
Invented by Daisuke Ikeda and Ke Yang
Assigned to Google
US Patent 8,504,562
Granted August 6, 2013
Filed: April 3, 2012
28. @KorayGubur
Synthetic Query
• Synthetic Query is the re-written version of
the query of the user by the search engine.
• A search engine can re-write a query by
augmenting the query to diversify the SERP
Features for a better search activity
satisfaction possibility.
• Some score types that Synthetic Queries
include are ‘Edit Distance Score’, ‘Similarity
Score’, ‘Transformation Cost Score’.
• Synthetic Queries can be collected from web
documents, Structured Data, and Similarity
Between Documents.
@KorayGubur
Inventors: Anand Shukla, Mark Pearson, Krishna
Bharat and Stefan Buettcher
Assignee: Google LLC
US Patent: 9,916,366
Granted: March 13, 2018
Filed: July 28, 2015
29. @KorayGubur
Synthetic Query and
Query Templates
• Query Templates are intermediary forms between the
Seed Queries and Synthetic Queries.
• Synthetic Queries are helpful for a Search Engine to
create pre-defined and pre-ordered SERP Instances.
• Synthetic Queries can be generated from HTML Tags,
IDF Scores, Close Phrases.
• If a Document has «Dorothy Parker Biography» as H1,
and «Sylvia Plath» as H2.
• Search Engine can use the «Sylvia Plath Biography» as
a synthetic query.
• If the results are good enough for relevance and
quality, the Synthetic Query will become a Seed
Query.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
30. @KorayGubur
Synthetic Query and
Query Templates
• Synthetic Queries can be generated from
the same author, same journal, source, or
time of period.
• Synthetic Queries and Open Information
Extraction are closely related to each
other.
• Before entering the world of entities,
understanding the world of phrases are
important.
• Open Information Extraction, and
Unknown Phrases, Entities are connected
to each other.
@KorayGubur
Invented by Steven D. Baker, Michael Flaster,
Nitin Gupta, Paul Haahr, Srinivasan Venkatachary,
and Yonghui Wu
Assigned to Google
US Patent 8,346,792
Granted January 1, 2013
Filed: November 9, 2010
31. @KorayGubur
Open Information Extraction
• Google bought Wavii for 30.000.000$ in
2013.
• Open Information Extraction is about ‘fact
extraction’ around nouns.
• It is for connecting different nouns to each
other based on relations.
• A classifier assigns a confidence scores to
a relation between two nouns.
• This is a text-to-data example.
• Wavii was originally a news aggregator
based on topics, not phrases.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
32. @KorayGubur
Open Information Extraction
• The relational tuples include at least two
nouns by connected to each other at least
one verb and adverb, such as ‘created by’,
‘author of’, ‘is from’, ‘located there’.
• ‘... Moreover, the number and complexity
of entity types on the Web means that
existing NER systems are inapplicable...’
• Open IE is for Unknown Entities, and
recognizing Minor Entities without a
registration to the Knowledge Base.
@KorayGubur
Invented by Michael J. Cafarella, Michele Banko,
and Oren Etzioni
Assigned to: University of Washington through its
Center for Commercialization
United States Patent 7,877,343
Granted January 25, 2011
33. @KorayGubur
Answer-seeking Query
• Answer-seeking Queries have specific
elements within the questions, and
answers.
• Google’s purpose is that extracting
question and answer formats for answer-
seeking queries.
• Answer-seeking queries requires concise
answers without any skepticism.
• Answer-seeking Query is an important
bridge between the Natural Language
Queries with an Intent.
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
34. @KorayGubur
Answer-seeking Query
• Question Elements are, Entity Instance,
Entity Class, Part of Speech Class, Root
Word, N-Gram and Question Triggering
Words.
• Answer Elements are Measurement, N-
Gram, Verb, Preposition, Entity_instance,
N-gram near entity, verb near entity,
preposition near_entity, verb class, skip
grams.
• Answer-seeking Queries trigger Answer
Scoring Engine,
@KorayGubur
Inventors: Yi Liu, Preyas Popat, Nitin Gupta, and Afroz
Mohiuddin
Assignee: Google LLC
US Patent: 10,592,540
Granted: March 17, 2020
Filed: June 28, 2016
35. @KorayGubur
Natural Language Queries
• Natural Language Queries are the queries
with the daily language.
• They do not have a proper grammar rule,
or complete sentence.
• They do not explicitly tell their intent.
• That’s why these queries also called Intent
Queries, or Queries with a specific minor
intent.
• For such a query, a Search Engine should
return an answer without lots of details,
or structure.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
36. @KorayGubur
Natural Language Queries
• Natural Language Queries are not Factual-queries, this is
the main difference for Answer-seeking queries.
• Natural Language Queries are related to the Intent
Template Generation.
• A Natural Language Query can have multiple intents with a
non-factual information, such as ‘How do I make
hummus?’.
• There might be different methods to make a hummus, and
there are different types of hummus, also, the query
includes ‘I’. So, no one can know how you do hummus.
• The answer-seeking version of this query is that ‘How to do
hummus’.
• One of the important methodology points from here is that
Google creates ‘heading-text’ pairs to understand the
topics of the sub-sections of the article.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
37. @KorayGubur
Natural Language Queries
• Variable and Non-Variable Portions are
important concepts for the intent templates.
• Non-variable section of the intent for the
previous query is ‘hummus’.
• The variable section or portion can be a
‘place, method, tool, or style’. And, ‘I’ can
change as a child, as a women, men, or adult
and blind person.
• For Natural Language Queries, the Intent
Templates can be implemented to different
Query Patterns such as X Causes, X Reasons.
• If someone searches for only X, the intent
templates will be used to assign the natural
language results to the query.
@KorayGubur
International Application No WO/2014/197227
Published:11.12.2014
International Filing Date: 23.05.2014
Applicant: Google
Inventors: Tomer Shmiel, Dvir Keysar, and Yonatan Erez
38. @KorayGubur
Query Rewriting for Same
Intnet Across Languages
• Google tried to unite different search
intents, data for these intents, and phrases
that represents these intents to each other
to improve the search results before.
• This is called Query Expansion. Query
Expansion can compare results for a query
from a language, to results for the same
query with a different language.
• If the click satisfaction possibility is higher
for another language, for the same intent,
search engine can re-rank the results for the
first language.
@KorayGubur
Invented by Stefan Riezler, Alexander L. Vasserman
Assigned to Google
US Patent Application 20080319962
Published December 25, 2008
Filed: March 17, 2008
39. @KorayGubur
Seed-Queries
• Seed Queries can be synthetic queries,
user generated queries. The main
necessity for a seed query is that the
query should be satisfying with a set of
documents.
• If a query is logical, popular and satisfying
for the user, it will be marked as seed
query whether it is synthetic or searcher
generated.
• Seed Queries are used to determine the
representative queries for query
variations, query and intent templates.
@KorayGubur
Inventors Manaal Faruqui and Dipanjan Das
Applicants Google LLC
Publication Number 20200167379
Filed: January 18, 2019
Publication Date May 28, 2020
40. @KorayGubur
End of Phrase-based Indexing and Query
Processing Chaos
• Query Parsing
• Seed Query
• Substitue Query
• Natural Language Query
• Answer-seeking Query
• Factual Query
• Non-factual Query
• Non-variable Portion in Query
• Variable Portion in Query
• Discordant Query
• Query Re-writing
• Open Information Extraction
• Synthetic Query
• Categorical Query
• Contextual Vectors
• Term Vectors @KorayGubur
• Intent Templates
• Question and Answer Elements
• Co-occurence Matrix
• Query Expansion
• Query Term Weight
• Multi-stage Query Processing
• Query Breadth
• Query Template
• Relation Types and Noun Tuples
• Macro-context
• Topical Entry
• Mid-page Query Refinement
• Query Ambiguity
• Query Cluster – Document Cluster for Logical Pair
• Associator, Matcher, Scorer for Query, Document
Association
• Edit Distance Score’, ‘Similarity Score’, ‘Transformation
Cost Score’.
• Phrase-based Indexing
• Contextual Domains
• Contextual Domain Word List
• Query Analysis
• Representative Query
• Canonical Query
• Minor Intent
• Space Vectors
• Navigational Query as a
Popularity Signal
• Evidence Based Ranking
• Word Proximity
• Word Adjecency
• Query Term Weight
41. @KorayGubur
First Semantic Web Announcement
• Semantic Web Roadmap has been published
in September 1998 by Tim Barners-Lee.
• Semantic HTML, and Semantic Web,
Semantic User Patterns were the principles
of Semantic Search.
• The main purpose of Semantic Web is
making the web understandable to machines
so that machines can help humen-beings for
better web surfing.
• Tim Barners Lee talked about Agents,
Ontology, Structured Data, RDFa, or Semantic
HTML Tags and Digital Signature.
• ‘Such an agent coming to the clinic's Web
page will know not just that the page has
keywords such as "treatment, medicine,
physical, therapy" (as might be encoded
today) but also that Dr. Hartman works at
this clinic on Mondays, Wednesdays and
Fridays and that the script takes a date
range in yyyy-mm-dd format and returns
appointment times. And it will "know" all
this without needing artificial intelligence ‘ @KorayGubur
‘The Semantic Web is an extension of the current web in
which information is given well-defined meaning, better
enabling computers and people to work in cooperation.’
-Tim Barners-Lee
42. @KorayGubur
First Semantic Search Patent
• Google’s first Semantic Search Engine patent
is from 1999. One year later from Tim
Barners-lee’s announcement.
• The Inventor is directly Sergey Bring.
• Document doesn’t have a legal language, like
other first patent instances of Google.
• Document tells that every thing from similar
type has same features.
• Things on the web can be collected for
certain type of information and stored with
this information.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
43. @KorayGubur
First Semantic Search Patent
• Sergey Brin encountered some problems
such as Named Entity Recognition, or Main
Entity, and Entity Relation Detection.
• These problems are not called based on
Entities, but these books were entities with
string representations.
• Even a single letter difference resulted in big
problems for Sergey Brin.
• And, some books didn’t have price, or proper
title, and some of them were not even real
books.
• In the first trying, the cost was high, process
was slow, results were half, but Google kept
going.
@KorayGubur
Invented by Sergey Brin
Assigned to Google
US Patent 6,678,681
Granted January 13, 2004
Filed: March 9, 2000
44. @KorayGubur
Knowledge Graph Launch
• ‘Things, not strings.’ is the motto of
Knowledge Graph. Everything on the web is
divided into different entities, entity types,
entity connections.
• Named Entity Recognition, and Natural
Language Processing increased its value and
prominence within the algorithmic hierarchy
of Google.
• Knowledge Graph supported the Knowledge
Panels.
• Fact Extracting, Question Answering,
Accuracy Audit, and Entity Relations are the
columns of Entity-oriented Search Engine.
• ‘Wouldn’t it be great understanding every
word of user, instead of matching words?’, by
Jack Menzel.
@KorayGubur
Inventors: John R. Provine
Assignee: Google LLC
US Patent: 10,922,326
Granted: February 16,
2021
Filed: March 14, 2013
45. @KorayGubur
Browsable Fact Repisotory
• Browsable Fact Repisotory is the main and
primitive version of the Google Knowledge
Graph.
• There are three important problems for
Browsable Fact Repisotory.
1. Updating the Knowledge Graph.
2. Extracting the New Entities.
3. Auditing the Fact Accuracy.
@KorayGubur
Invented by Andrew W.
Hogue and Jonathan T.
Betz
Assigned to Google Inc.
US Patent 7,774,328
Granted August 10, 2010
Filed: February 17, 2006
46. @KorayGubur
Entity-seeking Query
• Today’s last Query type.
• Entity-seeking Queries are one of the
basic columns of Entity-oriented search.
• Identify the Query seeks for a singular
entity, or plural things from same type.
• If it is singular, entity-seeking query will
match the term and the entity based on
an attribute.
• Entity-seeking Queries include a Semantic
Dependency Tree, Relevance Threshold
@KorayGubur
Inventors: Mugurel Ionut Andreica, Tatsiana Sakhar,
Behshad Behzadi, Marcin M. Nowak-Przygodzki, and
Adrian-Marius Dumitran
US Patent Application: 20190370326
Published: December 5, 2019
Filed: May 29, 2018
48. @KorayGubur
Structured Search Engine
@KorayGubur
• Sergey Brin said, ‘Structured Form’ in 1999.
• In 2011, Andrew Hogue said Structured
Search Engine.
• Andrew Hogue introduced the Open-
Domain Fact Extraction methodologies for
extracting, clustering entities from the web.
• Andrew Hogue has showed some concrete
examples to the future Google Engineers for
the direction that they want to head.
Cartoon is created by Gary Larson.
49. @KorayGubur
Semantic Search Engine
@KorayGubur
• Google can extract all attributes of an entity
to understand its general features.
• According to the Source Attribute, these
features can be changed, detected or
altered.
• Based on the entity types, and candidate
entities, Google can generate more entity
types, and connections between them.
• Structured Search Engine’s other name is
Semantic Search Engine.
• Semi-structured Text Understanding,
Question Generation from Keywords, and
Question-Answer Pairing are the main
objectives of Semantic Search Engine.
51. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Named Entity Recognition process for the
query.
• Entity-seeking Queries are the backbone
of the entity oriented search.
• Recognizing an entity from a Query is not
easy, or cheap.
• Neural Matching, RankBrain, Sub-topic
Update, or BERT, MuM, LaMDA... All of
them are used for recognizing the entity,
and its related attributes.
52. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for
Entity-oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Second step is Entity Resolution.
• Entity Resolution, and Attribute
Extraction are for understanding the
related attribute of the entity.
• Entity-seeking Queries usually try to find
an Entity’s Attribute such as look, height,
taste, inception or history.
• After the entity and its attribute are taken
from the query, at the next step,
Question Format will be taken.
53. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Third step is Synonym Extraction.
• Synonym Extraction is for strenghten the
confidence score.
• Other function of the Synonym Extraction
is that, it helps for using alternate
documents for the same question.
• According to the Synonyms, the question
format can change.
54. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Question format is necessary to understand
the query by increasing the confidence
score, and matching the similar successful
documents.
• Question format is important to
determine the answer format.
• Quetion term order, and answer term
order can increase the success rate.
• The last important thing here is that the
‘answer data type’ which is a date.
55. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Forth step is Entity Reconciliation and data accuracy audit.
At the next step, Google can check the related search
activity, possible search activity, and choose the best
answer.
• The answer formats, and answer phrases will be used
for entity reconcilation.
• Entity reconcilation includes the standartization of the
entity with the correct information.
• 5 Rand Fishkin Entity Recording exist in Knowledge
Graph, for same Rand Fishkin.
56. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
Entity Reconcilation
Inventors: Oksana Yakhnenko and Norases
Vesdapunt
Assignee: GOOGLE LLC
US Patent: 10,331,706
Granted: June 25, 2019
Filed: October 4, 2017
Entity Reconcilation is another patent from Google.
• It includes checking multiple sources to complete the missing
information on the Knowledge Graph.
• It also uses similarity threshold between different sources and the
knowledge graph.
• If the source is authoritative, it will be easier to modify the
Knowledge Graph.
57. @KorayGubur
Semantic Search Engine
@KorayGubur
This is a Query Parsing Example from a Google Engineer for Entity-
oriented Search.
Source: The Structured Search Engine by Andrew Hogue
“For other people it can be a little more complicated. Like me, for
example, John Mueller. If you search for me you’ll find Wikipedia pages,
barbecue restaurants, bands, all kinds of people who are called John
Mueller.
And if, on my site, I don’t specify who I actually am, then it could
happen that our systems look at my page and go: “oh this is that guy
that runs that barbecue restaurant.” And suddenly I’m associated with
a barbecue restaurant, which might be a move up, I don’t know.
But these subtle things make it easier for us to recognize who is
actually behind something. We call that reconciliation when it comes to
structured data, kind of recognizing which of these entities belong
together.”
John Mueller
64. @KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
Named Entity Resolution
Named Entity Extraction
Relation Detection
Lexical Semantics
Taxonomy
Ontology
Onomastics
Important Terms and Concepts for NER and Semantic Search Engine
65. @KorayGubur
Semantic Search Engine
@KorayGubur
Entity Extraction
• Entity extraction is a complementary step for
Named Entity Recognition.
• Recognized Entity can be extracted from the
text to be stored in a Knowledge Base.
• Entity Extraction uses attributes to connect
the entity and its meaning, prominence and
attributes to each other.
• In the sentence of ’46th President of United
States (US) had decided to go Paris on
Monday, 2th june, 2002.’
• ‘46th President of United States’ is the
named entity.
• The decision of the president is the attribute
with the date contribution which is included
in entity extraction.
66. @KorayGubur
Semantic Search Engine
@KorayGubur
Entity Resolution
• Entity Resolution has two phases.
• First phase is finding the mention entity’s
correct idendity.
• Second phase is finding the correct profile of
the mentioned entity.
• For instance, Bill Clinton was a U.S President,
but also an Actor in Hollywood. An American
Football Player can be also a cook, or
journalist.
• To find the right entity, from the entity
reference, Search Engine can use related
entities, and their types.
• Entity Resolution helps for feeding the text-
to-data systems of Search Engines.
• If you tell ‘Barry Scwhartz entered to
classroom and asked questions to the
students’, the Entity Resolution will decide
that it is the Professor Barry, not our Barry.
67. @KorayGubur
Semantic Search Engine
@KorayGubur
Relation Detection
• Relation Detection is the process of
understanding the relation type and labels
between different entities within a text.
• There are different types of relations, such as
‘isSimilarOf’, ‘locatedIn’, ‘superiorOf’,
‘closeTo’, ‘sameAs’.
• Some of these relation types are familiar
from the Structured Data.
• Some of the relation types are unique for
specific entities and specific topics.
• Relation Detection takes power from the
Lexical Semantics.
• Relation detection can be used for Visual-to-
text algorithms too.
68. @KorayGubur
Semantic Search Engine
@KorayGubur
Lexical Semantics
• Lexical Semantics should be known by every
human-being for thinking and speaking in a
healthy way.
• Lexical Semantics include semantic meaning
connections between different words.
• Lexical Semantics are used to understand the
relational connections between named
entities.
• For instance, ‘Boy’ includes ‘single’, ‘teenage’,
‘male’, ‘young’ meanings as default. But,
some of these meanings have high possibility,
some of them low.
• For instance, someone young, male, teenage
can be also married.
• Lexical Semantics are used to understand the
named entity’s resolution and connection
with other things.
Lexemes: not analyzable unit, by itself.
Lexicon: List of lexemes.
69. @KorayGubur
Semantic Search Engine
@KorayGubur
Semantic Role Labeling
• Semantic Role Labeling is the process of
understanding the parts of a sentence by
assigning related labels.
• Semantic Role Labeling takes power from
Lexical Semantics, and Part of Speech Tag.
• Semantic Role Labeling helps Relation
Detection.
• There are more than 32 Semantic Roles.
• For Semantic Role Labeling, the most
important part is finding the theme,
predicate, agent, and effect.
• Semantic Role Labeling is beneficial to audit
the content’s accuracy, and fact extraction
from the prepositions.
70. @KorayGubur
Semantic Search Engine
@KorayGubur
Taxonomy
• Taxos-logos, or Taxonomy means arrangement of
things.
• It is used for animal classification first, in Anceint
Greek.
• In moden era, it is used for all living thing classification
in biology, and then it has been used for classification
of chemical, or other types of existing things.
• In the field of Search Engine Optimization, Semantic
Entity Types, and Semantic Dependency Tree is
important.
• Creationg a hierarchy between entities based on their
type and size, prominence or superiority and
inferiority is important to increase the contextual
relevance, and specifying the relevance of the article.
• Every entity type has a different attribute group, and
hierarchy can be refreshed.
• If the context is size of cities, ‘berlin’, ‘paris’, ‘istanbul’
can have a different taxonomy, in terms of big, small,
medium cities.
• If the context is countries of these cities, taxonomy
can be aligned with country names, and region,
continent names.
71. @KorayGubur
Semantic Search Engine
@KorayGubur
Ontology
• Ontology completes the taxonomy.
• Ontos-logos, essence of things.
• It is a barnch of philosophy.
• Ontology is a reflex for all human-beings.
• Ontology can be created based on mutual
points of different entities.
• According to the mutual attribute between
entities, the taxonomy can change, and
ontology can follow it also.
• If three named entities are from same region,
region name is the mutual attribute, and it
can have other types of connections based
on this.
72. @KorayGubur
Semantic Search Engine
@KorayGubur
Onomastics
• Onomastics is the science of naming, and
analyzing the name patterns for different
languages.
• Every enttiy type has a different naming pattern.
• Name patterns are used to recognize entities,
entity types, and attributes of entities.
• It comes from onoma and stikos, means names
of things.
• Different science names, city names, event
names, situation names, or instituion names can
have naming patterns.
• Some onomastics sub-type examples,
1. helonyms: proper names of swamps, marshes and bogs.
2. limnonyms: proper names of lakes and ponds.
3. oceanonyms: proper names of oceans.
4. pelagonyms: proper names of seas and maritime bays.
5. potamonyms: proper names of rivers and streams.
• Onomastics can be used for taxonomy and
ontology creation too. Even a water can have
multiple naming patterns based on sub-types.
74. @KorayGubur
Semantic Search Engine
@KorayGubur
BERT - SMITH
Uses, Masked Language Model.
It masks 15% of every tokens for prediction model.
Used, Bidrectional Language Understanding.
It reads all sentence at once from both direction.
It predicts the next sentence.
Used bigger tokens than 512 with SMITH.
Used fine-tuning based representation model.
75. @KorayGubur
Semantic Search Engine
@KorayGubur
MuM
The research papers have been taken in 2021 March.
In 2021 May, they announced MuM.
In 2021 June, they announced that they started to use MuM.
All system is related to the understand ‘Related Search Activity’ to predict the future queries.
78. @KorayGubur
Semantic Search Engine
@KorayGubur
Conversational Search
Conversational Search is close to Conversational AI.
It connects different entities, concepts, intents to each
other.
Creates new Contextual Domains, and Co-occurence
Matrixes.
Conversational Search Announcement includes only the
past queries.
MuM, and LaMDA includes future queries.
81. @KorayGubur
Semantic Search Engine
@KorayGubur
ReALM
Inventors: Kenton Chiu Tsun Lee,
Kelvin Gu, Zora Tung, Panupong
Pasupat, and Ming-Wei Chang
Assignee: Google LLC
US Patent: 11,003,865
Granted: May 11, 2021
Filed: May 20, 2020
First a Research Paper,
Then, a Patent.
Lastly, an Update with Official Statement,
Or Non-Official Statement.
87. @KorayGubur
@KorayGubur
‘Without understanding the Query Processing in the eyes of
Search Engine, you can’t create the relevant, and satisfying
document based on minor and dominant search activity
types.’
Thank You