There are three main components of information retrieval systems: query understanding, document-query relevance understanding, and document clustering and ranking. The path from a search query to a search document involves several steps like query parsing, processing, augmenting, scoring, ranking, and clustering. Query understanding is where search engine optimization (SEO) begins, while document creation and ranking are other areas where SEO is applied. Cranfield experiments in the late 1950s helped develop the concept of a "search query language" which is different from the language used in documents. Formal semantics and components like tense, aspect, and mood can help machines better understand human language for information retrieval tasks.
This document summarizes how Google search results are evolving to include more semantic data through direct answers, structured snippets, and rich snippets. It provides examples of direct answers being extracted from authoritative sources using natural language queries and intent templates. It also discusses how including structured data like tables, schemas, and markup can help search engines understand and display page content in a more standardized way. While knowledge-based trust is an interesting concept, current search ranking still primarily relies on link analysis and does not consider factual correctness.
How to Automatically Subcategorise Your Website Automatically With Python
The document describes a Python script that can automatically generate new subcategories for an ecommerce website based on clustering product names. It discusses:
- Using NLTK to generate n-grams from product names to cluster related products
- Filtering the n-grams to keep only those with commercial value by checking for search volume and CPC data
- Running the script on a large home improvement site to identify over 1,650 new subcategory opportunities with a total search volume of over 13 million
- Sharing the script so others can automate subcategory identification for their own sites to scale up an important SEO tactic.
1) Knowledge graphs are structured databases that represent real-world entities and their relationships to each other. They help search engines like Google understand topics at a deeper level.
2) Entities (topics) are becoming more important than keywords for search engines to understand content. Google's entity understanding can be checked using their natural language processing tool.
3) Semantic SEO techniques like tightly linking topics both internally and to relevant external pages can help improve how search engines understand and represent the entities within a website through their knowledge graphs.
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
William slawski-google-patents- how-do-they-influence-search
Bill Slawski presented a webinar on analyzing patents related to search engines and SEO. He discussed 12 Google patents covering topics like PageRank, Google's news ranking algorithm, analyzing images to detect brand penetration, and building user location history. The patents described Google's work in building knowledge graphs from web pages, ranking entities in search results, question answering, and determining quality visits to local businesses.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
This document provides SEO metrics and comparisons for the website hangikredi.com over several time periods between April 2019 and September 2019. It shows substantial increases in key metrics like organic traffic, clicks, impressions, and average position after Google algorithm updates in May, June, July, and September. However, it also shows significant drops in these metrics during a server outage in early August. Overall the data demonstrates the site's strong SEO performance and organic growth over the 6-month period analyzed.
This document discusses digital marketing strategies focused on establishing authority through valuable, timeless content. It recommends creating content such as articles, videos, and academic papers on topics that will remain relevant for years to establish expertise. Creating a steady stream of high-quality content over time builds an online presence and credibility without major risks of losses, and may lead to job offers, clients, or other opportunities. It provides examples of interactive dashboards and open-source software that gained popularity and users by continuously publishing improvements and documentation without needing to rely on things like resumes or company profiles.
BrightonSEO March 2021 | Dan Taylor, Image Entity Tags
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing is publishing a page on the Internet by adding a semantic layer (i.e., semantic enrichment) in the form of structured data that describes the page itself.
This document provides an overview of entity SEO, including:
- What an entity is and why entity SEO is important as search engines have evolved from information engines to knowledge engines
- How search algorithms like Panda, Penguin, and Hummingbird helped drive this transition by prioritizing high-quality content over low-quality sites
- Techniques for entity SEO including entity research, topical maps, schema, internal linking, and case studies
- Tools like Google's Knowledge Graph that can help with entity research and understanding how entities are ranked
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
1) Google uses various techniques to extract structured information like entities, relationships, and properties from unstructured text on the web and databases. This extracted information is then used to generate knowledge graphs and provide augmented responses to user queries.
2) One key technique is to identify patterns in which tuples of information are stored in databases, and then extract additional tuples by repeating the process and utilizing the identified patterns.
3) Google also extracts entities from user queries and may generate a knowledge graph to answer questions by providing information about the entities from sources like its own knowledge graph and information extracted from the web.
Google's search results now include entities and concepts. Entities refer to people, places, things, and 20-30% of queries are for name entities. Google uses meta data like Freebase to build a taxonomy of entities and their relationships. This supports features like the Knowledge Graph, which provides information panels, and allows querying of nearby entities which may soon be available in search results.
The document discusses using Python for SEO applications such as data extraction, preparation, analysis, machine learning and deep learning. It provides an agenda and examples of using Python to solve challenging SEO problems from site migrations and traffic losses. Methods demonstrated include pulling data from Google Analytics, storing in DataFrames, regular expression grouping, and training machine learning models on page features to classify page groups and identify losses. Later sections discuss using deep learning with computer vision models to classify web pages from screenshots.
Internal Linking - The Topic Clustering Way edited.pptx
This document discusses internal linking strategies and techniques. It begins by explaining the benefits of connecting entities within content, rather than just words, and translating those connections into internal links. It then provides an overview of technologies like PageRank, the reasonable surfer algorithm, topical PageRank, chunking, and natural language processing that search engines use to understand contexts and how those ideas can be applied to internal linking at scale. Specific options for approaches to internal linking existing pages are also outlined.
Passage indexing is likely more important than you think
Whilst passage indexing may seem like a small tweak to search ranking, it is potentially much more symptomatic of the beginning of a fundamental shift in the way that search engines understand unstructured content, determine relevance in natural language, and rank efficiently and effectively.
It could also be a means of assessing overall quality of content and a means of dynamic index pruning. We will look at the landscape, and also provide some takeaways for brands and business owners looking to improve quality in unstructured content overall in this fast changing landscape.
Patrick Hanks - Why lexicographers should take more notice of phraseology, co...
English dictionaries since 1755 have attempted to present succinct statements of the meaning(s) of each word. A word may have more than one meaning but, so the theory goes, each meaning can in principle be summarized in a neat paraphrase that is substitutable (in context) for the target word (the definiendum). Such paraphrases must be so worded that the the substitution can be made without changing the truth of what is said – salva veritate, in Leibniz’s famous phrase. Building on Leibniz, philosophers of language such as Anna Wierzbicka have argued that the duty of the lexicographer is to “seek the invariant”.
In this presentation, I argue that this view of word meaning and definition may be all very well as a principle for developing stipulative definitions of terminology in scientific discourse, but it has led to serious misunderstandings about the nature of meaning in natural language, creating insuperable obstacles for the understanding of how word meaning works. As a result, linguists from Bloomfield to Chomsky and philosophers of language from Leibniz to Russell – great thinkers all – have been unable to say anything true or useful about meaning in language.
I argue that, instead, lexicographers should aim to discover patterns of word use in large corpora, and associate meanings with patterns instead of (or as well as) words in isolation.
They should also distinguish normal uses of each word from exploitations of norms.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
Exploring the US 2010 Plain Language Act and other countries are exploring options. Paul Danon, UK compares guides and discusses what's out there, the need for collaboration and ethical implications.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
The document summarizes discovery services adoption rates among libraries. EBSCO's discovery service, EDS, has the most subscribers with over 5,600 libraries. OCLC reports over 1,700 libraries have access to WorldCat Local, though fewer use it as their primary interface. Ex Libris has licensed Primo to over 1,400 libraries, and ProQuest reports 673 libraries using Summon. The document also discusses features of EDS, including integration with library catalogs and course management systems, relevance ranking, and development of applications using the EDS API.
This document discusses strategies for legal reading, research, and writing. It begins by exploring how people read texts, maps, and music, and how these insights could apply to reading law. It then addresses organizing legal research using citation managers. Finally, it provides guidance on academic legal writing, including different forms of writing, strategies for writing within constraints, planning approaches, and addressing introductions and problem-solving writing specifically. Throughout, it draws on research and references various scholars to support its discussion.
The document discusses natural language and natural language processing (NLP). It defines natural language as languages used for everyday communication like English, Japanese, and Swahili. NLP is concerned with enabling computers to understand and interpret natural languages. The summary explains that NLP involves morphological, syntactic, semantic, and pragmatic analysis of text to extract meaning and understand context. The goal of NLP is to allow humans to communicate with computers using their own language.
14. Michael Oakes (UoW) Natural Language Processing for Translation
This document discusses information retrieval and describes its three main phases: 1) asking a question to define an information need, 2) constructing an answer by matching queries to documents, and 3) assessing the relevance of the retrieved answers. It also covers several important information retrieval concepts like keywords, indexing documents, stemming words, calculating TF-IDF weights, and evaluating system performance using recall and precision.
Pragmatics is the study of language use and context. It examines how the context, both situational and linguistic, affects the meaning of utterances. An utterance is the smallest unit of speech studied in pragmatics. Pragmatics focuses on the speaker's intended meaning rather than just the grammatical form. The interpretation of an utterance depends on its semantic content and environment. Contextual factors like the social and situational background condition both the production and understanding of utterances.
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
This document discusses natural language processing (NLP) and language modeling. It covers the basics of NLP including what NLP is, its common applications, and basic NLP processing steps like parsing. It also discusses word and sentence modeling in NLP, including word representations using techniques like bag-of-words, word embeddings, and language modeling approaches like n-grams, statistical modeling, and neural networks. The document focuses on introducing fundamental NLP concepts.
Relationship of Descriptive Linguistics in the following areas [Autosaved].pptx
This document discusses several key concepts in linguistics including:
1. The autonomy of syntax - the theory that syntax operates independently of meaning and pragmatics.
2. Compositionality - the principle that the meaning of a phrase or sentence can be derived from the meanings of its parts and their structure.
3. Conservative vs innovative forms - conservative forms change little over time while innovative forms undergo more recent changes.
4. Prescriptivism - the belief that there are correct and incorrect ways to use language based on explicit rules imposed on speakers.
5. Methods of linguistic research include collecting primary and secondary data using tools like interviews, observations, and questionnaires for qualitative and quantitative analysis.
Dove, "A Model of the User's Psychological State as a Framework for Understan...
This presentation was provided by John G. Dove of Credo Reference during the NISO event "Next Generation Discovery Tools: New Tools, Aging Standards," held March 27 - March 28, 2008.
This document provides an overview of the legal research process. It defines what law is and discusses the different types of legal authorities, including statutes passed by Congress, regulations by executive agencies, and case law interpretations by courts. It then outlines the steps to conduct legal research, including developing search terms, choosing appropriate research tools like legal databases and books, searching strategically, evaluating sources, and refining searches. The document provides examples of searching legal databases like LexisNexis and Westlaw and managing citations. The overall process is iterative, beginning with forming a research question and repeating searches across different tools and terms until enough information is found.
The document discusses knowledge representation (KR) and different approaches to KR, including:
1) KR provides a surrogate for reasoning about the world by representing knowledge in a computable format. It determines how an agent thinks about the world.
2) Logics like propositional and predicate/first-order logic use symbols and rules to represent knowledge unambiguously, though they have limitations in expressiveness.
3) Semantic networks, frames, and conceptual graphs are other non-logical KR that focus on expressiveness, simplicity, and formality over logic-based representations. They provide flexible ways to represent objects, attributes, and relationships.
Keyword Research and Topic Modeling in a Semantic WebBill Slawski
The document discusses keyword research and topic modeling in the semantic web. It covers identifying named entities, adding schema markup to pages, and verifying listings on Google My Business. It also discusses using context and related phrases to improve search engine optimization, including looking at knowledge bases, disambiguations pages, and clustering related meanings. The document provides examples of using related words and phrases for semantic topic clustering and ranking documents based on included phrases.
Quality Content at Scale Through Automated Text Summarization of UGCHamlet Batista
The document discusses using automated text summarization techniques to generate quality content at scale from user-generated content like online product reviews. It proposes a technical plan to download Amazon reviews, remove duplicate sentences using neural semantic textual similarity, and then generate frequently asked questions and corresponding FAQ schema by feeding the review text into a neural question generation model. The goal is to leverage user content and machine learning to automatically create helpful content for websites.
Slawski New Approaches for Structured Data:Evolution of Question Answering Bill Slawski
Google has moved from Search to Knowledge, and Focusing on Answering questions with knowledge graph entity information provides has led to answering queries with Knowledge graphs for those questions, with confidence scores between entities and other entities or attributes of entities, based upon freshness, reliabilillity, popularity, and proximity between an entity and another entity or an attribute.
Semantic seo and the evolution of queriesBill Slawski
This document summarizes how Google search results are evolving to include more semantic data through direct answers, structured snippets, and rich snippets. It provides examples of direct answers being extracted from authoritative sources using natural language queries and intent templates. It also discusses how including structured data like tables, schemas, and markup can help search engines understand and display page content in a more standardized way. While knowledge-based trust is an interesting concept, current search ranking still primarily relies on link analysis and does not consider factual correctness.
How to Automatically Subcategorise Your Website Automatically With Pythonsearchsolved
The document describes a Python script that can automatically generate new subcategories for an ecommerce website based on clustering product names. It discusses:
- Using NLTK to generate n-grams from product names to cluster related products
- Filtering the n-grams to keep only those with commercial value by checking for search volume and CPC data
- Running the script on a large home improvement site to identify over 1,650 new subcategory opportunities with a total search volume of over 13 million
- Sharing the script so others can automate subcategory identification for their own sites to scale up an important SEO tactic.
1) Knowledge graphs are structured databases that represent real-world entities and their relationships to each other. They help search engines like Google understand topics at a deeper level.
2) Entities (topics) are becoming more important than keywords for search engines to understand content. Google's entity understanding can be checked using their natural language processing tool.
3) Semantic SEO techniques like tightly linking topics both internally and to relevant external pages can help improve how search engines understand and represent the entities within a website through their knowledge graphs.
The Python Cheat Sheet for the Busy MarketerHamlet Batista
What percentage of an Inbound marketer's day doesn't involve working with spreadsheets? How much of this work is time-consuming and repetitive? In this interactive session, you will learn how to manipulate Google Sheets to automate common data analysis workflows using Python, a very easy to use programming language.
William slawski-google-patents- how-do-they-influence-searchBill Slawski
Bill Slawski presented a webinar on analyzing patents related to search engines and SEO. He discussed 12 Google patents covering topics like PageRank, Google's news ranking algorithm, analyzing images to detect brand penetration, and building user location history. The patents described Google's work in building knowledge graphs from web pages, ranking entities in search results, question answering, and determining quality visits to local businesses.
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateKoray Tugberk GUBUR
This document provides SEO metrics and comparisons for the website hangikredi.com over several time periods between April 2019 and September 2019. It shows substantial increases in key metrics like organic traffic, clicks, impressions, and average position after Google algorithm updates in May, June, July, and September. However, it also shows significant drops in these metrics during a server outage in early August. Overall the data demonstrates the site's strong SEO performance and organic growth over the 6-month period analyzed.
This document discusses digital marketing strategies focused on establishing authority through valuable, timeless content. It recommends creating content such as articles, videos, and academic papers on topics that will remain relevant for years to establish expertise. Creating a steady stream of high-quality content over time builds an online presence and credibility without major risks of losses, and may lead to job offers, clients, or other opportunities. It provides examples of interactive dashboards and open-source software that gained popularity and users by continuously publishing improvements and documentation without needing to rely on things like resumes or company profiles.
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsDan Taylor
My talk from BrightonSEO 2021; focusing on using Google's image category labels (glancing into the Knowledge Graph and Google's image annotation processes) for better topic research and content optimization.
Semantic Publishing and Entity SEO - Conteference 20-11-2022Massimiliano Geraci
Semantic Publishing is publishing a page on the Internet by adding a semantic layer (i.e., semantic enrichment) in the form of structured data that describes the page itself.
Everything You Didn't Know About Entity SEO Sara Taher
This document provides an overview of entity SEO, including:
- What an entity is and why entity SEO is important as search engines have evolved from information engines to knowledge engines
- How search algorithms like Panda, Penguin, and Hummingbird helped drive this transition by prioritizing high-quality content over low-quality sites
- Techniques for entity SEO including entity research, topical maps, schema, internal linking, and case studies
- Tools like Google's Knowledge Graph that can help with entity research and understanding how entities are ranked
How to approach SEO in a world where Google has moved from strings and keywords to things, topics and entities. Dixon JOnes is the CEO of InLinks, who have build a proprietory NLP algorithm and Knowledge Graph designed for the SEO Industry.
Semantic search Bill Slawski DEEP SEA ConBill Slawski
1) Google uses various techniques to extract structured information like entities, relationships, and properties from unstructured text on the web and databases. This extracted information is then used to generate knowledge graphs and provide augmented responses to user queries.
2) One key technique is to identify patterns in which tuples of information are stored in databases, and then extract additional tuples by repeating the process and utilizing the identified patterns.
3) Google also extracts entities from user queries and may generate a knowledge graph to answer questions by providing information about the entities from sources like its own knowledge graph and information extracted from the web.
Bill Slawski SEO and the New Search ResultsBill Slawski
Google's search results now include entities and concepts. Entities refer to people, places, things, and 20-30% of queries are for name entities. Google uses meta data like Freebase to build a taxonomy of entities and their relationships. This supports features like the Knowledge Graph, which provides information panels, and allows querying of nearby entities which may soon be available in search results.
The document discusses using Python for SEO applications such as data extraction, preparation, analysis, machine learning and deep learning. It provides an agenda and examples of using Python to solve challenging SEO problems from site migrations and traffic losses. Methods demonstrated include pulling data from Google Analytics, storing in DataFrames, regular expression grouping, and training machine learning models on page features to classify page groups and identify losses. Later sections discuss using deep learning with computer vision models to classify web pages from screenshots.
Internal Linking - The Topic Clustering Way edited.pptxDixon Jones
This document discusses internal linking strategies and techniques. It begins by explaining the benefits of connecting entities within content, rather than just words, and translating those connections into internal links. It then provides an overview of technologies like PageRank, the reasonable surfer algorithm, topical PageRank, chunking, and natural language processing that search engines use to understand contexts and how those ideas can be applied to internal linking at scale. Specific options for approaches to internal linking existing pages are also outlined.
Whilst passage indexing may seem like a small tweak to search ranking, it is potentially much more symptomatic of the beginning of a fundamental shift in the way that search engines understand unstructured content, determine relevance in natural language, and rank efficiently and effectively.
It could also be a means of assessing overall quality of content and a means of dynamic index pruning. We will look at the landscape, and also provide some takeaways for brands and business owners looking to improve quality in unstructured content overall in this fast changing landscape.
English dictionaries since 1755 have attempted to present succinct statements of the meaning(s) of each word. A word may have more than one meaning but, so the theory goes, each meaning can in principle be summarized in a neat paraphrase that is substitutable (in context) for the target word (the definiendum). Such paraphrases must be so worded that the the substitution can be made without changing the truth of what is said – salva veritate, in Leibniz’s famous phrase. Building on Leibniz, philosophers of language such as Anna Wierzbicka have argued that the duty of the lexicographer is to “seek the invariant”.
In this presentation, I argue that this view of word meaning and definition may be all very well as a principle for developing stipulative definitions of terminology in scientific discourse, but it has led to serious misunderstandings about the nature of meaning in natural language, creating insuperable obstacles for the understanding of how word meaning works. As a result, linguists from Bloomfield to Chomsky and philosophers of language from Leibniz to Russell – great thinkers all – have been unable to say anything true or useful about meaning in language.
I argue that, instead, lexicographers should aim to discover patterns of word use in large corpora, and associate meanings with patterns instead of (or as well as) words in isolation.
They should also distinguish normal uses of each word from exploitations of norms.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
Exploring the US 2010 Plain Language Act and other countries are exploring options. Paul Danon, UK compares guides and discusses what's out there, the need for collaboration and ethical implications.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
The document summarizes discovery services adoption rates among libraries. EBSCO's discovery service, EDS, has the most subscribers with over 5,600 libraries. OCLC reports over 1,700 libraries have access to WorldCat Local, though fewer use it as their primary interface. Ex Libris has licensed Primo to over 1,400 libraries, and ProQuest reports 673 libraries using Summon. The document also discusses features of EDS, including integration with library catalogs and course management systems, relevance ranking, and development of applications using the EDS API.
This document discusses strategies for legal reading, research, and writing. It begins by exploring how people read texts, maps, and music, and how these insights could apply to reading law. It then addresses organizing legal research using citation managers. Finally, it provides guidance on academic legal writing, including different forms of writing, strategies for writing within constraints, planning approaches, and addressing introductions and problem-solving writing specifically. Throughout, it draws on research and references various scholars to support its discussion.
The document discusses natural language and natural language processing (NLP). It defines natural language as languages used for everyday communication like English, Japanese, and Swahili. NLP is concerned with enabling computers to understand and interpret natural languages. The summary explains that NLP involves morphological, syntactic, semantic, and pragmatic analysis of text to extract meaning and understand context. The goal of NLP is to allow humans to communicate with computers using their own language.
14. Michael Oakes (UoW) Natural Language Processing for TranslationRIILP
This document discusses information retrieval and describes its three main phases: 1) asking a question to define an information need, 2) constructing an answer by matching queries to documents, and 3) assessing the relevance of the retrieved answers. It also covers several important information retrieval concepts like keywords, indexing documents, stemming words, calculating TF-IDF weights, and evaluating system performance using recall and precision.
Pragmatics is the study of language use and context. It examines how the context, both situational and linguistic, affects the meaning of utterances. An utterance is the smallest unit of speech studied in pragmatics. Pragmatics focuses on the speaker's intended meaning rather than just the grammatical form. The interpretation of an utterance depends on its semantic content and environment. Contextual factors like the social and situational background condition both the production and understanding of utterances.
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
This document discusses natural language processing (NLP) and language modeling. It covers the basics of NLP including what NLP is, its common applications, and basic NLP processing steps like parsing. It also discusses word and sentence modeling in NLP, including word representations using techniques like bag-of-words, word embeddings, and language modeling approaches like n-grams, statistical modeling, and neural networks. The document focuses on introducing fundamental NLP concepts.
Relationship of Descriptive Linguistics in the following areas [Autosaved].pptxEnKhi1
This document discusses several key concepts in linguistics including:
1. The autonomy of syntax - the theory that syntax operates independently of meaning and pragmatics.
2. Compositionality - the principle that the meaning of a phrase or sentence can be derived from the meanings of its parts and their structure.
3. Conservative vs innovative forms - conservative forms change little over time while innovative forms undergo more recent changes.
4. Prescriptivism - the belief that there are correct and incorrect ways to use language based on explicit rules imposed on speakers.
5. Methods of linguistic research include collecting primary and secondary data using tools like interviews, observations, and questionnaires for qualitative and quantitative analysis.
This presentation was provided by John G. Dove of Credo Reference during the NISO event "Next Generation Discovery Tools: New Tools, Aging Standards," held March 27 - March 28, 2008.
This document provides an overview of the legal research process. It defines what law is and discusses the different types of legal authorities, including statutes passed by Congress, regulations by executive agencies, and case law interpretations by courts. It then outlines the steps to conduct legal research, including developing search terms, choosing appropriate research tools like legal databases and books, searching strategically, evaluating sources, and refining searches. The document provides examples of searching legal databases like LexisNexis and Westlaw and managing citations. The overall process is iterative, beginning with forming a research question and repeating searches across different tools and terms until enough information is found.
The document discusses knowledge representation (KR) and different approaches to KR, including:
1) KR provides a surrogate for reasoning about the world by representing knowledge in a computable format. It determines how an agent thinks about the world.
2) Logics like propositional and predicate/first-order logic use symbols and rules to represent knowledge unambiguously, though they have limitations in expressiveness.
3) Semantic networks, frames, and conceptual graphs are other non-logical KR that focus on expressiveness, simplicity, and formality over logic-based representations. They provide flexible ways to represent objects, attributes, and relationships.
Library Research for Legal Researchers at UCSDAnnelise Sklar
This document provides a step-by-step guide for legal researchers on how to conduct library research. It outlines choosing a topic and keywords, selecting appropriate research tools and databases, constructing search strategies, running searches, obtaining citation information, accessing full texts, and evaluating sources. Key databases for legal research include Westlaw Next, LexisNexis Academic, and HeinOnline. The guide stresses developing a focused research question and using subject headings and cited references to expand searches.
Haas and Flower Slideshow for Composition IIrslyons
This document summarizes key concepts from an academic article by Haas and Flower about reading as a constructive process. The summary includes:
1) Haas and Flower studied how readers of varying experience levels construct meaning as they read aloud. They categorized reader strategies as content-focused, feature/function-focused, or rhetorical-focused.
2) Inexperienced readers (students) focused more on content and features, while experienced readers (graduates) used more rhetorical strategies to understand context and purpose.
3) The study suggests readers can improve by actively applying rhetorical strategies like analyzing audience and purpose, in addition to content strategies. This helps readers mature and better comprehend academic texts.
This document provides an overview of the legal research process. It begins by defining what law is, then discusses the different types of legal authorities such as statutes, regulations, and court opinions. It explains that Congress makes statutes, agencies make regulations, and courts interpret laws through opinions. The document then outlines the steps of the legal research process, including choosing search terms related to the research topic, selecting appropriate research tools like legal databases and libraries, searching and refining searches, evaluating sources, and repeating the process until enough information is found. Key legal research tools discussed are Westlaw, LexisNexis, HeinOnline, and government websites. The goal of the process is to find authorities to help answer a specific legal question.
This document summarizes discovery service adoption rates among major library vendors. It reports that EBSCO has the largest number of subscribers to its discovery service (EDS) at 5,612 libraries. OCLC reports 1,717 libraries using WorldCat Local, and Ex Libris has licensed Primo to 1,407 libraries. The document also provides subscriber numbers for ProQuest Summon. It examines themes from user research on discovery services and outlines features and capabilities of EBSCO's EDS product.
Similar to Lexical Semantics, Semantic Similarity and Relevance for SEO (20)
Foundations in Content Optimization: How to Optimize Your Content to Fuel Organic Traffic Growth
A highly tactical introduction to best practices in optimizing your owned content to drive sustainable (and converting) organic traffic. In this Master Class, we'll cover everything from identifying the right keywords to use to how to strategically apply them to your content strategy to generate ROI.
Key Takeaways:
- A breakdown of key content-related factors Google and other Search Engines use when ranking content.
- Understand the basics of keyword research and where to begin.
- Learn how to apply those keyword learnings to optimize your content strategy and owned assets to maximize their organic visibility.
- Learn what to measure to articulate ROI and feed back into your strategy.
It’s been a difficult few years for Facebook Ads due to signal loss from iOS/Firefox/Chrome and the associated loss of ad targeting precision and ROAS. In this session, delve into 100% new high-impact strategies for thriving in Facebook advertising in a world without 3rd party cookies.
You'll uncover the top 7 Facebook ad hacks of 2024, all centered around first party ad signal data restoration and how to coax the new default Meta Audience+ ad targeting system to do what you want it to do, each backed by solid results and case studies. Learn how to skyrocket your landing page conversions by 20-25%, how to scale ads like never before, and target niche audiences with strategies that defy traditional norms.
Plus, gain insights into critical privacy regulations and how to maintain a full compliance therein.
In today's digital age, German auto repair shops can leverage digital marketing to enhance visibility, engage customers through personalized interactions, and reach targeted demographics effectively. By optimizing online presence, managing reputation, and analyzing performance data, shops can achieve cost-efficient growth and competitive advantage in the market.
[Webinar - VWO] AI-First Strategies to Drive Traffic and Conversions for 2024...VWO
Discover how Eric Siu’s agency, Single Grain, drove over 1 million new website visitors at a +59% higher conversion rate in 90 days by integrating innovative AI-driven strategies into their CRO and SEO practices, known as programmatic CRO (pCRO) and SEO (pSEO).
Imagine: At the click of a button, your landing pages dynamically adapt to feature content and elements specific to the keywords and products they are targeting. That’s the power of pCRO, transforming generic pages into highly personalized experiences. With pSEO, generate quality pages at scale that rank at the top of search results for relevant long-tail keywords, driving traffic that then converts.
Excited? In this session, Eric will guide you through how to implement these game-changing techniques for your own business, enhancing your digital strategy and maximizing your ROI.
10 Advantages and Disadvantages of Social Media Marketing in 2024Markonik
Explore the dynamic landscape of social media marketing in 2024 with our comprehensive presentation. Delve into the top 10 advantages and disadvantages that digital marketers face in leveraging social media platforms. Understand the opportunities for growth, engagement, and brand visibility, as well as the challenges and potential pitfalls that come with navigating the ever-evolving digital ecosystem. This presentation will provide valuable insights and actionable strategies for maximizing the benefits of social media marketing while mitigating its drawbacks, tailored specifically for the needs of Markonik.
10 Powerful Strategies to Solve Common Payroll Problems in SMEs_.pdfTop Klickz
Managing payroll in SMEs can indeed be challenging, but there are several effective strategies to solve common problems.
Invest in robust payroll software that automates calculations, tax deductions, and compliance requirements. This reduces errors and saves time.
You'll learn about proven systems and effective workflows to maintain a consistent and engaging social media strategy. Additionally, you'll gain actionable strategies and practical tactics to drive engagement, increase followers, and convert them into loyal customers.
Let’s be honest. Improvements in search rankings and organic traffic don’t always translate into sales. Yet, you spend the majority of your SEO resources on driving rankings and traffic. What if you built your SEO content with conversion in mind from the beginning? You’d generate more organic traffic that actually converts into revenue! Join 20-year search marketing veteran as he unveils his framework for developing SEO content with conversion in mind every step of the way ‒ from keyword strategy to content development and publication.
Takeaways:
Tactics and benchmarks for SEO content that converts in 2024
Page layouts and content formats that convert organic traffic
Crafting keyword strategy and calls-to-action for conversion
The continued disappearance of the third party cookie has both targeting and tracking implications for open-web advertising and marketing. We'll discuss where context, identity graphs and first party data are being used to substitute for third-party cookies. We'll also discuss where CTV and other newer media channels are maturing to allow for household or personal targeting.
Key Takeaways:
Learn how to effectively target prospects and consumers at various stages using a combinations of targeting types across channels.
Revolutionizing Advertising with Billion Broadcaster Standee Screen MediaVikasYadav194549
Billion Broadcaster's standee screen media is revolutionizing the advertising landscape with innovative digital screens placed in high-traffic areas such as malls, airports, and residential complexes. These dynamic screens capture attention with vibrant multimedia content, offering a visually engaging platform for advertisers.
Much like Odysseus's fabled journey, the venture of an organization into creating compelling websites, easy-to-use digital solutions, and flawless user experience is laden with trials and triumphs. This session explores a BizStream customer case study that demonstrates how crafting composable digital solutions with headless CMS and headless commerce is possible. The result now serves as a modern-day Athena, navigating the customer through the stormy seas of digital transformation. Attendees can expect to learn how to embrace modern composable solutions, understand the benefits they bring, and identify which of Odysseus's conflicts to avoid.
Key Takeaways:
What makes up a composable digital solution.
Why content is still king in a composable world.
How Headless CMS and Headless Commerce are different.
Traditional Foods Of Australia and The HistoryThe Aussie Way
One of the most iconic foods in Australia is the meat pie. This handheld snack or meal consists of a pastry shell filled with minced meat, most commonly beef, and savoury gravy. It is often enjoyed at sporting events or as a quick and satisfying lunch option.
Visit - https://theaussieway.com.au/category/food/
Chemical Industry- Rashtriya Chemical Fertilizers (RCF) .pptxmayurparate000
Research on chemical industry with considering one of PSU as an example Rashtriya Chemical Fertilizers (RCF). Chemical Industry trend, strengths, weaknesses. Chemical Industry market position as well as RCF position. RCF revenue, profit, EBITDA, forecast, technology, past performance. State wise revenue of chemical industry and RCF as well
Top 7 PPC and SEO strategies to drive more traffic and conversions. Most companies already own the data, let's share the data and boost the performance of both channels.
Key Takeaways:
Why PPC and SEO teams should strategize together7 ways PPC and SEO can work together to drive even more conversions
PPC and SEO Synergies - Strategies Every Company Should Deploy - Benjamin Lund
Lexical Semantics, Semantic Similarity and Relevance for SEO
1. Lexical & Query
Semantics Differences
for Information
Retrieval
Why PageRank is Sometimes Better
for Semantics
2. Closing the Gap between Search
Query Language and Document
Language
• There are three components of Information
Retrieval Systems.
• Query Understanding
• Document-Query Relevance
Understanding
• Document Clustering and Ranking
• The path from a “search query” to a “search
document” involves query parsing, processing,
augmenting, scoring, ranking and clustering.
• Query Understanding is where the SEO starts.
• Document Creation is where the SEO continues.
• Document Ranking where the SEO repeats itself.
Source: Query Language Determination Using Query Terms and Interface Language
3. What is this Search
Query Language?
• Search Query Language is invented in
Cranfield Experiments in late 1950s.
• Scientists realized that while “querying a
document”, the language gets densified, and
words change their meaning.
• There is a huge vocabulary difference between
“queries” and “documents.
• Because, people do not know what to ask for a
search engine, they only know what
represents the topic.
• The “query language” uses “knowledge
representation” with “dense vectors”.
• Query Term Weight Calculation is born during
these experiments.
Source: Augmenting Queries With Synonyms From Synonyms Map
4. Query Search
Language
• Cranfield Experiments: Cyril W. Cleverdon is one of the first
Information Retrieval experiments.
• It is for testing the efficiency of indexing systems.
• The “Vannevar Bush’s ‘As we may think’” paper is cited during the
research.
• The Cranfield Experiments invented the “Search Language” concept
to admit the fact that words change their meanings inside the
search queries even if they are used same inside the document.
• Information Retrieval has to make a distinction between
“understanding relevance” and “understanding query”.
• To understand the query, search engine can’t use the language
model for understanding the documents.
• Document language and query language are completely different.
• Inside the documents, we see “lexical semantics”.
• Inside the queries, we see “query semantics” with “search
language”.
Source: “As We May Think” – Vannevar Bush; Cranfield Experiments, Cyril W. Cleverdon, 1958.
5. An Algorithm doesn’t have
to be liked by your logic
• An algorithm doesn’t have to make sense.
• An algorithm has to be useful.
• Cranfield Experiments is debated for decades, and still it
is cited by new researches.
• Cranfield Experiments do not explain why their method
is working, it just tells, it works.
• The experiments tell test subjects to take documents
from “aerospace” topic, and write some “keywords”, or
“search queries” for “aerospace” topic.
• Test subjects rank the documents based on their own
query terms and their own judgement.
• Cranfield Experiments has created the concepts of
“search language” and “document language” along with
“natural language query”.
Source: Query Generation Using Structural Similarity Between Documents
6. Lexical Semantics
• Lexicosemantics involves word-sense
disambiguation with word copositionality and
language syntax-semantics interface.
• Lexicosemantics helps Formal Semantics (Natural
Language).
• Formal Semantics studies grammatical meaning of
natural language with theoretical computer
science.
• Lexical Semantics helps for construction of
WordNets, FrameNets, Knowledge Bases and
Index Tiers.
• Lexical Semantics is useful for Search Engines to
process a text item to understand “Semantic
Scope” of sentences with “modality”, “tense”,
“binding”, “aspect”, and pragmatics.
• Lexical semantics involve, hyponymy, hypernymy,
antonomy, homonymy, polysemy, meronymy,
holonym and semantic networks.
Source: Query Generation Using Structural Similarity Between Documents
7. Do You Remember Google Merge?
• What if Google became a
semantic search engine by buying
another one?
• Oingo was the first search engine
focused on meaning-based
relevance and advertisement.
• They became “Applied Semantics”
in 2001.
• Google and Applied Semantics
merged together on 18 April,
2003.
8. Applied Semantics (Oingo): The First
Conceptual Search Engine
• Applied Semantics is created by Eytan Elbaz in 1999.
• Information Extraction and Information Responsiveness work
differently than Information Retrieval.
• Lexical Relations do not have the meaning in query terms, but
Query Semantics have. Thus, to augment and expand a query,
query semantics are used first time.
• It is one of the first designs that mention “semantic distance”,
and “relationship strength” to create a semantic network of
concepts.
• It created the way to “Index Tiering”.
Typically, search engines match the search terms to the documents as a whole. If the user is interested in
specific information, for example, “sharks”, but a particular document about “beaches around the
world”, for example, only has one sentence about sharks, it is unlikely that the search engine would return the
document. Documents like the one described are likely to score very low under the query for “sharks”, if at all,
because the document as a whole is not “about” sharks.
Source: Methods and systems for detecting and extracting information
9. Do You Remember Google Merge?
• Similarity (“gluttonous” is similar to “greedy”) – Near Synonyms
• Membership (“commissioner” is a member of “commission”)
• Metonymy (whole/part relations) (“motor vehicle” has part
“clutch pedal”)
• Substance (e.g. “lumber” has substance “wood”)
• Product (e.g. “Microsoft Corporation” produces “Microsoft
Access”)
• Attribute (“past”, “preceding” are attributes of “timing”)
• Causation (e.g. travel causes displacement/motion)
• Entailment (e.g. buying entails paying)
• Lateral bonds (concepts closely related to one another, but not
in one of the other relationships, e.g. “dog” and “dog collar”)
• Capitonyms (Polish (Nation), polish (shining).
• Troponym (Walking -> Hustle, Trot, Crawl)
• Eponym (Tommy John Surgery, Biswanath Panda -> Panda Update)
• Demonym (New Yorkers -> Population of New York City, not
State)
• Acronyms (NASA, North American Saxophone Alliance,
National Auto Sport Association, National Association of
Students of Architecture)
•Source: Bill Slawski
10. Formal Semantics
• Formal Semantics involves philosophy of language and
linguistics together.
• Denotations of natural language expressions are used to
understand the compositionality of words, and their
references.
• Nature of meaning is the philosophical part of formal
semantics.
• Nature of meaning involves the meanings that come from
our nature (Constructivist, Coherence, Correspondence,
Consensus, Pragmatic Theories).
• Formal Semantics have two approaches.
• Truth Conditions
• Compositionality
• Formal Semantics is related to Lexical Semantics, because
based on lexical relations, the compositionality, and truth
conditions change.
11. Formal Semantics and Inquisitive Semantics
• Inquisitive Semantics involve raising new but related
issues to a truth value.
• For example: “Aspirin is used against headache. Does it
work against toothache?”
• The “toothache” and “headache” here have lexical
relations to each other as “meronyms”.
• The Formal Semantics here helps to understand the
truth value of “Aspirin” and its functions.
• The Formal Semantics and Truth Conditions have two
approaches.
• Dynamic Semantics: The raised issues have to change the
context, and the first premise has to be correct.
• Static Semantics: The raised issue doesn’t have to be
relevant, and premise doesn’t have to be true.
• For example: “John gives SEO Suggestions as a Googler.
Does John gives useful SEO Suggestions as a Googler?”
• Technically, John’s occupation is not connected to the
suggestions’ usefulness.
• The Dynamic Semantics change the context of the
previous sentence based on interpreter and receiver.
• Multi-stage or chained reasoning is highly relevant to
the Dynamic Semantics for “context direction”.
Source: Multi-level Recommendation Reasoning over Knowledge Graphs with
Reinforcement Learning
12. Formal Semantics and Compositionality
• Compsoitionality is to understand
lexical relations between the
subjects and objects.
• The easiest way to have a formal
semantics understanding for
compositionality is removing all the
meaningful lexical units from the
sentence.
• For the sentence “Contadu is the
best technology for creating a
semantical understanding to
optimize content”.
• “C is t-t for s-u to o-c”.
• The structure here gives the composition
of words, and how lexical relations are
constructed with constituent rules.
Source: Compositionality by Henk J. Verkuyl, Utrecht University
13. Formal Semantics and Scope
• Scope determines the validity of the specific
declaration’s range.
• Formal semantics helps machines to process
the human language for understanding the
specific scope.
• For example:
• “Every student has a favourite teacher”. -> It is not
clear whether every student has the same teacher
as their favourite or, all of them have different
teachers as their favourite, or some of them have
same teacher, and some of them have different
teachers as their favourite.
• “When three more votes are taken from the court,
the decision will be as we want.” -> The not clear
part here is that, why 3, and which 3. Does the
court have different layers of officials with
different vote values, or especially “X, Y, Z”
officials needed to vote, and which other
decision-givers are against the decision that the
person wants. This is the example of Inquisitive
Semantics, use it for question generation.
• There are other types of scopes, such as “scope
islands”, “exceptional scopes”.
Source: Context-Sensitivity and Individual Differences in the Derivation of Scalar
Implicature
14. Formal Semantics and Scope
• Island Scopes are called Island because
they can’t be taken out of that scope
(island).
• For example: “If every elephant in the
sanctuary gains 5 pounds every next 6
months, I will get a promotion”. The
person doesn’t get another promotion
whenever an elephant gains 5 pounds for
every 6 months. It happens once.
• Exceptional Scope reverses the scope
islands with “a” indefinite.
• For example, “If an elephant gains 5
pounds, I will take a promotion” The
disambiguous, and repetitiveness occur
together.
• Scope is important for Compositionality,
and Compositionality is important for
Lexical Semantics.
Source: Creation of inferred queries for use as query suggestions
15. Formal Semantics and Modality
• Modaliy is part of Formal Semantics
with propositional content, and
philosophical logic. There are
different modalities:
• Permissible: Express the acts that are
allowed.
• Possible: Express the acts that are
possible.
• Quintessential: Express the acts’
features.
• Evidential: Express the facts with
factual source.
• Habitual: Express the habits.
• Iterative: Express the repeated acts.
• Frequentative: Express the permanent
facts.
Source: Semantic frame identification with distributed word representations
16. Formal Semantics and Binding
• Binding is creating a bound between the predicate and the subject. The anaphors are used to express the
connections between bound predicates and subjects.
• Modality express the lexical relations’ features while binding is for lexical relations’ direction.
• The sentence of “Nancy Pelosi must be next presidential candidate for her career”, the “must be” involves
“strong possibility” while “career” is bound to “Nancy Pelosi”.
• The set theory works here to create “People who must be next candidates for presential election” set, and
“being a presidential candidate” as a possible “political career improvement” act, and “presidential
candidate” becomes a topic that involves connections to other types of “candidacies”, while “political career
steps”, and “political discussions” are connected to it.
• The binding and modality works to create an Information Graph, together.
• If the sentence changes as “Nancy Pelosi is the best possible candidate for every democrat in the US.”, the
sentence has a possibility from a different “modality”, and concept of “scope” works here again.
• Declaration tells that “Nancy Pelosi is a candidate” for “every Democrat in the US”. This explains the “scope”
and “compositionality”.
• Compositionality here is “N is a c for e d in the U.S”
• The main issue here is that the scope doesn’t make sense. If a Democrat goes outside of the US, does it mean
that “Nancy Pelosi is suddenly not the best candidate” anymore? Or, is he best candidate for every democrat
literally?
• Thus, the scope here affects the “modality” further, and makes the “possibility” “opinioated” rather than a
“factual possibility”.
• The Formal Semantics Components affect each other.
• The output of the Formal Semantics affect the Lexical Semantics.
• Lexical Semantics affect the Lexical Relations.
• Lexical Relations affect the Information Graph, and Extraction.
• Information Extraction determines the Knowledge Base (Raw Knowledge Graph). Source: Providing result-based query suggestions
17. Formal Semantics and T-A-M (Tense-Aspect-
Mood)
• Tense-aspect-mood has different combinations
to extract information, and relate
lexicosemantics to each other within a data
graph.
• Tense involves the position of the action inside the
timeline.
• Past, Present, Future
• Aspect involves extension of the state of action in
timeline.
• Unitary – Happened once and suddenly.
• Continuous – Happens during the time.
• Repeated – Happened repeatedly, will happen again.
• Continuous
• Mood (modality) involves the actuality of action.
• Possibly: Might happen.
• Necessity: Should happen.
Source: Extracting Semantic Classes from Text
18. Transition from Lexical Semantics to Query
Semantics
• Query Semantics and Lexical Semantics are
different from each other but highly similar.
• Lexically synonym words might appear
irrelevant to each other, while in Query
Semantics they are relevant.
• For example, “Buy” and “Sell” are opposites, or
antonyms for each other.
• In Query Semantics, “Buy” and “Sell” are
synonyms, in other words, they mean the same
thing.
• The “Soft Drinks” is different concept than
“Coca Cola”. The “Soft Drinks” is a hypernym
for Coca Cola in Lexical Semantics, but in Query
Semantics, they might be synonyms.
19. Transition from Lexical Semantics to Query
Semantics
• Query Semantics is used for “Query
Inference”, and “Query Phrasification”.
• The Query “Best temperature for Soft
Drink” is a query for a hypernym in
Lexical Semantics.
• Query Semantics is used to generate
the same search query for other
members of the same set, because at
the same time, they are synonyms in
query semantics.
• “Soft drinks such as Coca Cola” and “Coca
Cola (Soft Drink)” doesn’t represent the
same thing in Query Semantics.
• Second phrase is more relevant to “Coca
Cola”, while the first one is more relevant
to entire “class of things”.
20. Transition from Lexical Semantics to Query
Semantics
• “Best temperature for pepsi” query
requires further query processing
with lexicosemantics and query
semantics.
• “Best temperature for pepsi” has
missing part.
• For drinking
• For serving
• For producing
• For storing
• For Mixing
• All the possible “verbs” come form
“lexical semantics” and how they are
used in “query search” language.
21. Formal Semantics and T-A-M (Tense-Aspect-
Mood)
• Formal Semantics and T-A-M affect lexical
semantics.
• The “tense”, “aspect” and “mood”
combinations create different lexical
relations with contexts.
22. Transition from Lexical Semantics to Query
Semantics
• The smallest query and word differences
can create ranking changes,
• even if search intent is same,
• or they mean same thing.
Compositionality by Henk J. Verkuyl, Utrecht University
what should happen to someone who has hemophilia
what can happen to someone who has hemophilia
what happens to someone who has hemophilia
23. Formal Semantics and T-A-M (Tense-Aspect-
Mood)
• The modality “should” represent a
responsibility, and solution for a problem.
• Thus, result focuses on “treatment” or
“precaution”, even if rest of the sentence is
same.
what should not happen to someone
who has hemophilia
what will not happen to someone who
has hemophilia
what happened to someone who has
hemophilia
24. Formal Semantics and T-A-M (Tense-Aspect-
Mood)
• The lemmatization such
as “effected”, and
“effective” bring answers
closers.
• The predicate “show” is
closer to “demonstrate”,
and “metrics”, or “tests”.
• The predicates, and
possible
compositionalities have
different types of themes.
what shows happen to someone who has hemophilia
what effected to someone who has hemophilia
33. Query Semantics
• We also see that, “Cat” and “Dog” can
be synonyms.
• Part-time and Full-time can be
synonyms.
• But, sometimes they are also not
synonyms.
• For the query “find job”, they might be
synonym.
• For the query “buy pet”, they might be
synonynm.
• But for the “dog food”, it is not synonym.
• “Sign in” and “Sign on” might be or
might not be synonym.
• “Address” might be contact, or just the
address as well.
34. Query Semantics
• New York is not York.
• York Hotels doesn’t mean New
York Hotels.
• But, Vegas is always Las Vegas.
• If you search from Latin
America, York is New York.
• If you search from Africa, still,
York is New York.
• If you search from France, it is
50/50.
• If you search from UK, it is not
New York, again.
35. Query Semantics
• “New” appears alone a lot.
• “York” appears without “New”
sometimes.
• The combination of phrases
from the Documents help
search engines to relate these
things to each other, or
differentiate them.
• How documents use the query
phrases determine how people
search.
• How people search affect how
people use query phrases.
36. Query Semantics
• Bonus: Does it worth to
index?
• Even if 1,000,000 searches
happen everyday?
• What are the synonyms of
facial expressions?
37. Query Semantics
• “Prove the cost is worth it”.
• Do you worth for that cost if
you do not use
lexicosemantics?
38. Let’s talk about “porn”.
• This is Matt Cutts.
• His first big task on Google was
“finding spammy” but sometimes
not spammy, but highly “sexual
queries”.
• Why?
• S A F E S E A R C H.
39. Let’s talk about “porn”.
• And, how to find all these porns?
• How do people search porn?
• Matt Cutts was an expert on Web
Spam, because adult websites use
spam a lot.
• “Tink two times, if your manager
asks you that what do you think
about porn.”
• -Matt Cutts
40. Let’s talk about “porn”.
• Matt Cutts used 69 languages, and
synonyms to find good phrases that can
relate porns.
• “I didn’t think about this before. People
search porn with lots of different weird
words.”
• Matt Cutts tried to convince Google
Employees to search porn with weird
ways.
• He distributed “cookies”, this is how
“Google Cookie Porn” events happened.
• Lexicosemantics and Query Semantics
are tested first time with entire Google.