SharePoint Search - SPSNYC 2014
- 3. WHO AM I?
• Brian Caauwe
• SharePoint Consultant & Speaker
• Avtex Solutions (Minneapolis, MN)
• Email: bcaauwe@avtex.com
• Twitter: @bcaauwe
• Blog: http://blog.avtex.com/author/bcaauwe
• Unfortunate Sports Fan
• Minnesota Twins
• Minnesota Vikings
• Technical Editor
• Professional SharePoint 2013 Administration
• Certifications
• MCM: SharePoint Server 2010
- 4. THANK YOU EVENT
SPONSORS
• Please visit them and inquire about their
products & services
• To win prizes make sure to get your
bingo card stamped by ALL sponsors
- 5. POLL
• SharePoint Version
• 2007 – WSS, MOSS
• 2010 – SPF, Server, FAST
• 2013 – SPF, Server
• Work Roles
• SharePoint Administrator
• SharePoint Developer
• Business User
• Other
- 7. SEARCH EDITIONS
• SharePoint Foundation 2013
• SharePoint Server 2013
• Standard
• Enterprise
• ALL editions now use the SAME search service
• osearch15
• TechNet Reference: http://technet.microsoft.com/en-us/library/cb36484c-0e8f-480e-be88-
5daa8bf2d47d#bkmk_SearchfeaturesOnPrem
- 9. SEARCH EDITIONS
SHAREPOINT SERVER 2013 - STANDARD
• Scalable components
• People Search
• Promoted Results
• Customized Sorting
• Graphical Refiners
• Search Server web parts
- 10. SEARCH EDITIONS
SHAREPOINT SERVER 2013 - ENTERPRISE
• Content by Search web part
• Entity Extraction
• Content Processing Enrichment
• Video Search
• Item Recommendations
- 13. SEARCH COMPONENTS
ADMINISTRATION COMPONENT
Component
• Monitors states of all other components
• Managed Topology Changes
• Finally scalable
• Only one active at a time
Database
• Search Admin Database
• Configuration data
• Topology
• Crawl, Query rules
• Property Mappings
• Content Sources, Crawl Schedules
• Analytics Settings
Administration
- 14. SEARCH COMPONENTS
CRAWL COMPONENT
Component
• Performs the crawling
• Invokes connectors / protocol handlers
• SharePoint content
• Business Applications
• File Shares
• More…
• Delivers crawled items AND metadata to Content Processing Component
• Communicates with ALL crawl databases
Database(s)
• Crawl Database
• Crawl history
• Information on crawled items
• Scale out for each 20 million items crawled
• Host distribution
• 2010 Handled by Host URL
• 2013 Handled by Content DB
Crawl
- 15. SEARCH COMPONENTS
CONTENT PROCESSING COMPONENT (CPC)
Component
• Handles document parsing and iFilters
• Extracts data for Document Parsing and Property Mappings
• Performs linguistic processing
• Entity Extraction
• Generates phonetic name variations (people search)
• Sends items to the Index Component
Database(s)
• Link Database
• Receives information about links and URLs from CPC
• Stores unprocessed information for use in analytics
• Information on search clicks
• # of times people pick on results
• Scale out for each 20 million items crawled
• Scale out for each 100 million queries / year
Content
Processing
- 16. SEARCH COMPONENTS
ANALYTICS PROCESSING COMPONENT (APC)
Component
• Performs Search Analytics
• Pulls information from Links DB
• Stores information for search reports
• Performs Usage Analytics
• Pulls information from event store
• Generates recommendations, usage and statistics reports
• Sends results to the content processing component to be pushed to the index
Database(s)
• Analytics Reporting Database
• Results of usage analytics
• Statistics information from the analyses
• Scale out when size > 200 GB
Analytics
Processing
- 17. SEARCH COMPONENTS
INDEX COMPONENT
Component
• Logical representation of an index replica
• Mapped one-to-one to an index replica
• Each partition holds one or more index replicas
• Receives processed items from content processing component
• Receives queries from query processing component and writes to index
• Returns result sets to the query processing component
On File index
• Located ON SharePoint servers housing index component
• Index update groups
• Default (majority of managed properties)
• Security (ACL managed property)
• Link (managed properties related to link structure)
• Usage (managed properties related to usage data)
• People (managed properties related to people search)
• Full-text index
• Contains text from searchable managed properties
• Multiple replicas / server supported after October 2013 CU
Index
- 18. SEARCH COMPONENTS
QUERY PROCESSING COMPONENT (QPC)
Component
• Analyses and processes queries
• Decides which query rules are applicable
• Submits query to index component
• Determines which index partition to send query to
• Performs pre processing
• Receives result sets from index component
• Performs post processing
• Sends result set back to requestor
• Performs linguistic processing at query time
• Word breaking, stemming, spellchecking, thesaurus
Query
Processing
- 19. SEARCH COMPONENTS
COMPONENT PARTNERS
Name CPU Network Disk Memory
Administration ● ● ● ●
Crawl ●● ●●● ●● ●●
Content Processing (CPC) ●●● ●● ●●●
Analytics Processing (APC) ●● ●●● ●● ●●
Index ●●● ●● ●●● ●●●
Query Processing (QPC) ● ●● ●●
The content of this slide is borrowed from Neil Hodgkinson (@nellymo)
Query
Processing
Index
Analytics
Processing
Content
Processing
CrawlAdministration
- 21. SEARCH ADMINISTRATION
MAPPING TERMINOLOGY FROM 2010 TO 2013
2010 Term 2013 Term
Scopes Result Source
Federated Location Result Source
Keyword Query Rule
Best Bets Promoted Result
Managed Property Schema > Managed Property
Crawled Property Schema > Crawled Property
Search Result Removal Crawl Log > URL View > Remove the
item from the Index
XSLT Display Templates
N/A Result Types
N/A Result Block
N/A Continuous Crawl
Host Distribution Rule N/A
- 23. SEARCH ADMINISTRATION
SEARCH TOPOLOGY - POWERSHELL
## Get Service ##
$svc = Get-SPEnterpriseSearchServiceInstance -Identity “servername”
## Start Service ##
Start-SPEnterpriseSearchServiceInstance -Identity $svc
## Get Search Service Application ##
$ssa = Get-SPEnterpriseSearchServiceApplication
## Get Active Topology ##
$activeTop = Get-SPEnterpriseSearchTopology -SearchApplication $ssa -Active
## Clone Topology ##
$clone = New-SPEnterpriseSearchTopology -SearchApplication $ssa -SearchTopology $activeTop -Clone
- 24. SEARCH ADMINISTRATION
SEARCH TOPOLOGY - POWERSHELL
## New Administration Component ##
$adminComp = New-SPEnterpriseSearchAdminComponent -SearchTopology $clone
-SearchServiceInstance $svc
## New Analytics Processing Component ##
$apc = New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $clone
-SearchServiceInstance $svc
## New Crawl Component ##
$crawlComp = New-SPEnterpriseSearchCrawlComponent -SearchTopology $clone
-SearchServiceInstance $svc
## New Content Processing Component ##
$cpc = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone
-SearchServiceInstance $svc
- 25. SEARCH ADMINISTRATION
SEARCH TOPOLOGY - POWERSHELL
## New Query Processing Component ##
$qpc = New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $clone
-SearchServiceInstance $svc
## New Index Partition / Replica ##
$idx = New-SPEnterpriseSearchIndexComponent -SearchTopology $clone
-SearchServiceInstance $svc -IndexPartition 0 –RootDirectory “D:SPSearchIndex”
## Activate New Topology ##
$clone.Activate()
## OR ##
Set-SPEnterpriseSearchTopology –Identity $clone
- 26. SEARCH ADMINISTRATION
SEARCH TOPOLOGY
Topology Recap
• Ensure service is “online” before using in search topology
• To clone topology, use New-SPEnterpriseSearchTopology -Clone
• Otherwise you won’t have component ID’s
• Index Component
• When specifying a root directory, it MUST exist but be empty
• Also if referencing remote server, the Cmdlet checks local server
• Always specify a partition, otherwise it chooses 0
• When adding a new partition, it must have the same number of replicas as existing partitions
• After adding a new partition, the index WILL be repartitioned … amount of time it takes depends on
index size
• You can ADD a partition, but not DELETE
• Clean up old topologies / components
- 29. SEARCH ADMINISTRATION
FARM ADMINISTRATION
Queries and Results
• Authoritative Pages
• Result Sources
• Query Rules
• Query Client Types
• Search Schema
• Query Suggestions
• Enabled / Disabled
• Always / Never Suggest
• Import AND Export
• Search Dictionaries (Term Store Management)
• Company Exclusion / Inclusion
• Query Spelling Exclusion / Inclusion
• Search Result Removal
- 30. SEARCH ADMINISTRATION
FARM ADMINISTRATION
Search Schema (Managed / Crawled Properties)
• Searchable
• Advanced Searchable Settings
• Full-text index
• Weight group
• Queryable
• Retrievable
• Allow Multiple Values
• Refinable
• Sortable
• Safe for Anonymous
• Alias
• Token Normalization
• Complete Matching
• Company Name Extraction
• Custom Entity Extraction
- 31. SEARCH ADMINISTRATION
FARM ADMINISTRATION - POWERSHELL ONLY
## Result Types ##
$owner = Get-SPEnterpriseSearchOwner -Level Ssa
$word = Get-SPEnterpriseSearchResultItemType –SearchApplication $ssa –Owner $owner | ?{$_.Name –eq
“Microsoft Word”}
$pdf = Get-SPEnterpriseSearchResultItemType –SearchApplication $ssa –Owner $owner | ?{$_.Name –eq “PDF”}
$wordPDF = New-SPEnterpriseSearchResultItemType -SearchApplication $ssa -Name “WordPDF” –Owner $owner –
ExistingResultItemType $pdf –ExistingResultItemTypeOwner $owner
Set-SPEnterpriseSearchResultItemType –Identity $wordPDF –SearchApplication $ssa –owner $owner –
RulePriority 1 –DisplayTemplateUrl $word.DisplayTemplateUrl
## Thesaurus ##
Import-SPEnterpriseSearchThesaurus -SearchApplication $ssa -FileName “serversharethesaurus.csv”
- 32. SEARCH ADMINISTRATION
SITE ADMINISTRATION
Result Types
• Map results to display templates
Consumes farm settings, but allows site independent settings
• Result Sources
• Query Rules
• Search Schema
• Map Existing Managed Properties to Crawled Properties
• New Managed Properties - Types: Text or Yes/No
• Cannot make Sortable, Refinable, Multiple Values
- 35. SEARCH CUSTOMIZATIONS
CRAWL COMPONENT
Custom Connectors
• Really means BCS
• LOBSystemInstance needs ShowInSearchUI to show in Central Admin for content source
• DisplayUriField set on method otherwise URL’s in search will start with bdc3://
• LastModifiedTimeStampField set and ChangedIdEnumerator and DeletedIdEnumerator
implemented if you want incremental crawls
MSDN Reference: http://msdn.microsoft.com/en-us/library/gg294165.aspx
Crawl
- 36. SEARCH CUSTOMIZATIONS
CONTENT PROCESSING COMPONENT (CPC)
Content Enrichment Web Service
• Web service call outside of SharePoint to:
• Clean data
• Remove from index
• Augment properties
• Configurations
• Trigger Expression
• Input Managed Properties
• Output Managed Properties
• Failure Mode
• Debug Mode
MSDN Reference: http://msdn.microsoft.com/en-us/library/jj163968.aspx
Content
Processing
- 37. SEARCH CUSTOMIZATIONS
CONTENT PROCESSING COMPONENT (CPC)
Content Enrichment Web Service
• Registering the service in PowerShell
$ssa = Get-SPEnterpriseSearchServiceApplication
$cewsConfig = New-SPEnterpriseSearchContentEnrichmentConfiguration
$cewsConfig.Endpoint = “http://externalserver/cews.svc”
$cewsConfig.InputProperties = “Title”, “Company”
$cewsConfig.OutputProperties = “Title”, “Company”, “Prop3”
$cewsConfig.Trigger = ‘Contains(Company, “CoName”)’
$cewsConfig.FailureMode = “Error”
$cewsConfig.DebugMode = $false
Set-SPEnterpriseSearchContentEnrichmentConfiguration -SearchApplication $ssa -
ContentEnrichmentConfiguration $cewsConfig
Content
Processing
- 38. SEARCH CUSTOMIZATIONS
CONTENT PROCESSING COMPONENT (CPC)
Custom Entity Extraction
• Different Extraction types
• Word Extraction
• 5 Dictionaries
• Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n
• Word Part Extraction
• 5 Dictionaries
• Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n
• Word Exact Extraction
• One Dictionary
• Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1
• Word Part Exact Extraction
• One Dictionary
• Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1
TechNet Reference: http://technet.microsoft.com/en-us/library/jj219480.aspx
Content
Processing
- 39. SEARCH CUSTOMIZATIONS
CONTENT PROCESSING COMPONENT (CPC)
## Entity Extraction ##
Import-SPEnterpriseSearchCustomExtractionDictionary -SearchApplication $ssa
–DictionaryName Microsoft.UserDictionaries.EntityExtraction.Custom.Word.1
–FileName “serversharedictionary.csv”
Custom Entity Extraction
• Sample File
• Import through PowerShell
Content
Processing
- 41. SEARCH CUSTOMIZATIONS
QUERY PROCESSING COMPONENT (QPC)
Ranking Models
• Customize ranking based on YOUR logic
• VERY complex… a LOT of math
Registered in PowerShell
MSDN Reference: http://msdn.microsoft.com/en-us/library/sharepoint/dn169052.aspx
$ssa = Get-SPEnterpriseSearchServiceApplication
$owner = Get-SPEnterpriseSearchOwner -Level Ssa
$customModel = [string](Get-Content .CustomModel.xml)
$newModel = New-SPEnterpriseSearchRankingModel –SearchApplication $ssa
–Owner $owner –RankingModelXML $customModel
Query
Processing
- 42. SEARCH CUSTOMIZATIONS
QUERY PROCESSING COMPONENT (QPC)
Security Trimming
• Pre
• Augments claims
• Processed BEFORE index lookup
• Accurate refiner counts
• Post
• Secondary security checkpoint
• Processed AFTER index lookup
• Negatively effects refiner counts
Needs to be deployed to GAC
Registered in PowerShell
MSDN Reference: http://msdn.microsoft.com/en-us/library/sharepoint/ee819930.aspx
$ssa = Get-SPEnterpriseSearchServiceApplication
New-SPEnterpriseSearchSecurityTrimmer -ID “1” -SearchApplication $ssa -TypeName “<strong typed assembly>”
Query
Processing
- 43. UX
SEARCH CUSTOMIZATIONS
USER EXPERIENCE
Display Templates
• New way to change search results
• Good by XSLT
• Get used to JavaScript
• Available through Design Manager
• Live in Master Page Gallery
• Separate folders for Content by Search and Core Search
• .HTML file
• .JS file (DO NOT TOUCH)
MSDN Reference: http://msdn.microsoft.com/en-us/library/jj945138.aspx
- 45. UX
SEARCH CUSTOMIZATIONS
USER EXPERIENCE
Search Web Parts
• Search Results
• Query Builder
• Auto Refine
• Sorting
• Query Rules
• Inline testing
• Content by Search
• Search Results Web Part settings plus
• Term Navigation
• Tuned for use out of search center
- 47. HOW TO CONTACT ME
• Brian Caauwe
• SharePoint Consultant & Speaker
• Email: bcaauwe@avtex.com
• Twitter: @bcaauwe
• Blog: http://blog.avtex.com/author/bcaauwe
- 48. REFERENCES
SharePoint 2013 training for IT pros
• http://technet.microsoft.com/en-US/sharepoint/fp123606
Search Edition Features
• http://technet.microsoft.com/en-us/library/cb36484c-0e8f-480e-be88-
5daa8bf2d47d#bkmk_SearchfeaturesOnPrem
BCS Connector
• http://msdn.microsoft.com/en-us/library/gg294165.aspx
Content Enrichment Web Service
• http://msdn.Microsoft.com/en-us/library/jjl63968.aspx
- 49. REFERENCES
Custom Entity Extraction
• http://technet.microsoft.com/en-us/library/jj219480.aspx
Ranking Models
• http://msdn.microsoft.com/en-us/library/sharepoint/dn169052.aspx
Security Trimming
• http://msdn.microsoft.com/en-us/library/sharepoint/ee819930.aspx
Display Templates
• http://msdn.microsoft.com/en-us/library/jj945138.aspx