Version 2

    Search is one of the most critical components of a healthy community. A solid search tool helps you find people, places, and content quickly and easily. It lets you search far and wide or narrow your question to very specific parameters. It can make your job much easier - especially if you know how to make it work for you.

     

    We’ve found that the best way to use search effectively in Jive is to understand how it works. In this post, we’ll cover the nitty-gritty of Jive’s search engine so that you’re armed with the knowledge you need to search with success.

     

     

    Let’s start with the basics

     

    To make sure we’re all on the same page, let’s start with some vocabulary.

    • Field: A single piece of information within the content or user profile you're searching. For example, in a document, you have the title/subject, the content, and the tags. For a user, you have first name, last name, expertise, tags, and many more.
    • Place: An area of your site that contains content: spaces, groups, projects, etc.
    • Spotlight search: The search box included in the header of every page. It pops up a limited number of results from content, users and places.

     

     

    What’s included in Jive content search?

    What types of content are covered by Jive search?Unless filtered by content type, the system will search for the search phrase in all of these content types:

    • Direct message
    • Poll
    • Blog post
    • Idea
    • Announcement
    • Document
    • Discussion
    • Question
    • File
    • Photo
    • Status update
    • Task
    • Event
    • Video
    • External Activity
    • Comments on content

     

    Spotlight search vs. advanced search

    What’s the difference between spotlight search and advanced search?

    The spotlight search appears at the top of each page. It’s intended as the “quick and easy” search feature, with only a few options to narrow your search. It also adds a wildcard (*) to the end of your search term, since it searches as you type and expects that you may not have finished typing yet. This means that it anticipates what you’re searching for; if you’re searching for “Library of Congress” and pause while typing “Librar”, it will search for library, libraries, etc., not just “librar”.

    Advanced search takes place on the main search page. It offers many more options to refine your search and does not apply the wildcard, as it expects you to provide all of your detailed criteria for the most specific results.

    Both search options use the same algorithm to return results. The algorithm uses “OR” search, which means that it will find results that include at least one of the words in the search phrase. In other words, results don’t have to include every word in the search query. The algorithm also searches all included text, including attachments and comments - not just the initial blog post, document, or discussion.

     

     

    How to get the most relevant search results: 6 parameters that influence rankings

    Relevant search results are critical to the success of your community. Let’s explore what parameters impact the relevancy score for a piece of content and the rank it will get when you search for a specific search phrase.

    Several parameters impact the rank of a piece of content and can provide a boost to get it to the top of the search results. This gets a bit technical, but provides a comprehensive overview of how Jive search “thinks”.

     

     

    1. Similarity Score

     

    When searching for a phrase, the system looks at each word in the phrase and checks the match type and place of match. Each match type/place has a boost score. The boost score is normalized with the number of times this term appears in the given content - the more it appears the better. It is also normalized with the number of times the term appears in the search index in general, only in this normalization, the more common a term is, the less impact it has on the rank.

     

    Match types reflect how well your search query matches the results:

    • Raw: exact match of the search term
    • Analyzed: matches created by the language analyzer that use stemming, looking for the root of the word. For example, focusing will find focus, focused etc.
    • Edgengram: for wildcard search matches and for search-as-you-type queries

     

    Place of match is exactly what it sounds like: where in content the match was discovered.

    • Subject - Title field
    • Body - Content
    • Tags - tags added to the content

     

    The combination of these parameters determines the content’s similarity to the search query and boosts the more similar results accordingly. The higher the boost score, the more relevant the result.

    Match Type/PlaceSubjectBodyTags
    Raw (full match)1.00.10.5
    Analyzed (When language analyzer states it is the same word)1.00.1
    Edgengram (Partial match, for wildcards only)1.00.10.5

     

    2. Proximity Score

    The proximity score checks how close the term is that the user searches for to what appears in the content. When a user searches for a phrase built from several words, this phrase may appear exactly the same way in the content or it may appear in the content in a slightly different way. For example, content with the term "product one-pager brochure" is an approximate match when searching for "product brochure".

    This proximity is also used to boost more relevant results. Exact matches get boosted more than proximity matches.

    • Exact match: when all the search terms appear in the content next to each other.
    • Proximity match: when all the search terms appear less than 3 words apart from each other.

     

    PlaceProximity boostExact match boost
    SubjectDefault: 0.5Default: 1.6
    BodyDefault: 0.5Default: 1.0
    Tags(having proximity score on Tags is unlikely to happen)Default: 0.1Default: 1.0

     

    We also look at the frequency. The score has a lot to do with how many occurrences of the word you're searching for exist in the field. For instance, if you write a 20,000 word essay that makes a single reference to the movie "Finding Nemo" somewhere in the document and you have another document in the system (or a status update or a blog post or a thread, etc.) that's only 50 words and includes "Finding Nemo", the system assumes that the latter is more relevant to a query for "nemo".

     

    3. Outcome type

    Content in Jive can be marked with structured outcomes. These outcomes impact the score of that content in the search results.

    FinalOutdatedOfficialDefault
    1.40.12.01.0

     

    The content’s rank score is multiplied by the boost in the table based on its structured outcome. A higher boost will result in content being ranked higher in the search results and vice versa, so the 0.1 score for outdated documents significantly reduces their rank.

     

    4. Object type

    Similarly, content is boosted in search results based on its type. Documents and blogs are ranked higher as these are usually used for more comprehensive content that may be more relevant for the searching user. Documents and blogs get a boost of 1.4, while discussions, questions, polls, ideas, videos, and status updates get a boost of 1.0.

     

    5. Recency

    Recency (also known as time decay) lowers the score for older content. The impact of content can be seen this way:

    The Recency score calculation is based on these parameters:

    • Drop speed: set to 50. This determines how fast the algorithm reduces the content score by age.
    • Max value: set to 4 weeks. All the new content from the last 4 weeks has the same score without decay.
    • Minimum score: set to 0.9. This makes the maximum score difference of a very old document and a just-created document 2x. It is set so that even the oldest relevant content can still be found, but fresh content retains precedence.

     

    6. Social Score:

    The algorithm calculates a social score for a piece of content based on given user activities, follows and other behavioral connections.

     

     

    Final Rank

    The final rank of a piece of content is based on a combination of all of these parameters. This is how it is calculated:

    Rank = (SimilarityScore + ProximityScore) * OutcomeType * ObjectType * Recency * SocialScore

     

    The final rank numbers determines what will be displayed in your search results and in what order, with the objective of surfacing the most relevant content first.

     

     

    Admin tip: using synonyms to improve search

    You can define common synonyms for your particular system, like "docs" and "documentation".  To add synonyms, go to Admin Console > System > Settings > Search > Synonyms, enter a pair of words separated by a comma in the Synonyms box, then click Add Synonym.

     

     

     

    Searching for people and places

     

    In addition to searching for content, you can also search for users and places, such as spaces, groups and projects. There are some important differences to note with these types of search.

     

    @mentioning

    When you @mention someone or something, Jive searches in a similar way to spotlight search. The search algorithm takes what you've typed in so far and adds a wildcard to it; like spotlight search, this means that no stemming is done with this search. The main difference from spotlight search is that @mentioning only searches the title (of content or place) and username, name and email of a user.

    User search

    When searching for users, the system searches for the phrase in each of the profile fields that person performing the search has access to according to their user settings (for instance, an admin will have access to more fields than a standard user.) This includes searching for users through the front end (spotlight and advanced search) as well as searching for users in the admin console People tab.

     

     

    Places search

    When searching places, such as spaces, groups or projects, Jive searches the title, the description, and the tags.

    The same search algorithm applies here; a field that contains 5 words, one of which is a match, will receive a higher score than a field that contains 25 words, one of which is a match. If you're having trouble getting your place to show up at the top of a search for a particular term, be sure to use the search term in the title, description and tag field as many times as possible, with as few other words as possible.

     

    Searchable place types:

    • Space
    • Group
    • Project
    • Personal blogs

     

     

     

    Tips for tweaking your search

    Finally, here are some tips and tricks for searching more effectively in Jive.

    • Add wildcards (*) to your search.
      • Note that wildcards can't be used as the first character of a search. This means that you can't search for all users with a particular email domain. For example, a search for '*@jivesoftware.com' will return no results (unless you have a user who has the exact string '@jivesoftware.com' as part of their profile, such as their username.)
    • Use filters to narrow down the default search range.
      • You can choose a different search range (other than the default “all”)  if you're only looking for more recent items, specific content types, or a particular author, for example. You can also narrow the search by outcome types.
    • Change the order of your search results.
      • The order of search results is set by the system according to relevance. You can change the order in the advanced search page by sorting by last modification date or turning social search on or off.

     

     

    Default v5 Parameter configuration

    Here is the v5 Search Index Parameters currently being used. This will help understand which fields, content types, and boosts are weighted higher and how it determines the search relevancy ranking.

     

    Hopefully this provides a better understanding of how Jive search works and how to make it work for you. Questions? Let us know in the comments.