1 Reply Latest reply on Oct 3, 2013 12:42 PM by gdinning

    Single-source of data for indexing Jive discussions?

    ryankelley

      One of our 3rd party technical partners is attempting to crawl our Jive instance, specifically the discussions. They are having a difficult time getting past a few pages due to the immense amount of content and subsequent pagination. Looking for help utilizing the Jive API to crawl for information. Any information I can pass onto the 3rd party vendor would be appreciated.

        • Re: Single-source of data for indexing Jive discussions?
          gdinning

          Jive already indexes content and uses Lucene in the background so maybe they can somehow use the existing Lucene index.  If not, maybe they can limit their search to HTML elements that have certain IDs and classes.  Possibly they could also limit link following using regular expressions. These pages definitely have a lot going on so they probably want to limit what they're crawling.