1 Reply Latest reply on Feb 17, 2017 2:08 PM by jgoldhammer

    Jive REST API: how to get html resources (urls) of different content types to create a sitemap

    saeed

      I would like to create a sitemap that should be used as input to an external crawler job to create an external search index.

       

      For this purpose I start to use the jive REST API and parse the json response for "html" part:

      "html" : {

        "allowed" : [ "GET" ],

        "ref" : "https://domain.com/docs/DOC-374862"

        },

       

      I perform a paging by increasing startindex value

       

      Two Approaches:

       

      1) REST API query

      http://domain.com/api/core/v3/contents?

      &filter=type(discussion,document,favorite,file,idea)

      &fields=resources(html)

      &abridged=false

      &status=published

      &includeBlogs=false

      &count=100

      &startIndex=101

       

      result: as far as startindex is small json response time is ok, but with increasing startindex value, performance is very bad, and I get a proxy time out (502 Proxy Error).

      ----------------

       

      2) REST API query

      http://domain.com/api/core/v3/contents?

      &sort=latestActivityDesc

      &fields=resources(html)

      &abridged=false

      &status=published

      &includeBlogs=false

      &count=100

      &startIndex=101

       

      result: performance of json response time is better, but this query ends up in a proxy timeout as well, if startindex gets near 30.000

       

      Question: Is this the right way to get all jive URLs of type contents? Or is there a better way to do so.

       

      Any help is appreciated.