4 Replies Latest reply on Mar 23, 2016 5:24 AM by ryanrutan

    Gateway timeout on API calls with high start indexes

    iainbrown

      Hi Ryan Rutan, Rashed Talukder,

       

      I hope you're both recovering well from Jiveworld!

       

      I seem to remember one of you mentioning at one point that there may be an alternative way to connect to REST APIs in order to get past gateway timeouts on long-running API queries.

       

      We see this often on the /contents API (as well as others i.e. /statics), especially when there is a large piece of content included in the results, or if the startindex is high i.e. 40000 to 90000.

       

      While we've written some workarounds by reducing the number of items requested sequentially until the response comes back it's not very usable in this state.

       

      Am I dreaming that you mentioned a different URL for API calls that bypasses the proxies and reduces the occurance of these timeouts? If I'm not, can you please let me know what it was?

       

      Thanks!

        • Re: Gateway timeout on API calls with high start indexes

          In the town hall, you may have heard me referencing a timeout issue when loading all people from the API and trying to iterate over them.  If you are running into this issue on other content-types, let me know and we can file some tickets against it. 

           

          As a note, the /people/@all service is the mechanism for loading all users in an efficient manner for pagination without running into timeouts due to the cache paging/loading.

          1 person found this helpful
          • Re: Gateway timeout on API calls with high start indexes
            iainbrown

            Yes, that was it.

             

            We do see this on other APIs, specifically documents and statics.

             

            sionascu, joelee could you help with some more information on the timeouts we're seeing?

             

            Ryan, we have opened some cases on this; the suggestion was to iterate down until we're requesting one piece of content at a time to work around the timeout. Stefan and Joseph are the ones experiencing this directly, so I hope they can give you some more context.

             

            Thanks.

            • Re: Gateway timeout on API calls with high start indexes
              sionascu

              Hi Ryan Rutan / Iain Brown,

               

              On both the /contents and /statics endpoints, I get a "504 Gateway Time-out" when the startIndex parameter exceeds certain limits.

               

              For example, a GET on  /contents?startIndex=130000&count=100 API, returns a "Gateway Time-out" response for cisco.jiveon.com

              Following a suggestion from  Gateway Timeout Error , if I decrease the "count" parameter to 50, 25... down to 1, I still get the same error. This suggests that it's not a specific content that has a problem or the large number of content requested but with the "startIndex" parameter being high. 

              Assuming that the content at startIndex=130000 has a problem, increasing the startIndex further(to try to skip the problem), returns the same "504 Gateway Time-out."

              Likewise for /statics.

               

              Please note, depending on the response time of Jive at the time of the query, the startIndex that times out can fluctuate to lower numbers(e.g. 20000).

              jzawadzk - I believe you experienced a Time-out with a lower count?

               

              Thanks,

              Stefan

                • Re: Gateway timeout on API calls with high start indexes

                  Yes, this is a known catch that happens when results are loaded into cache and paged through for return.  If the server is slammed, then it will be able to process fewer results before the request timeout hits.  As discussed at JiveWorld, we need to get issues logged for your scenarios and look into implementing similar patterns to the @all service that go straight to response rather than through the intermediary cache.  It's worth noting that the API cache design works extremely well for targeting content ... However when loading extra large data sets (such as a comprehensive list of content/users), it tends to break down given the operations can't complete before the request timeout hits.  If given the time, it would complete however each iteration would take a little longer than the previous (all thing beings equal).

                   

                  Let's start with the tickets you've filed and we can go from there.