1 Reply Latest reply on Mar 8, 2017 7:08 AM by mpetrosky

    Scraping users and their stream notification settings via API

    mpetrosky

      Hi all, I'll preface this by saying I am a novice programmer at best

       

      I have a need to figure out which of our Jive-x users have modified their notification preferences and are not being updated when we publish certain information in streams.  There is no easy way to do this (that I'm aware) so I decided to tackle it with the API and Python.  Conceptually from looking at the API results, I saw that on a person object there is a streams object and within there are the email delivery settings for individual streams.

       

      https://xxx.jiveon.com/api/core/v3/people/xxx/streams

       

      I figured if I could first create a list of all of the user ID's, I could then loop through each one and gather the email preferences for each user into a dictionary.  The resulting code below works, but it is REALLY slow.  I'm wondering if anyone else has essentially tried to get a nested value from each user in the community and whether this is the most efficient way to go about it?  Thanks!

       

       

      import requests, base64, csv
      
      # jive api pagination variables
      startIndex = 0
      itemsPerPage = 25
      # jive api username
      username = 'xxxxxxxxxx'
      # take the base64 encoded value for the password, then convert it from a byte string to a string
      password = (base64.b64decode(b'xxxxxxxxxxxxxxxxxxx').decode('utf-8'))
      
      headers = {'Authorization': 'basicAuthKey'}
      
      user_id_list = []
      user_dict = {}
      
      with open('c:\users\xxxxxxxxxx\documents\emailpref.csv', 'wb') as csvfile:
        writer = csv.writer(csvfile, delimiter=',')
         while True:
         # reset the pagination counter, if we get a page with fewer than 25 items, we've reached the end
         counter = 0
         # variables to build the url
         url = 'https://xxxxxxxx.jiveon.com/api/core/v3/people'
         filters = '?filter=include-disabled'
         count = '&count=' + str(itemsPerPage)
        index = '&startIndex=' + str(startIndex)
         # api root access point using url variables
         rest_data = requests.get(url + filters + count + index, auth=(username, password), headers=headers)
         # convert the return into a json blob
         blob = rest_data.json()
      
         # parse the blob to pull out name and id
         for x in blob['list']:
        name = x['displayName']
        user_id = x['id']
        counter += 1
         user_id_list.append(user_id)
         # increment the starting point for the next page
         startIndex += blob['itemsPerPage']
      
         for z in user_id_list:
        final_id = z
        url = 'https://xxxxxxxx.jiveon.com/api/core/v3/people/' + z + '/streams'
         stream_data = requests.get(url, auth=(username, password), headers=headers)
        stream_blob = stream_data.json()
      
         for y in stream_blob['list']:
        email_pref = y['receiveEmails']
        email_source = y['name']
        email_type = y['source']
         print (final_id + ' done')
        writer.writerow([final_id.encode('utf-8'), email_type.encode('utf-8'), email_source.encode('utf-8'), email_pref])
      
         # test to see if we've reached the end of the query
         if counter < 25:
      break