We don't crawl it as such, we use SQL and some Perl scripts to generate our own sitemaps of content for Google.
Im very interested in this but afraid I dont have any practical experience to share. I do have a bunch if questions though if you dont mind sharing. What do you mean when you say 'crawl'? Im curious as to how your connector extracts the content from jive. Is it via the Rest API? Is it the Content service? Im curious how the search engine honours the access control of private and secret groups or documents with restricted readership.