12 Replies Latest reply on Sep 2, 2016 7:54 AM by b.taub

    Community Manager Reports versus DES Data

    Kathryn

      Hi - I have an administrator who has 40 groups in Jive, and has to consolidate CMRs every month (I know others know that pain).  So he asked if we could do something for him.  So, I looked at one of his groups and saw that he has 904 active users and then exported his group's activity (concatenating all four files) using DES and only see 706 unique names.  I'm using the same 30 time period as CMR - does anyone know why there would be such a discrepancy?   Shouldn't all the activity that makes for an "active" user be captured in DES? 

        • Re: Community Manager Reports versus DES Data
          Ted Hopton

          I gave up long ago on trying to reconcile CMR data and Jive DES data, Kathryn. I hope you get an answer, but I suspect the reasons for the discrepancy may be complex rather than a simple-to-explain issue. Just my guess. For example, the fact that I can pull CMR reports at one point in time and then pull identical reports months later and get drastically different results has led me to seriously doubt the integrity of CMR data.

          • Re: Community Manager Reports versus DES Data
            Ted Hopton

            Helen Chen, that's a good question. I have not seen differences in Jive DES data when I pull the same reports later. I think that's because Jive DES is simply data -- a set of records. CMR, on the other hand, is really a set of reports and Jive is performing all kinds of calculations on the data, including filtering behind the scenes. I don't know how the CMR data gets stored, but when it's displayed for us in a report, and thus becomes downloadable, it's not a pure record in a system. It's been altered and I suspect that's where things go wrong.

             

            Kathryn Everest, I do think the better data source is DES, for several reasons:

            • I can drill down into the DES data to see every little detail. Thus, when I find suspicious stuff -- as I have -- I can learn where the strangeness is and so decide whether or not to make an adjustment to the data that I use. That's huge, because it is inevitable that there will be either errors in the data at times or events that distort the data. With CMR, it's a black box. There's no way for me to dig in and figure out why something looks odd or why reports are inconsistent at different points in time.
            • I have not found DES data to vary over time. As I explained in my comment to Helen, just above, DES is pure transactional data while CMR is getting mucked with behind the scenes before it's presented to us.

             

            That said, I have no evidence that the number of transactions recorded in DES for any item is more accurate than the number recorded in CMR (to go back to your original question). I just don't know. But it would be pretty easy to test. Create a test script and execute the actions that should trigger the data you're evaluating. Then look in both DES and CMR and see which one more accurately recorded what you know actually happened. That's how I have learned which search activity codes to use in DES, for example. And then post the results of your test! :-)

            1 person found this helpful
              • Re: Community Manager Reports versus DES Data
                Kathryn

                Thanks Ted - I had a few "anomalies" in the DES data which made me suspicious of it (I filed cases but they couldn't be replicated by Support so they weren't pursued - but I can replicate them). I also don't always understand how they broke out the data between people, content, places and activities, and wonder if there are any activities that CMR has access to that I don't - but haven't had enough time to research, and having troubling finding complete, up-to-date and understandable lists.  Thanks for your help here.  

                  • Re: Community Manager Reports versus DES Data
                    Ted Hopton

                    Yep, you've raised some good points, Kathryn. I do think DES data has problems. If I have to choose between DES problems and CMR problems, I just feel my opportunity to understand and correct issues is much greater with DES than CMR. Lesser of two evils in that sense :-)

                      • Re: Community Manager Reports versus DES Data
                        Kathryn

                        I just haven't hit my stride with DES yet.  So just as you said, I wanted to test something.  I followed a social group and then use DES to show me the activity.  I filtered by Destination and used the url for the group.  My activity doesn't appear.  (I blurred by trust me.  I then downloaded every file (because I still can't figure out the four files and why stuff appears where) and no mention of me.

                         

                         

                         

                        I then filter by actor and type my name and it appears.  Why isn't it showing up when I filter for the group? 

                         

                         

                        Yes - I know it says I unfollowed three times, but this is a bug I tried to report that Support cannot reproduce so they closed the case. 

                          • Re: Community Manager Reports versus DES Data
                            Kathryn

                            Oh geez, why is the following a group not part of the "destination" filter?  In the actor shot, it shows it as the action_object.  When I want to know what happens in a group then do I have to filter both for destination AND action.object?  Any others? 

                            • Re: Community Manager Reports versus DES Data
                              Ted Hopton

                              Ah, you are using the portal interface. I have limited myself to just the Activity CSVs, as that's the one that Dirk McNealey's script pulls. I've only used that portal to investigate things I could not get from the Activity CSVs, such as search activity details.

                               

                              I don't really understand how the whole dataset works, nor why or how they split things among the four CSV download options. The bulk of my DES experience is just with what you can get from the Activity CSV. I do all my analytics on those.

                               

                              I'm interested in what you can learn. I hope to get access to the other CSVs through Benjamin Taub's 411 Labs InSite tool, so I will want to understand how all four sources fit together.

                                • Re: Community Manager Reports versus DES Data
                                  Kathryn

                                  What I learned is that this is just as frustrating.  Perhaps I need to get the data the same way you do - in ONE complete file.  I'm hoping Benjamin Taub become my white knight as well - but in the meantime, I'm in data hell.  I'm bailing on the portal now that I see it downloads stuff in strange increments that are incomplete.  Off to research Dirk's script. 

                                    • Re: Community Manager Reports versus DES Data
                                      b.taub

                                      I'm here! I'm here!  Wow, Ted Hopton and Kathryn Everest calling me to the bar. I feel like a genie who has been summoned.

                                       

                                      So no concrete answer, but a philosophical/architectural answer. DES is the behavioral data APIs are the static data. CMRs report data that is sometimes one, sometimes the other, sometimes neither - the UploadedProfilePhoto event, for example seems to appear in neither. 

                                       

                                      Because you can't really see immediately where a particular CMR is pulling its data from, I believe this is part of the reason for the descrepancies in your reports. Many of you also see this in the user count differences between the CMRs and the user counts in the Admin Console. The data is not different - the algorithm used to derive the different counts is different - but you need to be pretty long of tooth in your Jive experience in order know this in any detail.

                                       

                                      So, our philosophy for the developmeant of InSite is to pull Everything we can (I think we are the first partner to pull both API and DES data into a single database) and let You choose the algorithm or filter yourself.

                                       

                                      I am am certain that if you sat down with a Jive data architect, they would be able to explain in great detail the algorithm used AND why they made the decision they made  to filter that report in that way. It would even make sense.

                                       

                                      We we just think that folks like us shouldn't have to suss out the reason each report is the way it is. Instead, we just give you the data and let you do the slicing.

                                       

                                      Having said that, there are still issues - the sheer volume of the DES data is the first one that comes to mind.

                                       

                                      For a more tactical answer I am calling mikhalchuk into the conversation.

                                      1 person found this helpful
                                      • Re: Community Manager Reports versus DES Data
                                        b.taub

                                        I'm posting this reply on behalf of Andrey:

                                         

                                        Of the two data sources I think DES is more trustworthy – mostly because the source/algorithm is “known” – subject to my dropout caveat below.  Having said that, we think one of the values of InSite is that we jump through the data extraction hoops so you don’t have to.  We’re thinking about publishing our algorithms (subject to NDA) so that the “how” is explicit for our customers.

                                         

                                        One way to think about DES data is as “telemetry” from your Jive server – similar to a satellite talking to ground control. DES and your Jive servers are hosted separately and because of this there is a data connection where your Jive server transmits the telemetry data to the DES server.  This connection can be interrupted and we haven’t seen anything about how the data that was created when the connection was down is re-transmitted when the connection is restored. 

                                         

                                        Because of this DES may have gaps (dropouts) in the data. In our release after next we think we should be able to actually display the dropouts (but maybe not be able to restore it – sorry).  This is just my speculation right now.  This may also be the source of Kathryn’s discrepancies in her two runs.  Perhaps a Jiver monitoring this discussion could clarify how this works now.

                                         

                                        As Ben said, CMRs are sort of a black box, because you can see the raw data in the downloaded CSVs, but not the source of that data, nor the extraction (filtering) algorithm. Also I think the algorithm for generating those reports changes over time, that may be why you can run a report and then run the same report a couple months later and get two different results.

                                         

                                        The DES bug Kathryn is referring to may be just a time zone anomaly we think we are seeing also – where it appears DES may use server time and APIs may use GMT or something similar.  Or I could be something else. I’m not positive on this and would need to see the raw data, not blurred. There also may be a problem with Kathryn’s filter for the destination.  I would need to see the raw data and the query.

                                         

                                        Sorry for the speculation and sorry Ben has to post this.  I am heads down making InSite better.

                                        Andrey

                                        1 person found this helpful