Version 2

    Introduction

    If there are discrepancies between user, document or other counts displayed in the Community Manager Reports (CMR) and the counts displayed in the Jive Application UI, this could indicate an issue in the Jive Analytics database summary tables.

     

    The Jive Application database is the operational or transactional database. It stores user names, documents, spaces, groups and settings. In general, the Jive UI displays data pulled from the Jive Application database. Depending on privileges, a user can add, delete or modify a document or make changes to user accounts or create spaces in the Jive Application database.

     

    The Jive Analytics database is a data warehouse or analytical database. Its primary function is to support reports and queries which may span wide temporal dimensions. To ensure those queries and reports are generated quickly, the Jive Analytics database uses aggregate and summary tables indexed by dates and other data elements designed to support specific queries. From a typical Jive user's perspective, the Jive Analytics database is read only. The Jive Analytics database is populated by an automated process that runs as part of nightly Extract, Transform and Load (ETL) processing, which can also be started manually through the Jive Admin Console. For more information about the Jive Analytics data model, see Analytics DB Data Model.

     

    To understand the relationship between the Jive Application database and the Jive Analytics database better, imagine a Jive Application database that stores thousands of documents. You want to know which months those documents were created and see it in a trend line chart.

     

    To pull that information from the Jive Application database, the database engine would pull each document, look at the created timestamp, extract the month and year and then sort all the documents by the month and year. Depending on the number of documents in the database, that query could use a lot of processing power and take a considerable amount of time.

     

    The Jive Analytics database was developed as a solution to make querying Jive statistics faster and easier. The Jive Analytics database stores a metadata record for every document's creation date in a table called a fact table and rolls up or aggregates those records into summary tables, allowing the database engine to perform calculations like count once and store the results. The Jive Analytics database stores information about more than documents. It stores information about a multitude of events that occurred in the Jive Application database. For a list of those events, see Analytics DB Data Model.

     

    A simplified depiction of the Jive Application and Jive Analytics summary table relationship.

     

    When the counts and aggregates in the Jive Analytics summary tables don't match the counts in the Jive Application database, use the resummarization process to empty the summary tables, recalculate the counts derived from the Jive Analytics fact table and store the new values back in the summary tables.

     

    Prerequisites

    To complete this tutorial, you need:

     

    • Access to the Jive Analytics database with an account that has the privilege to run the truncate command.
    • A SQL client to connect to the Jive Analytics database.
    • Administrator access to the Jive Administrator Console.
    • Access to Jive Community Manager Reports (CMR). For more information about CMR, see Community Manager Reports Plugin.

     

    Step 1 - Identify and document the problem

    Typically, discrepancies between user counts, document counts and other counts are noticed when comparing the Jive Application UI counts against Jive CMR counts.

     

    Common places to look for count discrepancies:

     

    • User count
    • Discussions count
    • Blog post count
    • Status update count
    • Poll count
    • Question count
    • Question answered count

     

    Taking screenshots of the varying counts in the Jive Application UI and the Jive CMR UI is one way to document the discrepancies. In addition, if you have access to the Jive Application database, you can issue the following queries:

     

    User counts:

     

    SELECT COUNT(userid) FROM jiveUser WHERE visible = 1 AND initiallogindate > 0
    select count(userid) from jiveuser where userenabled = 1 and visible = 1;
    select count(userid) from jiveuser where lastloggedin <> 0 and userenabled = 1 and visible = 1;

     

     

    Step 2 - Estimate the amount of time the CMR system will be unavailable

    The resummarization process reads all of the records in the jivedw_activity_fact table. The jivedw_activity_fact table holds records for many Jive events. To see the events sent to Jive Analytics, from the Jive Administration Console click  System > Settings > Analytics > Events.

     

    There is no method to get an absolute measure of time a resummarization will take. The resummarization process reads every record in the fact table, generates aggregates of those records and inserts those aggregate records into summary tables. The speed of the CPU, the load on the CPU during resummarization, the memory and the speed of the I/O read and writes are factors in the resummarization processing time.

     

    To estimate the amount of time a resummarization could take, run the following query to return a count of the number of records in the Jive Analytics fact table.

     

    select count(user_id) from jivedw_activity_fact;

     

     

    Number of records in jivedw_activity_factApproximate processing time in days
    250,0001-5
    500,0003-5
    1,000,0005+

     

    If the jivedw_activity_fact table has 100 million or more records, resummarization will take a very long time and consume a lot of CPU, I/O and memory resources. If the Jive Analytics database with 100 million+ records in the fact table and the Jive Application database run in the same machine, a full resummarization is not recommended. In this case, you should contact Jive Professional Services to design a custom solution for your specific scenario.

     

    Step 3 - Alert users the CMR system will be unavailable

    While the resummariztion process runs, the data in the CMR reports will be inaccurate and/or unavailable.

     

    Step 4 - (Optional) Backup the current Jive Analytics database

    This step is optional. If your standard operating procedures require backing up a database prior to performing mass changes, feel free to back it up. The only tables changed in a resummarization are summary tables derived from data in the jivedw_fact_activity table. The jivedw_activity_fact table is the primary table and wont be modified.

     

    Backing up a database prior to performing mass changes is never a bad idea. However, if you restore the Jive Analytics database to a previous version after your Jive Application database has been running and completed an ETL process, you risk losing event data that occurred since the date and time of the back up.

     

    Step 5 - Clear the summary table data in the Jive Analytics database

    In this step, you empty the Jive Analytics summary tables, using a database command named "truncate". The truncate command is similar to the delete command, but it runs much faster because it does not record every delete transaction in the database transaction log.

     

    truncate jivedw_summary_stats;
    truncate jivedw_adoption_glb_summary;
    truncate  jivedw_adoption_spc_summary;
    truncate  jivedw_places_u_summary;
    truncate  jivedw_content_u_glb_summary;
    truncate  jivedw_adoption_grp_summary;
    truncate  jivedw_content_u_spc_summary;
    truncate  jivedw_content_u_grp_summary;
    truncate  jivedw_users_grp_summary;
    truncate  jivedw_daily_grp_summary;
    truncate  jivedw_user_summary;
    truncate  jivedw_adoption_u_glb_summary;
    truncate  jivedw_content_glb_summary;
    truncate  jivedw_activity_summary;
    truncate  jivedw_adoption_u_spc_summary;
    truncate  jivedw_content_spc_summary;
    truncate  jivedw_user_init_summary;
    truncate  jivedw_adoption_u_grp_summary;
    truncate  jivedw_content_summary;
    truncate  jivedw_places_summary;
    truncate  jivedw_content_grp_summary;
    truncate  jivedw_daily_u_glb_summary;
    truncate  jivedw_question_summary;
    truncate  jivedw_users_glb_summary;
    truncate  jivedw_daily_glb_summary;
    truncate  jivedw_daily_u_spc_summary;
    truncate  jivedw_user_points_summary;
    truncate  jivedw_users_spc_summary;
    truncate  jivedw_daily_spc_summary;
    truncate  jivedw_daily_u_grp_summary;

    Step 6 - Start the ETL process

    There are two ways to start the ETL process:

     

    • From the Jive Admin Console, click System > Settings > Analytics > Data Load > Run ETL now

     

    • Wait for the ETL process to automatically start. By default, the ETL process is scheduled to start at 2a local time. To review the date and times the ETL process previously started and ended, click System > Settings > Analytics > Data Load and review the data in the table labeled ETL Job History.

     

    Step 7 - Monitor resummarization progress

     

    From the official Jive docs,  Core Help: Monitoring Basics:

    The easiest way to monitor this is by querying the jivedw_etl_job table with something like this:

     

    select state, start_ts, end_ts from jivedw_etl_job where etl_job_id = (select max(etl_job_id) from jivedw_etl_job);

     

    • If the state is 1, the ETL is running.
    • If any state is 3, there is a hard failure that you need to investigate.If the difference between  start_ts and  end_ts is too big, you may need to increase the resources for the Analytics database.

     

    Step 8 - Compare new CMR results against old

    After the ETL process finishes, compare the CMR data to the Jive Application UI counts and to the old CMR data you captured in Step 1.