Analytics Data Model

    This document describes the data model for the Jive SBS analytics feature. You'll find the specifics of the schema -- tables, relationships and columns in the product documentation and for example here is the SchemaSpy - Jive v6 Analytics schema. For more on the optional analytics module, see the Jive web site.

    Overview of Design

    The analytics data model is a star schema form. The schema is comprised of a fact table that represents the events in Jive SBS, along with corresponding dimension tables that represent the actors and objects that take part in those events. Each column in the fact table contains a key that relates to an entry in the corresponding dimension table. For example, the user_id column contains IDs that can be found in the jivedw_user table. A basic query against the analytics schema will be a SELECT from the fact table, optionally performing INNER JOINS against the dimension tables. The dimension tables are used both constrain the results of the query based on their attributes and also to provide attribute data, such as the usernames of users.

    Activity Fact Table

    At the heart of the model is the activity fact table. Each record in the fact table represents a unique event that occurred. Essentially each record conveys the following:


    At <TIME>, <USER> performed <ACTION> on <OBJECT> (optionally with <INDIRECT OBJECT>) in <CONTAINER>


    Each of the variables in <> is represented as an ID in one of the columns of the fact table.


    The activity_ts column provides the timestamp when the activity occurred in the system. Since timestamps represent times down to the second or nanosecond level, this column provides the ability to analyze activity down to a very fine granularity. However, a foreign key to the JIVEDW_DAY table has also been provided to allow for selecting records using a time dimension. The JIVEDW_DAY dimension is the smallest granularity of the time dimensions available.


    The user_id column contains the ID of the user that initiated the action, sometimes referred to as the actor. This is a foreign key to the JIVEDW_USER table.


    The activity_type column contains a code indicating the action or event that occurred. Refer to the table on event codes. Most common activity codes will be VIEW(10), or CREATE(20), but there are many more.


    The dw_direct_object_id contains the ID of the action's object and its entry can be found in the JIVEDW_OBJECT dimension table. Objects represent all the types of data entities in the application. Most of these types are content types such as documents, blog posts, discussions, comments and any custom content type implemented by a plug-in. However there are other types of entity that can act as an object of an event such as a user or container. For example, an event may represent a connection being made between two users, or another event may describe a user viewing another user’s profile.


    The type of the object in the direct_dw_object_id column will depend on the activity type of the event. For example, a WATCH event will contain any type of object that represents watchable content, but will not contain a user object. For convenience, the object type is also stored in the fact record, although it can also be determined from the DW_OBJECT dimension table.


    The indirect_dw_object_id is an optional field that provides the ID of a secondary object in the event. Not all events require a secondary object. An example of an event that has a secondary object is the MOVE event. With the move event the direct object is the ID of the object being moved and the indirect object is the ID of the container it is being moved to.


    The container ID column contains the ID of the container of the object. This is not applicable to all events.

    Aggregate Fact Tables

    The activity fact table provides the finest level of granularity of events – each event is recorded as a record in the table. Reporting does not often require this level of granularity, especially when performing analysis over a large time frame. Additionally there maybe multiple events that are the same -- for example, a user may read the same document multiple times in the same day. This leads to an explosion in the size of the table that ultimately has a detrimental effect on query performance.


    Aggregate tables provide summarized data across all or specific dimensions. Instead of representing each unique event, each record provides a count of the number of events that occurred for the combination of dimension values. For example, if the aggregate is representing the day granularity, and a user viewed a document multiple times on that day, there will be a single record recording that activity but with a count indicating the number of times it occurred. Aggregate tables will contain significantly fewer records than the base fact table, providing much better query performance.


    The analytics data model provides aggregate tables that summarize the unique events at the Day, Week and Month periods. Additionally, some aggregate tables represent unique events for “All Users”, such as by removing the user dimension from the fact table and summarizing the events. For example, if three users view the same document, it will be represented as a single record but with a count of three. This aggregate is also provided at the Day, Week and Month periods.


    Note: When aggregating up to a particular time period, it is not possible to retain the timestamp in the aggregate table. Instead of the time key of the period can be retained. Fortunately, each of the time dimensions contains beginning and end timestamps representing the period's range. Therefore you can still use timestamps in queries.

    TableDescription
    JIVEDW_ACTIVITY_AGG_DAYActivity aggregated for each day period
    JIVEDW_ACTIVITY_AGG_WEEKActivity aggregated for each week period
    JIVEDW_ACTIVITY_AGG_MONTHActivity aggregated for each month period
    JIVEDW_ACTIVITY_AGG_USER_DAYActivity aggregated for all users, for each day period
    JIVEDW_ACTIVITY_AGG_USER_WEEKActivity aggregated for all users, for each week period
    JIVEDW_ACTIVITY_AGG_USER_MONTHActivity aggregated for all users, for each month period

    Jive Object Dimension

    The Jive object dimension, JIVEDW_OBJECT, contains a record representing each Jive object entity in the system. A Jive object can be any content type, such as a document, discussion message, blog post, comment, and so on. It can also be a container, such as a group, space, or user container. Even a user can be considered an object. Each type of object has a unique object type ID, and for each object type the object ID is unique.


    The Jive object dimension provides these attributes, and also provides a generic “Name” attribute. The name will contain either the title or subject of the content, if it’s available, for any content type that has a specific data load process. For all other content types in the system the name will simply be “CONTENT”.


    For certain content types there are some additional dimension tables that can be joined to the object table – so-called "snowflake" dimensions. These provide additional attributes for those types of objects. The following snowflake dimensions exist:

     

    TableDescription
    JIVEDW_DOCUMENTContains attributes for documents. NOTE: Object types DOCUMENT (102) and DOCUMENT_VERSION (120) refer to the same documents and thus share the same id. The document dimension will contain the most recent name of the document – which is the name of the DOCUMENT_VERSION with the greater version number.
    JIVEDW_MESSAGEContains attributes for discussion messages.
    JIVEDW_BLOGPOSTContains attributes for blog posts.

    User Dimension

    The user dimension contains a record for each user defined in the system and provides attributes such as username, name, firstname and last name. The user ID is unique. The user_id column can be joined to the user_id in the activity fact and aggregate tables to query activity for that user.

    User Profile Fields

    The JIVEDW_USER_PROFILEFIELD dimension is a snowflake from the user dimension. It contains the values for profile fields set for the user and can be used to constrain analysis queries for certain types of users.

    Profile Fields

    The profile fields table contains the names and IDs of all profile field and custom field types defined in the application. It can be joined to the JIVEDW_USER_PROFILEFIELD table to provide the field name if necessary.

    Time Dimensions

    The Time dimensions provide unique keys that represent a time period. The time dimension also provide attributes to assist in selecting those keys, such as the day of the month, the month of the year and the year. They also provide beginning and ending timestamps that represent the boundaries of the time period. This allows queries to select a period using a particular timestamp.


    The primary keys of the time dimensions are used as foreign keys in their corresponding aggregate tables. For example, the MONTH_ID of TIME_MONTH is used in JIVE_ACTIVITY__AGG_MONTH and JIVEDW_ACTIVITY_AGG_USER_MONTH.


    Note: For convenience, the base fact table contains a DAY_ID column in addition to the actual timestamp of the event.

     

    Analytics Event Code Reference

    Event CodeValue
    VIEW10
    CREATE20
    DELETE30
    MODIFY40
    HIDE50
    COPY60
    MOVE70
    UNDELETE80
    EXPIRE90
    RATE_OR_VOTE100
    LIKE/ACCLAIM

    101

    UNLIKE102
    MODERATE110
    SEND120
    APPROVE130
    ONLINE140
    OFFLINE150
    LOG_IN160
    LOG_OUT161
    SEARCH_CONTENT170
    SEARCH_USERS180
    SPOTLIGHT_SEARCH190
    ASSOCIATE200
    DISSOCIATE210
    REGISTER220
    VALIDATE230
    REJECT240
    WATCH250
    REMOVE_WATCH260
    FOLLOW270
    UNFOLLOW280
    AIDED290
    RESOLVED300
    SEARCH TAGS310
    DOWNLOADED320
    MENTIONED330

    Object Type Reference

    Object TypeValue
    ACCLAIM-1177427622
    ACCLAIM VOTE-786106556
    NULL-1
    DISCUSSION THREAD1
    DISCUSSION MESSAGE2
    USER3
    GROUP4
    ATTACHMENT13
    COMMUNITY (Space)14
    POLL18
    PRIVATE_MESSAGE20
    ANNOUNCEMENT22
    AVATAR26
    QUESTION27
    BLOG37
    BLOGPOST38
    TRACKBACK40
    TAG41
    TAG SET42
    USER STATUS48
    USER RELATIONSHIP49
    USER RELATIONSHIP LIST53
    DOCUMENT102
    COMMENT105
    RATING107
    SEARCH QUERY109
    DOCUMENT VERSION120
    DOCUMENT VERSION COMMENT121
    PROFILE IMAGE501
    PROJECT600
    SOCIAL GROUP700
    BOOKMARK800
    BOOKMARK (EXTERNAL)801
    VIDEO1100
    USER CONTAINER2020
    WALL ENTRY1464927464
    EVENT96891546

     

    Special User Reference

    User TypeValue
    Anonymous (Guest) User-1
    System User1

     

    For additional descriptions of analytics events, check out Definitions of Analytics Events.