Legacy XML Migration Documentation

Version 1

    This document is outdated and kept here only for reference. Please consult the current XML Content Migration Documentation.

     

    Overview

    The XML Migration Plugin is an internal framework used by Jive Professional Services to aid in customer migration engagements. The plugin processes a set of XML files, which represent content exported from the customer's source platform. The high-level data flow during the actual migration is as follows:

    Migration Framework XML data flow.jpg

     

    Overview

    1. Download the attached Samples and XSD archive.
    2. Extract it and inspect the folder structure.
    3. Create an XML export for one of your source object types, typically users.
      1. Match the XML bean definition provided in the samples.
      2. Match the file structure, i.e. more than one bean per file, but no more than 100 MB compressed per XML file.
      3. Match the provided folder structure exactly for each content type.
      4. Ensure that generated XML is exclusively UTF-8 encoded.
      5. Wrap non-conforming content in <![CDATA[  ... ]]> sections.
    4. Validate the generated XML using the XSD.
    5. Validate XHTML contents of any object type that contains markup.
    6. Repeat for all necessary source content types.
    7. ZIP all contents into a single archive and send to Jive for review/testing.
      1. Do not use TAR and/or gzip formats.
      2. Ensure you are using the correct compression tool parameters to encode file names as UTF-8.
      3. Please be aware that Windows has a 255 character limit for file name lengths, but with the folder structure provided you should not run into that.
    8. Tip: When migrating binary documents, it can be very helpful to have a small program to run thru the generated XML, and validate the file path & filenames correctly match what's actually on the filesystem in the /attachments folder. Its much easier for a client to create such a program, rather than let the migration framework find the problems on very large sets of files.

     

    Requirements

    1. The names of the element type folders are mandatory–they must be these names and uppercase.
    2. The XML file names within each folder are arbitrary.
    3. All files must start/end with the <importSource> tag.
    4. All dates must be provided as milliseconds since the epoch, for example 1242365364000 for Fri, 15 May 2009 05:29:24 GMT.
    5. All numeric values must not contain leading zeros, i.e. not look like 00123.
    6. The entire export must be compressed in a single ZIP archive, named <customer>-xml-YYYY-MM-DD-N.zip, containing the timestamp of the export and a running number N starting at 1 to support providing more than one export per day.

     

    XML Schema Specifications

    Most requirements for the XML beans are encoded in the attached XSD files. Use schema1.xsd for validation.

    1. There are several abstract bean types defined with common element definitions, from which all other bean definitions derive. It is generally easier to start from the samples and use them as documentation rather than navigating the XSD, unless your XML tool has full XSD support.
    2. Each bean must have a unique ID among all beans of the same type. The same ID, however, can be reused across different types. For example there can be both a blog post and a discussion message each with the ID 1001.
    3. Any ID that references another bean's ID must actually exist in the other bean definition. For example, a blog post's UserID must correspond to an existing bean of type USER with that same ID.
    4. The SourceID is generally not provided. Each bean's ID element is considered your source system's ID.
    5. For any bean deriving from PropertyBean, a <properties> element is required, even it is empty. These properties are occasionally used for custom migration and transformation requirements.
    6. Any bean derived from ContainableBean can specify a container to which it belongs, which is always a combination of container type (see folder names for permissible values) and container ID.

     

    User

    By default, a source users will be mapped to an existing user if there is an exact match on either username or email. <registrationStatus> should be REGISTERED.

     

    User Profile

    Valid visibility levels for element levelID are:

    • 1000: ALL_USERS (includes guests)
    • 1001: REGISTERED_USERS
    • 1002: COLLEAGUES
    • 1003: CONNECTIONS
    • 1004: CONNECTIONS_AND_COLLEAGUES
    • 1005: OWNER (private, visible only to user)
    • 1006: ALL_BUT_PARTNER_USERS

     

    Community (Space)

    1. <containableTypes> must list each type of containable type for each community, ie: FORUM if you are loading discussions, DOCUMENT if you are loading documents–otherwise the discussions or documents wouldn't show once migrated. The list of valid types here is again the sample folder names.
    2. <parentCommnityID> should have a value of 1, unless you are loading a hierarchy of communities, in which case the ID must refer to another community bean that comes before it in the file(s).

     

    Social Group

    1. <ID> in <socialGroupBean>: should have a unique value across all the other socialGroupBean elements. It should be a positive Integer, with NO leading zeros either (00123)
    2. <containableTypes>: You must list each type of containable type for each SocialGroup (ie: FORUM if you are loading discussions, DOCUMENT if you are loading documents–otherwise the discussions / documents / blogs etc wouldn't show once loaded, BLOG for blogs and so on...)
    3. <userID> in <socialGroupBean>: should have a value that matches the userBean <ID> value for the SocialGroup (see the userBean docs)

     

    Social Group Member

    The Social Group Member content type is the way to add users to a Social Group. The user who is the owner of the Social Group is typically added as part of the SocialGroupBean itself (see above), so SocialGroupMemberBeans is normally used to simply add additional members to the SocialGroup.

    1. <memberTypes>: You must pick ONE type of member SocialGroupMemberBean (ie: typically this would be set to MEMBER, since the owner is going to likely already be set from the socialGroupBean itself (see the SocialGroup content type documentation above). However, see the memberType enum above for the complete possible list of memberTypes.
    2. <userID> in <socialGroupMemberBean>: should have a value that matches the userBean <ID> value for the SocialGroupMember (see the userBean docs)

     

    Project

    Projects are a bit unique, in that they are contained by either a SocialGroup or Space (Community)  AND they themselves contain other content, like Documents or Blogs etc. The project may only contain content types which are allowed by its parent container ... Social Group or Space (Community)

    1. <containerID> in <projectBean>: should have a value that matches the container for the projectBean (see the communityBean / socialGroupBean etc etc)
    2. <containerType> value should be set to: COMMUNITY or SOCIAL_GROUP (all caps) depending on what type of container the project is to be migrated into.
    3. <userID> in <projectBean>: should have a value that matches the user for the projectBean (see the userBean)

     

    Document

    The primary elements to focus on are: currentVersion (type documentVersionBean), & binaryBody (type binaryDocumentBodyBean)

    1. <containerID> in <documentBean>: must match the container for the documentBean
    2. <containerType> value should be set to: COMMUNITY or SOCIAL_GROUP (all caps) depending on what type of container the document is to be migrated into.
    3. <userID> in <documentBean>: should have a value that matches the user for the documentBean (see the userBean)
    4. <ID> in <currentVersion>: can just set this to the value of 1  (note: simply needs to be unique across all versions of the document within <documentBean> )
    5. <userID> in <currentVersion>: just set this to the same userID value as in <documentBean>
    6. <ID>, <containerID>, <containerType>, <userID> in <binaryBody>: just set all these to the same values as in <documentBean>
    7. <name> in <binaryBody : this should be the name of the file.
    8. <url> in <binaryBody> : this is the full folder path & filename where the file can be found on the filesystem. It would be a relative path underneath the /attachments/ folder.
      For example:, in the Directory Structure section described above, there is an /xml-data folder, with an /attachments/ folder underneath it like this: /xml-data/attachments/
      If you were to have a file to migrate called test.pdf, and it was in a folder within the attachments directory like this: /xml-data/attachments/sales/pdfs/test.pdf
      then for the <url> value in the xml, you would want: <url>/sales/pdfs/test.pdf</url>
      (be aware of some gotchas with folder & filename paths, see Tips & Recommendations section above)

     

    Blog

    1. containerID should match the ID value defined the the User elements IF this is a personal blog, otherwise, it needs to match the ID value for the SocialGroup if it will be going into a socialGroup.  It does not need to be the userid of the Jive user account in the existing instance.
    2. If your blog will be a personal blog, then containerType should be USER_CONTAINER for personal blogs
    3. If your blog will be going into a Social Group, then containerType should be SOCIAL_GROUP for social group blogs.
    4. If your blog will be a SystemBlog, then containerType should be SYSTEM_CONTAINER, and containerID should be 0 (zero)

    SystemBlog:

        <blogBean>

            <ID>11</ID>

            <properties/>

            <containerID>0</containerID>

            <containerType>SYSTEM_CONTAINER</containerType>

            ...

     

    Blog Post

    1. attachmentState element must be included and set to PRESENT if the blog has an attachment
    2. blogID should be the same as the ID element value for the parent Blog defintion
    3. creationDate, modificationDate, and publishedDate are required (even though minOccurs=0)
    4. Value of "body" element must be a valid XHTML.  It should be the contents within an XHTML body element. It is recommended to put the contents of this element in a CDATA block. The literal entities <, &, >, ', and " must be encoded using &lt;, &amp;, &gt;, &apos; and &quot; .
    5. The blogID, publishedDate, and permalink must be a unique tuple among all blog posts.  Jive uses these to uniquely identify a blog post, so if there are posts with duplicate of these tuples, only one post will be displayed.

     

    Comment

    1. containerID should be the ID of the Blog Post that the Attachment should be attached to
    2. containerType should be user container type: BLOG_POST
    3. parentObjectID should be set to -1
    4. body must be valid XHTML
    5. parentCommentID should be 0 (or omitted) for non-nested comments.  For nested comments, it should be the ID of the parent comment.
    6. commenterInformation should contain the ID fo the User who made the comment.

     

    Attachment

    1. containerID should be the ID of the Blog Post that the Attachment should be attached to
    2. containerType should be user container type: BLOG_POST
    3. mimeContentType should be the MIME type of the attachement, e.g., application/pdf
    4. The name should be the filename of the attachment
    5. The contents of the attachement can either be specified by the data element as a base64 encoded value or as a path in the url element.  Using the url element is recommended if there there are a large number of attachments to migrate.  One of these two elements must exist.

     

    Tagged Content & Tags

    1. containerID should be the ID of the content (e.g., document, blog post, message) that the tags should be set on. It should should not be the container in which the content is in.
    2. containerType should be Content Type that the tag is being set on  (e.g., DOCUMENT, BLOG_POST, DISCUSSION_MESSAGE)
    3. The TagBean bean being referenced must exist.

     

    Embedded Images

     

    1. containerType will be the type of the content in which the image is embedded, e.g., BLOG_POST

    Example:

    This example uses the "data" element, but you can also use "url" (see 2nd example)

     

    <?xml version="1.0" encoding="UTF-8"?>
    <importSource xmlns='http://www.jivesoftware.com/migration'>
      <imageBean>
          <ID>21</ID>
          <properties/>
          <containerID>245</containerID>
          <containerType>BLOG_POST</containerType>
          <mimeContentType>image/jpeg</mimeContentType>
          <data>/9j/4AAQSkZJRgABAgEAYABgAAD/QQd7=</data>
          <name>test.jpg</name>
      </imageBean>
    </importSource>
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

     

    The code will look for embedded images in the body of discussion messages.

    This includes the root message in discussions. The tag in the message needs to contain an img tag with an attribute called "__embedded_id". The value needs to match an id in the images XML file you are importing.

    This works the same way for Blogsposts etc


    For example, the reference to the above image within the body of a blog post would look like:


    <p>image test <img __embedded_id="21" src="test.jpg" alt="this is a test image" /></p>


    This example uses the "url" element to indicate the folder/path & filename of the file on the filesystem

     

    <?xml version="1.0" encoding="UTF-8"?>
    <importSource xmlns="http://www.jivesoftware.com/migration">
        <imageBean>
            <ID>56789</ID>
            <properties/>
            <containerID>101</containerID>
            <containerType>BLOG_POST</containerType>
            <userID>2001</userID>
            <mimeContentType>image/jpeg</mimeContentType>
            <name>dentist1.jpeg</name>
            <url>dentist1.jpeg</url>
        </imageBean>
    </importSource>
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

     

    Follow

    The migration framework has the ability to setup users to be following certain Places & Content. (Could be updated to work for following other users if needed)

    With this in mind, a "follow" is treated like other content is, in this case an XML source data file.

    Migrating a follow is fairly straightforward.

    Even though a follow isn't really "content", and is also not "contained" by any containers, we're treating like it is, so we easily re-use parts of the migration framework to do it.

    1. <ID> in <followBean>: should have a unique value across all the other followBean elements
    2. <containerID> in <followBean>: should have a value that matches the container for the followBean (see the communityBean / socialGroupBean / projectBean etc etc)
    3. <containerType> value should be set to: COMMUNITY or SOCIAL_GROUP or PROJECT (all caps) (also see example XML for more "containers") depending on what type of "container" the follow is to be "migrated to" for the user.
    4. <userID> in <followBean>: should have a value that matches the user for the followBean (see the userBean)

     

    View Counts

    In previous versions of the XML schema, view counts for object were migrated as part of the object for which the occurred. They are now migrated as a separate ViewCountBean:

    <viewCountBean>
        <ID>1000</ID>
        <objectType>14</objectType>
        <objectId>1</objectId>
        <viewCount>17</viewCount>
    </viewCountBean>
    
    
    
    
    1. <ID> is an arbitrary unique ID for the view count instance.
    2. <objectType> refers to the type of object for which the view count is migrated
    3. <objectId> uniquely identifies the "viewed" object
    4. <viewCount> specifies the number of views for that object

     

    On-Prem Instructions

    The migration is initiated from the Jive admin console and thus executes in a running instance of Jive:

    1. Remove the node that is running the migration from any cluster configuration.
    2. Shut down all other nodes.
    3. Ensure EAE is up and running and healthy (in the Jive instance's admin console).
    4. Configure and execute the migration. Details on how to do so will be provided by the PS Engineer engaged on your migration project.
    5. It is recommended to rebuild content indexes and to restart Jive when the migration completes.

     

    Reference – Object Type IDs

     

    Object TypeValue
    ACCLAIM-1177427622
    ACCLAIM VOTE-786106556
    NULL-1
    DISCUSSION THREAD1
    DISCUSSION MESSAGE2
    USER3
    GROUP4
    ATTACHMENT13
    COMMUNITY (Space)14
    POLL18
    PRIVATE_MESSAGE20
    ANNOUNCEMENT22
    AVATAR26
    QUESTION27
    BLOG37
    BLOGPOST38
    TRACKBACK40
    TAG41
    TAG SET42
    USER STATUS48
    USER RELATIONSHIP49
    USER RELATIONSHIP LIST53
    DOCUMENT102
    COMMENT105
    RATING107
    SEARCH QUERY109
    DOCUMENT VERSION120
    DOCUMENT VERSION COMMENT121
    PROFILE IMAGE501
    PROJECT600
    SOCIAL GROUP700
    BOOKMARK800
    BOOKMARK (EXTERNAL)801
    VIDEO1100
    USER CONTAINER2020
    WALL ENTRY1464927464
    EVENT96891546