Introduction to Sitecore Experience Extractor

Note: Experience Extractor requires direct access to Mongo DB and Sitecore’s item database, and is currently not compatible with Sitecore xDB Cloud Service.

Experience Extractor is a new tool that is meant to be used with Sitecore XP 8.0 and above. The goal of Experience Extractor is to get data out of the Sitecore platform and into other data analysis tools.

Let’s face it – there are lots of great data analysis tools and we needed a better way of piping Sitecore data into these tools. Marketers and data analysts usually know these 3rd party tools REALLY well and know very little about Sitecore (sometimes by choice). But in my experience these people are incredibly resourceful and are able to manipulate the raw data into whatever format they need once they get the data in a tool they are familiar with.

Here is a snippet from the documentation that really stuck with me:

By executing the jobs in Sitecore context using Sitecore’s API rather than querying the underlying databases directly, Experience Extractor provides an option for data integrations that are more robust towards future upgrades of Sitecore, blends data from xDB and Sitecore’s item database, and enables reuse of custom logic for data stored in xDB.

Sourcehttps://github.com/Sitecore/experience-extractor

Particular attention should be given to “blends data from xDB and Sitecore’s item database” which tells me that Experience Extractor is not solely limited to xDB data like I expected. Instead this could potentially be used for exporting data held within the standard Sitecore DBs. This is different from one of Sitecore’s previous offerings named Sitecore Engagement Intelligence which was solely for exporting DMS data.

Nothing is going to be completely future proof but “using Sitecore’s API rather than querying the underlying databases directly” is as good as you’ll get with Sitecore.

Using Experience Extractor

Download and Install Experience Extractor as a normal Sitecore package from the Marketplace or GitHub. Experience Extractor should be ready to use right after installation although pay special attention to this information which advises not to connect directly to the primary MongoDB instance in a production environment as it could potentially effect performance.

After installation you will be able to open “Experience Extractor” from the Sitecore launch pad.

EE_app

Opening the Experience Extractor app will present you with an interface with various controls to assist you in building jobs and the shell with YAML which is used to build the query (YAML is a human-readable data serialization format).

EE_interface

Once you are happy with the job specification (YAML or JSON in the shell) then you can submit the job which creates a zip file that is available for download. The zip file contains a handful of files including files with the requested data in CSV format.

You can also include optional post processing steps in the job definition that will export the requested data directly to a database (mssql or msaccess). This is a great feature that opens a lot of possibilities which will be discussed later.

It is important to note that all of this work is done via a REST API which means you can call the API directly instead of using the user friendly interface available within Sitecore. Awesome!

Differences between Versions of Experience Extractor

It is an interesting choice to use YAML as the format to build jobs in the Experience Extractor interface since everything is working with JSON in the backend. YAML is a bit easier to read for non-programmers and might make things a bit more user friendly for them. The good news is that JSON is still accepted in the Shell which is awesome since the documentation still references only JSON examples.

The initial release of Experience Extractor did not include an app in the Sitecore launch pad and instead made use of a page located at: /sitecore/admin/experienceextractor/shell.aspx. This page uses JSON instead of YAML for the job definition and is still available in the most recent release. It is a great resource since it displays an example JSON job definition by default including the proper connection string format for a MSSQL database. This is helpful since calls to the Experience Extractor API are still done using the JSON format and this is a good example to build from and test.

In Closing

I really like the business problem that Experience Extractor is attempting to solve. After years of writing custom reporting and data manipulation scripts, I’m excited to give Experience Extractor a more in depth trial with real world scenarios. Like I said, let’s get Sitecore data into tools like Excel that marketers and data analysts are familiar with. You may be surprised how powerful these tools are and the awesome things people can do with the data.

More Resources about Experience Extractor:

Posted in Sitecore
One comment on “Introduction to Sitecore Experience Extractor
  1. […] my previous post I gave an Introduction to Sitecore Experience Extractor. This post builds upon that and introduces the scheduled jobs functionality available in Experience […]

Leave a comment