One of the most laborious things in any CMS, not different in Sitecore, is the mass processing and transformation of data. It’s specially remarkable when you spend a day or two doing manual selection and edition of individual items for any reason. First time I felt that it was so painful, I ended up obsessed for a tool that could minimally automatize these kind of jobs.
A bit of History
Surprisingly I couldn’t find anything, then the only options would be sit and wait for something to arise, or create my own tool. As a born eager, I could not stay still and wait to, sooner or later, be parachuted to my next nightmare. Can’t sleep with that!
When I started looking for options, based on some of my experiences with other systems, my first thought was to build or extend an ETL tool. ETL, which stands for “extract, transform and load”, is commonly used in data warehouse systems to massively retrieve, filter and modify data across different sources (SQL servers, XML files, Excel spreadsheets, web services, etc).
With so many good tools at the market, building an ETL tool from scratch would not be smart from my part. My best choose would be to pick an existent tool and make it able to read and write on Sitecore databases. The ItemWebAPI would make it possible, but still very low-level. That would also require some significant effort to create the configurations, business logic and interfaces to make it connect and speak with Sitecore.
It stood reverberating on my mind…
I also considered creating an extension to Sitecore Rocks. That sounded like a good idea, as Sitecore Rocks already handles a considerable portion of what is needed to connect and interact with Sitecore databases. It also counts with XPath and query builders, whose logic could be used by my module to retrieve data from Sitecore. But in the other hand it would require Visual Studio to run, which would limit the module coverage, and still some UI and low-level implementation would be required.
And I let it reverberating on my mind…
A Sitecore Module?
Speaking in abrangence, my tool would preferable be a Sitecore Module instead of anything else. This way I would save time connecting to Sitecore, as the code would be executed inside of it. I would also save energy translating data back and forth as it reads and writes data on Sitecore, since I could simply use the Sitecore API as usual. That would also make things easier as the whole environment is familiar to me, and I wouldn’t have to dig down to technologies I’m not familiar with. I was trying to be pragmatic on my purpose of shielding from another nightmare with a minimum effort. Extending an existent Sitecore feature would be the best option.
My first real attempt was to extend the Sitecore Buckets feature (image below). It has a nice UI for filtering and listing items, as well as some Search Operations that we can use to modify the items selected.
The native Buckets system would still have to be modified in order to be used for what I wanted, but it looked like the best option till that time. Some investigation demonstrated that the change wouldn’t be trivial, more studying and investigation would be needed. I was also not very satisfied with the way that items are filtered, conditions are programmed, and how user friendly it is to build a query.
For some time it simply stood reverberating on my mind…
The Rules Engine – An insight from Digital Marketing
It was during the preparation of a Sitecore training course for Content Editors, targeting Digital Marketing features, that I first considered using the Sitecore Rules Engine for my purposes. Despite it is originally used for other things, such as Content Personalization or Event-triggered custom tasks, the experience was very close to what I was thinking. The way conditions and actions are chained is perfect for flexibility and is also very user-friendly.
That is actually perfect for my intents:
- Rules would now be a way to save a data transforming pattern
- Conditions could be used to select and filter content
- Actions would be responsible for data processing and transformations.
- They are also very easy to code
Then the idea started to flow out of my mind into a real tool…
The Sitecore Rule Processor
Available at the Sitecore Marketplace, Sitecore Rule Processor is the result of my efforts to automatize data transforming in Sitecore. After installed, when a Rule is selected at the Content Editor an icon is shown.
When clicked, it brings up a window to process the rule, where users can easily filter items that matches the rule’s conditions, then execute their actions against all or some of the items:
This way any user can visually build queries to retrieve items, and setup actions to transform them.
The module comes with a series of custom Actions, to increase the number of things a user can do to process and modify items, such as:
- Add a version to the item at a certain language;
- Change the item’s template;
- Copy, Move or Clone items to a certain path;
- Delete items;
- Empty an item’s field;
- Log a message with info taken from the item;
- Publish the item;
- Replace a string in a field of the item;
- Replace a string in an item’s name;
- Run a script;
- Serialize the item;
- Set a value at an item’s field;
- Set the value of an item’s field as the value of another item’s field;
- Set the value of an item’s field as the ID of another item;
- Set a workflow state.
I still have some useful actions left to do in my backlog, which will expand even more the value of this module.
Real life experience: existent actions
Then it finally happened: the next “monkey-job” were lurking at the clouds, being parachuted on me without a notice. One of our projects came with this demand: a “news article” template had some of its look and feel updated, and it imposed the need to replace some content of a Richtext field in all items (there were around 200 items at the repository). We had to replace all entries of class=”old-style” to our new class=”new-stype”.
My rule was then composed by:
- Where template=’News Article’
- Replace a string in a field of the item
- Replace class=”old-style”
- By class=”new-style”
- At field “Body”
Real life experience: a custom action
The above covered part of our needs, unfortunately some replacements were not that straight-forward. For instance, some of the markup had to be replaced by different tags. In our case, some tables inserted by the client had to be replaced by better formed <figure> tags.
This demanded the creation of a custom action, which would parse the HTML and apply replacement logic at its Document Object Model, then save it back to the item. To learn how to create a custom action, please check this article from John West.
My rule ended up very similar to the previous:
- Where template=’News Article’
- My custom action for DOM replacements
Please fell free to download and test the module (available on the Sitecore Marketplace at this link), and expand it as you need. Also please let me know if you have questions or any feedback!