First of all, let me say that this post do not pretend to explain how to do dynamic updates in WF. There are a lot of posts about that, including some that i strongly recommend in Jon Flanders blog. I'm writing this because I've concerns about performance in dynamic updates. Because we're extensively using dynamic updates in a project, and doing some pretty hard profiling analysis, I've now a pretty good picture about this issue.
Let me try to explain how we are using dynamic updates in WF. One of your projects, for a Portuguese bank, consists in developing a platform to expose business services that will be consumed by the different business channels (eg., Internet, Branch, Phone, ATM). We've decided to use Windows Communication Foundation to expose our services and Windows Workflow Foundation for the implementation. Basically, a business service has some validation rules and an execution flow. For instance, considering an account transfer, our validation rules typically are: validate that the source account is correct (depending on current channel, if we are in the Internet we must guarantee that the account is owned by the customer in session, if we are in the branch we just look at the check digit), validate that the destination account has a valid check digit, validate if amount is correct, and so on.
Then service implementation just need to call the mainframe gateway to execute the transaction. Sounds simple, right? Well, it's not so simple. Depending on the business channel, the type of service, and the profile of the customer we have different behaviors: in the Internet and phone banking we have aggressive security requirements that must be implemented before we execute the transaction: we can ask a security code, we can send a mobile SMS to the phone registered with the customer and ask that code, we can ask an authorization to another customer, etc. However, this validation does not make sense if the customer just wants to see the account balance. And we need more validation in the branch and phone channel, where the customer is assisted with an operator, which implies some validations on the operator profile.
What I'm trying to say is that we have a set of operation patterns implemented as workflows. We call this workflows Master Workflows or Template Workflows. This workflows have extensibility points in the form of hooks that can be injected and are specific to the service logic. So, when we receive a message, we inspect it, and, through configuration, we decide the workflow template to use and what hooks must be injected in that workflow template.
As you've already noticed, this injection is made using Dynamic Updates. We do that in all the requests received to create the workflow instance to run. In our profiling analysis we noticed that Dynamic Updates accounted for approx. 50% of the total execution time. Of course our team did not accept that, and promptly started searching for alternatives.
Our first approach consisted in caching the workflow definition of the first request of each kind of workflow, for further reuse.How can that be done? There is only one entity responsible to create workflows instances, which is the WorkflowRuntime. To create a workflow we must use the method CreateWorkflow passing an Xml with the Xoml definition. In our case, we are authoring workflows in code, therefore creating workflow instances through a type.
Our first try was to generate the Xoml definition after the dynamic updates to further reuse that Xoml. The problem with this approach was that the Xoml definition associated with a workflow that is already compiled is opaque. Basically, the Xoml definition is a reference to the CLR type used before the dynamic updates occurs. We also tried to navigate in the child activities to generate each Xoml activity definition, and compose them in an unique Xoml. The problem was that we were using Code Activities that reference handlers in the original class, which must be replicated to the new type that we're trying to create. Quite complicated, we thought. We evaluated also the possibility to use CodeDom to generate the source code of the new type, however it was a little overwhelming for a team that practices and stands for KISS.
Finally, the solution adopted consisted in creating the workflows in background through a set of worker threads. This worker threads have the responsibility of maintaining a pool of workflows prepared to use. In this way, we ensure that there's always available a workflow for each request received, ready to be used. Our response time decreased 50%. However, if we stress the server to reach ~100% of CPU usage there's no gain, since there's no CPU available to do the background work.
I confess that, at this point, the team was disappointed with the dynamic updates. We still don't understand why there's no simple way to cache or to reuse a workflow definition. As an humble suggestion to the WF Team, we propose an overload to the CreateWorkflow method, which must receive an Activity as an argument, that should be the root activity of an workflow definition. In our case, after the dynamic updates, we would keep the workflow definition around and lately call this overload to create another workflow instance with that definition. I would love to have this feature in the final release ;-).