Recently, we had a situation with the BizTalk installed in one of our customers, where the message processing time started to increase. We started our investigation, analyzing threads, disabling anti-virus, etc. This BizTalk installation consists of a cluster in Active/Active mode at the application tier. At the database tier we have also a cluster in active/passive mode.
The behaviour was very strange, since the processing time was increasing linearly. The system wasn't under load. We checked the MessageBox size, counting the rows at the Spool table, but everything was ok.
We started to investigate all the performance counters related to BizTalk, checking the SDK documentation. Meanwhile, our attention went to a performance counter named Message publishing delay (ms) under the category BizTalk:MessageAgent. With PerfMon we started to check this counter. This counter was increasing, which explains why we had a bad processing time. We had to understand the cause for this increasing delay. There is a chapter in BTS SDK dedicated to Throttling, which is a mechanism to moderate the workload associated a host instance. Apparently we are under Throttling without knowing.
To confirm our suspicions, we checked the performance counter Message Publishing Throttling State under category BizTalk:Message Agent. This counter tell us what caused the throttling, and we had a value of 6. Our first reaction was "what a hell is 6?". Checking again the documentation, the value of 6 means
"Host message queue size, the spool table size or the tracking table size exceed the specified threshold.
Possible reasons for this condition include:
- The SQL Agent jobs used by BizTalk Server to maintain the BizTalk Server databases not running or are running slowly.
- Down-stream components are not processing messages from the in-memory queue in a timely manner.
- Number of suspended messages is high.
- Maximum sustainable load for the system has been reached."
There was a relief knowing what was the problem: we are under Throttling. Well, but we had to understand what was the condition that triggered Throttling. In terms of mitigation strategy, the documentation says
"Ensure that the SQL Agent jobs used by BizTalk Server to maintain the BizTalk Server databases are running and are not failing.
Terminate and resume suspended instances as needed.
Increase the default value for the Message count in database threshold taking into consideration the space requirements of the SQL server that houses the BizTalk databases.
If your database is sized appropriately to handle additional message backlog, consider increasing the ThrottlingSpoolMultiplier and ThrottlingTrackingDataMultiplier registry values to allow additional backlog in the Spool and Tracking tables. For more information about changing the values see"
As I told before, we had already checked the Spool table, and it was almost empty. However we hadn't checked yet the Tracking Tables. The name of the tracking table in the MessageBox database is TrackingData. We checked the number of records in this view and we found ~ 800k records. Reading the documentation again
"The Message count in database setting also indirectly defines the threshold for a throttling condition based on the number of messages in the spool table or tracking table. If the number of messages in the spool table or tracking table exceeds 10 times this value then a throttling condition will be triggered. By default the Message count in database value is set to 50,000, which will cause a throttling condition if the spool table or the tracking table exceeds 500,000 messages."
The cause of Throttling was in the Tracking Tables.
For now, we decided to disable throttling based on the message count in database parameter, specifying a value of 0 in the treshhold "Message count in database". Meanwhile we are investigating why our TrackingData has so many records.
I recommend for all BizTalk developers and administrators to read the chapters in SDK documentation about Throttling. It has great information, and never know if suddenly your BizTalk slows down the processing.
BFC