A few months ago I wrote a post about a way to implement a Streaming Download pipeline component for BizTalk which would download data from CRM. The aim of the component was that we knew that there could potentially be a large amount of data returned by some of the queries which would be executed against CRM and over time the amount of data which we would download could grow. With this in mind I developed this streaming approach so that we could minimize the amount of data being flown through the BizTalk message box which would improve performance. The idea was that the FetchXML (query input) was sent to the message box and picked up by a send port. In the send port we had a custom pipeline component which would allow us to replace the stream being used by BizTalk so that we would execute the query and the results would be paged down as the stream was read by BizTalk.
The benefits of this approach were:
- Reduce the amount of data going through the message box
- Control the memory usage as the message was sent through the pipeline
The approach involved a custom stream and if your output adapter supported streaming such as with the File adapter then you could download a very large amount of data without a big spike in memory usage on your BizTalk server.
Ok so that was great for our CRM requirements, but in the back of my mind something was nagging away that I could make this approach more generic, so I did that. Now we have a stream that can download anything and you would constructing it by passing in a class which would get the data from the data source. The custom stream is probably the most painful bit to get right so I have provided this with this code sample and then you would write a class that implements the IDownload interface I provide and then pass that into the stream. In BizTalk you would then find it very easy to reuse this stream to download anything you want.
Lets take a look at a video showing how this works
What about BizTalk
In the demo to keep it simple I have just used C# to show how the stream would be read and then subsequently execute the download. If you wanted to use this within a BizTalk pipeline you would do exactly what I did previously and use something like the code example below:
Where might I use this?
I think this is a handy approach to have in your tool box. Some examples might include:
- Download from CRM
This approach I have already demonstrated in the previous post
- Download from Azure Blob Storage
Using the sample code In this articles code download you could put the code for the Azure Blob download into a pipeline component and then you could flow a message which points to a file held in Azure Blob storage. In the pipeline component you set up the GenericDownloadStream so that it will download the file from the cloud as BizTalk processes your output message and writes it to disk. I have seen a few times a requirement where a service in the cloud has a requirement to upload a file to the cloud and then BizTalk was to pull the message down and save it on the SAN somewhere on premise. In this approach rather than flowing the entire file through BizTalk (which in the case im thinking of, were PDF scanned documents) a pointer to the file in the cloud could be flown through service bus and on the send pipeline going to save to disk on the SAN you could use this stream approach to download the file from the cloud. The send port would also give you all of the BizTalk retry capabilities and capabilities of which ever adapter you chose.
- Download from a database
Using a similar approach to what I did for CRM, you could perhaps use this approach to pull scheduled data from a database in the BizTalk send pipeline rather than having to read it all into the message box. I remember one customer where they didnt have an established ETL tool or dev/ops setup for managing SSIS so we could have used this to schedule some extracts that were pulled from the database and sent over FTP to a B2B partner
- REST/HTTP Download
I remember one project where a friend needed to develop a way to download a set of data from a B2B partners website on a daily basis and publish it to a spreadsheet for some business users. Bit more complicated than that but the key thing was that last bit of download the data and push it to file. I cant remember exactly why but he was unable to use the out of the box BizTalk adapters so ended up writing a .net component which was called by an orchestration. With this approach a simple bit of .net code to implement your IDownload then use the GenericDownloadStream within a custom pipeline component would mean the data could be any size and you would leverage all of the port and pipeline config without having to manage any config in a custom way
The code sample is available here: