Warning: session_start() [function.session-start]: open(C:\Windows\SERVIC~2\NETWOR~1\AppData\Local\Temp\\sess_e4c0e5a77e5e8a10933e4fcb3a956d7b, O_RDWR) failed: Permission denied (13) in E:\hshome\snagy\azure.snagy.name\blog\wp-content\plugins\si-captcha-for-wordpress\si-captcha.php on line 763

Warning: session_start() [function.session-start]: Cannot send session cookie - headers already sent by (output started at E:\hshome\snagy\azure.snagy.name\blog\wp-content\plugins\si-captcha-for-wordpress\si-captcha.php:763) in E:\hshome\snagy\azure.snagy.name\blog\wp-content\plugins\si-captcha-for-wordpress\si-captcha.php on line 763

Warning: session_start() [function.session-start]: Cannot send session cache limiter - headers already sent (output started at E:\hshome\snagy\azure.snagy.name\blog\wp-content\plugins\si-captcha-for-wordpress\si-captcha.php:763) in E:\hshome\snagy\azure.snagy.name\blog\wp-content\plugins\si-captcha-for-wordpress\si-captcha.php on line 763
Above The Cloud » Worker Role


 16 May 2010 @ 10:04 PM 

You may not have heard this name before. In fact, I Googled it and got 0 responses. However the pattern is very important and is very well publicised in Azure circles. Since there seems to be no actual name for this pattern, I’m seeking to give it a name, so that we can all speak a common language. First I’ll explain the pattern in concept, then I’ll explain it in the context of Azure.

Definition

The Asynchronous Work Queue Pattern allows workers to pull work items that are guaranteed to be unique from a robust, redundant queuing mechanism, in a fashion that is ignorant to leasing and locking of the work items provided. In other words, the leasing and locking functionality is removed from the worker which can concentrate on the work to be done, and the queue guarantees that no work item enqueued will ever be dequeued more than once.

Concurrent Workloads – The Problem

As a developer I’ve worked on a lot of large systems and have often found myself dealing with the problem of resource contention. Whether multiple threads are trying to access a resource for the same reason, or perhaps two different tiers want the same piece of information for two different reasons, sharing resources can be hard.

Imagine an event that occurs as the result of some interaction with a website. That event might require some data to be saved, an email to be sent, a log service to be called, and a bunch of other things. We never want this to happen all in the original request; we like our UI to be responsive, otherwise the user is just going to press the submit button again right?

To get around this problem we create a WorkItem class and our UI thread now saves a work item. Running on a Windows Service in the background (probably on another server) we have a work processor whose job it is to check the work item table every 15 minutes and pickup any work items that need doing and process them. We sit back and put up our feet, comfortable that our separation of work items from the UI has made our application extremely responsive, and added some robustness to boot.

A month goes by and all of a sudden the sales team lands three massive clients and our site traffic has increased ten fold! Our web front end is doing great though, especially since it can just hand off work items and return control to the user very quickly. However the backend Windows Service is choking under the pressure and work items are coming in faster than it can process them.

No problem, we decide to add a second server and install the Windows Service there as well. But wait; this won’t work will it? Both services hit the database and get the next work item; they get the SAME work item! So now we need to consider taking a lease over a certain number of work items to indicate that they are being processed. We pick an arbitrary number; each service will pick off ten work items and put a flag next to them saying they are being processed. In doing so we realise we are polluting our data schema with information about how the data is being used, but we really have no other choice.

Of course we then ponder; what happens if they both query and get work items at the same time? They might not have the flag and could therefore still process the same records. So now we need a double verification. The complexity grows.

Concurrent Workloads – The Solution

By providing a queue implementation that ensures that the ‘next’ work item cannot be dequeued by more than one requester, the workers can focus on the work that needs to be done and can remove complex code pollution required when worrying about leases.

The primary advantage of this approach is that it scales extremely well. In the problem scenario depicted above, adding another 20 windows services will result in each other service slowing down because more lease checking occurs when looking for free work items to process, and as a result less work gets done. But in the queue scenario, 20 times more services will mean 20 times more productivity.

In Context Of Azure

In Azure the equivalent scenario would be worker roles with multiple instances. Windows Azure Storage provides a highly scalable Queue that is accessible via REST. In essence, the Windows Azure Storage Queue service was designed specifically for asynchronous work distribution/consumption. Each retrieval of a work item from the queue is guaranteed to be unique, except where the worker fails to notify the queue of successful processing, in which case the work item is automatically re-enqueued after a certain amount of time. This ensures the work item is not lost due to a worker failure. Also, Azure Queues are at least three times redundant, ensuring no work item is ever lost.

An Example

In some example code I’ve posted from previous presentations, there is an application that searches for images based on their colour content. The Asynchronous Work Queue Pattern is applied in this example application; a search is made against the Flickr API for a specific keyword and a bunch of results are returned. Each result is placed into a queue, and multiple workers listen at the other end, waiting to pick up an image that they will chop up and analyse for colour content.

Summary

To be honest I don’t care what its called; its just important that you are aware of the pattern and know why to use it.

Update – 17/5/2010

It was just pointed out to me by Paul Stovell that this fits very closely with the Message Dispatcher pattern as identified in the book Enterprise Integration Patterns. The key difference is that the dispatcher pushes work to the consumers whereas workers will pull the work load from the queue instead. The message dispatcher pattern makes the assumption that the message channel is in fact dumb, whereas Windows Azure Storage Queues are smart and can ensure no message duplication, as well as reliable messaging, even in the event of consumer failure. Currently I believe these are still separate patterns however am happy to have the discussion over a beer or two.

Tags Tags: , , , ,
Categories: Azure
Posted By: Steven Nagy
Last Edit: 17 May 2010 @ 10 04 PM

E-mailPermalinkComments (3)

I’ve seen a few demos of Windows Azure and one of the common themes I see around the worker role is that people want to demonstrate scalability through increasing the number of instances in their service definition. That’s fine but we need to also remember that we are scaling up an entire virtual machine each time we increase our instance count, and maybe that just isn’t necessary when we consider parallelization instead.

When we create a new worker role, we get some boiler plate code that guides us into overriding the “Start” and “GetHealthStatus” methods. The latter is irrelevant to this discussion; the former is more important. The default template constructs the Start method in a way that indicates it should do some work and then Sleep.

public override void Start()
{
    while (true)
    {
        // Do Work
        Thread.Sleep(10000);
    }
}

The problem with this convention is two-fold. First, it assumes that one worker should be performing one role, but that’s not the case a lot of the time. Secondly, it doesn’t leverage parallel processing, in particular multi-threading.

The next obvious step is to break up your Start method and kick of multiple threads, each working on their own little piece of the application. But this kind of code is quite common, and I thought, wouldn’t it be better if we could just focus on what our individual application roles are without needing to think about how many workers/instances we need, and whether or not each of those roles should be run multi-threaded?

Thus, the need for a framework (although I shudder at the term) to help us separate that logic, and to provide some common classes to aid with parallelism. I’m not great with names, but for this release, I’ll call it: “WorkSharing”.

So What Does It Do?

The WorkSharing assemblies let you focus more on the roles of your application, the work that actually needs to be done, and it keeps this separate from when/where the work occurs.

You begin as usual by creating a Cloud application with a worker role. For the moment we don’t need to think about the worker role itself; instead we want to focus on what actual work our application needs to perform. We create a new class for every unit of work we need to do. In this example we’ll save time and create just one worker who will be instantiated a few different times.

public class MyCustomWorker : IWorker
    {
        private readonly string name;
        private static Random random = new Random();

        public MyCustomWorker(string name)
        {
            this.name = name;
        }

        public bool HasWorkToDo() { return true; }
        public void DoWork()
        {
            RoleManager.WriteToLog("Information", name + ": Before sleep");
            var wait = random.Next(700, 1200);
            Thread.Sleep(wait);
            RoleManager.WriteToLog("Information", name + ": After sleep");
        }
    }

The key is implementing the IWorker interface – this forces your class to expose two methods: “HasWorkToDo” and “DoWork”, both fairly self explanatory. The actual work that our class performs is two log entries with a random period of sleep in between. This will help us show the async work being performed later on.

Here is where you would define all the workers for your application. These don’t have to be in the same project as your cloud worker role – in fact it usually won’t be. If the roles are small but numerous, I’d put them in their own project. If they are each very large, you’d probably have one project per role. Either way, we’ve focused our attention on the work that needs to be done without coupling it with the “where” and “when”.

Heading back to our Azure worker role, we need to make some changes to the default template code. Strip out the override methods (although you can leave the health status override if you have special logic that needs to go there). We then change the inheritance of the worker role. Currently it inherits from ‘RoleEntryPoint’ but we will change this to ‘WorkSharingRole’ instead – this is a new class in the WorkSharing assemblies.

We want a parameterless constructor for our worker role as well, and it is in here that we define what it needs to do and how to do it. Here is some code illustrating sample usage of our custom worker:

public class WorkerRole : WorkSharingRole
{
    public WorkerRole()
    {
        base.SleepTime = 1000; 
        base.PrimaryWorker = new MyCustomWorker("Custom");
    }
}

Let’s step through the code. First, you can see that our worker inherits from ‘WorkSharingRole’ as mentioned before, and we have a parameterless constructor where we refer to two properties on the base class. ‘SleepTime’ is essentially the period (in milliseconds) between “executions” (a term I will explain further in a moment). The ‘PrimaryWorker’ property is of type IWorker which means we can assign our custom worker to it. That’s all that’s needed to get up and running, and when we fire it up, we can see in our development fabric the output from our worker role:

07/13/2009 03:48:38:648,Event=Information,Level=Info,ThreadId=4316,=Custom: Before sleep
07/13/2009 03:48:39:695,Event=Information,Level=Info,ThreadId=4316,=Custom: After sleep
07/13/2009 03:48:40:695,Event=Information,Level=Info,ThreadId=4316,=Working
07/13/2009 03:48:40:695,Event=Information,Level=Info,ThreadId=4316,=Custom: Before sleep
07/13/2009 03:48:41:744,Event=Information,Level=Info,ThreadId=4316,=Custom: After sleep

 

Ok so there’s nothing special there just yet, we haven’t really change much of the face of our app except that we’ve hidden away the loop. However there are a number of other IWorker implementations that are part of the WorkSharing assemblies, and it is these guys that will let you easily parallelize your work. Consider this change to the constructor:

public WorkerRole()
{
    base.SleepTime = 1000;
    base.PrimaryWorker = new WeightedWorker()
        .Add(new MyCustomWorker("AAA"), 2)
        .Add(new WaitingAsyncWorker()
            .Add(new MyCustomWorker("BBB"))
            .Add(new MyCustomWorker("CCC")));
}

Here we’ve introduced two new classes: WeightedWorker and WaitingAsyncWorker. The WeightedWorker allows us to state how often the worker role should be focusing on this particular task. it acts much like a ratio. In the above example, the WeightedWorker accepts a custom worker called ‘AAA’ with a weight of “2”, and a WaitingAsyncWorker with a weight of “1”. This means the custom worker will be called twice as often.

The WaitingAsyncWorker is one of 2 kinds of async workers that let you parallelize your work. In this case, it takes two custom workers, ‘BBB’ and ‘CCC’ which it will execute side-by-side. The WaitingAsyncWorker will then wait for both its children to finish executing.

Here’s the output you would see in the development fabric:

07/13/2009 02:37:06:688,Event=Information,Level=Info,ThreadId=5624,=AAA: Before sleep
07/13/2009 02:37:07:755,Event=Information,Level=Info,ThreadId=5624,=AAA: After sleep
07/13/2009 02:37:07:755,Event=Information,Level=Info,ThreadId=3528,=BBB: Before sleep
07/13/2009 02:37:07:755,Event=Information,Level=Info,ThreadId=5452,=CCC: Before sleep
07/13/2009 02:37:08:606,Event=Information,Level=Info,ThreadId=5452,=CCC: After sleep
07/13/2009 02:37:08:947,Event=Information,Level=Info,ThreadId=3528,=BBB: After sleep
07/13/2009 02:37:08:947,Event=Information,Level=Info,ThreadId=5624,=AAA: Before sleep
07/13/2009 02:37:10:061,Event=Information,Level=Info,ThreadId=5624,=AAA: After sleep

07/13/2009 02:37:11:062,Event=Information,Level=Info,ThreadId=5624,=AAA: Before sleep
07/13/2009 02:37:12:236,Event=Information,Level=Info,ThreadId=5624,=AAA: After sleep
07/13/2009 02:37:12:236,Event=Information,Level=Info,ThreadId=3528,=BBB: Before sleep
07/13/2009 02:37:12:236,Event=Information,Level=Info,ThreadId=5452,=CCC: Before sleep
07/13/2009 02:37:13:291,Event=Information,Level=Info,ThreadId=3528,=BBB: After sleep
07/13/2009 02:37:13:413,Event=Information,Level=Info,ThreadId=5452,=CCC: After sleep
07/13/2009 02:37:13:413,Event=Information,Level=Info,ThreadId=5624,=AAA: Before sleep
07/13/2009 02:37:14:245,Event=Information,Level=Info,ThreadId=5624,=AAA: After sleep

As you can see, the ‘AAA’ task executes twice, whilst tasks ‘BBB’ and ‘CCC’ could happen in any order and are both starting before the other finishes.

There are 2 other predefined workers in the WorkSharing framework. The ContinuousAsyncWorker also operates in an asynchronous fashion, however where the WaitingAsyncWorker will wait for all its children to finish before itself finishing, the ContinuousAsyncWorker will continuously keep working all of its children over and over. When each child finishes its work, it will kick off a new thread to start it again. This makes it a little different from all the other workers because it never really finishes its DoWork method – it just keeps kicking off its children over and over. For this reason, the DoWork method will return immediately after being called.

The final worker worth mentioning (or perhaps not really) is the LinearWorker which simply performs a number of tasks in the order defined. You can think of it as similar to the WeightedWorker except that it only fires off its children once each.

Is That A Fluent Interface I Doth Behold?

Yes and a meager one at that. Only the Add method will return the same object, letting you chain your adds together. But more important here is that you can make all workers children of other workers. In the above example we saw a WeightedWorker own a WaitingAsyncWorker. You could have any number of workers owning any others, as many levels deep as you want. It also lets you be creative – you might need to dynamically discover what work this worker will perform.

You Didn’t Explain ‘Executions’ Yet

Lets take the WaitingAsyncWorker from our example. The total amount of work that needs to be done by this worker is:

  1. Start all of its children
  2. Wait for all of its children to finish

After this, it has completed its DoWork method, which means it has completed its ‘Execution’. It may or may not get fired again. Now consider the WeightedWorker – its total sum of work is:

  1. Randomise the order of its children
  2. Call each of its children one at a time, waiting for each to finish before proceeding

… after which it has completed its execution. However in this example, its execution includes the execution of the WaitingAsyncWorker and all its children.

The top of the hierarchy is always the PrimaryWorker property of the WorkSharingRole class. Therefore its execution includes the execution of both the WeightedWorker and the WaitingAsyncWorker and all their children.

Between each execution of the PrimaryWorker there is a user defined pause. This is simply a Thread.Sleep call and defaults to 0 which means it won’t sleep between executions (which is a perfectly valid scenario).

Anything Else I Should Know About The Internals?

When the base WorkSharingRole class starts an execution, it calls the HasWorkToDo method on the PrimaryWorker. If the result is true then it proceeds to call the DoWork method. In the case of the prebaked workers, all calls to HasWorkToDo will actually forward the checks onto their children. If any child has work to do, it returns true – this is important because these workers don’t have any work of their own to do, they just call out to their children.

Likewise, when DoWork is called on one of the prebaked workers, it will in turn call the DoWork methods on its children (as per the descriptions given of each earlier).

Summary

You can download the framework here (full source and example applications including an Azure worker and a console app). Its free to use but I’d like to know how you think it can be improved. So far here is the list of things I’d like to add:

  • Events to hook into (such as OnInitialized and OnStart) [I don’t like the dependency on constructor]
  • More workers such as a worker where you can specify the number of repetitions of a task
  • Ability to change the worker hierarchy after the worker role has started
  • Configurability of worker hierarchy via XML

Please let me know what you think and have fun playing.

DOWNLOAD

Tags Tags: , , , ,
Categories: Azure
Posted By: Steven Nagy
Last Edit: 14 Jul 2009 @ 08 26 AM

E-mailPermalinkComments (5)

Very recently David Lemphers posted about one of the cool methods of debugging available in Windows Azure. In his blog post he talks about a method of spinning up a process that can execute command line type requests via the Process class, and then pushing the output back to your webform.

Its an excellent example of what you can do under full trust in Windows Azure. I liked this demo of David’s so much that I decided to throw it together quickly myself to have a play. David only shows a snippet of the code he used, so I thought I’d post the whole thing as a single ASPX file (no code behind) that you can just plonk into your cloud app. Remember to set ‘enableNativeCodeExecution’ to ‘true’ in your service definition file before attempting to use this diagnostic page.

Have fun, and thanks David

(create a new diagnostics.aspx file with no code behind, and copy this into its contents)

<%@ Page Language="C#" AutoEventWireup="true" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head id="Head1" runat="server">
    <title>Diagnostics</title>
    <script language="CS" runat="server">       
        protected void btnRun_Click(object sender, EventArgs e)
        {
            var newProc = new System.Diagnostics.Process();

            newProc.StartInfo.UseShellExecute = false;
            newProc.StartInfo.RedirectStandardOutput = true;
            newProc.StartInfo.FileName = "cmd";
            newProc.StartInfo.Arguments = "/c " + txtInput.Text;

            newProc.EnableRaisingEvents = false;
            newProc.Start();

            txtOutput.Text += String.Format("{0}\r\n", newProc.StandardOutput.ReadToEnd());

            newProc.Close();
        }           
    </script>
    <style type="text/css">   
        textarea { width: 800px; }
        textarea.CommandOutput { height: 600px; background: black; color: White; }   
    </style>
</head>
<body>
    <form id="form1" runat="server">
    <div>   
        <asp:ScriptManager ID="ScriptManager1" runat="server" />       
        <asp:UpdatePanel ID="UpdatePanel1" runat="server">
            <ContentTemplate>
                Output: <br />
                <asp:TextBox id="txtOutput" runat="server" TextMode="MultiLine" CssClass="CommandOutput" /><br />
                <br />

                Input: <br />
                <asp:TextBox id="txtInput" runat="server" TextMode="MultiLine" CssClass="CommandInput" /><br />
                <asp:Button ID="btnRun" runat="server" Text="Run" onclick="btnRun_Click" /><br />
            </ContentTemplate>
        </asp:UpdatePanel>   
    </div>
    </form>
</body>
</html>

Tags Tags: , ,
Categories: Azure
Posted By: Steven Nagy
Last Edit: 13 Jul 2009 @ 12 56 PM

E-mailPermalinkComments (3)
 04 Feb 2009 @ 4:29 PM 

As of the date of this post, when you are working with Windows Azure Services, you get the option to create either a web role or a worker role (or both) for your Cloud Services project. But what are these roles exactly and what do they do?

Well you can think of each role instance as its own project. I tend to liken the two roles to existing Visual Studio project templates.

The Web Role is similar to a ‘Web Application’ – it has aspx pages and code behinds, but can also server anything that uses the http protocol, such as a WCF service using basicHttpBinding. The Web Role is driven by UI – the user interacts with a web page or service and this causes some processing to happen. As far as I can tell, the http pipeline is very similar to standard ASP.NET requests. Just think of it as a good old ASP.NET web application.

The Worker Role is similar to a windows service. It starts up and is running all the time. Instead of a timer, it uses a simple while(true) loop and a sleep statement. When it ‘ticks’ it performs some kind of maintenance work. This is great for background processing.

What the worker is working on is up to you of course, but usually it needs some data to work with. This data can come from a number of places. For example, the AzureServicesKit has examples that show you how to communicate between a worker and a web role via the use of a queue. The idea here is that the worker role doesn’t care how stuff got into the queue: it just does its job of processing items in the queue. It is the web role that is user driven and causes data to go into the queue. The web role can then monitor the queue (via some nicely placed Ajax) and show the results when they have been processed.

So you might think: I’ll always need a web role if I have a worker role. However, this is not true.

Think for a second about Live Mesh. Here you have a folder of all your holiday photos and you’ve modified the actual photos with metadata containing the name of the town or city you took the photo. Live Mesh is built on the same services as the rest of Azure, so communicating between the two is quite easy. You could potentially create a Cloud Service with just a worker role that monitors a particular Live Mesh folder for photos, and when a new one is added, it checks for that metadata with the city name, and inserts latitude/longitude coordinates as additional metadata.

Lets run with that example for a moment. Imagine that the Live Mesh folder is actually a shared folder with thousands of people having access to it (you can do that in Mesh). People are uploading their photos from cities all around the world. Your single worker role can’t keep up with the demand! That’s where Azure steps up: this is what it was truly built for. You can very easily add another instance of your Worker Role. All of a sudden, your processing time is halved!

So there you go. Currently there is a limitation of 2 instances per role. And you can only have one worker role and one web role in your solution at the most. This will seem very restricting once you start to really get into Azure but I suppose there has to be some limitations while Microsoft builds its datacenters around the world.

Steve Marx stated about a month ago (see the comment at the bottom of that link):

we’re absolutely interested in giving the ability to have multiple worker roles, and multiple web roles, and some other kinds of roles, and direct synchronous communication between them …

He goes on to mention that Microsoft are careful about releasing functionality in a staged manner. I’m predicting that some of the roles we see will more closely match some of the existing projects in Visual Studio, in particular an ‘MVC Web Role’.

Oh before I wrap up, you can also write your worker roles in F# now! Check out the Microsoft site for the download.

 

Technorati Tags: ,,,,
Tags Tags: , , , ,
Categories: Azure
Posted By: Steven Nagy
Last Edit: 10 Feb 2009 @ 06 31 PM

E-mailPermalinkComments (13)
\/ More Options ...
Change Theme...
  • Users » 25016
  • Posts/Pages » 64
  • Comments » 191
Change Theme...
  • VoidVoid
  • LifeLife
  • EarthEarth
  • WindWind « Default
  • WaterWater
  • FireFire
  • LiteLight
  • No Child Pages.

Warning: Unknown(): open(C:\Windows\SERVIC~2\NETWOR~1\AppData\Local\Temp\\sess_e4c0e5a77e5e8a10933e4fcb3a956d7b, O_RDWR) failed: Permission denied (13) in Unknown on line 0

Warning: Unknown(): Failed to write session data (files). Please verify that the current setting of session.save_path is correct () in Unknown on line 0