23 Mar 2009 @ 8:44 AM 
 

The Azure Fabric Controller

 

The Azure Fabric Controller (FC) is the service which monitors, maintains and provisions machines to host the applications that we (the developer) create and store in the Microsoft cloud.

Previously I’ve helped define the word ‘fabric’ and in specific, discuss the details of the Azure Fabric and the Development Fabric. The Azure Fabric Controller is responsible for managing all the nodes and edges in the Azure Fabric, which is essentially servers (both provisioned and not), load balancers (usually hardware balancers), power-on automation devices, switches, routers, etc.

imageThe Fabric Controller manages different devices in different ways. For example, hardware load balancers are supported through a driver model. Each balancer could be different hardware type, vendor, etc. Azure abstracts the communication by exposing the balancer through a custom driver for that specific model. However, how it manages powered-on servers is slightly different.

There is a special service that runs on all powered-on servers/instances and the Fabric Controller communicates with the server via this service. The service tracks two things: the ‘current state’ of the server and the ‘goal state’. A goal might be to run a worker instance, or it might also be to remain idle as part of the free inventory. The current state might be something like ‘initialising’ or ‘idle’. The Fabric Controller and the local service can then manage how the system gets to the goal state from the current state.

When an error occurs, the service detects the fault and changes the current state accordingly (something like ‘faulted’). Once again the Fabric Controller and the service can manage what’s required to get back to the goal state. This might mean a reboot, or perhaps reprovisioning the whole server. The Fabric Controller can take alternative options like provisioning another resource to host your instance.

This mechanism is quite useful in that repeat patterns of failures or hardware faults can be easily identified and a server can be marked as ‘inoperable’.

image One of the key roles of the Fabric Controller is to provision resources based on the needs of the applications written by the developer. To manage this it has a declarative service model that defines exactly what is needed by the application.  This model covers things like what roles the application performs and how those roles communicate, what operating system requirements there are (does it need IIS for example), how much CPU is needed, bandwidth required, etc. It can even specify what guest operating system to use, and if a dedicated box is required or if virtualisation is enough.

There is also some redundancy tolerance at the provisioning level, referred to as ‘fault domains’ and ‘update domains’. For example, you can specify that a particular application be distributed over 3 fault domains, meaning that your application will be located in different parts of the fabric such that server or switch failures will only bring down 1 instance. The Fabric Controller can model a certain amount of risk to sections of the fabric based on areas of single point of failure, and it uses these statistics when deploying your application into the fabric. This also applies to ‘update domains’ which essentially ensure that system updates that take services offline will affect your application 1 piece at a time, meaning you can ensure continuous availability.

When it comes time to provision a resource for your instance (being 1 instance of 1 role), the Fabric Controller will examine those specified requirements, and look through its inventory (fabric) for a resource that matches. It then changes the Goal State of the node and the provisioning process begins.

At this point you might be saying: “Hey, I can’t configure any of this stuff right now you liar!”. And you’d be right: as the developer we don’t yet have this fine grained control over how the Fabric Controller manages our applications. For the CTP, what we have been given is some templates instead, known to you and I as the Web and Worker Roles. However on the Azure side, these templates are translated into some predefined specifications around fault and upgrade domains, software requirements, and machine level resources. For example, these will always be a Server 2008 Enterprise running x64, 1.7Gb of RAM, and 250Gb disk space. In the future we will see more specific control become available, specifically to organisations who pursue an SLA route with Microsoft. We should see some of this later in the year (2009).

The Fabric Controller itself is highly redundant, with 5 to 7 replicas being available at any given time. The state of all the nodes in the fabric is replicated across all of these replicas to ensure that no matter which Fabric Controller is managing your particular node, its state tracking is 100% up to date. In the event that all Fabric Controller nodes go down, all existing services will still continue to run. However the provisioning and fault tolerance aspects will obviously be offline.

What I find really interesting is that the Fabric Controller replicas are all managed by a miniature version of Azure as well. This means that there is a service definition for the “Azure Fabric Controller” application which is deployed as a set number of instances, and has support for all the same kinds of fault and update domains. A new fabric controller can be provisioned automatically should there be failures, etc.

That’s about all I wanted to cover with the Fabric Controller. There’s a lot more to learn about how Azure manages its infrastructure, especially around deployment of host images and virtualised guests’ images. A future post perhaps.

Tags Tags: , ,
Categories: Azure
Posted By: Steven Nagy
Last Edit: 23 Mar 2009 @ 08 44 AM

E-mailPermalink
 

Responses to this post » (11 Total)

 
  1. New and Notable 307 : Sam Gentile's Blog (if (DeveloperTask == Communication && OS == Windows) said...
    12:56 am - March 24th, 2009

    [...] The Azure Fabric Controller - read What Is: The Azure Fabric and the Development Fabric first [...]

  2. Above The Cloud » Blog Archive » Windows Azure Geo Locations said...
    8:50 am - May 3rd, 2009

    [...] should be located near to each other in the data centre for optimal performance. As mentioned in previous posts of mine, the Fabric Controller is already equipped with the ability to provision resources on the fabric in [...]

  3. Above The Cloud » Blog Archive » Secret Azure Feature For PDC Release? Maybe.. said...
    4:30 pm - October 26th, 2009

    [...] instance. In this case it would be custom Server 2008 instances that could still be managed by the Fabric Controller. Microsoft summarises the concept like [...]

  4. Above The Cloud » Blog Archive » Windows Azure Development Deep Dive: Working With Configuration said...
    10:27 pm - March 7th, 2010

    [...] used in the service configuration file to configure your app. Think of it as an instruction to the fabric controller: “My app can be configured with these 4 values”. Also, once you’ve uploaded your application, [...]

  5. SpotCloud Goes Where Others Fear « Cloud Comments .net said...
    11:03 pm - November 30th, 2010

    [...] instead of building a proprietary platform that is difficult to sell, should’ve built their Azure fabric controller directly into Windows Server so that they could create a cloud market using their existing [...]

  6. Azure History and Introduction - Across Boundaries - ( Cipto ) said...
    7:52 pm - January 6th, 2011

    [...] Fabric controller, the man behind the screen who does all the Automation. from hardware management,service type setting, service life cycle .Details  [...]

  7. Azure@home Part 14: Inside the VMs - Jim O'Neil - Developer Evangelist - Site Home - MSDN Blogs said...
    4:25 pm - January 7th, 2011

    [...] it’s the Windows Azure Fabric Controller running in the data center that has decided upon that specific machine, but it’s the service [...]

  8. Mediahood.Net said...
    6:41 pm - January 25th, 2013

    Mediahood.Net…

    Above The Cloud » Blog Archive » The Azure Fabric Controller…

  9. Lynda said...
    1:32 am - April 5th, 2013

    Lynda…

    Above The Cloud » Blog Archive » The Azure Fabric Controller…

  10. tłumaczenia rosyjski Katowice said...
    12:02 am - May 22nd, 2013

    tłumaczenia rosyjski Katowice…

    Above The Cloud » Blog Archive » The Azure Fabric Controller…

 

Leave A Comment ...

 

 XHTML:
You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
\/ More Options ...
Change Theme...
  • Users » 88
  • Posts/Pages » 64
  • Comments » 182
Change Theme...
  • VoidVoid
  • LifeLife
  • EarthEarth
  • WindWind « Default
  • WaterWater
  • FireFire
  • LiteLight
  • No Child Pages.