Just the other day the Microsoft Azure team shifted the CDN functionality into release mode and offered up a pricing model. I reflected on the release in my post ‘Azure CDN Pricing’ however some of the information I provided was in fact incorrect.
This article is to correct some of those mistakes, and also offer a few more insights into the Microsoft CDN. I spoke with Jason Sherron from the Azure CDN team this morning and clarified a number of points and learnt more about Microsoft’s CDN offering.
Some years ago it was true that Microsoft used partner networks like Akamai and Limelight for content delivery. The team responsible for managing the global CDN was the “Edge Computing Network” (ECN) team.
Microsoft also had/has extensive co-location spaces; racks of servers sitting in a 3rd party data centre. Usually these have dedicated backbones and fibre. This localised service has extended over the years to include Microsoft.com, MSN.com, Bing Maps, etc.
Slowly Microsoft has started removing dependencies on 3rd party providers and moving on to their own infrastructure. This article gives some indication of where Microsoft are going with their services. Jason indicated to me that today, Microsoft serves 60% of edge content themselves.
Jason went on further to explain that as of this launch of the Azure CDN, Microsoft are now hosting 100% of your blob storage at those edges around the world. Your data does not sit with Akamai or Limelight, as I previously indicated.
Currently Microsoft has a single point of presence in Australia. The goal of any edge is to be located close to key egress and ingress points in the local area of internet exchange. For us, this is in Sydney, and while Microsoft doesn’t have an entire data centre here, they partner with someone who does, yet the racks are all Microsoft servers. Hopefully we’ll see more presence in the future in other capital cities.
We discussed briefly “what next” with the CDN. While nothing is bedded down completely, the team is investigating the possibility of Silverlight smooth streaming (apparently one of the most requested features) and also the potential of ‘compute at the edge’. How this latter service would differ from an implementation of the Azure Fabric is beyond me at this stage, and Jason was not at liberty to provide further information. I’m certainly very interested to see what this is about though.
In yesterday’s article I indicated you pay twice for CDN retrievals. This is partially true but really should be clarified.
The first time your data gets requested at the edge, the node has to retrieve the blob from Azure storage. You pay at the Azure storage data centre (normal Azure bandwidth charges) and then you pay again when it is delivered from the edge to the user (CDN charges). The content is of course cached at that point. Subsequent requests will hit the cache, which means only one charge.
Essentially if your data is “hot” then you only pay once. If you are constantly finding that your data is “cold” then perhaps CDN isn’t for you.
You have the option of either specifying the time-to-live for your blob object or you can rely on the heuristics of the cache network to determine the best time-to-live. More information can be found in this article: Delivering High-Bandwidth Content with the Windows Azure CDN.
Thanks to Jason for giving up some time to chat with me today, and put up with my follow up emails. Its important to remember that the authoritive source of your content is still your blob storage account, and the CDN cache is just a copy. The cache expiry will also affect the ‘freshness’ of your content so keep this in mind if you have content that changes frequently.
* Update: Please note this article is now redundant. Please defer instead to this clarification: Azure CDN Updated
The pricing structure for the CDN aspect of Windows Azure has just been announced. You may remember that I previously wrote about Global Foundation Services and the mechanism that Microsoft uses to globally distribute its own content. Since CDN might become more interesting to you now that it is officially released, I thought I’d summarise two important points.
I did cover this in greater detail in the previous post about GFS mentioned above; Microsoft does not have a CDN of their own and 3rd party services are utilised to achieve this. While I don’t see it generally being a problem, this might bother some people, mostly the fact that they can’t be sure where their data is actually sitting when it is cached in a CDN node.
CDN nodes are not part of the Microsoft network, therefore you will pay for output data transaction and bandwidth from the Azure Storage service, as well as for connections to CDN node. This means you are really paying a premium for this service. To quote the original release:
Any data transfers and storage transactions incurred to get data from Windows Azure Storage to the CDN will be charged separately at our normal Windows Azure Storage rates.
You have been warned!
Late last year Microsoft announced the availability of a Content Delivery Network (CDN) in Azure that you can use to distribute your Windows Azure Blobs to over 18 different edges, including Australia.
However this technology is not actually a new offering. Microsoft has been building/leasing CDNs for some time now. Prior to 2007 Microsoft leveraged the CDNs of a variety of companies such as Akamai and Level 3. Then in early 2007 Limelight Networks announced that they were supporting Silverlight with a focus on streaming media, and not long after, a Microsoft and Limelight joint partnership was announced where Microsoft would lease some key CDN technologies from Limelight and start building their own CDN. Rumour spread about Microsoft acquiring the company but those were quickly dismissed.
Like all other things infrastructure related, Global Foundation Services (GFS) is the department that is responsible for Microsoft’s CDN technology. At the end of the day you have to remember that Azure is not the first product to come along that requires massive storage and compute resources. Bing, Virtual Earth, Photo Gallery, Hotmail, etc are all services that require massive scale and Microsoft has been working in data centre technology for a long time. GFS is responsible for all of that and more. In their own words:
Global Foundation Services (GFS) is the engine that powers Microsoft’s Software Plus Services strategy. We focus on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide
Microsoft has been building data centres for some time and it makes sense that they would start improving on their own CDN offering. And even with the Limelight deal in 2007, Microsoft has started scaling back its other outsourced CDN needs, improving on its own CDN capability instead. At the Content Delivery Summit of 2009, the GM of Microsoft’s Edge Computing Network Jeff Cohen released some interesting statistics around Microsoft’s CDN services, as this article states:
..one of the points that really stood out was how quickly Microsoft is moving away from relying on third party CDNs for delivery and instead, using their own internal CDN..
In 2007 about 95% of Microsoft content was delivered by 3rd party CDN, however by the end of 2010 they expect this number to have dropped to a lowly 40%. It would seem that Microsoft is moving away from partner CDNs for small/large file transmissions, opting instead to use its own CDN technology, but sees its video streaming still sitting firmly in the hands of other companies like Akamai and Limelight. It makes sense. As I said before, Microsoft is in the market of building data centres for its various products and has already built data centres all around the world, including this one in Dublin (there was a feasibility study done for an Australian based data centre a couple years ago and Microsoft opted not to build one here yet).
Well I hunted around and couldn’t find any specific information stating one way or the other whose CDN the blob storage uses, but I can tell you that it would appear to be hosted on 3rd party CDNs. How do I know for sure? I don’t, but there’s some clues that hint to the fact, and I’ve dropped 2 key clues in this article for you to find. The first one to work out the 2 clues/indicators and post them in a comment to this post will win an Azure T-Shirt! (Australian residents only please).
Is there any real cause for alarm that my blob data could be hosted on a 3rd party CDN provider? Well no – you can only enable CDN availability on blob containers that are public access, which means everyone can get your data anyway.
In a country like Australia where we have communication ministers who want to introduce ISP level internet censorship and telco-providers refusing to improve bandwidth, CDNs will be critical to content delivery speed and adoption of higher quality information (eg HD streaming).
During the early CTP of Azure you could only select 2 locations for your Windows Azure compute and storage accounts, and 1 location for SQL Azure (SQL Data Services) and AppFabric (.Net Services). Now we have a lot more options for all 3 main technologies:
On top of this we now have a single location for Dallas accounts:
The Microsoft CDN is now also utilisable from Windows Azure Blob Storage as well. Microsoft has been building a CDN for some and uses it for a variety of purposes. There are over 18 edges in the CDN, Australia being one of those nodes.
So next time you’re wondering about the locations available, hopefully you won’t need to go into the actual portal project creation process just to find out.