Fire + Ice: David Pallmann's Technology Blog

Thursday, September 23, 2010

My Windows Azure Wish List – The Future Cloud I Hope to See by 2012

What will cloud computing be like in a couple of years? I got my first look at Windows Azure 2 years ago, and the rate of progress has been nothing short of amazing--and shows no sign of slowing down. What will the cloud be like in another year or two? Where should it go? Here’s where I’d like to see the cloud go over the next couple of years:

1. Auto-Sizing: Out-of-box Governance

Many people don’t seem to be aware that cloud computing brings with it a new management responsibility. A big selling point for the cloud is its elasticity and subsequent cost efficiency—but you only get that if you monitor activity and manage the size of your assets in the cloud. That is not by any means automatic today, so you must elect to do it yourself or through a third-party, either through automated means or human oversight.

We could debate whether this is the cloud provider’s responsibility or the customer’s, and in fact it needs to be a partnership between the two. Since this is something everyone needs to do, however, it seems fair to expect the cloud provider to more than meet us halfway. In the Future Cloud, I’d like to be able to easily set technical or financial thresholds and have the cloud monitor them for me—notifying me about changes and trends and taking action as per my marching orders.

We may get some of these capabilities as cloud integrations become available to operations monitoring software such as System Center—but that’s not a full realization of this idea. The modern start-up may run 100% in the cloud with no on-premise IT. Those companies need a completely in-cloud way to do governance.
Human beings shouldn’t have to babysit the cloud, at least not beyond an oversight/approval level of involvement. It should watch itself for us, and governance should be an out-of-box cloud service.

2. Auto Shut-off: App and Data Lifetime Management

I don’t know about you, but my house probably would have burned down long ago and my electric bills gone through the roof if it were not for the auto shut-off feature of many household appliances such as irons and coffee-makers. You only have to browse the forums to see the daily postings of people who are in shock because they left the faucet running or didn’t realize other so-called hidden costs of the cloud.

It’s human nature to be forgetful, and in the cloud forgetfulness costs you money. Every application put in the cloud starts a run of monthly charges that will continue perpetually until you step in and remove it someday. Every datum put in the cloud is in the same boat: ongoing charges until you remove it. It’s extremely unwise to do either without thinking about the endgame: when will this application need to come out of the cloud? What is the lifetime for this data? You might think you won’t forget about such things, but think about what it will be like when you are using the cloud regularly and have many applications and data stores online.

What we need to solve this problem is lifetime management for assets in the cloud. In the Future Cloud, I’d like to see lifetime policies you can specify up-front when putting applications and data into the cloud—with automated enforcement. You can imagine this including ‘keep it until I delete it’ and ‘keep until [time]’—similar to the options you get on your DVR at home. Auto delete could be dangerous, of course, so we will want more sophisticated options such as an ‘archive’ option, where we take something offline but don’t lose it altogether. Perhaps the best choice we could be given is a lease option, where the app or data’s expiration period gets renewed whenever they are used. This is how auto-shutoff works for many appliances: the shut-off timer gets reset whenever we use them, and only after a certain period of inactivity does deactivation take place.

As with the previous wish list item, this is something everyone needs and is therefore a valid ask of cloud providers. Let us set lifetime policies for our apps and data when we put them in the cloud, and enforce them for us.

3. Mothballing & Auto-Activation: Dehydrate & Rehydrate Apps and Data

As described in the previous wish list item, an ideal implementation of lifetime management for applications and data would include decommissioning and archiving. That is, apps and data that become inactive should be mothballed automatically where they cost us far less than when they are formally deployed.

Along with mothballing comes the need for reactivation. Here I think we can take an idea from workflow technologies such as WF and BizTalk Server, where long-running workflows are dehydrated so that they do not consume finite resources such as threads and memory. They get persisted, and the workflow engine knows what events to look for in order to rehydrate them back into running, active entities.

In the Future Cloud, I’d like apps and data to be dehydrated when inactive and rehydrated when needed again—with greatly reduced costs during the inactive period. We can thus imagine an app that people start to use less and less, and eventually stop using altogether. An example of this might be a health care plan enrollment portal, only used once or twice a year. As the app moves to an inactive state, an expiration policy would cause the cloud to remove all of the server instances. However, the “light would be on”: a future access to the application would bring it back online. We can similarly imagine account data that moves into archive mode when inactive: kept around, but not at the premium rate.

The best realization of this concept would be that mothballed apps and data cost us nothing until they are re-activated. That might be a little unrealistic since the cloud provider is keeping the light on for us, but a mothballed asset should certainly cost a small fraction of an activated one.

4. Automatic Encryption

Most customers go through a period of considering risks and concerns (real or imagined) before they start using the cloud. A common concern that surfaces is the use of shared resources in the cloud and the specter of your critical data somehow falling into the wrong hands. The best way to feel okay about that is to encrypt all data transmitted and stored by your application. That way, if data does fall into the wrong hands—remote as that may be—it won’t be intelligible to them. In the Future Cloud, I’d like all data I store—database and non-database—to be automatically encrypted.

This is another example of something I believe we will all be doing: encryption of data will become a standard practice for all data we put into the cloud. As previously mentioned, whenever there is something everyone wants to do in the cloud it’s fair to ask the cloud provider to provide a service rather than each of us having to separately implement the capability. Naturally, the customer should remain in control of keys and strong encryption methods should be used.

5. Get Closer to True Consumption-based Pricing

Cloud computing has great appeal because of the consumption-based pricing model and the analogies we can make to electricity and other utilities. However, the implementation of that idea today leaves room for improvement. While we do have consumption-based pricing it’s very coarse-grained.

For example, let’s consider Windows Azure hosting. For each VM you allocate, you are reserving that ‘machine’ and are paying $0.12/hour or more for wall clock time. The actual usage of each VM has nothing to do with your charges. Is this really consumption-based pricing? Yes, but at a coarse level of granularity: you add or remove servers to match your load. Can we imagine something more ideal? Yes, charging for the machine hours used to service actual activity. This would work well in combination with an auto-sizing feature as previously discussed.

We can make the same observation about SQL Azure. Today, you buy a database bucket in a certain size, such as 1GB or 10GB or 50GB. Whether that database is full, half full, or even completely empty does not affect the price you pay. Is this really consumption-based pricing? Yes, but again at a very coarse level. We can imagine a future where the amount of database storage in use drives the price, and we don’t have to choose a size bucket at all.

In the Future Cloud, I’d like to see more granular consumption-based pricing that more naturally lines up with usage and activities the way the customer thinks about them. It’s when the pricing model is at a distance from actual activity that surprises and disappointments come in using the cloud. We’ve already sold the ‘metering’ concept: now we need to give the customer the kind of meter they are expecting and can relate to.

6. Public-Private Portability: Doing Things the Same Way On-Prem or in the Cloud

I’m convinced many, many more businesses would be exploring the cloud right now if they could easily move portable workloads between cloud and on-premise effortlessly. Today, the cloud is a bit of a big step that requires you to change some things about your application. The cloud would be far more approachable if instead of that one big step, an enterprise could take several small, reversible steps.

In the Future Cloud, I’d like to be able to host things the same way in the cloud and on-premise so that I can effortlessly shuttle portable workloads between cloud and on-prem. Portable workloads would be huge. It doesn’t seem realistic that existing enterprise apps are going to just work in the cloud unchanged, because they weren’t designed to take advantage of a cloud environment. What does seem realistic is that you can update your apps to work “the cloud way” but be able to host identical VMs locally or in the cloud, giving you the ability to change your workload split anytime. The advent of private cloud will play a big role in making this possible.

7. Hybrid Clouds: Joining My Network to the Cloud

Today, on-premise and in-cloud are two very separate places separated by big walls. IT assets are either “over here” or “over there”, and special activities are needed to move applications, data, or messages between them. This makes certain scenarios a poor fit for the cloud today. Consider what I call the “Molar” pattern: an application with so many internal integrations that its deep roots make it impractical to extract out of the enterprise and move into the cloud.

In the Future Cloud, I’d like to be able to bridge parts of my local network to my assets in the cloud. The picture of what makes sense in the cloud changes radically if we can make connections between the cloud and our local network. That molar pattern, for example, might now be a suitable thing for the cloud because the in-cloud application now has a direct way to get to the internal systems it needs to talk to.

We know this is coming for Windows Azure. “Project Sydney”, announced at PDC 2009, will provide us with a gateway between our local networks and our assets in the cloud. What we can expect from this is that in addition to the “first wave” of applications that make sense in the cloud now, there will be a second wave.

8. Effortless Data Movement

Moving data to and from the cloud is not particularly hard—if it’s small, and of the type where you have a convenient tool at hand. When working with large amounts of data, your options are reduced and you may find yourself doing a lot of manual work or even creating your own tools out of necessity.

It’s not just moving data into the cloud and out that’s at issue: you may want to copy or move data between projects in the data center; or you may want to copy or move data to a different data center. In the Future Cloud, I’d like to be able to easily move data between on-premise and cloud data centers around the world, regardless of the amount of data.

9. A Simpler Pricing Model

If you look at Azure ROI Calculators and TCO tools, you’ll see that there are many dimensions to the pricing model. As we continue to get more and more services in the cloud, they will only increase. Although there’s something to be said for the transparency of separately accounting for bandwidth, storage, etc. it certainly puts a burden on customers to estimate their costs correctly. It’s very easy to get the wrong idea about costs by overlooking even one dimension of the pricing model. In the Future Cloud, I’d like to see a simpler, more approachable pricing model. This might mean a less itemized version of the pricing model where you consume at a simple rate; with the ability to reduce your costs slightly if you are willing to go the itemized route. This would be similar to tax returns, where you can choose between easy and itemized forms.

10. Provide SaaS Services

Software-as-a-Service providers are ISVs who face a common set of challenges: they need to provide multi-tenancy and engineer their solutions in a way that protect tenants well. This includes protection and isolation of data, and may involve customer-controlled encryption keys. SaaS providers also have to deal with provisioning of new accounts, which they would like to be as automated as possible. Change management is another consideration, where there is a tension between the ability to provide customizations and the use of a common deployment to serve all customers.

In the Future Cloud, I’d like to see services and a framework for SaaS functionality. Microsoft themselves are solving this for SaaS offerings such as SharePoint Online and CRM Online. Why not offer provisioning, multi-tenancy, and data isolation services for SaaS ISVs as a general cloud service?

11. E-Commerce Services in the Cloud

In line with the BizSpark program and other Microsoft initiatives to support emerging business, e-commerce services in the cloud would be highly useful. A cloud-based shopping cart and payment service would an excellent beginning, best implemented perhaps in conjunction with a well-known payment service such as PayPal. For more established businesses, we could imagine a deeper set of services that might include common ERP and commerce engine features. In the Future Cloud, I’d like to see shopping, payment, and commerce services.

12. Basic IT Services in the Cloud

It may be unrealistic to expect enterprises will put everything they have in the cloud, but start-ups are another matter altogether. For many start-ups, all of their IT will be in the cloud. They won’t have any local IT assets whatsoever beyond laptops. That means the basics, such as email, conferencing, Active Directory, domain management, and backup/restore will need to be in the cloud. We have a start on that today with Exchange Online, Office Communications Online, and Live Meeting in BPOS, but more is needed to complete the picture. In the Future Cloud, I’d like to see basic IT services provided by the cloud to support the fully-in-the-cloud customer.

Well, there’s my wish list. What do you think needs to be in the future cloud? Send me your comments.

Friday, September 17, 2010

Stupid Cloud Tricks #1: Hosting a Web Site Completely from Windows Azure Storage

Can you host a web site in Windows Azure without using Windows Azure Compute? Sure you can: you can ‘host’ an entire web site in Windows Azure Storage, 100% of it, if the web site is static. I myself am currently running several web sites using this approach. Whether this is a good idea is a separate discussion. Welcome to “Stupid Cloud Tricks” #1. Articles in this series will share interesting things you can do with the Windows Azure cloud that may be non-obvious and whose value may range from “stupid” to “insightful” depending on the context in which you use them.

If you host a web site in Windows Azure the standard way, you’re making use of Compute Services to host a web role that runs on a server farm of VM instances. It’s not uncommon in this scenario to also make use of Windows Azure blob storage to hold your web site assets such as images or videos. The reason you’re able to do this is that blob storage containers can be marked public or private, and public blobs are accessible as Internet URLs. You can thus have HTML <IMG> tags or Silverlight <Image> tags in your application that reference images in blob storage by specifying their public URLs.

Let’s imagine we put all of the files making up a web site in blob storage, not just media files. The fact that Windows Azure Storage is able to serve up blob content means there is inherent web serving in Windows Azure Storage. And this in turn means you can put your entire web site there—if it’s of the right kind: static or generated web sites that serve up content but don’t require server-side logic. You can however make use of browser-side logic using JavaScript or Ajax or Silverlight.

How does ‘hosting’ a static web site out of Windows Azure Storage compare to hosting it through Windows Azure Compute?

With the standard Windows Azure Compute approach, a single VM of the smallest variety @$0.12/hr will cost you about $88/month--and you need at least 2 servers if you want the 3 9's SLA. In addition you’ll pay storage fees for the media files you keep in Windows Azure storage as well as bandwidth fees.
If you put your entire site in Windows Azure storage, you avoid the Compute Services charge altogether but you will now have more storage to pay for. As a reminder, storage charges include a charge for the amount of storage @$0.15/GB/month as well as a transaction fee of $0.01 per 10,000 transactions. Bandwidth charges also apply but should be the same in either scenario.

So which costs more? It depends on the size of your web site files. In the Compute Services scenario the biggest chunk of your bill is likely the hosting charges which are a fixed cost. In the storage-hosted scenario you’re converting this aspect of your bill to a charge for storage which is not fixed: it’s based on how much storage you are using. It’s thus possible for your 'Storage-hosted’ web site charges to be higher or lower than the Compute-hosted approach depending on the size of the site. In most cases the storage scenario is going to be less than the Compute Services scenario.

As noted, this is only useful for a limited set of scenarios. It’s not clear what this technique might cost you in terms of SLA or Denial of Service protection for example. Still, it’s interesting to consider the possibilities given that Windows Azure Storage is inherently a web server. The reverse is also true, Windows Azure Compute inherently comes with storage--but that’s another article.

Saturday, August 28, 2010

Threat Modeling the Cloud

If there’s one issue in cloud computing you have to revisit regularly, it’s security. Security concerns, real or imagined, must be squarely addressed in order to convince an organization to use cloud computing. One highly useful technique for analyzing security issues and designing defenses is threat modeling, a security analysis technique long used at Microsoft. Threat modeling is useful in any software context, but is particularly valuable in cloud computing due to the widespread preoccupation with security. It’s also useful because technical and non-technical people alike can follow the diagrams easily. Michael Howard provides a very good walk-through of threat modeling here. At some level this modeling is useful for general cloud scenarios, but as you start to get specific you will need to have your cloud platform in view, which in my case is Windows Azure.

To illustrate how threat modeling works in a cloud computing context, let’s address a specific threat. A common concern is that the use of shared resources in the cloud might compromise the security of your data by allowing it to fall into the wrong hands—what we call Data Isolation Failure. A data isolation failure is one of the primary risks organizations considering cloud computing worry about.

To create our threat model, we’ll start with the end result we’re trying to avoid: data in the wrong hands.

Next we need to think about what can lead to this end result that we don’t want. How could data of yours in the cloud end up in the wrong hands? It seems this could happen deliberately or by accident. We can draw two nodes, one for deliberate compromise and one for accidental compromise; we number the nodes so that we can reference them in discussions. Either one of these conditions is sufficient to cause data to be in the wrong hands, so this is an OR condition. We’ll see later on how to show an AND condition.

Let’s identify the causes of accidental data compromise (1.1). One would be human failure to set the proper restrictions in the first place: for example, leaving a commonly used or easily-guessed database password in place. Another might be a failure on the part of the cloud infrastructure to enforce security properly. Yet another cause might be hardware failure, where a failed drive is taken out of the data center for repair. These and other causes are added to the tree, which now looks like this:

We can now do the same for the deliberately compromised branch (1.2). Some causes include an inside job, which could happen within your business but could also happen at the cloud provider. Another deliberate compromise would be a hacker observing data in transmission. These and other causes could be developed further, but we’ll stop here for now.

If we consider these causes sufficiently developed, we can explore mitigations to the root causes, the bottom leaves of the tree. These mitigations are shown in circles in the diagram below (no mitigation is shown for the “data in transmission observed” node because it needs to be developed further). For cloud threat modeling I like to color code my mitigations to show the responsible party: green for the business, yellow for the cloud provider, red for a third party.

You should not start to identify mitigations until your threat tree is fully developed, or you’ll go down rabbit trails thinking about mitigations rather than threats. Stay focused on the threats. I have deliberately violated this rule just now in order to show why it’s important. At the start of this article we identified the threat we were trying to model as “data in the wrong hands”. That was an insufficiently described threat, and we left out an important consideration: is the data intelligible to the party that obtains it? While we don’t want data falling into the wrong hands under any circumstances, we certainly feel better off if the data is unintelligible to the recipient. The threat tree we have just developed, then, is really a subtree of a threat we can state more completely as: Other parties obtain intelligible data in cloud. The top of our tree now looks like this, with 2 conditions that must both be true. The arc connecting the branches indicates an AND relationship.

The addition of this second condition is crucial, for two reasons. First, failing to consider all of the aspects in a threat model may give you a false sense of security when you haven’t examined all of the angles. More importantly, though, this second condition is something we can easily do something about by having our application encrypt the data it stores and transmits. In contrast we didn't have direct control over all of the first branch's mitigations. Let’s develop the data intelligible side of the tree a bit more. For brevity reasons we’ll just go to one more level, then stop and add mitigations.

Mitigation is much easier in this subtree because data encryption is in the control of the business. The business merely needs to decide to encrypt, do it well, and protect and rotate its keys. Whenever you can directly mitigate rather than depending on another party to do the right thing you’re in a much better position. The full tree that we've developed so far now looks like this.

Since the data intelligible and data in the wrong hands conditions must both be true for this threat to be material, mitigating just one of the branches mitigates the entire threat. That doesn’t mean you should ignore the other branch, but it does mean one of the branches is likely superior in terms of your ability to defend against it. This may enable you to identify a branch and its mitigation(s) as the critical mitigation path to focus on.

While this example is not completely developed I hope it illustrates the spirit of the technique and you can find plenty of reference materials for threat modeling on MSDN. Cloud security will continue to be a hot topic, and the best way to make some headway is to get specific about concerns and defenses. Threat modeling is a good way to do exactly that.

Saturday, August 21, 2010

Hidden Costs in the Cloud, Part 2: Windows Azure Bandwidth Charges

In Part 1 of this series we identified several categories of “hidden costs” of cloud computing—that is, factors you might overlook or underestimate that can affect your Windows Azure bill. Here in Part 2 we’re going to take a detailed look at one of them, bandwidth. We’ll first discuss bandwidth generally, then zoom in on hosting bandwidth and how your solution architecture affects your changes. Lastly, we’ll look at how you can estimate or measure bandwidth using IIS Logs and Fiddler.

KINDS OF BANDWIDTH
Bandwidth (or Data Transfer) charges are a tricky part of the cloud billing equation because they’re harder to intuit. Some cloud estimating questions are relatively easy to answer: How many users do you have? How many servers will you need? How much database storage do you need? Bandwidth, on the other hand, is something you may not be used to calculating. Since bandwidth is something you’re charged for in the cloud, and could potentially outweigh other billing factors, you need to care about it when estimating your costs and tracking actual costs.

Bandwidth charges apply any time data is transferred into the data center or out of the data center. These charges apply to every service in the Windows Azure platform: hosting, storage, database, security, and communications. You therefore need to be doubly aware: not only are you charged for bandwidth, but many different activities can result in bandwidth charges.

In the case of Windows Azure hosting, bandwidth charges apply when your cloud-hosted web applications or web services are accessed. We’re going to focus specifically on hosting bandwidth for the remainder of this article.

An example where Windows Azure Storage costs you bandwidth charges is when a web page contains image tags that reference images in Windows Azure blob storage. Another example is any external program which writes to, reads from, or polls blob, queue, or table storage. There is a very complete article on the Windows Azure Storage Team Blog I encourage you to read that discusses bandwidth and other storage billing considerations: Understanding your Windows Azure Storage Billing.

Bandwidth charges don’t discriminate between people and programs: they apply equally to both human and programmatic usage of your cloud assets. If a user visits your cloud-hosted web site in a browser, you pay bandwidth charges for the requests and responses. If a web client (program) invokes your cloud-hosted web service, you pay bandwidth charges for the requests and responses. If a program interacts with your cloud-hosted database, you pay bandwidth charges for the T-SQL queries and results.

WHEN BANDWIDTH CHARGES APPLY
The good news about bandwidth is that you are not charged for it in every situation. Data transfer charges apply only when you cross the data center boundary: That is, something external to the data center is communicating with something in the data center. There are plenty of scenarios where your software components are all in the cloud; in those cases, communication between them costs you nothing in bandwidth charges.

It’s worth taking a look at how this works out in practice depending on the technologies you are using and the architecture of your solutions.

SCENARIO 1: CLOUD-HOSTED ASP.NET WEB SITE
Let’s consider a typical ASP.NET solution, where you have an ASP.NET web site in the cloud whose web services and database are also in the cloud. When a user interacts with your web site, you’re incurring bandwidth charges as your browser sends and receives data to and from the web site. If your site has image or media tags that reference blobs in Windows Azure Storage, you’re also incurring bandwidth charges for accessing them. Fortunately, browser image caching will keep that from getting out of hand. The web site in turn talks to its web services, and those web services in turn interact with a database. There are no bandwidth charges for the web site communication or the database communication because all of the parties are in the data center. In the diagram below the thicker green arrows show interactions that cross the data center boundary and incur bandwidth charges. In contrast the thinner black arrows show interactions that never leave the data center and incur no bandwidth charges.

Bandwidth Profile of an ASP.NET Solution in the Cloud

SCENARIO 2: CLOUD-HOSTED SILVERLIGHT APPLICATION
Now let’s consider nearly the same scenario but this time the front end is a Silverlight application. It’s largely the same composition as the previously described ASP.NET solution, except that when a user browses to the site a Silverlight application is downloaded that then runs locally. As the front end is now running locally on the user’s computer, the bandwidth picture changes. First off, there is less bandwidth consumption from the UI because we are no longer hitting the web server repeatedly to go to different pages: instead, the interaction is all local to the Silverlight application (except for image or media tags that reference blobs in Windows Azure storage). However, there is also more bandwidth consumption from web services because the client, the Silverlight application, is now outside the data center. Whereas web service calls registered no bandwidth charge in the prior scenario, now it’s something you’re paying for. Database interaction continues to incur no bandwidth charges because that’s coming from the web services which are still in the cloud.

Bandwidth Profile of a Silverlight Solution in the Cloud

SCENARIO 3: CLOUD-HOSTED BATCH APPLICATION
In a batch application where there is no outside interaction your solution could incur no bandwidth charges whatsoever on a regular basis. Presumably you need to insert data and retrieve results from time to time which is where bandwidth charges will enter into the picture.

Bandwidth Profile of a Batch Application in the Cloud

SCENARIO 4: HYBRID ON-PREMISE/CLOUD APPLICATION
In a hybrid application where parts of the solution are in the cloud and parts of the solution are on-premise, you’ll incur bandwidth charges for wherever it is you cross the data center boundary. That might mean bandwidth charges for web service calls, database access, or storage access depending on where the on-premise/cloud connection(s) are. In the case where web services and backing database are in the cloud and the consuming clients are on-premise, it’s just web service interaction that will have bandwidth charges.

Bandwidth Profile of a Hybrid Application

WHAT BANDWIDTH COSTS
Below is a copy of the Windows Azure price sheet for standard monthly consumption-based use of the cloud (note: you should always check the official pricing on Azure.com in case the pricing model or rates change over time). Looking at the Data Transfer pricing at the bottom, we see that in North America and Europe data transfers cost $0.10/GB into the data center and $0.15/GB out of the data center. In Asia the prices are higher at $0.30/GB in and $0.45/GB out.

Windows Azure Standard Monthly Pricing (as of summer 2010)

PRICING OPTIONS AND SPECIAL OFFERS
The rates we looked at in the previous section are for consumption-based pricing which is month-to-month with no term commitment. However, Windows Azure pricing comes in a few different flavors. For a time commitment, there is subscription pricing where a certain amount of bandwidth may be included; in these cases, you start paying for bandwidth once your usage exceeds that amount.

In addition, there are special offers. For example, the Windows Azure Platform Introductory Special that is being offered at the time of this writing gives you 500MB of in/out data transfer each month at no charge. It’s only when you exceed that usage that you pay bandwidth charges.

Windows Azure Introductory Special (as of summer 2010)

ESTIMATING BANDWIDTH USAGE
Now that we’ve established the importance of taking bandwidth into account, how can you estimate what your usage will be? This is definitely easier when you are migrating an existing application because you have something you can measure and extrapolate from. Here are two approaches you can use to estimate bandwidth charges in the cloud:

1. Measuring server-side bandwidth. If you are going to be migrating an existing application, you can measure current overall bandwidth usage at the server and extrapolate from there.
2. Estimating client-side bandwidth. If you can measure or estimate the bandwidth of various kinds of client interactions with your application, you can multiply that by expected load to arrive at expected overall bandwidth usage.

MEASURING SERVER-SIDE BANDWIDTH
If your application is web-oriented and uses Microsoft technologies, there’s a good chance it is IIS hosted. For IIS-hosted applications you can use IIS logs to measure overall application bandwidth. For other kinds of applications you’ll need to see if a similar facility is available or investigate using a server-side network monitoring tool.

You can control logging from IIS Configuration Manager. In the IIS / Logging area you can set up and configure logging. There are various formats, schedules, and fields you can select for logging. If you’re using IIS defaults, you’re probably set up for W3C format logs and the output fields don’t include bandwidth counts. To change that, click Select Fields and ensure Bytes Sent (sc-bytes) and Bytes Received (cs-bytes) are selected. While you’re setting up logging the way you want, also note the location where the log files are written to.

With logging set up to capture bytes send and bytes received, you’ll be collecting the raw data being captured from which you can measure bandwidth. Once you have some of these log data, take a look at a log file and verify the bytes sent and received are being tracked. With the W3C format, you’ll see a text file similar to the listing below where there is a text line of values for each web request/response. Depending on the fields you’ve selected the text lines may be very long. The line beginning with #Fields gives you the legend to the data on each line. In the case of the example shown, the sc-bytes and cs-bytes fields are the next to last values.

#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2010-08-21 14:23:47
#Fields: date time cs-method cs-uri-stem cs-uri-query s-port sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
2010-08-21 14:23:47 GET / - 80 - 200 0 0 936 565 250
2010-08-21 14:23:49 GET /welcome.png - 80 - 200 0 0 185196 386 1531
2010-08-21 14:23:49 GET /favicon.ico - 80 -  404 0 64 0 333 31
2010-08-21 14:23:54 GET /myapp.aspx - 80 - 404 0 0 1749 591 375
2010-08-21 14:24:00 GET /myapp - 80 - 401 2 5 1509 586 140
2010-08-21 14:24:00 GET /myapp - 80 301 0 0 691 3269 78
2010-08-21 14:24:00 GET /myapp/ - 80 200 0 0 3570 3270 312
2010-08-21 14:24:00 GET /myapp/Silverlight.js - 80 200 0 0 8236 3116 281
2010-08-21 14:24:01 GET /favicon.ico - 80  404 0 2 1699 3016 62
2010-08-21 14:24:04 GET /myapp/ClientBin/myapp.xap - 80 200 0 0 846344 3062 2625
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 752 3461 359
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 3262 3513 62
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 6314 3453 250
2010-08-21 14:24:15 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 846 3564 1281
2010-08-21 14:24:15 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 847 3565 718
2010-08-21 14:24:30 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 834 3647 14609
2010-08-21 14:24:31 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 8028 3461 218

Next we need to sum the sc-bytes and cs-bytes values so we know overall bandwidth for the period of the log. We can do this using a utility that parses IIS logs. The one I use is called Log Parser and is a free download from Microsoft. With an IIS log parsing utility, we can find out our overall bandwidth.

LogParser.exe "SELECT SUM(cs-bytes),SUM(sc-bytes) FROM u_ex10082114.log"
SUM(ALL cs-bytes) SUM(ALL sc-bytes)
----------------- -----------------
683352            1265207

Statistics:
-----------
Elements processed: 202
Elements output:    1
Execution time:     0.00 seconds

Be mindful of the solution architecture discussion earlier in this article: it’s possible some of the bandwidth you’re measuring will be charged for in the cloud and some will not. If that’s the case, you’re going to need to parse your log files with selection filters to find the subset of bandwidth you would be charged for.

IIS logs are equally useful for monitoring bandwidth once your applications have been deployed to the cloud. Karsten Januszewski’s article Downloading and Parsing IIS Logs from Windows Azure explains how to do this.

ESTIMATING CLIENT-SIDE BANDWIDTH
Sometimes it’s easier to look at bandwidth from the client-side. If you can measure or estimate the bandwidth of an individual client session, you can multiply that by the expected load to arrive at overall bandwidth. If your application already exists and is web-based, you can measure the bandwidth of client-side interactions using the popular Fiddler tool (described in detail in the following section).

If you do this too coarsely, the information won’t be valuable. You need to consider usage patterns for your application and their frequency. Ask yourself what the different kinds of user are, what tasks they perform, and what interaction that entails. Once you have bandwidth figures for the various usage patterns, multiply them by the expected number of users per month for each pattern.

Whether you are measuring client bandwidth or estimating it, you need to consider the major usage patterns for your application. How many kinds of user are there and what tasks and scenarios do they perform?

In the example analysis below, Fiddler was used to measure the number of requests, bytes sent, and bytes received for various tasks for a Silverlight-based training portal. Next the number of user sessions per month for each task was estimated. Multiplying session bandwidth by session count gives us total expected in and out bandwidth. Although the numbers are looking large at this point, we’re only charged pennies per gigabyte. When we round up the number of in and out gigabytes and multiply by $0.10/GB in, $0.15/GB out, we have our final figure—a mere $9.70/month. Although the bandwidth charge is low in this particular example, you can’t assume that will always be the case.

You will want to use similar techniques to estimate bandwidth for other uses of the cloud, including storage, database, security, and communication services. Once deployed, you would want to inspect your monthly bill and see how well actual bandwidth aligns with predicted bandwidth. If you see a large difference, something wasn’t taken into account.

USING FIDDLER TO MEASURE CLIENT-SIDE BANDWIDTH
The Fiddler tool can be used to measure the bandwidth of client interactions with a web application. To measure bandwidth with Fiddler, follow these steps:

1. Identify a Concrete Measurement Objective.

Have a clear idea of what it is you are going to measure:

• Page or Session? Are you measuring a single web page access, or a complete session where the user will be navigating to a site and then interacting with it?
• User Role and Intended Tasks. If you are measuring a session, what roles is the user in and what are they intending to accomplish? You’ll want to log that information along with the measurements you make. It may be helpful to create a script, the list of steps you will perform on the web site.
• First-time or Repeat Visit? Is this a first-time visit, or a repeat visit? This is important because the first-time visit scenario doesn’t benefit from browser caching of temporary files.

2. Ensure Proper Starting Conditions.

We don’t want to taint our results or our conclusions, so it’s important to have the right starting conditions.

a. Browser Caching. If you are measuring the repeat visit (cached) scenario, you need to have previously visited the site in the same way you will be using it now. If on the other hand you are measuring the first-time (non-cached) scenario, clear out your browser cache. In Internet Explorer 8, you do this by selecting Tools > Internet Options, clicking the Delete… button, specifying what to delete (just Temporary Internet Files), and clicking Delete.

b. Close Browser Windows and Web Applications. Close down any existing browser windows or Web-active applications as we don’t want to include unintended Internet traffic in our measurements.

c. New Browser Instance. Launch a new browser instance.

3. Prepare Fiddler.

a. Launch Fiddler. Bring up Fiddler. You’ll see a large Web Sessions window on the left side of the display and a tabbed detail area to the right.

b. Clear Previous Sessions. If the Web Sessions window isn’t empty, select everything in it (you can use Ctrl+A) and press the Delete button. If you see ongoing activity and the window doesn’t stay empty, something is wrong and you have something still running that is performing web traffic. Hunt it down, shut it down, and return to this step.

c. View Statistics. From the menu, select View > Statistics (or use the shortcut key, F7).

4. Perform Web Activity.

Perform the web activity you want to measure. Do this carefully, so that you are including everything you want to but nothing superfluous. For a measurement of a web page access, simply navigate to the page and let it load in your browser. For a session, navigate to the web site and then start interacting as planned. Note: If you are trying to gauge activity for a web site that doesn’t actually exist yet, find a web property you feel is similar in terms of content density and navigation and use that as a rough gauge.

5. Capture Results.

a. Return to Fiddler. You should see data in the Web Sessions window reflecting your web activity (if you don’t, check File > Capture Traffic is checked in the menu).

b. Select All Activity. In the Web Sessions window, select all (Ctrl+A). This will give you summarized statistics for all of your web activity in the Statistics tab at right.

c. Capture Bandwidth Results. Select the entire Statistics window content, copy to the clipboard, and paste into Word or Excel where you can save it. Right at the top is key bandwidth information: the number of requests, the number of bytes sent, and the number of bytes received.

d. Capture Bandwidth Breakdown by Content Type. While it’s useful to know the bandwidth in terms of size, it’s also important to understand how that bandwidth usage breaks down. Click on the Show Chart link at the bottom of the Statistics page and Fiddler will show you the breakdown by content type along with a chart. As in the previous step you can select, copy and paste the textual information. To copy the chart, click the Copy this Chart link on the bottom of the window. This can be very revealing: in the example below, we can see that images and JavaScript are taking up the lion’s share of the bandwidth. You may be able to optimize the bandwidth consumption of your application based on this information—for example reducing your images to smaller format and resolution.

e. Get Additional Information from Fiddler. Taking the time to learn more about Fiddler will allow you to gain deeper insights into not only the size of your bandwidth but its nature.

SUMMARY
Bandwidth may or not be a large factor in your Windows Azure billing. The magnitude of bandwidth charges depends on just one thing—how much data you pass in and out of the data center—but there are many factors that determine that: the number of cloud services you use, the architecture of your solution, the efficiency and chattiness of your interactions, usage patterns, and load.

Tuesday, August 17, 2010

The Enigma of Private Cloud

If you swim in cloud computing circles you cannot escape hearing the term private cloud. Private cloud is surely the feature most in demand by the cloud computing market—yet perhaps the longest in coming, as cloud computing vendors have gone from initial resistance to the idea to coming to terms with the need for it and figuring out how to deliver it. The concept is something of a paradox, made worse by the fact that private cloud definitely means different things to different people. There are at least 5 meanings of private cloud in use out there, and none of them are similar. Despite all this, the market pressure for private cloud is so great that cloud computing vendors are finding ways to deliver private cloud anyway. Let’s take a deeper look at what’s going on here.

What’s Behind The Demand For Private Cloud?
The desire for private cloud is easy enough to appreciate. Organizations are enamored with the benefits of cloud computing but don’t like certain aspects of it, such as the loss of direct control over their assets or sharing resources with other tenants in the cloud. This is where the paradox comes in, because management by cloud data centers and shared resources are core to what cloud computing is and why its costs are low. The market isn’t required to be logical or think through the details, however, and when there’s sufficient demand vendors find ways to innovate. Thus, while private cloud may seem at odds with the general premise of cloud computing, it turns out we need it and will have it.

There are some other drivers behind the need for private cloud that are hard to get around. Governments may have requirements for physical control of data that simply cannot be circumvented. In some countries there are regulations that business data must be kept in the country of origin. Another influence is the future dream of things working the same way in both the cloud and the enterprise. When that day comes, solutions won’t have to be designed differently for one place or the other and enterprises will be able to move assets between on-premise and cloud effortlessly.

Defining Private Cloud
How then is private cloud to be brought about? This is where we get into many different ideas about what private cloud actually is. My pet peeve is people who use the term private cloud without bothering to define what they mean by it. Let’s take a look at understandings that are in widespread use.

1. LAN Private Cloud
Some people use private cloud to simply mean their local network, similar to how the Internet can be referred to as the cloud without any specific reference to cloud computing proper. This use of the term is rather non-specific so we can’t do much with it. Let’s move on.

2. Gateway Private Cloud
This use of private cloud centers on the idea of securely connecting your local network to your assets in the cloud. Amazon’s Virtual Private Cloud is described as “a secure and seamless bridge between a company’s existing IT infrastructure and the AWS cloud” which “connects existing infrastructure to isolated resources in the cloud through a VPN connection.” In the Windows Azure world, Microsoft is working on something in this category called Project Sydney. Sydney was mentioned at PDC 2009 last year but until it debuts we won’t know how similar or different it will be to the Amazon VPC approach. Stay tuned.

This type of private cloud is valuable for several reasons. It potentially lets you use your own network security and operations monitoring infrastructure against your assets in the cloud. It potentially lets your cloud assets access something on your local network they need such as a server that you can’t or won’t put in the cloud.

3. Dedicated Private Cloud
In this flavor of private cloud you are using a cloud computing data center where an area of it is dedicated for just your use. From this you get the benefits you’re used to in the cloud such as automated provisioning and management and elasticity, but the comfort of isolation from other tenants.

Microsoft Online Services has offered this kind of private cloud with a dedicated version of the Business Productivity Online Suite (“BPOS-D”) for customers with a large enough footprint to qualify.

It seems axiomatic that dedicated private cloud will always be more expensive than shared use of the cloud.

4. Hardware Private Cloud
In hardware private cloud, cutting edge infrastructure like that used in cloud computing data centers is made available for you to use on-premise. Of course there’s not only hardware but software as well. Microsoft’s recent announcement of the Windows Azure Appliance is in this category.

The nature of hardware private cloud makes it expensive and therefore not for everybody, but it is important that this kind of offering exist. First, it should allow ISPs to offer alternative hosting locations for the Windows Azure technology in the marketplace. Secondly, this allows organizations that must have data on their premises, such as some government bodies, to still enjoy cloud computing. Third, this solves the “data must stay in the country of origin” problem which is a significant issue in Europe.

Is there something like the hardware private cloud that’s a bit more affordable? There is, our next category.

5. Software Private Cloud
Software private cloud emulates cloud computing capabilities on-premise such as storage and hosting using standard hardware. While this can’t match all of the functionality of a true cloud computing data center, it does give enterprises a way to host applications and store data that is the same as in the cloud.

An enterprise gets some strong benefits from software private cloud. They can write applications one way and run them on-premise or in the cloud. They can effortlessly move assets between on-premise and cloud locales easily and reversibly. They can change their split between on-premise and cloud capacity smoothly. Lock-in concerns vanish. One other benefit of a software private cloud offering is that it can function as a QA environment—something missing right now in Windows Azure.

We don’t have software private cloud in Windows Azure today but there’s reason to believe it can be done. Windows Azure developers already have a cloud simulator called the Dev Fabric; if the cloud can be simulated on a single developer machine, why not on a server with multi-user access? There’s also a lot of work going on with robust hosting in Windows Server AppFabric and perhaps the time will come when the enterprise and cloud editions of AppFabric will do things the same way. Again, we’ll have to stay tuned and see.

Should I Wait for Private Cloud?
You may be wondering if it’s too soon to get involved with cloud computing if private cloud is only now emerging and not fully here yet. In my view private cloud is something you want to take into consideration—especially if you have a scenario that requires it—but is not a reason to mothball your plans for evaluating cloud computing. The cloud vendors are innovating at an amazing pace and you’ll have plenty of private cloud options before you know it. There are many reasons to get involved with the cloud early: an assessment and proof-of-concept now will bring insights from which you can plan your strategy and roadmap for years to come. If the cloud can bring you significant savings, the sooner you start the more you will gain. Cloud computing is one of those technologies you really should get out in front of: by doing so you will maximize your benefits and avoid improper use.

Summary
There you have it. Private cloud is important, both for substantive reasons and because the market is demanding it. The notion of private cloud has many interpretations which vary widely in nature and what they enable you to do. Vendors are starting to bring out solutions, such as the Windows Azure Appliance. We’ll have many more choices a year from now, and then the question will turn from “when do I get private cloud” to “which kind of private cloud should we be using?”

And please, if you have private cloud fever: please explain which kind you mean!

Upcoming Cloud Computing for Public Sector Webcast

I'll be giving a webcast on Microsoft Cloud Computing and why it makes business sense for Public Sector on Wednesday August 18 from 10-11a PT.

10 Reasons to use Microsoft's Cloud Computing Strategy in Public Sector
https://www.clicktoattend.com/invitation.aspx?code=149746

Neudesic’s David Pallmann discusses why cloud computing is compelling from a business perspective and in addition how it can be a high value platform in the Public Sector. We examine why cloud computing on the Microsoft platform is fiscally responsible, puts costs under control, and allows you to spend your I.T. dollars more efficiently. The discussion will include how to compute your monthly charges and how to determine the ROI on migrating existing applications to the cloud.

Upcoming Radio Talk on Windows Azure

On August 18th 5-6p PT I'll be joining David Lynn and Ed Walters of Microsoft on the radio for the Computer Outlook program to discuss Microsoft Cloud Computing. We'll be focusing on Windows Azure and a customer example.