Saturday, August 21, 2010

Hidden Costs in the Cloud, Part 2: Windows Azure Bandwidth Charges

In Part 1 of this series we identified several categories of “hidden costs” of cloud computing—that is, factors you might overlook or underestimate that can affect your Windows Azure bill. Here in Part 2 we’re going to take a detailed look at one of them, bandwidth. We’ll first discuss bandwidth generally, then zoom in on hosting bandwidth and how your solution architecture affects your changes. Lastly, we’ll look at how you can estimate or measure bandwidth using IIS Logs and Fiddler.

KINDS OF BANDWIDTH
Bandwidth (or Data Transfer) charges are a tricky part of the cloud billing equation because they’re harder to intuit. Some cloud estimating questions are relatively easy to answer: How many users do you have? How many servers will you need? How much database storage do you need? Bandwidth, on the other hand, is something you may not be used to calculating. Since bandwidth is something you’re charged for in the cloud, and could potentially outweigh other billing factors, you need to care about it when estimating your costs and tracking actual costs.

Bandwidth charges apply any time data is transferred into the data center or out of the data center. These charges apply to every service in the Windows Azure platform: hosting, storage, database, security, and communications. You therefore need to be doubly aware: not only are you charged for bandwidth, but many different activities can result in bandwidth charges.

In the case of Windows Azure hosting, bandwidth charges apply when your cloud-hosted web applications or web services are accessed. We’re going to focus specifically on hosting bandwidth for the remainder of this article.

An example where Windows Azure Storage costs you bandwidth charges is when a web page contains image tags that reference images in Windows Azure blob storage. Another example is any external program which writes to, reads from, or polls blob, queue, or table storage. There is a very complete article on the Windows Azure Storage Team Blog I encourage you to read that discusses bandwidth and other storage billing considerations: Understanding your Windows Azure Storage Billing.

Bandwidth charges don’t discriminate between people and programs: they apply equally to both human and programmatic usage of your cloud assets. If a user visits your cloud-hosted web site in a browser, you pay bandwidth charges for the requests and responses. If a web client (program) invokes your cloud-hosted web service, you pay bandwidth charges for the requests and responses. If a program interacts with your cloud-hosted database, you pay bandwidth charges for the T-SQL queries and results.

WHEN BANDWIDTH CHARGES APPLY
The good news about bandwidth is that you are not charged for it in every situation. Data transfer charges apply only when you cross the data center boundary: That is, something external to the data center is communicating with something in the data center. There are plenty of scenarios where your software components are all in the cloud; in those cases, communication between them costs you nothing in bandwidth charges.

It’s worth taking a look at how this works out in practice depending on the technologies you are using and the architecture of your solutions.

SCENARIO 1: CLOUD-HOSTED ASP.NET WEB SITE
Let’s consider a typical ASP.NET solution, where you have an ASP.NET web site in the cloud whose web services and database are also in the cloud. When a user interacts with your web site, you’re incurring bandwidth charges as your browser sends and receives data to and from the web site. If your site has image or media tags that reference blobs in Windows Azure Storage, you’re also incurring bandwidth charges for accessing them. Fortunately, browser image caching will keep that from getting out of hand. The web site in turn talks to its web services, and those web services in turn interact with a database. There are no bandwidth charges for the web site communication or the database communication because all of the parties are in the data center. In the diagram below the thicker green arrows show interactions that cross the data center boundary and incur bandwidth charges. In contrast the thinner black arrows show interactions that never leave the data center and incur no bandwidth charges.



Bandwidth Profile of an ASP.NET Solution in the Cloud

SCENARIO 2: CLOUD-HOSTED SILVERLIGHT APPLICATION
Now let’s consider nearly the same scenario but this time the front end is a Silverlight application. It’s largely the same composition as the previously described ASP.NET solution, except that when a user browses to the site a Silverlight application is downloaded that then runs locally. As the front end is now running locally on the user’s computer, the bandwidth picture changes. First off, there is less bandwidth consumption from the UI because we are no longer hitting the web server repeatedly to go to different pages: instead, the interaction is all local to the Silverlight application (except for image or media tags that reference blobs in Windows Azure storage). However, there is also more bandwidth consumption from web services because the client, the Silverlight application, is now outside the data center. Whereas web service calls registered no bandwidth charge in the prior scenario, now it’s something you’re paying for. Database interaction continues to incur no bandwidth charges because that’s coming from the web services which are still in the cloud.



Bandwidth Profile of a Silverlight Solution in the Cloud

SCENARIO 3: CLOUD-HOSTED BATCH APPLICATION
In a batch application where there is no outside interaction your solution could incur no bandwidth charges whatsoever on a regular basis. Presumably you need to insert data and retrieve results from time to time which is where bandwidth charges will enter into the picture.



Bandwidth Profile of a Batch Application in the Cloud

SCENARIO 4: HYBRID ON-PREMISE/CLOUD APPLICATION
In a hybrid application where parts of the solution are in the cloud and parts of the solution are on-premise, you’ll incur bandwidth charges for wherever it is you cross the data center boundary. That might mean bandwidth charges for web service calls, database access, or storage access depending on where the on-premise/cloud connection(s) are. In the case where web services and backing database are in the cloud and the consuming clients are on-premise, it’s just web service interaction that will have bandwidth charges.



Bandwidth Profile of a Hybrid Application

WHAT BANDWIDTH COSTS
Below is a copy of the Windows Azure price sheet for standard monthly consumption-based use of the cloud (note: you should always check the official pricing on Azure.com in case the pricing model or rates change over time). Looking at the Data Transfer pricing at the bottom, we see that in North America and Europe data transfers cost $0.10/GB into the data center and $0.15/GB out of the data center. In Asia the prices are higher at $0.30/GB in and $0.45/GB out.



Windows Azure Standard Monthly Pricing (as of summer 2010)

PRICING OPTIONS AND SPECIAL OFFERS
The rates we looked at in the previous section are for consumption-based pricing which is month-to-month with no term commitment. However, Windows Azure pricing comes in a few different flavors. For a time commitment, there is subscription pricing where a certain amount of bandwidth may be included; in these cases, you start paying for bandwidth once your usage exceeds that amount.

In addition, there are special offers. For example, the Windows Azure Platform Introductory Special that is being offered at the time of this writing gives you 500MB of in/out data transfer each month at no charge. It’s only when you exceed that usage that you pay bandwidth charges.


Windows Azure Introductory Special (as of summer 2010)

ESTIMATING BANDWIDTH USAGE
Now that we’ve established the importance of taking bandwidth into account, how can you estimate what your usage will be? This is definitely easier when you are migrating an existing application because you have something you can measure and extrapolate from. Here are two approaches you can use to estimate bandwidth charges in the cloud:

1. Measuring server-side bandwidth. If you are going to be migrating an existing application, you can measure current overall bandwidth usage at the server and extrapolate from there.
2. Estimating client-side bandwidth. If you can measure or estimate the bandwidth of various kinds of client interactions with your application, you can multiply that by expected load to arrive at expected overall bandwidth usage.

MEASURING SERVER-SIDE BANDWIDTH
If your application is web-oriented and uses Microsoft technologies, there’s a good chance it is IIS hosted. For IIS-hosted applications you can use IIS logs to measure overall application bandwidth. For other kinds of applications you’ll need to see if a similar facility is available or investigate using a server-side network monitoring tool.

You can control logging from IIS Configuration Manager. In the IIS / Logging area you can set up and configure logging. There are various formats, schedules, and fields you can select for logging. If you’re using IIS defaults, you’re probably set up for W3C format logs and the output fields don’t include bandwidth counts. To change that, click Select Fields and ensure Bytes Sent (sc-bytes) and Bytes Received (cs-bytes) are selected. While you’re setting up logging the way you want, also note the location where the log files are written to.



With logging set up to capture bytes send and bytes received, you’ll be collecting the raw data being captured from which you can measure bandwidth. Once you have some of these log data, take a look at a log file and verify the bytes sent and received are being tracked. With the W3C format, you’ll see a text file similar to the listing below where there is a text line of values for each web request/response. Depending on the fields you’ve selected the text lines may be very long. The line beginning with #Fields gives you the legend to the data on each line. In the case of the example shown, the sc-bytes and cs-bytes fields are the next to last values.

#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2010-08-21 14:23:47
#Fields: date time cs-method cs-uri-stem cs-uri-query s-port sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
2010-08-21 14:23:47 GET / - 80 - 200 0 0 936 565 250
2010-08-21 14:23:49 GET /welcome.png - 80 - 200 0 0 185196 386 1531
2010-08-21 14:23:49 GET /favicon.ico - 80 - 404 0 64 0 333 31
2010-08-21 14:23:54 GET /myapp.aspx - 80 - 404 0 0 1749 591 375
2010-08-21 14:24:00 GET /myapp - 80 - 401 2 5 1509 586 140
2010-08-21 14:24:00 GET /myapp - 80 301 0 0 691 3269 78
2010-08-21 14:24:00 GET /myapp/ - 80 200 0 0 3570 3270 312
2010-08-21 14:24:00 GET /myapp/Silverlight.js - 80 200 0 0 8236 3116 281
2010-08-21 14:24:01 GET /favicon.ico - 80 404 0 2 1699 3016 62
2010-08-21 14:24:04 GET /myapp/ClientBin/myapp.xap - 80 200 0 0 846344 3062 2625
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 752 3461 359
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 3262 3513 62
2010-08-21 14:24:13 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 6314 3453 250
2010-08-21 14:24:15 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 846 3564 1281
2010-08-21 14:24:15 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 847 3565 718
2010-08-21 14:24:30 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 834 3647 14609
2010-08-21 14:24:31 POST /myapp/mysvc.svc/mysvc.svc - 80 200 0 0 8028 3461 218

Next we need to sum the sc-bytes and cs-bytes values so we know overall bandwidth for the period of the log. We can do this using a utility that parses IIS logs. The one I use is called Log Parser and is a free download from Microsoft. With an IIS log parsing utility, we can find out our overall bandwidth.

LogParser.exe "SELECT SUM(cs-bytes),SUM(sc-bytes) FROM u_ex10082114.log"
SUM(ALL cs-bytes) SUM(ALL sc-bytes)
----------------- -----------------
683352 1265207

Statistics:
-----------
Elements processed: 202
Elements output: 1
Execution time: 0.00 seconds

Be mindful of the solution architecture discussion earlier in this article: it’s possible some of the bandwidth you’re measuring will be charged for in the cloud and some will not. If that’s the case, you’re going to need to parse your log files with selection filters to find the subset of bandwidth you would be charged for.

IIS logs are equally useful for monitoring bandwidth once your applications have been deployed to the cloud. Karsten Januszewski’s article Downloading and Parsing IIS Logs from Windows Azure explains how to do this.

ESTIMATING CLIENT-SIDE BANDWIDTH
Sometimes it’s easier to look at bandwidth from the client-side. If you can measure or estimate the bandwidth of an individual client session, you can multiply that by the expected load to arrive at overall bandwidth. If your application already exists and is web-based, you can measure the bandwidth of client-side interactions using the popular Fiddler tool (described in detail in the following section).

If you do this too coarsely, the information won’t be valuable. You need to consider usage patterns for your application and their frequency. Ask yourself what the different kinds of user are, what tasks they perform, and what interaction that entails. Once you have bandwidth figures for the various usage patterns, multiply them by the expected number of users per month for each pattern.

Whether you are measuring client bandwidth or estimating it, you need to consider the major usage patterns for your application. How many kinds of user are there and what tasks and scenarios do they perform?

In the example analysis below, Fiddler was used to measure the number of requests, bytes sent, and bytes received for various tasks for a Silverlight-based training portal. Next the number of user sessions per month for each task was estimated. Multiplying session bandwidth by session count gives us total expected in and out bandwidth. Although the numbers are looking large at this point, we’re only charged pennies per gigabyte. When we round up the number of in and out gigabytes and multiply by $0.10/GB in, $0.15/GB out, we have our final figure—a mere $9.70/month. Although the bandwidth charge is low in this particular example, you can’t assume that will always be the case.



You will want to use similar techniques to estimate bandwidth for other uses of the cloud, including storage, database, security, and communication services. Once deployed, you would want to inspect your monthly bill and see how well actual bandwidth aligns with predicted bandwidth. If you see a large difference, something wasn’t taken into account.

USING FIDDLER TO MEASURE CLIENT-SIDE BANDWIDTH
The Fiddler tool can be used to measure the bandwidth of client interactions with a web application. To measure bandwidth with Fiddler, follow these steps:

1. Identify a Concrete Measurement Objective.

Have a clear idea of what it is you are going to measure:

• Page or Session? Are you measuring a single web page access, or a complete session where the user will be navigating to a site and then interacting with it?
• User Role and Intended Tasks. If you are measuring a session, what roles is the user in and what are they intending to accomplish? You’ll want to log that information along with the measurements you make. It may be helpful to create a script, the list of steps you will perform on the web site.
• First-time or Repeat Visit? Is this a first-time visit, or a repeat visit? This is important because the first-time visit scenario doesn’t benefit from browser caching of temporary files.

2. Ensure Proper Starting Conditions.

We don’t want to taint our results or our conclusions, so it’s important to have the right starting conditions.

a. Browser Caching. If you are measuring the repeat visit (cached) scenario, you need to have previously visited the site in the same way you will be using it now. If on the other hand you are measuring the first-time (non-cached) scenario, clear out your browser cache. In Internet Explorer 8, you do this by selecting Tools > Internet Options, clicking the Delete… button, specifying what to delete (just Temporary Internet Files), and clicking Delete.



b. Close Browser Windows and Web Applications. Close down any existing browser windows or Web-active applications as we don’t want to include unintended Internet traffic in our measurements.

c. New Browser Instance. Launch a new browser instance.

3. Prepare Fiddler.

a. Launch Fiddler. Bring up Fiddler. You’ll see a large Web Sessions window on the left side of the display and a tabbed detail area to the right.

b. Clear Previous Sessions. If the Web Sessions window isn’t empty, select everything in it (you can use Ctrl+A) and press the Delete button. If you see ongoing activity and the window doesn’t stay empty, something is wrong and you have something still running that is performing web traffic. Hunt it down, shut it down, and return to this step.

c. View Statistics. From the menu, select View > Statistics (or use the shortcut key, F7).



4. Perform Web Activity.

Perform the web activity you want to measure. Do this carefully, so that you are including everything you want to but nothing superfluous. For a measurement of a web page access, simply navigate to the page and let it load in your browser. For a session, navigate to the web site and then start interacting as planned. Note: If you are trying to gauge activity for a web site that doesn’t actually exist yet, find a web property you feel is similar in terms of content density and navigation and use that as a rough gauge.

5. Capture Results.

a. Return to Fiddler. You should see data in the Web Sessions window reflecting your web activity (if you don’t, check File > Capture Traffic is checked in the menu).



b. Select All Activity. In the Web Sessions window, select all (Ctrl+A). This will give you summarized statistics for all of your web activity in the Statistics tab at right.



c. Capture Bandwidth Results. Select the entire Statistics window content, copy to the clipboard, and paste into Word or Excel where you can save it. Right at the top is key bandwidth information: the number of requests, the number of bytes sent, and the number of bytes received.

d. Capture Bandwidth Breakdown by Content Type. While it’s useful to know the bandwidth in terms of size, it’s also important to understand how that bandwidth usage breaks down. Click on the Show Chart link at the bottom of the Statistics page and Fiddler will show you the breakdown by content type along with a chart. As in the previous step you can select, copy and paste the textual information. To copy the chart, click the Copy this Chart link on the bottom of the window. This can be very revealing: in the example below, we can see that images and JavaScript are taking up the lion’s share of the bandwidth. You may be able to optimize the bandwidth consumption of your application based on this information—for example reducing your images to smaller format and resolution.



e. Get Additional Information from Fiddler. Taking the time to learn more about Fiddler will allow you to gain deeper insights into not only the size of your bandwidth but its nature.

SUMMARY
Bandwidth may or not be a large factor in your Windows Azure billing. The magnitude of bandwidth charges depends on just one thing—how much data you pass in and out of the data center—but there are many factors that determine that: the number of cloud services you use, the architecture of your solution, the efficiency and chattiness of your interactions, usage patterns, and load.

2 comments:

Anonymous said...

Hi,

Great article - very helpful.

One question - when analysing the traffic in Fiddler, should we only select the traffic that relates to the host website we are analysing? For example, i'm analysing totaljobs.com, and part of the response I see in fiddler when requesting a page from totaljobs.com is a response from Virtual Earth (i suspect for maps). Should we include this in our bandwidth calculations?

Anonymous said...

David,

Any suggestions on calulating our SharePoint farm bandwidth consumption?