Skip to content

Coradiant

Archive for August, 2006

Change impact management


Tuesday, August 29th, 2006 Posted by: Alistair Croll

One of the best things about this job is the number of people I get to meet and talk with. It lets me see the patterns that form across the industry in fascinating and sometimes unpredictable ways.

One of the big shifts I’m seeing a lot of lately is a move away from simple error detection towards change impact management. It’s been said that the only constant is change, and this is certainly true in web applications. One of our customers rolls out six or eight changes a day—and these are code changes!

Early on, people thought Real User Monitoring would be a great way to detect problems and get to work on them before the phone rang. And it is. But an increasing number of people are using our technologies to measure the before and after of a change.

The change may be as small as a memory upgrade or a new layout, or as big as a data center move or switch from Java to .net. But in every case, they want to know two things:

  • Did the change I made do what it was supposed to?
  • Did I inadvertently break something when I changed that?

The two questions seem almost the same. But they’re different in subtle ways. An engineer might make a change to address a problem (like poor performance.) Or they might try to add a new feature or function. In the former case, they have an intended outcome—reduced latency—that they want to verify. In the latter, there’s not supposed to be an impact.

One of the biggest impacts is knowing whether a new version can handle the same traffic as its predecessor. Our customers do this by isolating the specific function (using a technology called a Watchpoint) and then plotting the relationship between performance and load. They then make the change, and compare before and after.

This lets them say things like, “before, 95 per cent of users got the report in under 5 seconds when we had 40 hits a second; now, that’s only true at 25 hits a second. That change cost me 15 hits a second, you bonehead!” To be fair, that’s usually what the product manager says to the coder; and more often than not, we see improvements from release to release. But you get the idea.

The other thing that’s interesting about change impact management is the number of people who care. Operations teams don’t like change—in fact, one of our support team has a big sign that says, “what changed?” taped above his desk because change is nearly always the root cause of a problem. On the other hand, engineers live for change. They get paid for it. Whether they’re altering a network, or modifying a piece of code, or rewriting a query, they’re always changing something.

More and more of the people using Coradiant’s products are engineers. And that signals an expansion of the role of Real User Monitoring within IT, as everyone starts to realize how they can benefit from measuring actual users.

What’s Real User Monitoring, anyway?


Monday, August 14th, 2006 Posted by: Alistair Croll

We use the term Real User Monitoring to explain what Coradiant’s technology does. The term sounds a bit nebulous, but it does the job. Of course, there are lots of people who think they do real user monitoring; so I’m going to try and explain the differences between us and some of the distinctions.

Synthetic tests

First off are the synthetic testing companies. Their tools—usually sold as recurring monthly services—run scripts at regular intervals from all over the world. These scripts simulate what an ideal user would do: Transactions like checking in, putting something in a cart, or getting an account balance.

Lots of people like synthetic tests because they’re repeatable and predictable. They’re great for baselining; in fact, Chris Looseley of Keynote Systems did a great job explaining this for us at the Webops sessions of Interop Las Vegas.

But they’re not monitoring real users. They’re simulating idealized users from controlled environments. Real users might be miserable while the synthetic tests work just fine.

Synthetic testing

These tools are essential to web operators; but they won’t tell you anything about the volume of traffic to a site, or whether end users are actually getting the performance that the tests report.

Web log analysis

A second way of collecting information on web health is via weblogs. Each time a server gets a hit, it writes down information on that hit in a logfile (usually following a format called ELF, or Extended Log Format.) The logfile tells you a lot about the request: Where it came from, what it requested, and when it occurred. It might even tell you about the timing of the request.

Web logs are monitoring real user activity. On their own, they’re not that useful. But feed them into a web log analysis tool (like Analog, Webtrends, Sane, or Sawmill) and you’ll find out lots of details: What people searched for, where they went on the site, what browser they used, and so on. More commonly, companies use a web analytics firm like CoreMetrics, WebTrends Live, Omniture, Clicktracks, or WebSideStory that collect activity based on Javascript. Often, activity is displayed in funnels of user activity by step, cross-referenced with search terms.

Web funnel view

Web log analysis doesn’t offer much performance data. It won’t split requests down into the elements of latency, or show network forensics. But it’s also aimed at the public-facing, B2C sites. Analytics products are seldom used to explain activity on an intranet or a back-end B2B application.

Sniffers

Technically, sniffing traffic is real user monitoring—after all, real users made all those packets. But even viewing the traffic in a sniffer screen doesn’t tell you much about users. WildPackets, Network General, Niksun and ClearSight are good examples of sniffers I’ve seen, but most people I know use Ethereal, which is free and amazing.

A sniffer screen from Ethereal

Flow monitoring products

Higher up the stack than sniffers are what I call “flow monitors.” These work in a variety of ways, generally by asking other devices about traffic they saw (using RMON or Netflow). A more open version of Netflow, called IPFIX, is making this more and more attractive to people.
Flow monitoring across TCP ports

Response time monitoring

Another way to measure application response time is to sniff traffic from span ports and measure the round-trip time of sessions (rather than collecting flow data from network devices.) For example, NetQOS’ SuperAgent measures the end-to-end time between networks and hosts by listening to span ports or taps.

We announced a partnership with NetQOS back in April. Their reporter/analyzer product collects NetFlow and IPFIX data; and their SuperAgent product is a response-time monitoring product that watches the TCP/IP sessions between networks and hosts. It assembles and aggregates these so you can see how much traffic flowed from what network to what port on a server. And it measures performance data—how long the packets took to travel, how long the server thought about them, and so on.

What does what

A flow monitoring product summarizes things at the time of collection (i.e. on the router) so it can’t peer within the flow. Response time monitors can look within the traffic, but are generally protocol-agnostic: They don’t “understand” a web, e-mail, or IM session across individual traffic flows. This means that if a protocol-agnostic monitoring tools sees that there were 50 Kbytes of data between a network and a web host, the operator still doesn’t know whether that was one 50Kbyte object, or 50 1-Kbyte objects.

As a result, I don’t know if this session was one user, or 10 users behind a NATting firewall. I don’t know how long individual pages took, or how many pages a user requested in a visit, and so on. And I can’t tell things like browser type or search string, or what they entered in a form.

On the other hand, flow monitors and response time monitors are great for comparing the amount of traffic across all kinds of applications. A sudden increase in Voice-Over-IP (VOIP) traffic might mean that web traffic takes longer to get through; someone running a backup late at night might inadvertently make late-night shoppers miserable. And this kind of activity is completely invisible to a product that’s watching HTTP. So if you’re trying to troubleshoot and measure networks, you need a flow monitor (preferably from our friends at NetQOS; they also have great SNMP monitoring tools to collect device health, and a centralized performance console.

Real User Monitoring products

TrueSight falls into this class. Basically, it’s able to discern individual users and page load times.

So a complete web operations team has a variety of monitoring tools at their disposal:

  • Synthetic testing to detect problems when there’s no activity and set baselines for controlled, known processes.
  • Web analytics to show conversion rates, funnels, search terms, and the like to marketing.
  • Sniffers to capture traces for network engineers.
  • Flow-based monitors to understand the breakdown of traffic across all protocols and how one application impacts the others.
  • Real user monitoring to measure the performance and availability experienced by actual users, diagnose individual incidents, and track the impact of a change.So when Coradiant partners with NetQOS, it’s a way of giving customers the best of both worlds: Deep web analysis, alongside broad multiprotocol monitoring.

Viewing performance deep and wide

Picking raspberries: Why transaction funnels won’t last


Monday, August 7th, 2006 Posted by: Alistair Croll

I spent a rather prickly half-hour this weekend picking wild raspberries.  I meandered opportunistically through tangles of bramble, eyes searching for the deep-red clusters. And each time I found one, a glance showed me more. Some were right at hand, others tantalizingly out of reach.

At the end of the half hour, I walked up a short hill and looked back where I’d been as I picked thorns from my hands and wiped juices from a few stolen berries from my mouth.  And what I saw was interesting: No clear path.  I didn’t follow a trajectory.  I zig-zagged, apparently randomly, through the field. I was guided by my choices and by what caught my fancy.

A few short years ago, consumer-facing websites were transactional.  Users followed a tightly ordered series of steps, from goal to completion. Whether buying a car or applying for a loan, booking a flight or shopping for shoes, the fundamentals were the same.

Today, however, the assumption of a transaction—and the ubiquitous metaphor of the “funnel”—is fading fast. Modern sites do all they can to distract the user from a simple transactions. They show you other products; offer upgrades, promotions, and discounts; and try to forward you to their partners.

Indeed, today’s business-to-consumer (B2C) site bears far more resemblance to my fruit-gathering afternoon than to any funnel. Visitors meander opportunistically, following some distractions and ignoring others.  They linger, move back and forth, add to wishlists. And often the steps of a traditional transaction take place asynchronously: The site already knows billing; they add things to their cart after they’ve started the payment process; and so on.

One of our customers, a leader in dynamic web platforms for retail and support, told me a little while back that their checkout pages are the lightest, least resource-intensive element of their site. In fact, it’s the personalization and customization that comes beforehand that consumes all of their processing.

But most of the Web Analytics applications I’ve seen are still stuck in funnels. They try to map goal attainment to costs (such as search words or campaign mailouts.)  And while this works reasonably well for prescriptive, task-oriented transactions, it’s less effective for the kinds of meandering interactions that are more and more common on today’s Internet.

This is one of the reasons that many of the companies I talk to are looking beyond traditional web analytics.  They’re pumping web data into Business Intelligence systems, where it can me analyzed and massaged to provide real insight into user behavior.

Tying this kind of analysis back to lead generation (search engines, e-mail campaigns, and the like) is only part of the challenge.  It also needs to be linked to the back-end resources to understand which dynamic content is working, and into operational systems to ensure that visitors get acceptable service levels.

We’re seeing the application of well-known technologies such as Business Intelligence to the challenges of web operations. And as sites move from the sterile assumption of a funnel to a more real-world understanding of visitor behavior, new metaphors will undoubtedly emerge.  But for many sites, the funnel’s time is up.

Now if I can just get that Raspberry Patch visualization to work…

Why percentiles are your friend


Friday, August 4th, 2006 Posted by: Alistair Croll

I recently spoke at the Computer Measurement Group’s Connecticut and New York regional meetings. The good folks at CMG are concerned with measuring how computers perform, planning their capacity, and trying to predict what’s going to break before anyone gets hurt.

They’re one of the few groups that understand percentiles already. Most of us have only a smattering of statistics, and go around quoting bad measurements all the time.

Most people think of the average as a way of understanding something. But averages are very misleading. I once had a stats professor who drove this home clearly. He strode into a dank, humid room late in a September that was lasting far longer than it had a right to. He squinted over bent glasses at a roomful of expectant students, stroking his bristling beard. And then, in a thick Northern European accent, he began.

“I apologize for the unseasonably warm weather. I realize that it will be hard to concentrate in this oppressive heat.”

The class sweated its agreement damply.

“But I assure you,” he continued, “that the building is poorly insulated, and that in the winter you will leave your jackets on and write with gloved hands. Indeed, it will be hard to work in that biting cold.”

As one, through the sweltering haze, we started to regret our choice of classes.

He looked up with a twinkle in his eye. “We can console ourselves with the knowledge that, on average, the temperature is really quite pleasant. My name is Stig; welcome to Statistics 201.”

Stig’s example was a good one. In this case, a standard deviation—a measure of how far from average the measurements are—would have revealed that while the average was indeed comfortable, daily temperatures varied widely from it.

One of the common misconceptions is that people think data is distributed according to a normal, or bell, curve. In other words, it clusters around the average. Imagine our classroom: When I tell you that the average temperature is 65 degrees fahrenheit, you assume it looks like this:

Normal curve

 

In our badly-insulated math class, there might have been many very hot days, and many very cold days: But no days that were actually temperate. In other words, the data looked like this:

Well curve

 

Some people refer to this as a well curve. It’s characteristic of many things—for example, the performance of broadband users (on the left) and dialup users (on the right) on a website.

There are lots of others. The much-blogged Long Tail, subject of a forthcoming book by a former editor of Wired Magazine, is common in things like online bookstores. A few books are hugely popular; but millions of other books are of roughly equal rank. If we had lots of cold days and only a few hot ones, my graph would look like this.

Long tail curve

 

Any of these charts are called histograms. They simply show how frequently something occurs at a particular measurement. People are starting to understand that averages are misleading, reducing confusion around statistics somewhat. And they’re starting to use histograms and frequency distributions to better communicate the health of something.

In my experience, performance for most web sites looks something like the following chart.

Typical web performance

 

 

There’s another, bigger issue at work here. Not everybody matters as much. The odds that a user with very bad performance complains, doesn’t buy, or otherwise undermines your site are high. So companies should care more about those stragglers on the far-right-hand side of the graphs. Consider the following chart, which shows how many people were upset by slow performance.

Typical web redyellowgreen coloring

 

There are two good ways to understand this information relatively clearly. The first is to have a threshold (say, 10 seconds) and then to count the number of people that were below that threshold.

hard 10 second threshold

 

In this example, 74% of requests (37 out of 50) were completed in less than 10 seconds. And the 13 users on the right are the ones we care about (assuming, of course, that these users are impatient enough to go away when the page takes 10 seconds or more. More on this later.)

But what if we change the threshold? What if users suddenly become more demanding, so that now 6 seconds is all they’ll tolerate?

Hard 6 second threshold

 

Now, only 27 users (54%) are satisfied. And this is the problem with a threshold approach—thresholds change. Once you’ve set a threshold, you can never go back and change it for something that happened in the past. You’ve counted how many people were over 10 seconds; but you can’t go back in time and now count up how many were under 6 seconds.

The second way to store an accurate understanding of traffic is called a frequency distribution. Essentially, it involves making a “bucket” for each measurement, from which you can calculate thresholds and other good things. Here’s an example of that.

Frequency buckets

 

Storing performance data this way is difficult; it requires some pretty exotic math to be able to store it efficiently and make sense of it. But it allows people to say things like, “95 per cent of visitors received the page in under 7 seconds.” That’s a nice, meaningful, pithy statement. My stats professor would be proud.

When it came time to build TrueSight, I dusted off some of my old stats textbooks to be sure we did it right. And I’ve always struggled to explain why it’s important to do it this way—once people are familiar with percentile reports, they never go back; but explaining it is always difficult.

I should have left to the guys at CMG. After my Connecticut presentation, one of the audience walked up to the podium and handed me a scrap of paper. I’ve got it sitting on my desk. It says:

 

“H. Barry Merrill (MXG) on why averages are misleading and invalid: ‘The average person in the U.S. has one breast and one testicle.’”

 

 

 

 

Is virtualization the killer app for Real User Monitoring?


Tuesday, August 1st, 2006 Posted by: Alistair Croll

I’m a bigot. I think everything should be done by watching end users. It’s no big surprise—since that’s what we make. But at a deep, visceral level it seems foolish to me to try and guess how users are doing when we can just watch them.

For years, people have had to settle for approximations and estimations of end-user experience. The technology wasn’t there to watch them directly: It took too much horsepower; it was complicated; the protocols weren’t standardized. So they settled for test-it-yourself, or wait-for-the-phone-to-ring.

Nowadays, there’s no excuse: Real User Monitoring is affordable and transparent (we just announced a $20K box that goes in in an hour.) And yet I still find myself coming up with analogies and evangelizing why everyone should watch their users, in real time, to determine whether things are working properly.

I think virtualization’s going to change all that for me.

If you run a data center or a web application, you’re about to be hit by wave upon wave of virtualization. And this can only help the case for real user monitoring: The complexity of many layers of indirection make it nearly impossible to determine user health by looking at the components that serve up a web page.

Don’t believe the hype? Let’s look a little more closely at virtualization.

Any technology that tries to separate something from something else through a layer of abstraction is virtualized. If you have a LAN in your house, the little router connected to your telephone company has a Virtual IP address on the outside, and some (or many, if you’re a geek like me) real IP addresses on the inside. Your phone company doesn’t know how many PCs you have—if they did, they’d try to charge you for it. In the same way, one could consider a dual-core CPU as a form of virtualization: It tries to make a single CPU look like two.

Consider the following layers of virtualization that run in nearly every data center today:

  • TCP/IP sessions between a web browser and a server allow the client to pull in many objects on a page simultaneously, making many connections look like one to the user. This is particularly true in later versions of browsers with pipelining and persistence.
  • Load-balancers make many servers look like a single, larger machine.
  • Servers run the Java Virtual Machine or other services that spawn new threads on a single platform.
  • The operating systems manage memory to make each application think it has its own environment.
  • Storage Area Networks and Network-Attached Storage (SAN/NAS) make one disk look like many and many like one.

    But all of these pale in comparison with what’s on the horizon.

  • Virtualization managers like those from VMWare create a “master” layer (a hypervisor) that runs several, independent machines. Each virtual machine makes the programs it’s running think they have their own computer. This is true for open-source contender Xen, which Novell’s SUSE Linux is helping to mainstream (and this article shows just how difficult it is to do).
  • In a different approach, Microsoft announced their virtualization strategy—this one lets applications run virtually within the operating system.
  • Of course, there’s grid computing. Whether you’re doing some edge processing with Akamai or Netli, or actually relying on a grid for computing, some part of your application is virtualized.
  • And what’s a web service if not a function that doesn’t care where it’s run, turning a piece of software into a distributed mass of remote procedure calls?

After a few days of listening to virtualization presentations from the editors of Network Computing and from BMC’s research team, I drew the following schematic of all the layers of virtualization:

Layers of virtualization

With all of this complexity and layering, it becomes nearly impossible to measure health from within. Like a prisoner of the Matrix, if we’re living in a virtualized environment we can’t tell whether our frame of reference is reliable.

(Woah…)

But without reliable internal indicators, we can only measure health from outside the system. It’s no longer a question of whether or not we should monitor real users; it’s now a matter of not being able to get reliable data anywhere else.