Skip to content

Coradiant

Archive for the 'Simplicity' Category

Correlating End-User Performance and Conversions


Monday, January 11th, 2010 Posted by: Sri Raghavan

Companies spend an inordinate amount of money setting up complex IT systems to manage and optimize service offerings to customers.  Often these management systems only look at a single metric, like conversions or performance. What if both could be correlated?

Imagine you are a high-traffic on-line ticket vendor for major events nationwide.  As the head of IT operations, you want to maximize conversions by ensuring that you can handle the traffic volumes generated by a major concert event or marketing push and that the response times for customer queries remain optimal. In spite of your best efforts, however, the latest concert promotion brought your site to its knees and some of your systems went down for a short period of time.  The impact of that outage on your users might be unacceptably slow response times or actual system errors in their browser.  In either case, if they risk losing out on good seats for the concert while waiting for your systems to recover, they will abandon your site in favor of a competitor that is more responsive and you will have lost that revenue. On the assumption that you could have sold 1000 tickets at $100 apiece during that outage, you could lose up to $100k of revenue, depending on the number of users affected.  That’s a lot of crumpets, as they say.

Poor response times and unreachable sites are bad for revenue. But there is also a third reason why your site traffic could be impacted that is not caused by the performance issues described above.  It may be that your site content is unappealing to the average eye or that your offerings are priced out of reach for the average consumer.  This is the typical explanation that marketing gives to conversion dips, but it is clearly not the only root cause.

This is where Coradiant Analytics In A Box can help.  Our passive capture technology combined with the included Google Urchin software collects performance and availability metrics about your site while it is tracking user behavior. This allows you to look at conversions (e.g. ticket purchases) in the context of performance and availability, thus allowing you to categorize conversion dips into those that are infrastructure related and those that are marketing and content related.  For example, your analytics data tells you that the 22% of your users bought tickets between 12-2 PM on Monday afternoon. You also see that this is about 35% less than the norm for a corresponding period.  Coradiant allows you to correlate this event with an increase in page load time or availability problems.

End-user monitoring coupled with analytics provides a powerful solution that helps to maximize customer visits and conversions.  It places those responsible for business results and for performance in the driver’s seat by providing them with the data they need to understand the real business impact of system outages, thus enabling them to provide the best possible service to their Line-of-Business managers.

How can Marketing free itself from IT bondage and privacy regulation?


Sunday, December 13th, 2009 Posted by: Jonathan Ginter

I feel real sympathy for Marketing departments.  They are saddled with the enormously difficult task of profiling users in order to achieve more effective marketing efforts.  In order to do that, they must understand who the user is, what they did on the site and how they reacted to various marketing efforts.  Up to now, they have been relying on traditional Web Analytics solutions to get that data.  However, the price for that data is the insertion of page tags and tracking cookies.  Cloud-based services add on an additional price – the exporting of analytic data out to the cloud.  All of that might seem like a small price, but it is actually a true Faustian deal.

To start with, Marketing must subject itself to the busy timelines of Development, QA and IT whenever it wants to change its tagging or cookies.  Since those departments are often busy rolling out new features, Marketing often has to wait weeks or even months to get their new data.  In many cases, Marketing discovers a problem with the page tagging during the course of a campaign and are unable to roll out a fix quickly enough. In companies where this problem is recognized, the problem is reversed and Marketing is allowed to hijack the roll-out process with an emergency patch to its analytics tagging, often negatively impacting the delivery of important new features.  Both Marketing and IT would benefit if this link could be severed.

The other problem that Marketing is facing comes from Europe, where a wave of privacy regulation is forcing existing Web Analytics solutions to run for cover, leaving Marketing departments with little to help them.  Germany has passed very strict laws prohibiting the use of page tagging and tracking cookies without the user’s consent.  Moreover, shipping analytics data to a hosted service for processing is specifically forbidden.  Although privacy is often talked about in every part of the world, Europe is the first to have passed these kinds of laws about it – a trend that could easily spread outside EMEA.

The irony is that most of the data that Marketing often requires is already contained within the traffic stream before any tagging or tracking takes place – where the user is from, what browser and OS they use, which ISP they used, where they came from, where they went, what they looked at, what they bought, etc.  It’s either embedded in the HTTP protocol or as part of a web server’s natural ability to maintain a stateful application or it is directly within the request or response content.

Consequently, as announced last week, Coradiant has teamed up with Google to create our Analytics In A Box (AIB) solution.  This revolutionary new product uses Coradiant’s existing technology to passively process all user traffic to and from the web site, producing full-featured web analytics data that remains in-house.  Instead of inserting page tags or using special cookies, Marketing can define AIB rules that will extract the information directly from the traffic stream. This approach will allow Marketing to define new metrics whenever they want to – even in the middle of a campaign during peak hours.  Moreover, the solution refreshes its data every hour, providing Marketing with immediate results from any changes.  No other department has to be involved.  Moreover, the data is 100% secure and kept in-house, which means no privacy violations.

As Bogey said, “this could be the start of a beautiful friendship”.

What makes a “must-have” IT product?


Tuesday, February 26th, 2008 Posted by: Tony Tissot

Patrick Gardella, of Discovery Communications, recently spoke to Network World and said about Coradiant TrueSight, “Basically, it allows us to identify very rapidly what is happening with actual users on the site, and then it helps us debug those things.”

“With our huge online shopping site — and other Web systems that require major user interaction – users have problems. When that happens, we get e-mails saying, ‘Your site is broken’ and not much more. The reason I like Coradiant is that it offers a very simple, easy-to-use appliance that can find out what’s happening with those individual users, as well as how many other people are having those same problems.”

For the full article see: Network World 

How Microsoft broke Skype by accident


Monday, August 20th, 2007 Posted by: Alistair Croll

Skype broke.

This should serve as a lesson to us all. Sometimes the old ways are the best, and we ignore them at our peril.

The folks at Skype said:

On Thursday, 16th August 2007, the Skype peer-to-peer network became unstable and suffered a critical disruption. The disruption was triggered by a massive restart of our users’ computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update.

Yep, that’s right. Microsoft sent out a patch, and it brought down Skype.

TCP is a great example of simple, elegant implementations. TCP is breaking at the seams — it doesn’t support enough ports; it’s a jack-of-all-trades transport that isn’t particularly efficient; it requires a lot of computation; and it’s redundant in a lot of encryption and compression systems. Companies like Netli (acquired by Akamai) built businesses on the inefficiency of TCP. Making TCP efficient is a major factor in how Application Front End products (like Citrix’s NetScaler) speed up sites and reduce the load on servers.

But TCP is elegant. One of the things it does best is recover from problems. Wikipedia tells us:

“Modern implementations of TCP contain four intertwined algorithms: Slow-start, congestion avoidance, fast retransmit, and fast recovery (RFC2581).”

Ethernet does this well, too. When congestion occurs, senders keep talking long enough to make sure everyone heard the congestion, then back off for a random length of time. From Wikipedia, again:

“This can be likened to what happens at a dinner party, where all the guests talk to each other through a common medium (the air). Before speaking, each guest politely waits for the current speaker to finish. If two guests start speaking at the same time, both stop and wait for short, random periods of time (in Ethernet, this time is generally measured in microseconds). The hope is that by each choosing a random period of time, both guests will not choose the same time to try to speak again, thus avoiding another collision. Exponentially increasing back-off times (determined using the truncated binary exponential backoff algorithm) are used when there is more than one failed attempt to transmit.”

Think about that for a second. The guys who built these protocols realized that congestion would happen, and built models for dealing with unpredictable situations by backing off a random time, and for detecting congestion and avoiding it. And this was back in the day when there were only a few nodes on the Internet. Yet they function reasonably well even today.

So why didn’t Skype work properly? Without getting into too many details, the folks at Skype explained:

Normally Skype’s peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly.

There are two important lessons to be learned here:

  • First, it’s critical to look at traffic volumes. Many of the people who buy our UPM equipment used to rely on synthetic testing to monitor their sites. Often, they couldn’t answer simple questions like, “how many users do you have on your site today?” Their marketing department might know, through web analytics tags, how many sessions were active; but there was no way to stitch together traffic levels and performance.
  • And second, the Skype incident is a great example of how complex systems can fail in unexpected ways, and how everything on the Internet is intertwingled. Microsoft’s practice of updating and automatically rebooting billions of computers independent of owner control creates tremendous traffic spikes — and this is true of web-connected services such as antivirus updates and desktop plug-ins. But the impact of these spikes isn’t tracked or understood.

Understanding the relationship between load and performance is critical for anyone running a production web application. Applications will break; and without the right information at your disposal, you won’t be able to detect problems or fix them effectively.

With billions of nodes on the Internet and millions of changes a day to production systems, Sod’s Law (a variant of Murphy’s law) is definitely true: “Anything that can go wrong, will.” But it’s also possible to invoke Hanlon’s razor, a corollary to Murphy, that says, “Never assume malice when stupidity will suffice.”

Why movies teach us bad things about IT tools


Monday, August 6th, 2007 Posted by: Alistair Croll

I watched the Bourne trilogy this weekend.

I have to confess that I love the series. One of the things I most admire about it is that the hero actually thinks. I mean, in the first film, he grabs a radio off an opponent, rips a floor map off a wall, and uses that to evade capture and get out of the building. Sure, the films have some crazy car chases (which, by the way, result in a lot of accidents — how unusual!) And there are flight scenes and explosions. But they’re always reasonable.

It’s sad that I’m so impressed by someone acting wisely and normally. As I thought more about it, it occurred to me that Hollywood fills films with convenience. They do this so much that cleverness and pragmatism are refreshing. We’re so used to the Macguffin that when there isn’t one, we’re actually pleasantly surprised.

Peter’s Evil Overlord List is a great, and growing, list of silly conceits from movies. It does a better job than I can of making my point. Some examples:

  1. My Legions of Terror will have helmets with clear plexiglass visors, not face-concealing ones.
  2. My ventilation ducts will be too small to crawl through.
  3. Shooting is not too good for my enemies.
  4. One of my advisors will be an average five-year-old child. Any flaws in my plan that he is able to spot will be corrected before implementation.
  5. No matter how well it would perform, I will never construct any sort of machinery which is completely indestructible except for one small and virtually inaccessible vulnerable spot.
  6. I will never build only one of anything important. All important systems will have redundant control panels and power supplies.
  7. For the same reason I will always carry at least two fully loaded weapons at all times.
  8. Once my power is secure, I will destroy all those pesky time-travel devices.
  9. If I have massive computer systems, I will take at least as many precautions as a small business and include things such as virus-scans and firewalls.
  10. No matter how many shorts we have in the system, my guards will be instructed to treat every surveillance camera malfunction as a full-scale emergency.

So what does this have to do with IT? Well, often demos are so convenient they lull buyers into a false sense of security. We want to accept the convenient explanations, because they make things simple.I remember movies from the seventies in which the bad guy locked Our Hero in the sauna, hoping he’d steam to death.

Oh, come on. How many saunas have doors that lock from the outside?

For that matter, how many data centers have big “self destruct” buttons, clearly marked? How many security guards have nametags without photos on them? How many times is the back door to the secret lair conveniently ajar? None of these things happen in the real world; but they happen in movies, and we accept them. Software demos do the same thing. We see a demo, and it looks fine. We want to believe it can save us. We’re willing to accept the coincidences. Salvation is real and imminent.

But reality is a lot more bleak. The tools are seldom as straightforward as they were in the demo. In our field — user performance management for online applications — there are plenty of examples of how things in the real world aren’t nearly as convenient.

Here’s my list of ten differences between the demo and the real world for web monitoring technologies.

1. There’s always a security problem

Whenever you try to deploy new software, there are always security issues. Applications require ports for communication, and have to be tested by the security department. Capturing user data means compliance and oversight — depending on your industry, you may have to store it for seven years. And physical devices may be subject to attacks or may be an unsupported operating system. Good, secure tools that work out of the box without annoying your security officer are worth their weight in gold.

2. URIs aren’t sensible

Sites don’t always have easy-to-read names. Sure, Wikipedia might have http://en.wikipedia.org/wiki/Evil_Overlord_List as a URL that’s pretty easy to parse. But more often than not, it’ll be something like http://www.ifaw.org/ifaw/general/default.aspx?splash&oid=17767 (which, by the way, is the home page for the International Fund for Animal Welfare — but you wouldn’t know it from the URL.) Assume that for something to be useful, it has to be flexible enough to accommodate the quirks of your site’s structure.

3. The things you’re testing change

Nothing is static. We have customers whose websites’ code changes daily. For them, a simple test isn’t really relevant; it’s useful for a day. If a key function is a constantly moving target, make sure your tools can stick to that target like glue. Otherwise, when something breaks you’ll be looking at yesterday’s data. Ask yourself whether a tool can adapt quickly to changes in the site.

4. All functions aren’t equal

The typical website has dozens of funtions, from login to reporting to search to account management. We don’t expect all of them to take the same time. Logging in should be relatively quick; but generating a detailed report could take a while. And we’re okay with that. Unfortunately, performance measurement isn’t. Most web performance tools have a “one size fits all” approach to thresholding. This means that you’re either flooded with false alarms (which you’ll turn off) or missing important ones. Does the monitoring technology recognize the context of a function and a user, and automatically adjust to different functions?

5. Every site breaks in its own special ways

I used to have a bounty for broken sites. Over the years, people have sent me hundreds of screenshots of applications breaking in new and unexpected ways. (to this day, one of my favourites is http://www.starwars.com/welcome/404.html.) Some sites try to hide their errors behind polite apologies. Others give detailed error information on the page. Some errors don’t even produce data: A premature server reset or excessive TCP retransmissions, for example, happens outside the realm of HTTP; but it’s still a problem. What if your site breaks in ways that aren’t in the demo you’re seeing?

6. No matter what reports you’ve got, you don’t have the right one

You can never tell what you’re going to need to look at. Sure, it might be useful to see which server is busiest, which browser is slowest, or which page has the most errors. But sooner or later you’re going to get a “complicated” question: “Are Firefox browsers from China who search by zipcode generating more errors?” (seriously, one of our customers needed to know this.) If the tool can only slice data in predefined ways, you’re going to be stuck guessing. How flexibly can you focus the analysis of the tool on specific segments of traffic? Can you drill into it?

7. The installation of agents always has issues

The software agent is the IT equivalent of a dentist saying, “trust me, this won’t hurt a bit.” Agents need management and updating. They have to transmit data, and present points of attack. They’re silent when the servers they run on are broken. They generate network traffic. And they’re sandboxed, trapped within the environment on which they run.

Sure, agent-based monitoring is a necessary evil. But it should be used judiciously, and you need to deploy agents with a recognition that things won’t be as rosy as they sound. You’ll have to lobby for their deployment. You’re going to jump through hoops to get them communicating with your management systems. When you’re looking at a demo that has complete visibility, spend a lot of time on the organizational cost of that visibility.

8. Editing tags has hidden costs and limited visibility

The web alternative to agents is tags. These included pieces of Javascript provide some monitoring by asking the browser to report on performance and errors. Javascript and tagging is a big headache. For marketing departments, it’s an invaluable tool — but Gartner claims that maintaining tags and scripts is the biggest downside to web analytics.

Using tags for monitoring sounds easy in principle. In practice, however, it’s fraught with peril. Javascript collection makes the assumption that the page loaded properly (otherwise, how did you get the Javascript?) It also assumes that the client will run the script (which isn’t the case for many phones, for non-HTML content, and for users with privacy settings turned on.) And the client is sandboxed: For security reasons, the Javascript on the client doesn’t have access to the networking stack or facts about the network. What’s worse, the act of including Javascript can often slow down the page load time. Consider the organizational cost and the amount of technical information you’ll get when things go wrong.

9. Users don’t follow simple paths

Most e-commerce sites like to think they have simple transactions. Users put things in a cart, check out, pay for their goods, and confirm the shipping address. The reality is, users don’t follow proscribed routes. They meander around the site, going backwards and forwards, opening new tabs, changing their minds. For IT operations, what matters more is the health of key steps in a process, and which users encountered problems at those steps. Don’t assume users will do what you expect.

10. It’s always expensive to run things

Many studies have repeatedly shown that the real cost of IT is operational. Eric Dean, CIO of United Airlines, told Forbes that that for every dollar he spends on a package, he must spend $5 to $7 more on consulting to make it work. Network Appliance estimates that for every dollar of storage, users spend $5 to $7 to manage it (though their tools claim to get that down to $2 to $3 — partly due to their appliance focus.) And the Seybold Group estimates that with even standard packaged software, for every dollar spent on software a company spends $5 on consulting, systems integration, and custom programming. So when you’re seeing an IT offering, ask yourself: How much will this cost to run? Will it take care of itself?

Back to the real world

Demos often feature nice, simple sites where users are well behaved, installation is assumed, reports show the right data, and security’s not an issue. That’s the IT sales equivalent of the hero defusing the bomb with two seconds left, then finding an escape pod. It’d be nice, but it’s no way to run a business.

Next time you’re evaluating IT tools, think of the cheap tricks that movies pull to conveniently move the plot along. Then think about how much of what you’re seeing is conveniently tweaked for an ideal story.

We used to run websites, so when we started making tools for web operators, we vowed never to make things that looked better in the demo. In fact, we don’t have demo boxes. We have production units that prospective customers buy. They nearly never come back. We don’t really believe in demos: If the product is going to be useful, you should be using it from day one.

In short: If you can’t get results from it the day you plug it in, it’s probably not going to get used once you sign the check.

I’m going to finish this off with a joke, even though you’ve probably heard it and I may have already given away the punchline.

A software salesperson is killed trying to save a schoolbus full of orphans. St. Peter says, “I’m a little unsure what to do. On the one hand, you gave your life so others could live. On the other hand, you sold software that promised far more than it could actually deliver in the real world. So I don’t know whether you go to heaven or hell.”

The salesperson replies, “well, what’s the difference between the two?”

St. Peter answers, “I’m willing to let you visit both places briefly, if it will help your decision.”

First, St. Peter sends the salesperson to hell. And it’s beautiful! Sunny, clear, with attractive people enjoying delicious food, frolicking in the ocean.

“This is great!” says the salesperson. “If this is hell, I really want to see heaven!”

St. Peter snaps his fingers and they’re in heaven. It’s high above fluffy clouds, with angels singing and playing soft, Enya-like music.

The salesperson thinks for a minute, then says, “I guess I’ll take hell.”

Two weeks later, St. Peter decided to see how his charge was doing. When he got there, he found the poor salesperson in chains, hair singed off, screaming as he was tormented by fireball-tossing imps and succubi.

“How’s it working out?” he asked.

The salesperson sobbed, “this is nothing like the hell I visited two weeks ago! What happened?”

“Oh, I’m sorry,” said St. Peter. “That was the demo.”

I guess the moral of the story is, there’s no substitute for seeing the real thing.

Don’t underestimate the importance of products that do what they say they do, well, the day you get them.

Simple is good


Monday, February 12th, 2007 Posted by: Alistair Croll

We used to be a managed service provider.

Before we started building TrueSight, a lot of what we did involved sitting in chairs, trying to fix things, and answering the phone. We had boxes we relied on daily. Simple things — like a power strip that you could SSH into and reboot a machine through — saved us thousands of dollars and untold hours on planes.

And we had software. Complicated, hard-to-use, never-implemented software. I personally authorized a huge software implementation — half a million dollars — that we never got running (I won’t tell you the name of the company, to spare myself the lawsuits.) It was painful, and wasteful, and led to a ridiculous amount of finger pointing. In the end, a smart employee built something in Lotus Notes in his spare time that let us remotely manage an entire rack of multivendor equipment.

One of the things we’re brutally aware of, then, is complexity. When we started making equipment for data centers, we vowed to keep it simple. It’s one of the reasons we make things in an appliance format. We think that user performance monitoring should be transparent, secure, and as easy to maintain as a load-balancer or router.

It’s also one of the reasons you can see, at a glance, an entire user’s session and drill into it. And it’s one of the things that’s fueling the creation of a wide range of simple, intuitive visualizations (we call them Lenses, and we haven’t really shown them to anyone but our customers yet, but everyone who sees them can’t stop staring and clicking.) More on these in a future post.

So with web operations scars all over me, it makes me really happy when I see quotes from our customers that suggest we’re on the right track. Someone forwarded me a note on Friday that made my weekend:

“Every time a client calls up, the first thing someone says is, ‘Are they on Coradiant?’ So it’s become sort of the standard for troubleshooting. It went from sort of an exploratory product to one that has become almost mission-critical in regards to its usage … I mean, I’ve seen a lot of products that tout all their abilities, but this is one that actually truly delivers and delivers out-of-the-box, which is really the best part of it.”

Simplicity is great. But simple doesn’t mean limited features. In fact, we were talking to a very big software company last week and they looked at me an hour into the discussions and said, “this thing is like Excel: I can start using it in minutes, but there’s just so much depth and detail here I can get it to do pretty much anything.” (Although you can’t use it to play pac-man, as this completely unnecessary Excel macro does.)

I’ve often thought that product management isn’t about giving people what they ask for. It’s about not giving them what they don’t really need.

This year, we’re working on communicating that simplicity to people so they can grasp what we do and understand how useful it is. This video (about the iPod, but apparently created by folks in Redmond) is a great example of how to position for simplicity.

Part of making a simple product is understanding the problem you’re trying to solve really, really well. We spend a lot of time agonizing over small details and using smart defaults so our end users don’t have to. A lot of times, we don’t give people choices; instead, we make a choice for them. Choices, it turns out, are the enemy of great software. Every choice doubles the number of test cases you’ve got. Great technology is smart enough to make its own choices, and let the advanced users adjust those choices if they feel like running with scissors.

A friend showed me fellow Montrealer Andy Nulman’s blog on surprise — it’s great reading if you’re a marketer — and in his first post he claimed that:

“I have always shouted that consumers don’t know what they want. Well, I lied.
They know what they want.
They WANT to be led.
They WANT us to lead them.
They WANT to follow.
And they WANT to be surprised.”

That might sound a bit arrogant. And I’m confident that a lot of technology companies think it doesn’t apply to them: They’re selling non-consumer products, so their buyers make rational, economic buying decisions. Right?

Wrong.

Most people are innundated with rational pitches. I’m with Andy: Provided that the surprise includes novel, efficient, easy-to-grasp solutions to known problems, web operators are OK with us innovating on their behalf. Lots of our customers rely on us to plow the snow on their behalf; in other words, they don’t have time to figure out where web performance is going, so they expect us to do so.

Having sat in their seats for a few years, we feel their pain.