Skip to content

Coradiant

Archive for the 'End-User Experience Management' Category

Your site’s performance is important to Google


Wednesday, May 5th, 2010 Posted by: Jonathan Ginter

Poor performance can now degrade your business in an even more real and meaningful way.  Recent changes by Google will allow site performance to affect whether traffic is driven to your site.  This places a new urgency on the ability to accurately measure performance in terms of the end user.

“Time is ticking out” by flickr user Mao Lini. Used under Creative Commons license.

Recently, the Mashable blog reported on a decision by Google to add performance as another deciding factor in how they rank their search results.  This marks a significant milestone in which performance will affect your site’s ability to attract visitors.  Essentially, the faster sites will receive more attention.  By taking this action, Google hopes to shine a spotlight on performance and drive better overall development practices.  I have no doubt that this is a direct result of Steve Souders‘ move to Google from Yahoo, where he had been busy leading the YUI best practices effort that produced YSlow.  Since that time, Google has become much more of an advocate for web performance.  Here’s a direct quote from their blog that sums up their current perspective:

“We encourage you to start looking at your site’s speed … not only to improve your ranking in search engines, but also to improve everyone’s experience on the Internet.”

Essentially, Google has now made performance an important factor in driving business to your site.  At the same time, performance is one of the hardest things to accurately and properly measure for web-based traffic.  Google suggests a number of very good tools, most of which run inside your browser.  These tools look at performance in terms of the end user, which is now a recognized best practice.  However, they only measure the performance of your site while you are actively browsing it.  They won’t tell you anything about how your site performs for others or how it performs during the other 99% of the time when you are not personally measuring it with your browser.  Synthetic testing suffers from the same drawbacks.

More importantly, these approaches miss the elusive problems that affect specific people or that occur at specific times.  Nor will they find the problems that only occur under specific circumstances.  The fact is, these elusive issues are the most common.  They are the ones that plague every web site administrator because they are hard to find and nearly impossible to reproduce.  They eat up days of investigation time and are the biggest destroyer of public confidence in your site.  To find these problems, you need to be watching ALL traffic on your site every second of every day.  Moreover, you must do so while maintaining the end user perspective.

“Houston we have a problem...” by flickr user Mihael Mafy. Used under Creative Commons license.

There are very few solutions out there that can do this effectively.  Consequently, we are justifiably proud of our TrueSight product line and it’s ability to tackle this very difficult problem.  Once installed in your data center behind your firewall, TrueSight is able to monitor all traffic from your load balancer all the way back to your database and 3rd party tiers – from web request right down to code, SOA and SQL calls.  We auto-discover new applications and web servers as they are deployed without any need for additional configuration.  Moreover, our monitoring solution continues to perform its duties, even as you gradually virtualize your infrastructure or integrate back-end cloud services.

In fact, using a unique ground-breaking integration, we can provide full visibility of any deployments using Akamai’s Application Delivery Assurance services, like acceleration or caching.  Our unique relationship with Akamai has also recently led to the first co-developed solution that is being offered directly from Akamai for managing their services.

Furthermore, our technology is capable of providing accurate overall performance measurements for mashups or hybrid solutions.

If you have blind spots in your performance monitoring, you will not be protected from the negative consequences of this new trend.  If you lack insight on your end users’ actual experience on your site, you should seriously start planning to acquire it.

Akamai and Coradiant show where Cloud application acceleration brings real value


Tuesday, April 27th, 2010 Posted by: John Overton

Akamai and Coradiant have developed a practical approach to show how performance across the Cloud produces real value. In an industry first, Akamai is offering cloud visibility assurance by making the jointly-developed Coradiant TrueSight Edge for Akamai available to their APS Enterprise customers.  Akamai customers can understand how effective Web application delivery is to any area around the globe, and gauge where and when to implement Akamai application acceleration services to normalize end-user experience across the globe.

This marks an important milestone for the cloud. How effective is it? How is it supporting my business?

Coradiant TrueSight Edge for Akamai includes a software-based virtual appliance that is easy-to-deploy at the customer’s data center. Interactive, built-in dashboards and detailed statistics permit IT folks to monitor the end user performance of their applications. And it provides a side-by-side comparison of performance of applications delivered with and without Akamai.

Coradiant TrueSight Edge for Akamai offers detailed viewer of traffic and performance metrics with interactive dashboards and geographic drill-down capabilities. The product is designed to help enterprises gain greater visibility into end user performance to ensure consistent experience and better adoption of Web applications across the globe.

Key benefits include:

•  Executive-level global map displays with bubble charts to reflect critical business delivery and response
•  Dynamic views of response times and traffic volumes (real-time and historical)
•   Playback to help assess problems, and for planning and delivery optimization
•  “Vanity Displays” for lobbies, demo centers and for employees.
•  Comparison so customers can see the Akamai acceleration benefits for all users

Perhaps the most striking feature is the intuitive visual displays. Displays reveal the overall health and the quality of the end-user experience of a global application. Easy-to-interpret dashboards show traffic patterns and service level achievement around the globe. A scrolling traffic ticker displays relative changes in traffic volume or service levels over the last five minutes.

These allow executives and line of business managers an instant view of delivery success for live application traffic and may also be used to view historical data for up to 30 days. Marketing can also assess campaign interaction and balance the need for delivering rich content while maintaining good performance and a good user experience.

Accelerate your online business globally with confidence with Coradiant TrueSight Edge for Akamai.

Cloud Connect Conference – Managing Cloud Environments Brings Challenges


Monday, April 12th, 2010 Posted by: Hon Wong

I had the opportunity a few weeks ago at the Cloud Connect conference to co-chair a half-day workshop on Cloud Performance Optimization. The consensus from the mix of site operators, content providers and vendors is that monitoring and measuring the performance of cloud applications is critical.

“Cloud-served” applications can be a complex mix of content and application logic that is served from multiple locations. Unlike “conventional” client-server applications of the past, no single set of server- or application- oriented performance optimization metrics would be relevant for cloud-served online applications.  In the cloud, the only constant is where the application comes together – at the end user’s browser.

Often the cloud-served portion of an application is outside of a site operator’s direct control, even though they are ultimately responsible for the delivery performance of that content. A universal topic of discussion was about how to be sure that users are getting the application performance and availability that keeps users satisfied.

Managing cloud environments brings many challenges that traditional monitoring options can’t solve because the metrics they measure like CPU, memory and disk utilization, network bandwidth, etc. aren’t meaningful in the virtualized cloud computing model as far as application performance is concerned.

Ultimately, the goal is to overcome the visibility gap between IT executives and their web applications no matter how the applications are served. What’s abundantly clear is that understanding the user experience provides a single point with which to judge how effective sites are – and understand how effective the cloud actually is. Knowing the user perspective assists in governing IT effectiveness at the senior executive level and also helps find problems so that IT operations professionals can fix performance issues quickly.  Coradiant has been dedicated to the approach of knowing the user experience since 2000. It’s great to see this approach taking hold as more and more applications move to the cloud.

End user impact is still the most important metric


Wednesday, February 3rd, 2010 Posted by: Jonathan Ginter

Coradiant started out several years ago with a single clear message – that understanding the end user experience (whether actual human beings or back-office clients) was fundamental to managing any application.  Since that time, the industry has evolved significantly, as it always does, leaving in its wake the broken remains of once-rising companies and rejected approaches.  Often, as experience and depth-of-understanding increases, initial ideas are overthrown in favor of more accurate theories that reflect lessons learned in the field.  Not so with end user experience.  Not only has this idea won through unscathed, it has seen wide adoption at every level of the performance monitoring space.  Analysts write about its importance and vendors of all sizes have integrated it into their solutions.  There is not a single first-tier or second-tier vendor of APM solutions that has not built or acquired end-user monitoring technology and added it to their solution – e.g. CA, HP, Compuware and Quest all have an end-user component in their solutions.  Companies in North America no longer doubt its value and expect to have such intelligence out-of-the-box.  Europe is trailing behind this trend but I expect that they will rapidly catch up over the next two years.

The fact that end user experience has survived indicates that it is a fundamental truth of application monitoring – if you don’t understand how your user is being impacted by your application, you cannot effectively assess whether you have an actual problem.  For example, if a primary database that has a backup goes down over a holiday weekend, should the IT manager agree to pay double overtime to have it fixed immediately?  If, in an attempt to prevent a repeat of the failure, the IT manager presents the CTO with a purchase order for a clustered set of industrial servers with fibre-channel connections, should he agree to that very expensive purchase?  These questions cannot be answered unless the impact of the problem is well understood and this impact must be expressed in terms of how it affected the end users of the business.  No other metric is nearly as important.  If few or no end users were affected, the expense of an emergency repair or a massive upgrade may not be truly justified.  The days of watching CPU, RAM and disk usage as a means of determining real impact are over.  Those metrics, although good to know and important in their own right, cannot effectively reflect the impact to the business the same way that user experience does.

The fact is that some truths never change.

Correlating End-User Performance and Conversions


Monday, January 11th, 2010 Posted by: Sri Raghavan

Companies spend an inordinate amount of money setting up complex IT systems to manage and optimize service offerings to customers.  Often these management systems only look at a single metric, like conversions or performance. What if both could be correlated?

Imagine you are a high-traffic on-line ticket vendor for major events nationwide.  As the head of IT operations, you want to maximize conversions by ensuring that you can handle the traffic volumes generated by a major concert event or marketing push and that the response times for customer queries remain optimal. In spite of your best efforts, however, the latest concert promotion brought your site to its knees and some of your systems went down for a short period of time.  The impact of that outage on your users might be unacceptably slow response times or actual system errors in their browser.  In either case, if they risk losing out on good seats for the concert while waiting for your systems to recover, they will abandon your site in favor of a competitor that is more responsive and you will have lost that revenue. On the assumption that you could have sold 1000 tickets at $100 apiece during that outage, you could lose up to $100k of revenue, depending on the number of users affected.  That’s a lot of crumpets, as they say.

Poor response times and unreachable sites are bad for revenue. But there is also a third reason why your site traffic could be impacted that is not caused by the performance issues described above.  It may be that your site content is unappealing to the average eye or that your offerings are priced out of reach for the average consumer.  This is the typical explanation that marketing gives to conversion dips, but it is clearly not the only root cause.

This is where Coradiant Analytics In A Box can help.  Our passive capture technology combined with the included Google Urchin software collects performance and availability metrics about your site while it is tracking user behavior. This allows you to look at conversions (e.g. ticket purchases) in the context of performance and availability, thus allowing you to categorize conversion dips into those that are infrastructure related and those that are marketing and content related.  For example, your analytics data tells you that the 22% of your users bought tickets between 12-2 PM on Monday afternoon. You also see that this is about 35% less than the norm for a corresponding period.  Coradiant allows you to correlate this event with an increase in page load time or availability problems.

End-user monitoring coupled with analytics provides a powerful solution that helps to maximize customer visits and conversions.  It places those responsible for business results and for performance in the driver’s seat by providing them with the data they need to understand the real business impact of system outages, thus enabling them to provide the best possible service to their Line-of-Business managers.

Coradiant presents Web Application Visibility solution in Manhattan


Friday, October 30th, 2009 Posted by: Jonathan Ginter

Coradiant is holding an informative afternoon executive event in the Penthouse at Gary’s Loft in Manhattan focused on end-to-end Web application visibility.

You can register for the event here (https://www.123signup.com/event?id=jbxhq).
Presenters include:

  • Dave Anderson, Principal Architect/Co-founder of Peopleclick,
  • Dennis Callaghan of The 451 Group,
  • Fred Dumoulin of Coradiant
  • Andreas Grabner, dynaTrace software,

Gary’s Loft in Manhattan features a stunning 360 degree view of the Manhattan skyline.  Light food, beverages and hospitality will be provided by Tip Of The Tongue NYC.

Date: Thursday, November 12, 2009
Time: 4:30 PM to 6:30 PM ET

Gary’s Loft
28 West 36th Street
Penthouse
New York, New York 10018

Customers Lose Patience with Poorly Performing Retail Sites


Thursday, September 17th, 2009 Posted by: Tony Tissot

Online retail customers have less and less patience for poorly performing Web applications.

Forrester’s recent report on “eCommerce Web Site Performance Today “clearly shows that customer frustration leads to lost sales. The report was commissioned by Akamai Technologies and is available (after registering) at www.akamai.com/2seconds.

      • - 69 % of dissatisfied online shoppers indicated that they are less likely to buy from a poorly performing site again.
      • - 64 % would simply purchase from another online store.
      • - 47 % of consumers expect a Web page to load in 2 seconds or less
      • - 40 % of consumers will wait no more than 3 seconds for a Web page to render before abandoning a site.

    These are sobering statistics indeed, particularly for etailers that don’t proactively manage and monitor site performance and actual, delivered service levels. Synthetic testing alone does not provide the needed level of visibility into actual, delivered performance because it misses far too much of the action.

    The clear trend is that consumers are even more demanding now than in the past. The study reveals that forty-seven percent of consumers expect a Web page to load in 2 seconds or less. That consumer expectation is a sea change from a similar 2006 Forrester study which showed that the majority of customers expected page loads of 4 seconds or less.

    Most revealing is the finding that forty percent of consumers will wait no more than 3 seconds for a Web page to render before abandoning the site.

    The report concludes: “It is clear that there are serious consequences for an online retailer with an underperforming site. However, by taking steps to improve site features and performance, online retailers can look to increase overall consumer satisfaction and ultimately increase sales. Forrester recommends that online retailers test their Web site performance, fix easy site features and performance issues before attempting to address larger problems, as well as improve the multichannel experience by addressing content and functionality issues on the retail site.”

    Every business that relies on the Web needs to understand exactly the service quality they are providing to online customers. How can you accomplish that?

    Coradiant provides TrueSight Automated Incident Management and TrueSight Edge which are specifically designed to help solve the Web application performance problem. With Coradiant, reports on performance, availability, and traffic volumes are only a click away. When problems occur, they can quickly be detected, localized, and resolved. IT operations can now manage user performance and optimize and troubleshoot important functions, including Akamai traffic, and then drill down to specific parts of the infrastructure to see how they handle transactions.

    Coradiant TrueSight is the most cost-effective means to achieve a single, comprehensive “voice of the online customer.”

Blind Spots in Web Application Performance Monitoring


Thursday, August 6th, 2009 Posted by: Jonathan Ginter

Contrary to popular belief, the brain is not a Personal Video Recorder, recording everything submitted by your various senses.  That would be too much data for any brain to handle.  Instead, it sifts through sensory input looking for relevant data points that it can trust and throws everything else away.  The important words in that last sentence are “relevant” and “trust”.

If a data point is not relevant, then it is considered to be a distraction.  There are well-known studies on Inattentional Blindness and Change Blindness which demonstrate that even large-scale events can be filtered out by the brain if they are considered irrelevant to the task at hand.  Similarly, if the data point cannot be trusted, the brain tosses it out as well (whether your senses can be trusted has been a heated debate in philosophy for centuries, but I digress).  Trust and relevance are crucial to the brain’s ability to eliminate useless noise and derive good results.

These same principles apply to monitoring your web applications.  Instead of monitoring the universe, you should be reducing your data flood to those points that are relevant.  Moreover, you should only be using the most trusted tools and methodologies to draw conclusions.

For web applications, the most relevant data is the data that directly describes or explains your user’s experience and places it in context.  In order to identify that data, you must be able to draw a direct line from your user’s experience to those data points.  If you cannot do that, you are probably chasing your tail and wasting a lot of valuable resources. It is important to realize that a lot of tools cannot draw a direct line from user experience to monitoring data without leaving a few gaps and logical leaps of faith.

As an example, operations teams love to know whether a database is down.  Although this is valuable data, is it relevant?  If users experienced worse performance around the same time, does that mean that fixing the database will solve the performance problem?  In fact, in a well-architected environment, the loss of a web server, app server or database should have little, if any, effect on the end user’s experience due to clustering and load-balancing. A lot of solutions love to use time correlation as a magnificent leap of faith, but it simply makes unreliable conclusions look enticing.

To draw that line between user experience and environmental monitoring, you need a tool that can see the actual users’ experience and is able to relate it directly to problems in your network, application design, deployment, code quality, etc.  Moreover, it must prove itself to be a trusted source of information, returning results quickly and reliably without drowning you in irrelevant data.  In other words, it must be trusted to extract and analyze relevant information and return high-quality results.

Handling the Truth


Wednesday, December 24th, 2008 Posted by: Jonathan Ginter

Coradiant’s TrueSight End-User Experience Management product evokes a number of interesting reactions when we first start monitoring a customer’s Web traffic.  One of the most common reactions is amazement at just how many bugs exist – even in the best Web applications.  One customer speaking at a luncheon described the experience as being similar to turning on the light in your apartment and seeing big ugly cockroaches everywhere – you are appalled, you are embarrassed … and you feel a strong urge to simply turn off the light.

This might seem like a damning statement to make about one’s own environment.  And yet we repeatedly hear how such confrontations with the ugly truth have provided insights that resulted in the correction of long-standing problems, some of which had never even made it onto the radar of Web Operations.  I think you would have to struggle to find a Coradiant customer that did not have a similar story to tell. One customer discovered that 30% of their traffic consisted of cache hits (where the server reports that nothing has changed) or redirects.  By simply tweaking the caching parameters returned by their web servers, they reduced the load on their servers significantly.  Another customer discovered that some pages were taking up to 1.5 minutes to be handled by the server before a response was being sent back to the browser.  Yikes.

How many users are hitting your site?  How many errors are being returned?  How slow are the pages?  How reliable is the network?  Some customers are clearly floundering without any real ability to answer these fundamental questions.  Other customers believe they already have a solid handle on such issues.  We have found that almost all of them have a real shock in store.  Some of our most loyal customers are those that firmly believed they knew the truth already.

Often, though, the insight doesn’t have to be that deep to be a revelation.  It never ceases to amaze me how often web sites are thrown over the fence to be supported by a team that hasn’t the first clue about what they have taken on.  We offer a fairly simple feature that reports lists of traffic attributes sorted by popularity – e.g., URLs, hosts, client IP address blocks, geographic regions, cookie keys, etc.  Our customers can define their own fields as well, pulling whatever they would like out of the traffic to do so (e.g., database error codes, product IDs, etc).  We use this feature to help populate configuration fields.  However, the contents of those lists proved to be such a revelation to our customers that we re-categorized the feature under “Reports”.  Using this simple feature, one of our customers noticed that we were seeing internal traffic that we should not have been able to see.  This led him to realize that his routers were improperly configured.

The customer that we had invited to speak at our luncheon finished off his presentation by advising others – somewhat jokingly – to consider carefully whether they were truly ready to handle the truth about their traffic.  Ugly as it may be, facing it can reveal real solutions to real problems.  I highly recommend it.

The Benefits of Immediate Data


Friday, November 21st, 2008 Posted by: Jonathan Ginter

As the world moves on-line for most of its social and business interactions, it becomes more and more important for us to be able to react quickly when the systems that support those interactions exhibit problematic behavior.  Since problematic behavior is not always reflected by the health of your infrastructure, this has to be measured from the end user’s perspective.  In other words, if the end user’s experience degrades in any way, the application has become problematic.

This can present quite a problem on several fronts:

  • Measuring the end user’s experience
  • Being notified quickly that the user’s experience has degraded
  • Discovering that a potential fix has failed to address the problem so that it can be rolled back before too many users are negatively affected

As an example, let’s imagine that the on-line store on one of our web servers suddenly experiences an internal problem, which causes its performance to tank with no outward sign of distress (i.e., no log entries, etc).  Traditional methods of detection and notification will not work for us here.

Moreover, time is of the essence.  In our various field deployments, I have noticed that having 5000 users on your site every hour – on average – is quite common.  In fact, some sites have been known to average about 100k users every hour.  So if it takes you an hour to even notice that you have a problem, you have already upset quite a few users.  You need to be able to react quickly.

Challenge #1

How do we detect the problem?  We need to monitor the end user’s experience and it needs to happen in real-time.  This challenge has become less of a problem as various End User Experience Management (EUEM) tools have emerged to address this, some more successfully than others and each with its own unique feature set.  However, this is not the immediate focus of this article.  So, let’s assume that we already have such a tool in place.

Challenge #2

How quickly can we be notified that the user’s experience has gone south?  That depends upon the immediacy of the data and that depends largely on the tool that’s been chosen.

When an application starts to collapse, there are typically two major symptoms:

  • A drop in performance
  • A drop in volume as users abandon the application

If we typically receive 5000 users per hour in our on-line store, we can assume that 80 or more users are negatively impacted every minute.  Moreover, so far we’re only talking about being notified.  Once that happens, we will still have to analyze and deal with the problem.  All the while, the problem on the site is spreading to more and more users.

Assuming that the problem may only be noticeable as a trend, waiting for several minutes for enough data to be gathered to predict the trend might be necessary.  However, the lag time should be kept to that order of magnitude.  Waiting for an hour or more to be notified should be completely unacceptable.

Moreover, if the problem can be detected from a single hit on the site – e.g., the application is throwing back pages with error codes embedded in them – then the notification should be almost immediate.  The lag time from seeing a hit on the wire to the time that an alert can be sent about that hit should be within a few minutes, at most.

Challenge #3

Immediacy of data is also a concern when a fix is being rolled out and we need to validate that the problem has truly been addressed.  Rolling out a fix and waiting for an hour to gather the results is unacceptable in this day and age.  The only organization that should be willing to accept such a lag time is NASA (and at least they have a good reason for it).  If potential fixes cannot be validated within minutes, then users are being treated like piñatas.  Generally speaking, users don’t appreciate that.

You own that data.  You deserve to have access to it as fast as possible.  Your users will thank you.