Skip to content

Coradiant

Archive for the 'Standards' Category

Your site’s performance is important to Google


Wednesday, May 5th, 2010 Posted by: Jonathan Ginter

Poor performance can now degrade your business in an even more real and meaningful way.  Recent changes by Google will allow site performance to affect whether traffic is driven to your site.  This places a new urgency on the ability to accurately measure performance in terms of the end user.

“Time is ticking out” by flickr user Mao Lini. Used under Creative Commons license.

Recently, the Mashable blog reported on a decision by Google to add performance as another deciding factor in how they rank their search results.  This marks a significant milestone in which performance will affect your site’s ability to attract visitors.  Essentially, the faster sites will receive more attention.  By taking this action, Google hopes to shine a spotlight on performance and drive better overall development practices.  I have no doubt that this is a direct result of Steve Souders‘ move to Google from Yahoo, where he had been busy leading the YUI best practices effort that produced YSlow.  Since that time, Google has become much more of an advocate for web performance.  Here’s a direct quote from their blog that sums up their current perspective:

“We encourage you to start looking at your site’s speed … not only to improve your ranking in search engines, but also to improve everyone’s experience on the Internet.”

Essentially, Google has now made performance an important factor in driving business to your site.  At the same time, performance is one of the hardest things to accurately and properly measure for web-based traffic.  Google suggests a number of very good tools, most of which run inside your browser.  These tools look at performance in terms of the end user, which is now a recognized best practice.  However, they only measure the performance of your site while you are actively browsing it.  They won’t tell you anything about how your site performs for others or how it performs during the other 99% of the time when you are not personally measuring it with your browser.  Synthetic testing suffers from the same drawbacks.

More importantly, these approaches miss the elusive problems that affect specific people or that occur at specific times.  Nor will they find the problems that only occur under specific circumstances.  The fact is, these elusive issues are the most common.  They are the ones that plague every web site administrator because they are hard to find and nearly impossible to reproduce.  They eat up days of investigation time and are the biggest destroyer of public confidence in your site.  To find these problems, you need to be watching ALL traffic on your site every second of every day.  Moreover, you must do so while maintaining the end user perspective.

“Houston we have a problem...” by flickr user Mihael Mafy. Used under Creative Commons license.

There are very few solutions out there that can do this effectively.  Consequently, we are justifiably proud of our TrueSight product line and it’s ability to tackle this very difficult problem.  Once installed in your data center behind your firewall, TrueSight is able to monitor all traffic from your load balancer all the way back to your database and 3rd party tiers – from web request right down to code, SOA and SQL calls.  We auto-discover new applications and web servers as they are deployed without any need for additional configuration.  Moreover, our monitoring solution continues to perform its duties, even as you gradually virtualize your infrastructure or integrate back-end cloud services.

In fact, using a unique ground-breaking integration, we can provide full visibility of any deployments using Akamai’s Application Delivery Assurance services, like acceleration or caching.  Our unique relationship with Akamai has also recently led to the first co-developed solution that is being offered directly from Akamai for managing their services.

Furthermore, our technology is capable of providing accurate overall performance measurements for mashups or hybrid solutions.

If you have blind spots in your performance monitoring, you will not be protected from the negative consequences of this new trend.  If you lack insight on your end users’ actual experience on your site, you should seriously start planning to acquire it.

Using Existing Technology to Track Users Reliably


Friday, August 29th, 2008 Posted by: Jonathan Ginter

I spend a portion of my time working with Coradiant customers on session tracking strategies that ensure a complete view of the end-user experience. Session tracking gives us the ability to see clearly every action, every page and every object associated with a user. Proper session tracking achieves what we refer to as user awareness. This is different from identity awareness, which allows us to put a face to the session. I’ve written in more detail about the differences between identity awareness and user awareness, but I can recap briefly for the purpose of this article.

Identity awareness means that you know the unique ID of a specific user on the site, allowing you to put a name (or even a face) to that user’s activity – e.g., user XYZ can now be referred to as “Jonathan” or “jginter”.  Oddly enough, this is fairly simple since it only requires that users identify themselves (by providing an ID) at least once during their session – e.g., as part of a login procedure, etc.  If even one hit during the session carries the user’s personal ID, we can be configured to pull it out and place it on the session we are tracking.

I just said a mouthful and didn’t underscore it, though.  I said that we were tracking a session already, which means we would have already achieved user awareness.  To do this is no small feat, since it requires that we be able to identify the subset of hits that represent the activity of a single unique user.  This is the foundation of any good monitoring solution.  Moreover, we must find these related needles out of the haystack that is the river of traffic we are monitoring.  Ironically, this could be very simple, but most sites make it very hard.  However, we can resolve this through a couple of simple suggestions.

As I have said before, most sites host traffic that is unintentionally anonymous – i.e., there is nothing on the hit that would identify who it belonged to.  Even sites that are diligently trying to place identifiers in their traffic are likely to have anonymous traffic that they are not aware of.  Why is that?  The simple fact is that most sites suffer from tunnel vision.  They focus on their most important traffic – JSP, HTML, ASP, etc.  These hits represent their application, as they perceive it.  When they track a user’s progression through their site, these are the URLs that they care most about.  What are often overlooked are the support files – stylesheets, javascript, images, etc.  Although secondary in importance, they can be critical when it comes to understanding the performance seen by the user.  Organizations must solve this problem in order to monitor their traffic properly.

There are two basic problems that need to be addressed:

  1. Failing to use cookies to identify sessions
  2. Failing to control the domain of those cookies

Whatever identifier is being used on the primary URLs, it should also be present on all secondary URLs as well.  For those applications that use something other than a cookie to carry session IDs, this can be a challenge if not downright impossible.  Switching to cookies allows the browser – the most authoritative source for identifying a unique user – to properly label every hit using a natural aspect of the protocol.  Even when browsers do not support cookies, web servers have been designed to rewrite the URLs in the content being sent to that the cookie values are embedded in the URL paths themselves.  The beauty of this is that browsers and web servers are designed to do this with little effort on the part of the application developer.  All that is required is that the web server be configured to open a new session for any traffic it sees – something that is not usually done if the developers are trying to maintain a stateless application.

Recommendation #1: all web servers should be configured to track sessions even if the application is stateless.  Opening a session does not mean you are required to maintain any state.

So now your web servers are diligently placing a session cookie on all traffic that they are serving up, which will solve the problem … most of the time.  You can still be easily defeated by deployments in which some or all of the secondary files are hosted on an alternate server.  This is a problem because it places those files in a different domain and cookies are sensitive to domains (and paths).  It is considered a bad practice to ask browsers to send cookies to servers that don’t care about them (it bloats the network traffic and forces web servers to do extra work for nothing).  Therefore, web servers often configure their cookies so that browsers only send them back to the servers (i.e., domains) that issued them.  Thus, browsers usually do not send cookies set by the primary server to the secondary servers as well.  This is easy to fix, however, by changing the configuration of the web servers.  Most of the time, the secondary servers have the same root domain – e.g., www.coradiant.com vs. images.coradiant.ca – so a simple rule that includes both domains (*.coradiant.*) can be associated to the session ID cookie, allowing the browser to send it on all traffic to the monitored site.  That will ensure that all secondary traffic is also clearly tagged with a unique session ID.  Although this goes against best practice, the session ID cookie is typically very small and the best practice was meant to avoid needlessly sending multiple cookies and / or fat cookies with heavy payloads.  Moreover, what I am suggesting is hardly “needless”, since it solves a significant problem.

Recommendation #2: define domain patterns that cover your whole site and associate those patterns to the cookie(s) that will be carrying the session IDs.  Each web server vendor may do this differently, so consult the administration manuals for your specific web servers for more information on how this can be done.

Reconstructing sessions reliably does not have to be complicated, but people often get in their own way.  These simple recommendations should help solve a world of problems, no matter what monitoring solution you are using.

Dependency mapping in web applications


Wednesday, June 14th, 2006 Posted by: Alistair Croll

The Change Management Database (CMDB) is a map of the components in an IT environment and how they depend on one another. It’s at the core of the Information Technology Infrastructure Library (ITIL), a set of best practices developed by the UK government and broadly adopted in recent years.

Even if you’re not an ITIL shop, you may already be using some of the ITIL directives as a part of the Microsoft Operations Framework (MOF). And various schemas describe specific environments: The Common Information Model (CIM), and various iterations of it such as the Data Center Markup Language (DCML) which attempt to describe elements of an IT environment using a structured format such as XML.

Yikes. Okay, enough acronym soup. CMDB is particularly interesting when applied to a specific environment. In a data center, there are lots of pieces and lots of dependencies. A particular web application might depend on a set of servers, which in turn depend on power supplies and contain CPUs running operating systems.

One of the challenges of managing a web application is answering the question, “what happened?” Many times, a change to one component has far-reaching consequences. Consider a modification to a back-end web service. Lots of other parts of the application are related to that change:

  • The servers on which the service runs
  • The virtual IP addresses (VIPs) that forward traffic to the servers
  • Any TCP ports that provide the service
  • Any web pages that call the back-end service
  • Any transactions that involve that web page in a process
  • Any business processes or user groups that rely on those transactions
  • Contractual obligations affected by those processes or groups

Unfortunately, most of the dependency mapping tools and schemas have little or no visibility into these dependencies. They tend to tie an application (“E-Mail”) to infrastructure (“server 12″) without breaking the application into services, components, users, or processes.

The “holy grail” of application performance management continues to be an association between the content, application, network, and infrastructure; derived from user or user group experience; measured across performance, availability, and traffic levels.