Error handling is often one of the last items on the backlog for the first release of a product. Or more to the point it often doesn't make it in at all until users start reporting unhandled exceptions and all of a sudden it becomes a management priority!
CBS where I work is predominantly a web development shop, up until now we've been using ELMAH to do our error handling. It's very easy to install and use and so far we haven't had any real complaints. Having said that I've been thinking for a while how nice it would be to have one portal where we could view any exceptions being thrown across our products and set up notifications for relevant team members. We are also doing more mobile and desktop application development and it would be nice to make use of a product that seamlessly spans all those environments.
Raygun from Mindscape looks promising so I thought I 'd dive in and take a look.
First you'll need to sign up for a Raygun account. Once you've done that you will be prompted to create an application for which you will be handing errors and subsequently given instructions on how to configure Raygun appropriately. This is a very slick and well though out process.
I then installed Raygun into my ASP.Net MVC project using the NuGet package manager, once again nice and easy. After that you need to decide how you want to capture exceptions - if you want to do it on a global level you can do by adding a configuration section to you web.config or you can do it manually through handling exceptions in your code and using the Raygun client classes.
One nice touch is that when manually handling exceptions you can provide metadata with each exception. This is particularly useful if you want to add some context to your exceptions. Unfortunately I couldn't see an easy way to filter or search for this information through the web portal.
The Raygun Web Portal
This is where Raygun really needs to be adding value to the exception tracking life cycle. There are plenty of products that allow for the catching and storing of errors but what I'm looking for from Raygun is a single port of call where I can see all the exceptions my apps are throwing, easily analyse the history for each exception and set up the appropriate notifications. So does it measure up?
Firstly Raygun does provide the ability to set up multiple applications to be monitored - yay! I can see a timeline of exceptions being thrown, a dashboard summary of all exceptions along with some great summary information for each exception group. One of the most significant features of Raygun is that instances of an exceptions are grouped together - it's not exactly clear how this is done but I would think it's some sort of hash of where the exception occurred, exception type, message etc. This makes it very easy to deal with all the instances of a particular type of exception.
There is also the ability to assign a number of states to a particular exception grouping - I can see this being useful if there are a large number of exceptions being thrown and multiple team members with access to the portal. However some states, such as Resolved, are better off living in your Issue Tracking software, fortunately Raygun does provide integration with popular issue trackers such as JIRA, FogBugz etc through Plugins so that that new issues can be created for particular exception types.
When clicking on a particular exception grouping you get taken to the details screen for that particular type of exception. This is probably my least favourite screen, it's dominated by a graph which doesn't have the ability to change the selected date range and hence has very little relevance unless you are dealing with a heap of exceptions that have occurred in the last 12 hours. There is also no easy way to see a list of all the instances of that exception - instead you have to make do with a previous and next button to work through them. In my opinion it would be better to get rid of the graph and show a table of the errors along with the date and time. Or at least the ability to toggle modes to switch between a table and the graph. Having said that the information it does capture about each exception instance is exactly what you would expect, there is a full stack trace, environment information, request information etc.
The other feature of this screen is the ability to enter a comment against a particular type of exception. I think this is of fairly limited use for most professional software shops as they would most likely be tracking these exceptions in their Issue Tracking software.
The notification system is very easy to use. When the system logs an exception you can set it to send you an email. If you like you can get a daily digest. Nice and simple and it works. I might be tempted to tweak it so that if no exceptions have been logged then you aren't sent a digest, the danger being that if you get too many empty digests you might begin to ignore them.
General User Interface
There are some parts of the UI that I find annoying. Culprit number one is the "Upgrade Now" banner that takes up the top 70px of each page! I know calls to action are important but this one sticks in the throat when it's dominating prime real estate and you can't hide it.
The second issue I have is with the amount of relevant content that one can easily see and how it is displayed. I found that there is far too much empty space not being used (particularly whitespace in the error details) and much more information could be conveyed with a bit of restyling - this is a tool for use by software professionals who are used to dealing with large amounts of data - give it to them! I also think an opportunity has been missed by not making it a Single Page Application. The web interface isn't exceptionally complicated and the full page refreshes when moving around feel slow and somewhat clunky.
In addition I would love to see exceptions being shown in real time on the dashboard without refreshing the page (Web Sockets anyone?).
So the million dollar question - what does it cost? Well the pricing's not bad - the 'Micro' package is USD$14/month for 1 application plus as many users as you want. However the real downside to this is that you only get 2 weeks worth of data retention - this could be a real issue if you have an intermittent exception that you're finding hard to reproduce and track down. It also means that you're not going to have any meaningful stats that can be used over an extended period of time. The 'Small' package at USD$39/month gives you 5 monitored apps and 30 days retention which is better. There are 2 larger packages which only really start making sense if you have a large number of applications that you want to be monitoring. So while the pricing isn't super expensive I think most people are going to opt for the Small Package.
I really like the functionality of Raygun, it's a product that I was crying out for before it came along. At the moment it does feel like a work in progress, while it has enough functionality to be useful I'm sure there's plenty more value that can be added to the exception management lifecycle.
So would I use it? Well that depends - if I could run enough applications through it to justify the Small or Medium packages then I probably would. If I was looking to support just one web based product then I might stick to ELMAH. If I was looking to capture errors from multiple sources like mobile, desktop apps and web applications then I think it's a much more compelling option.
All in all with some UI enhancements, extended dashboard functionality and more integration with other products I think Raygun has a promising future ahead of it!