Yeah, I know Monitoring Sucks, but it can be pretty neat as well. It’s a hard problem, so there are all kinds of interesting approaches to it. A new one I came across recently was the Assimilation Monitoring Project. I met Alan Robertson (Linux-HA founder) at a recent Cloud Computing meetup and in the post-meetup discussions he talked a bit about his project. One of the things he mentioned was that our monitoring systems spend the majority of their time (proportional to the quality of your system I guess) detecting and reporting that everything is ok. His project aims to distribute that task across your systems in a way that scales to quite large infrastructures & is inherently redundant.
I’m not sure how ready it is for prime time, but take a look – it sounds pretty interesting.
I think I’ll always have an interest in monitoring problems – not because I have to as part of my job, but because the problem has such a wide variety of potential solutions with different benefits. It’s also one of those areas that’s very rarely just plug and go, you have to architect it like any other service which keeps it interesting.