www.jayntguru.com

August 30, 2010

scom web application monitoring – making it useful – part 1

Filed under: annoyances, computer geek stuff, iis, scom, scripting — Tags: , , , , — jayntguru @ 12:39 pm

I could go on for days about SCOM and the URL monitoring and how it needs to be improved. Honestly.. it kinda sucks. So here I will attempt to describe what I think is wrong with it and how I work around it. The items in bold below are what I feel like are failures in the way this was designed.

Also I am not writing this as strictly a “how to monitor a web app” post, there are already plenty of those. This is just about the changes required to make this useful. Here is a good article with the basics on setting up a web application monitor in SCOM.

  • Requirements

To begin with, you will need to figure out what you need to monitor. In many cases it is simple enough to pull up the main page of a website and as long as it comes up, is in a reasonable timeframe, and is giving an HTTP status code of 200, you’re OK. This sort of monitoring is useful, but you can do so much more in order to get a lot more out of it. What I like to do is get the devs to code you up something special through some sort of bribery or blackmail. In our case what they did was define 5 business processes, for example “make a payment” and create a page that does the back end work of making that transaction but also the other end of the work which is cleaning up after itself. What you will get in the end isn’t exactly user experience, but it’s a good way to track the ongoing performance of a process relative to itself, and it’s a very good up/down indicator. Since we have dev environments as well, I have those on a development scom server, and I have the below web monitoring in place there as well in the first production like environment. This allows our QA folks to compare state and response time and see if the environment is working before they release code or start a test, but also they can see the impact of the new code by comparing response times from before and after the code release.

  • Once you have your URL’s, it’s time to get to work.

Create a web application monitor and give it your URL. The problem with those default settings is that by default you are only logging the transaction response time and not alerting on it. From an alert standpoint, there is no timeout for your web request, matter of fact, the only thing SCOM will tell you out of the box is just if it was eventually able to pull up a URL as long as it doesn’t have an HTTP response code > 400. This default setting is not useful!

To fix this, what you want to do is add response time criteria like this.

image

Because of a problem with the service level dashboard that I will explain later, I only put one HTTP request in each web application monitor. This brings me to a little UI weirdness here because you can also set response times in the “configure settings” for the specific URL pull like this.

 image

I always leave this performance criteria blank because I can see the other one easier and get more out of it. This one here just seems redundant.

  • Seeing the data

Now once you gather some data you will want to, well, see what’s going on. In order to do this, create a new performance view in the monitoring console and scope it to “collected by specific rules”, and then you get to go manually pick your rules. This is where Microsoft fails again, because the list of rules is not searchable and they all have arbitrary names. For web requests I figured out they are called “Performance Collection: Transaction response time total for Name of web app monitor”. like this screenshot.

image

Now that you have done that, you will be able to see a nice blank performance chart with some stuff to check.

image

Now when we pick one, we get a pretty graph like this.

image

This brings me to my next issue with all of this.. it’s that the performance chart settings are user specific.. meaning I cannot create a view of any sort that contains performance information and have the counters checked already. No matter which ones I put in, and it doesn’t matter if you are using a performance view or even a dashboard view that contains a performance view, those have to be selected every time. This is a pain!

This also means that if you wanted to say, get fancy with a URL to a specific view, you cannot just create one of these and have folks click the link and end up at a pretty performance chart with the counters already checked. The fact that you cannot do this is a serious limitation with SCOM, IMO.

  • setting up alert parameters (what you cannot change)

You will likely have to play with the values a bit in order to get them not to false alert. And this brings me to my next problem with SCOM web monitoring, it’s that you cannot change anything about how it samples other than where it is from (what host) and how often it samples. What I would love to do is be able to say “only alert when two consecutive thresholds are exceeded”, but that’s not an option. We get a lot of failures at night during our backup window that cause a single transaction to go out of SLA, and we get alerts based on that. As a result, we have to set our thresholds for response time to the highest level it could possibly be so that we aren’t false alerted every night, but this makes it so high that the alerting becomes less useful during the daytime. As of now I do not have a workaround for this.

  • stopping duplicate alerts

When you do get your first alert you will see that two are sent.. one for the URL pull and one for the aggregate monitor on the web application monitor. This doesn’t really make sense to me why this would be set up this way at all, so let’s fix it.

Start by right clicking on one of the alerts and open the health explorer for it. Expand it out and you will see something like this.

image

Each of the red lines has an alert set up for it, and the lower one for the actual request rolls up into the web application one. In my mind the web application one is redundant, so I am going to disable it. Right click, choose “monitor properties”, go to alerting, and uncheck it.

image

Now you will receive one alert instead of two.

  • useful alert details

Of course the text of the alerts isn’t useful at all out of the box (it doesn’t tell you if the URL failed for time, SSL, http response, or anything). I am using this article as a basis for fixing this, but I don’t have it totally worked out yet. This will continue to require some further tweaking.

This post ended up being longer than I intended (there’s a lot to fix) so I am going to break it up into two parts and get the service level dashboard stuff into a 2nd post.

July 21, 2010

Microsoft, you HAVE to do a better job than this

Filed under: annoyances, computer geek stuff, scom — Tags: , , , — jayntguru @ 5:54 pm

Here’s an error from SCOM.

Performance data collection process was unable load SQL Server Authentication configuration information. Account for RunAs profile in workflow "Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData", running for instance "INFMGT02.accessgeneral.com" with id:"{81890C12-35B3-7AEA-C0FF-3EFCA7486E97}" is not defined. Workflow will not be loaded. Please associate an account with the profile. Management group "Access"

OK guys WHICH profile? Come on.. how hard is this? I mean I can guess, and I have, and guess what? It has one associated.

July 1, 2010

SCOM 2007 R2 – workgroup/DMZ server notes

This is harder than it should be. Here are my notes on doing this.

1. On cert server go here: http://blah/certsrv/

2. request cert. choose type other and paste in the below OID

3. OID = 1.3.6.1.5.5.7.3.1,1.3.6.1.5.5.7.3.2

4. Make sure to check key exportable. Make sure to use FQDN of server for name and common name.

5. Open up server mgt for certificate manager and approve.

6. Go back to website, install the cert.

7. Mmc, certificates for personal. Export the cert. make private key exportable.

8. Copy cert to client server.

9. On server do mmc for client, import cert, mark as exportable.

10. Run momcertimport on client, choose cert.

11. Restart system center manager service on client.

12. Wait a min and go to mom console, administration, pending management. Approve it.

13. Done!

May 30, 2010

Dear SCOM. You blew it

Filed under: annoyances, computer geek stuff, scom, scripting — Tags: — jayntguru @ 11:46 am

In case you weren’t aware, for SCOM to work against a non domain machine, all manner of certificates is required between the RMS and the agents in order for this to work. Not only is it required, but you have to use the fairly archaic tools provided with certificates, oh, and you will need your own certificate authority too. This is such a complete and utter #FAIL that I don’t really know where to start. Mainly my issue is that it doesn’t need to be this hard.. if someone wants to see the CPU time on my webserver, then by all means, hack in, but damn if I care enough to go through this level of work for it. And that brings me to my second issue, the shit just doesn’t work. Sure you could say this is a “rush it out the door” kinda thing, but this happened back in 2007 and there have been plenty of releases including an R2 version, yet still this useless and archaic process is still in place.

So in short, the SCOM guys failed by over-complicating something that isn’t needed, and then making it 10 times more difficult than necessary. FAIL.

March 25, 2010

Obama and health care – worst president ever and one of the worse ideas ever

Filed under: annoyances, politics — Tags: — jayntguru @ 12:59 pm

In case you didn’t know, I think Obama has taken the torch from Jimmy Carter as the worst president we have ever had, for many reasons. Especially, I believe that this health care thing is a piss poor idea for many reasons, it’s too expensive (we don’t have the money – we cannot afford it), it increases the entitlement state, diminishes the concept of personal responsibility and is unconstitutional. I have posted about it many times on twitter and on facebook. I don’t want to get into a long discussion here as to why, mainly because I don’t have the time, and I feel like a lot of it has been said before, but what I did want to do was register my complaints in an open forum (the internets) so that google could pick up on them and to be sure that my feelings are.. what are those words? Oh yeah open and transparent.

I will probably come back later and update this post with some more info, but for now I wanted to put in some quotes I have run across in the past couple of days that I think are applicable.

If you’re not a liberal at 20, you have no heart, and if you’re not a conservative at 40, you have no head! – Winston Churchill

Government, even in its best state, is but a necessary evil; in its worst state, an intolerable one. – Thomas Paine

Amendment 10 – Powers of the States and People. Ratified 12/15/1791. Note The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.

Everyone should read this.

March 23, 2010

hiding users from the welcome screen in windows

Filed under: annoyances, computer geek stuff, w7 — Tags: — jayntguru @ 9:57 am

This is a bit of an annoyance for me, but when vista or w7 boots and shows you the list of users, I personally find that annoying. Here’s a link to a fix.

March 12, 2010

keeping flash full screen when you use the mouse

Filed under: annoyances, computer geek stuff, w7 — Tags: , — jayntguru @ 2:58 pm

Since I’m a bit of a mediacenter advocate, I use the internets on the tv a lot. One of my issues with flash is that it never stays full screen if you click the mouse on another monitor. Today I ran across a how-to at lifehacker on how to fix this. It’s temporary (it will break when they upgrade versions) but it’s better than nothing.

February 18, 2010

the vista snipping tool (where is it?)

Filed under: annoyances, computer geek stuff, w7 — jayntguru @ 12:43 pm

I went to use the vista screenshot tool just now and couldn’t find it. After some investigation I realized that the snipping tool is included with the “tablet pc components” in vista and w7. So if you uninstall things that aren’t needed (like the tablet pc components), then you won’t have this.

Why this is included with the “tablet pc components”? I have no idea. This doesn’t make any sense to me.

February 5, 2010

deleting a partition during the w7 install

Filed under: annoyances, computer geek stuff, w7 — jayntguru @ 12:22 am

I had an issue tonight when reinstalling w7 where the install would not let me delete the partitions on one of the disks… they could be formatted, etc, but the delete button was grayed out for some reason. Why it did this, I can’t tell you. What I finally found to fix it was this:

  • on the first welcome screen of the w7 install, hit shift-f10, this gives you a command prompt
  • run diskpart
  • list disks
  • select disk 0 (if this is the disk you want)
  • clean
  • exit

Then you can continue with the install on a new fresh and clean drive.

February 4, 2010

jusched.exe #fail

Filed under: annoyances, computer geek stuff, w7 — jayntguru @ 7:03 pm

I really dislike this stupid java update scheduler already for a whole lot of reasons, but with windows 7 it’s an extra hassle, at least for me. This is because that new systray icons are hidden by default, so you can not notice the (stupid) thing running. Just now I started to use my laptop and realized it was very slow. Here’s why:

image

 

Total and complete #fail.

Older Posts »

Powered by WordPress