A few days ago I presented at PQR’s itGalaxy, a yearly event organized to share knowledge and connect with peers and others. I presented a session about End 2 End monitoring for I think it’s missing in so many environments and today it’s more and more impossible to live up to the needs of users without it.
In this blog post I will repeat most of what is said in my session without me standing in front of the screen.
Introduction to end 2 end monitoring
To make sure we all board the same train I’ll start explaining what E2E monitoring is, to do that let’s start with a chain. To understand E2E monitoring it’s important to understand chaining.
Components and chaining
Below is a drawing of a very basic chain, the chain consists of multiple components that provide the user with the application, desktop or whatever resource he/she needs to access.
A chain is set of components including the business consumers endpoint device. E2E monitoring is about chain monitoring, you want to make sure that the product used for monitoring understands that a desktop is not just a desktop. That desktop is only available when all components it depends on are available. When a product does not understand that you have a point solution monitoring single components.
If we look at a component, let’s say a file server, the functionality it provides is only provided when certain layers are available. A file server needs an operating system running, needs networking needs the hardware running fine. All these layers and more need to be monitored if a component is monitored in a chain.
Mature E2E monitoring products understand this, they understand that a component is not a single layer item, they understand the complexity and will show you that issues on one layer may have effect on others or the whole component.
We’re almost there, hold on.
Now that we talked about the components and the layers you think that’s it… let’s monitor. Well not so fast, all components (or at least most of them) have other components they depend on. These are not direct in-line components that the user will cross to access the resource it requested. These are components that provide functionality to a components.
Simple examples are a Domain controller or a license server needed to make sure the user is authenticated and a license is provided. When you create a chain for the most important resource in you organization to be monitored, make sure you draw the whole picture. It will look like a spiderweb at the end but at least you then know the complexity of your network and what to monitor.
Why would I need to monitor?
You perhaps wonder why you need to monitor? Well let me take you on a small trip to explain why I think you should. Let’s go back to the time I started to work… 1993 it was.
Customers I visited at that time had FAT clients on their desk, running WfW311 or Win95 later on. Applications were installed locally most of the time and Wordperfect was still king. If you read the part above you now realize the chain to the application was pretty short. The application was on the desktop of the user and only a few layers were needed to provide the functionality.
Also if one desktop would go down, and they did, it would effect only that one user. Life was good 🙂
There were some applications on servers, that evolved more and more in the ninties, but also the server was physical. The chain from the user to the server and the application running there was short.
If a server would go down, as they did in those days, only those central applications would be inaccessible. basic applications would work, perhaps not a well as you would like to but they would work.
2001 and forward
Then in 2001 someone brought out ESX 1.0 and that changed the world, or at least the IT world.
Small note: I’ll skip Citrix and Terminal services in this story for they really got off when we started to virtualize imho
So in 2001 we started to virtualize our servers, applications, desktops, datacenters and in 2013 at VMworld network virtualization (NSX) was announced. we’ve come a long way, glad I was part of this journey.
If however we look at the impact this had on the chain and the layers things become more clear on why we need to focus on monitoring.
Layering after 2001
After 2001 a servers were virtual (I know not all servers were right away but it sound nice), so instead of those physical layers we had before now we had some extra layers, the virtual ones.
So if you had a product to monitor the server now it had to understand that server is more than just a IP address and a piece of hardware, it’s running on a hypervisor.
The chain for the user stayed the same, the server was still the same server with the same name and IP address. The complexity of the component changed and we had to change with it, but did we?
Then we started with desktop virtualization, I don’t care about VDI or SBC/RDS/TS or any other name it’s been given. We’ve been transferring the users desktop to the data center one way or the other.
This meant a lot to the user, the trustworthy FAT client with all it’s local application and the pictures or the wife and kids next to it were replaced with a Thin client or a re-purposed FAT client. Users were forced to work in flex offices with no personal stuff on desks..
The chain to the applications was longer all of a sudden because no matter which technique you use there is a broker a web server a load balancer or things like that in between. The user doesn’t need to know that we’ve made it complex, they just needed their applications.
Something we overlooked at that time I think is that failure of the desktop environment did not impact one user but all users. No matter how hard we tried to make it highly available there was always a good chance of hurting multiple users at once.
Hosting is hot
Let’s move on, the next step was that we decided that it was better to put all the desktops, all of our servers in a external data center. Before most of this was still running in our on-premises data center but outsourcing and hosting was hot.
Again we added a few components to the chain for the user to access the resource they need. Of course we also made it possible for the user to access the resource any time anywhere any how but only when everything is working fine.
Around this time even our dull and uninspiring politicians talked about a new way of working, users should be able to start at home and drive to the office when the traffic jams are gone. (over here that means you work home all the time 🙂 ).
Demands for any any any was getting higher and organizations demand 24/7 access to the resources. IT looked up and said hey we’re there already we outsourced our desktops and ready to give you access 24/7.
One thing they didn’t have in place was a way to monitor if the desktop was actually running. The user that would start early was their pro-active monitoring tool. If the first few could log on fine, why shouldn’t the rest? All they had in place were the old monitoring tools to monitor the hardware or some parts of a network. The evolution of our networks and the evolution of the way we work needs one more step to guarantee availability and pro-active signaling of issues, it needs E2E monitoring
I’ve looked at some product, far from a complete list (have to work sometimes also 😉 )
I’m not gonna tell you which one is the best or which one fits your environment best.
The goal behind this blog was to make you aware of the need of monitoring due to the growing chain and demand for resources 24/7.
So if we look at this list, the highlighted products provide E2E monitoring in some form one more advanced than the other. I’ll give some examples of products.
There are point solutions on the list, like Citrix Edgesight. Edgesight is a very good tool to monitor a Citrix Environment, highly recommend it. For E2E monitoring you will need more, Edgesight doesn’t monitor other components like a DHCP scope.
What I looked for in products was if they understood what they monitored. I’ve seen products monitor a DHCP server without being able to tell me how many addresses are available. I don’t care about the service, I care about the scope and the percentage of IP addresses for my desktop pool.
If you look at Solarwinds for instance they monitor a DHCP scope like it should.
Not a E2E monitoring tool but very aware of what it’s monitoring is Uberagent, an Add-on of Splunk created by Helge Klein. It gives you a breakdown of the user logon times and what is causing it. It’s the same kind of report as Edgesight gives for Citrix but now for any desktop or server.
Another product that does similair monitoring is SPS GenSys and Dutch company.
I’ve included a video of their End User Experience Monitoring tool, the talking is in Dutch but I’m sure you’ll get the idea. The create a script to copy the actions a user is doing to test that every hour or so. The data is directed to the monitoring console and IT has information on the performance of the desktop or application and the trend of it.
We’ve been doing great stuff last 12 years but forgotten to make sure we can pro-actively monitor what we provide. demands for 24/7 access is there and it’s not possible to loose a desktop environment anymore. We need to be in control of our environment and know that something is going to happen far from the user reporting it. We need to solve issues before they hurt our users.
I’ve talked about some products, please check them out. Depending on you requirements you will find one that suits them. look for the differnces in agent or agent-less and stuff like that to compare.
If you have any thought on this subject or perhaps disagree with me, please let me know.. always in for a good discussion. It’s my look on monitoring right now.