From Cloud Monitoring to Effective Cloud Management

The Cloud has become the hot topic in the past year, with many major technology leaders (Amazon, Google, Microsoft,, etc.) extensively promoting public cloud services. What is equally interesting is the growth of the so-called “private cloud”: whether large enterprises transitioning their datacenter infrastructure to cloud-based paradigms; or solution providers using their own cloud-based platforms to provide solutions more efficiently.

13 July 2010

This is the next in a series of guest blog posts by Intralinks’ collaborators, partners, and vendors. Andreas Grabner is Sr. Architect and Technology Strategist at dynaTrace. dynaTrace is a technology leader in application performance management (APM). Intralinks selected dynaTrace Continuous Application Performance Management to improve the operation of Intralinks’ platform during the production lifecycle in September of 2009.

The Cloud has become the hot topic in the past year, with many major technology leaders (Amazon, Google, Microsoft,, etc.) extensively promoting public cloud services. What is equally interesting is the growth of the so-called “private cloud”: whether large enterprises transitioning their datacenter infrastructure to cloud-based paradigms; or solution providers using their own cloud-based platforms to provide solutions more efficiently.

An innovative example of the latter is Intralinks, a leading provider of critical exchange information solutions. Intralinks’ solution is used by companies in industries such as financial services, life sciences and legal, which are typically participating in complex business opportunities that involve a large volume of documents, most of which are highly confidential. Not only must Intralinks ensure that its cloud-based platform is secure, but it also needs to provide  users with instant access as soon as the documents are uploaded, which is vital during the multi-million dollar business transactions that Intralinks helps to accelerate.

To provide highly available solutions in the cloud, leading solution providers such as Intralinks have transitioned from cloud monitoring to effective cloud management. The difference is important: cloud monitoring enables detect and triage performance problems to assign responsibility to the appropriate group (system or network engineering, application development) that then needs to dig deeper to resolve the underlying issues. The problem with such monitoring is that it fails to provide the insight into the dynamic nature of the cloud needed to quickly isolate, diagnose, and resolve critical performance issues.

Effective cloud management, on the other hand, continuously provides the transparency and depth of visibility needed to assure performance and scalability in these highly dynamic environments. As no two transactions necessarily take the same path under the same conditions in the cloud, the connection between symptom and root cause of issues impacting performance and scalability is often impossible to determine. To be effective, cloud management has the following fundamental requirements:

  • Proactive: due to the continuously dynamic nature of a cloud environment you have to be more proactive than ever before
  • Business-centric: focusing on software components is no longer adequate for effective cloud management; the focus must now be on customer experience
  • Monitoring and diagnosis integrated in real-time: it is no longer enough to alert on a problem – you must capture data in real time enabling rapid resolution of issues

We’ll be hosting a webinar with Intralinks in July to discuss how they have moved from monitoring to management of their cloud-based solution. In advance of that webinar, we expand on these fundamental requirements to move from cloud monitoring to cloud management.

You Must Become More Proactive

One of the key principles to move from cloud monitoring to cloud management is becoming even more proactive. The easiest way to be proactive is to find issues earlier in the lifecycle when they are least expensive to resolve. Cloud management requires extending application performance management processes beyond production to include technicians in test and development – as well as the architects who designed the application – enabling cost savings and efficiencies by finding issues earlier before they become problems lost in the cloud. The other big benefit of such proactive management is that problems found earlier in the lifecycle never impact customers in the first place.

No matter how proactive you become early in the lifecycle, however, it is an immutable fact that some issues will find their way into the cloud. Thus, cloud management requires becoming proactive in the cloud itself.

The cloud’s dynamic nature essentially creates a multiplier effect on complexity: Now that every component of the infrastructure and application are in constant flux, effective cloud management requires continuous vigilance. You can no longer wait for problems to arise and then troubleshoot them; you have to seek out issues early before they become problems impacting end-users. In effect, you need to create an early-warning system for the cloud – capturing the data your key developers need to spot and resolve such issues before they grow into problems causing performance degradation or outages.

You Must Become Business Centric

In a cloud environment, users share the same, dynamically-scalable infrastructure – so understanding if any one of them is experiencing performance issues is no longer as simple as looking at traditional monitoring tools. What’s needed to optimize customer experience in multi-tenant, cloud-based environments is the ability to group and analyze transactions by tenant or user type. This way, when the experience of a specific user or tenant is less than expected, IT can resolve the issue quickly.

Another aspect of effective cloud management is another way of grouping such business transactions. While being able to group transactions by the end-user is essential to optimize customer experience on an individual level, you may need to look at transactions by the functional type of transaction as well.

For example, since Intralinks’ users often have access to literally millions of documents during any given project, Intralinks provides a lot of functionality around search enabling users to find those documents in many different ways. The ability to group transactions also by functionality type is necessary to ensure that such functionality is being provided to end-users, but that it is being provided with the kind of performance Intralinks’ users have come to expect.

Moreover, an important element of Intralinks’ search function is provided by a third-party service. The Intralinks service must work seamlessly with the third-party service to provide the solution their users require. Intralinks not only needs to understand overall the customer experience when users access the search function, but also whether any issues impacting the search experience are being caused by that third-party service.

Effective cloud management requires such a business-centric approach, making customers rather than infrastructure or applications the focus of attention.

Monitoring & Diagnosis Integrated in Real-Time

Cloud management requires Intralinks to be able to monitor every transaction moving through their platform continuously. More than a million professionals have used Intralinks’ solutions, including users from 800 of the Fortune 1000. In this high volume environment, Intralinks cannot utilize a solution that creates overhead impacting performance. At the same time, Intralinks can’t afford not to monitor every transaction – data aggregated over 5 or 15 minute intervals doesn’t provide the visibility needed to satisfy its users. Effective cloud management therefore requires a proper balance between depth of visibility into every transaction and low overhead.

Simply monitoring every transaction at low overhead, however, is not sufficient for effective cloud management. Real-time data from those transactions must be captured at a depth to provide actionable evidence to the developers who must resolve the issues that simple monitoring discovers. Such diagnostic evidence must not only be granular on a transaction basis, but must achieve code-level granularity. Only with such depth (including necessary context, such as logging, exceptions, arguments and bind values) can key IT personnel quickly isolate the root cause of an issue in order to resolve to mitigate impact to end-users.

Transient issues are a particularly difficult example of the need for such evidence. The problem with transient issues is not only that they can occur randomly at widely-separated, unpredictable intervals, but that they can also be very short-running. While they happen intermittently and sometimes for only a short time, they still can significantly impact the users who have the bad luck to be accessing the solution at that time. Effective cloud management not only enables IT to spot such transient issues, but captures the full execution path of each transaction that exhibits such behavior. Proper cloud management can eliminate the days or weeks of time wasted by IT combing manually through log files trying to reproduce such issues.

To learn more about moving from cloud monitoring to effective cloud management, and how Intralinks manages their cloud-based solution, register for the upcoming webinar, July 15, 2010 at Noon ET.