JAVA Application Performance – Monitoring, Diagnosis and Reporting with eG Enterprise v6 [DEMO]


Watch the on-demand demo “JAVA Application Performance – Monitoring, Diagnosis and Reporting with eG Enterprise v6”.

java_tech

Java apps are powering many business-critical IT services. They are also getting more complex and interconnected with other tiers and layers of the IT infrastructure. An issue anywhere in the Java stack can quickly cascade and negatively impact end user experience.

Watch the on-demand demo to see how next-generation application performance monitoring & analytics provides deep visibility into the Java stack to accelerate the diagnosis of application performance issues and quickly restore user experience. During the demonstration, you will see how to:

  • Have a single unified monitoring solution that addresses your application monitoring, diagnosis, analytics, and reporting needs;
  • Use intelligent analytics to analyze and correlate performance inside the Java stack and across the tiers of your IT environment to provide unparalleled speed & ease of proactive alerting, diagnosis & analysis;
  • View best-in-class customizable dashboards that integrate Java application performance metrics to provide real-time role-based and domain-based views on user experience, system and service health, resource consumption, capacity and more;
  • Report on historical performance and trends and analyze usage patterns to right-size and optimize your IT infrastructure for maximum ROI;

 

We look forward to seeing you online!

 

Java Application Performance – Join the Live Demonstration


Join the live demo “JAVA Application Performance – Monitoring, Diagnosis and Reporting with eG Enterprise v6” on October 22, 2014 at 11am ET | 10am CT | 8am PT | 4pm UK | 5pm CET. Register now: https://www4.gotomeeting.com/register/576795527Java_Duke

Java apps are powering many business-critical IT services. They are also getting more complex and interconnected with other tiers and layers of the IT infrastructure. An issue anywhere in the Java stack can quickly cascade and negatively impact end user experience.

Join this live demo to see how next-generation application performance monitoring & analytics provides deep visibility into the Java stack to accelerate the diagnosis of application performance issues and quickly restore user experience. During the live demonstration, we will show how to:

  • Have a single unified monitoring solution that addresses your application monitoring, diagnosis, analytics, and reporting needs;
  • Use intelligent analytics to analyze and correlate performance inside the Java stack and across the tiers of your IT environment to provide unparalleled speed & ease of proactive alerting, diagnosis & analysis;
  • View best-in-class customizable dashboards that integrate Java application performance metrics to provide real-time role-based and domain-based views on user experience, system and service health, resource consumption, capacity and more;
  • Report on historical performance and trends and analyze usage patterns to right-size and optimize your IT infrastructure for maximum ROI;

Title:  JAVA Application Performance – Monitoring, Diagnosis and Reporting with eG Enterprise v6

Registration:  https://www4.gotomeeting.com/register/576795527

Date:  October 22, 2014 at 11am ET | 10am CT | 8am PT | 4pm UK | 5pm CET

Presenters: Bala Vaidhinathan (CTO, eG Innovations), Holger Schulze (VP Marketing, eG Innovations)

We look forward to seeing you online!

Unified Monitoring, Diagnosis and Reporting of IT Infrastructure Performance with eG Enterprise v6 [LIVE DEMO]


Join the live demo “Unified Monitoring, Diagnosis and Reporting of IT Infrastructure Performance with eG Enterprise v6” on October 9, 2014 at 11am ET | 10am CT | 8am PT | 5pm UK | 5pm CET.

Register now: https://www4.gotomeeting.com/register/971626975

eG Enterprise v6
See live the brand-new release of eG Enterprise v6 – the first intelligent performance monitoring solution designed to simplify the management of today’s complex and distributed IT environments. Find out how eG Enterprise helps you make IT Operations more productive, reduce IT support cost & complexity, and keep your end users happy & productive. During the live demonstration, we will show how you can:

  • Have a single unified solution that addresses your application monitoring, database monitoring, server monitoring, network monitoring, virtualization monitoring, service monitoring and even mobile device monitoring needs;
  • Use intelligent analytics to analyze and correlate performance across the tiers to provide unparalleled speed & ease of proactive alerting, diagnosis & analysis;
  • View best-in-class customizable dashboards that integrate performance metrics to provide real-time role-based and domain-based views on user experience, system and service health, resource consumption, capacity and more;
  • Report on historical performance and trends and analyze usage patterns to right-size and optimize your IT infrastructure for maximum ROI;
  • Address gaps in your current monitoring for Citrix XenApp/XenDesktop, virtual desktop infrastructures (VDI), multi-tier Java applications and heavily virtualized IT environments – in the cloud or on-premise;

Title:  Unified Monitoring, Diagnosis and Reporting of IT Infrastructure Performance with eG Enterprise v6

Registration:  https://www4.gotomeeting.com/register/971626975

Date:  October 9, 2014 at 11am ET | 10am CT | 8am PT | 4pm UK | 5pm CET

Presenters: Bala Vaidhinathan (CTO, eG Innovations), Holger Schulze (VP Marketing, eG Innovations)

  • Are you having to spend hours troubleshooting problems by looking at multiple different tools?
  • Yearn to have a single pane of glass view into your entire IT infrastructure?
  • Wish you could drill down and with just one click determine where the root-cause of a problem lies and call the right expert to get it fixed quickly?

Get your answer on October 9.

Troubleshooting Java Application Deadlocks – Diagnosing ‘Application Hang’ situations


Users expect applications to respond instantly. Deadlocks in Java applications result in ‘application hang’ situations that result in unresponsive systems and poor user experience.deadlock

This blog post explains what deadlocks are, consequences of deadlocks and options to diagnose them.

In a subsequent blog post, we’ll explore how the eG Java Monitor helps in pinpointing deadlock root causes down to the code level.

 

A typical production scenario

It is 2 am in the morning and you get woken up by a phone call from the helpdesk team. The helpdesk is receiving a flood of calls from application users. The application is reported to be slow and sluggish. Users are complaining that the browser keeps spinning and eventually all they see is a ‘white page’.Graphic of sys admin having to troubleshoot at night

Still somewhat heavy-eyed, you go through the ‘standard operating procedure’. You notice that no TCP traffic is flowing to or from the app server cluster. The application logs aren’t moving either.

You are wondering what could be wrong when the VP (Operations) pings you over Instant Messenger asking you to join a war room conference call. You will be asked to provide answers and pinpoint the root cause – fast.

What are Java application deadlocks?

A deadlock occurs when two or more threads form a cyclic dependency on each other as shown below.

In this illustration ‘thread 2’ is in a wait state waiting on Resource A owned by ‘thread 1’, while ‘thread 1’ is in a wait state waiting on Resource B owned by ‘thread 2’.Graphic of deadlocked threads having a circular dependency

In such a condition, these two threads are ‘hanging’ indefinitely without making any further progress.

This results in an “application hang” where the process is still present but the system does not respond to user requests.

“The JVM is not nearly as helpful in resolving deadlocks as database servers are.thumbnail graphic of java_concurrency_in_practice_book

When a set of Java threads deadlock, that’s the end of the game. Depending on what those threads do, the application may stall completely”

Brian Göetz et al, authors of “Java Concurrency in Practice”

Consequences of deadlocks

1. Poor user experience

When a deadlock happens, the application may stall. Typical symptoms could be “white pages” in web applications while the browser continues to spin eventually resulting in a timeout.graphic of browser timeout

Often, users might try to retry their request by clicking refresh or re-submitting a form submit which compounds the problem further.

2. System undergoes exponential degradation

When threads go into a deadlock situation,graphic of long-queue-people they take longer to respond. In the intervening period, fresh set of requests may arrive into the system.

graphic of how fresh requests would cause exponential system degradation due to backlogged threads

When deadlocks manifest in app servers, fresh requests will get backed up in the ‘execution queue’. Thread pool will hit the max utilization thereby denying new requests to get served. This causes further exponential degradation on the system.

 3. Cascading impact on the entire app server cluster

In multi-tier applications, Web Servers (such as Apache or IBM HTTP Server) receive requests and forward it to Application Servers (such as WebLogic, WebSphere or JBoss) via a ‘plug-in’ .cascading effect

If the plug-in detects that the Application Server is unhealthy, it will “fail-over” to another healthy application server which will accept heavier loads than usual thus resulting in further slowness.

This may cause a cascading slowdown effect on the entire cluster.

Why are deadlocks difficult to troubleshoot in a clustered, multi-tier environment?

Application support teams are usually caught off-guard when faced with deadlocks in production environments.

hot-tip

1. Deadlocks typically do not exhibit typical symptoms such as a spike in CPU or Memory. This makes it hard to track and diagnose deadlocks based on basic operating system metrics.

2. Deadlocks may not show up until the application is moved to a clustered environment. Single application server environments may not manifest latent defects in the code.

3. Deadlocks usually manifest in the worst possible time – heavy production load conditions. They may not show up in low or moderate load conditions. They are also difficult to replicate in a testing environment because of the same load condition reasons.

Options to diagnose deadlocks

There are various options available to troubleshoot deadlock situations.

1. The naïve way: Kill the process and cross your fingers

You could kill the application server process and hope that when the application server starts again, kill_processthe problem will go away.

However restarting the app server is a temporary fix that will not resolve the root-cause. Deadlocks would get triggered again when the app server comes back.

 

2. The laborious way: Take thread dumps in your cluster of JVMs

You could take thread dumps. To trigger a thread dump, we have to send a SIGQUIT signal. (On UNIX, that would be a “kill -3” command and on Windows, that would be a “Ctrl-Break” in the console).

Typically, you would need to capture a series of thread dumps (example: 6 thread dumps spaced 20 seconds apart) to infer any thread patterns – just a static thread dump snapshot may not suffice. thread_dump

If you are running the application server as a Windows service (which is usually the case), it is a little more complicated. If you are running the Hotspot JVM, you could use the jps utility in order to find the process id and then use the jstack utility in order to take thread dumps. You can also use the jconsole utility to connect to the process in question.

You would have to forward the thread dumps to the development team and wait for them to analyze and get back. Depending on the size of the cluster, there would be multiple files to trawl through and this might entail significant time.

This is not an optimal situation you want to be at 2 am in the morning when the business team is waiting on a quick resolution.

cautionManual Processes to troubleshoot deadlocks can be time consuming

The manual approach of taking thread dumps assumes that you know which JVM(s) are suffering from deadlocks.

Chances are that the application is hosted in a high-availability, clustered Application Server farm with tens (if not hundreds) of servers.

complex_architecture

If only a subset of JVMs are undergoing the deadlock problem, you may not be in a position to precisely know which JVM is undergoing thread contention or deadlocks. You would have to resort to taking thread dumps across all of your JVMs.

This becomes a trial-and-error approach which is laborious and time consuming. While this approach may be viable for a development or staging environment, it is not viable for a business-critical production environment where ‘Mean Time To Repair’ (MTTR) is key.

 

3. The smart way: Leverage an APM

While an APM (Application Performance Management) product cannot prevent deadlocks from happening, they can certainly provide deeper visibility into the root cause down to the code level when they do happen.Smart way

In the next blog post, we’ll explore how the eG Java Monitor can help provide an end-to-end perspective of the system in addition to pinpointing the root cause for deadlocks.

Stay tuned!

 

About the Author

Arun Aravamudhan has more than 15 years of experience in the Java enterprise applications space. In his current role with eG Innovations, Arun leads the Java Application Performance Management (APM) product line.

New White Paper: Managing Java Application Performance


photoJava-based applications are powering many business-critical IT services. Performance monitoring and diagnosis of the Java Virtual Machine can provide key insights into performance issues that can have a significant impact on the business services it supports.

For example, a single run-away thread in the JVM could take up significant CPU resources, slowing down performance for the entire service. Alternatively, a deadlock between two key threads could bring the business service to a grinding halt.

Read the white paper “Managing Java Application Performance” and find out how to deliver:

  • Reliable performance assurance and user satisfaction
  • Complete performance visibility across your service environment
  • Automatic, rapid root cause performance diagnosis and analytics for even the most complex performance problems
  • Pre-emptive problem detection and alerting
  • Rapid ROI and cost savings through right-sizing and optimization

Designing High Performance Java / J2EE Applications is not Easy!


Business applications developed in Java have become incredibly complex. Java developers have to have expertise with numerous technologies – JSPs, Servlets, EJBs, Struts, Hibernate, JDBC, JMX, JMS, JSF, Web services, SOAP, thread pools, object pools, etc., not to forget the core Java principles like synchronization, multi-threading, caching, etc. Malfunctioning of any of these technologies can result in slow-downs, application freezes, and errors with key business applications.

Anatomy of a Java developer

One of the articles I was reading last week, i saw a very interesting table that highlighted the different types of failures commonly seen in J2EE applications. Below is an adaptation of this table listing common J2EE problems and their causes. This table gives a very good idea of why designing high performance Java/J2EE applications requires a lot of expertise (and of course, you need to have the right tools handy for you to be able to troubleshoot such applications rapidly, with minimal effort).

JAVA PROGRAMMING DISEASE  DESCRIPTION SYMPTOMS CAUSES OR CURES
Bad Coding; Infinite Loop Threads become stuck in while(true) statements and the like. This comes in CPU-bound and wait-bound/spin-wait variants. Foreseeable lockup You’ll need to perform an invasive loop-ectomy.
Bad Coding: CPU-bound Component This is the common cold of the J2EE world. One bit of bad code or a bad interaction between bits of code hogs the CPU and slows throughput to a crawl. Consistent slownessSlower and slower under load The typical big win is a cache of data or of performed calculations.
The Unending Retry This involves continual (or continuous in extreme cases) retries of a failed request Foreseeable backupSudden chaos It might just be that a back-end system is completely down. Availability monitoring can help there, or simply diferentiating attempts from successes.
Threading: Chokepoint Threads back up on an over-ambitious synchronization point, creating a traffice jam. Slower and slower under loadSporadic hangs or aberrant errorsForeseeable lockup

Sudden Chaos

Perhaps the synchronization is unnecessary (with a simple redesign) or perhaps more exotic locking strategies (e.g., reader/writer locks) may help.
Threading: Deadlock / Livelock Most commonly, its your basic order-of-acquisition problem. Sudden chaos Treatment options include detecting if locking is really necessary, using a master lock, deterministic order-of-acquisition, and the banker’s algorithm.
Over-Usage of External Systems The J2EE application abuses a backend system with requests that are too large or too numerous. Consistent slownessSlower and slower under load Eliminate redundant work requests, batch similar work requests, break up large requests into several smaller ones, tune work requests or back-end system (e.g., indexes for common query keys), etc.
External bottleneck A back end or other external system (e.g., authentication) slows down, slowing the J2EE app server and its applications as well. Consistent slownessSlower and slower under load Consult a specialist (responsible third party or system administrator) for treatment of said external bottleneck.
Layer-itis A poorly implemented bridge layer (JDBC driver, CORBA link to legacy system) slows all traffic through it to a crawl with constant marshalling and unmarshalling of data and requests. The disease is easily confused with External Bottleneck in its early stages. Consistent slownessSlower and slower under load Check version compatibility of bridge layer and external system. Evaluate different bridge vendors if available. Re-architecture may be necessary to bypass the layer altogether.
Internal Resource Bottleneck: Over-Usage or Under Allocation Internal resources (threads, pooled objects) become scarce. Is over-utilization occurring in a healthy manner under load or is it because of a leak? Slower and slower under loadSporadic hangs or aberrant errors Under-allocation: increase the maximum pool size based on highest expected load. Over-usage: see Over-Usage of External System.
Linear Memory Leak A per-unit (per-transaction, per-user, etc.) leak causes memory to grow linearly with time or load. This degrades system performance over time or under load. Recovery is only possible with a restart. Slower and slower over timeSlower and slower under load This is most typically linked with a resource leak, though many exotic strains exist (for example, linked-list storage of per-unit data or a recycling/growing buffer that doesn’t recycle)
Exponential Memory Leak A leak with a doubling growth strategy causes an exponential curve in the system’s memory consumption over time. Slower and slower over timeSlower and slower under load This is typically caused by adding elements to a collection (Vector, HashMap) that are never removed.
Resource Leak JDBC statements, CICS transaction gateway connections, and the like are leaked, causing pain for both the Java bridge layer and the backend system. Slower and slower over timeForeseeable lockupSudden chaos Typically, this is caused by missing finally block or a more simple failure to close objects that represent external resources.

In-depth visibility into the Java virtual machineAs you can see from the above table, monitoring a J2EE application end-to-end requires:

  • Tracking key metrics specific to the application server in use (e.g., WebLogic, WebSphere, JBoss, Tomcat, etc.)
  • Monitoring of the external dependencies of the Java application tier – e.g., databases, Active Directory, messaging servers, networks, etc.
  • Finally, all of the metrics have to be correlated together – based on time, and based on inter-dependencies between applications in the infrastructure, so that when a problem occurs, administrators are equipped to quickly determine what is causing the problem – i.e., network? database? application? web?

Below are several relevant links about how eG Enterprise helps with end-to-end monitoring, diagnosis, and reporting for J2EE applications.

Also of interest is this on-line webinar titled “Managing N-Tiers without Tears”. Click here to view this webinar >>>

Java Monitoring Made Easy! How We Eat Our Own Dog Food :-)


Several years ago when we started to use Java technology in our products, this technology was in its infancy. We had a lot of teething problems, but multi-platform support was important to us and we continued to pull along with Java technologies.

For many years, Java has lacked cost-effective, easy to use tools and methodologies to monitor applications. Troubleshooting was often a manual, trial and error approach. 

As our monitoring application got bigger, troubleshooting just got way more complex. Byte-code instrumentation has been one of the common ways that monitoring and troubleshooting tools for Java applications have used. This technology has been very expensive and resource intensive, and hence, often used in development environments and not in production.

The last couple of Java releases (JDK 1.5 and higher) have incorporated excellent monitoring and diagnostic interfaces that can be used to troubleshoot Java applications. The need to understand how our own Java-based monitoring application necessitated that we take a closer look at these monitoring specifications and interfaces for the Java Virtual Machine.

The result – a new Java monitoring module that is going to be available as an integral part of our next major product release.

Monitoring Java Applications to the Code-level using the eG Java Monitor

This software has been extensively used in our labs for the last couple of years, and i have experienced first hand how effective this technology is. The level of visibility and the precision of the diagnostics are incredible. This module has saved us endless hours of troubleshooting time and we hope that when this gets to our customers, they will benefit in the  same way.

We’re quite excited about this capability – the instrumentation provided in the JVM has been great, and hats off to our developers for providing a very clean and easy to user interface that should be simple to use not just for any support person but will also appeal to any Java programmer because it provides navigation similar to many of the tools built into the JVM.

You can read more about this technology here. You can also take a sneek-peak at this technology by viewing a recorded demonstration here. Please do contact us if you are interested in getting access to an early release of this software.