Saturday, April 9

WebLogic Server 8.1 Administration

Agenda
• Basic design concepts of WebLogic Server
• WebLogic Threading Model
• WebLogic Classloading
• WebLogic Clustering(communication between Apache and weblogic cluster)
• WebLogic Application Deployment
• WebLogic JDBC,JMS,JMX,JNDI
• WebLogic Performance tuning
• WebLogic Troubleshooting
• Enhancements in WLS9.0


Basic design concepts of WebLogic server - Domain

• Domain
Domain is a logically related group of WebLogic Server resources that you manage as a unit. A domain always includes only one instance of WebLogic Server called the Administration Server. Administration Server serves as a central point of contact for server instances and system administration tools.

A domain may also include additional WebLogic Server instance
called Managed Servers.

• All domain configuration information is stored in config.xml.
Attributes in config.xml are stored a ConfigMbeans within the memory of Admin and managed server.

• Server startup:
Admin server: The server as part of its startup process, would parse the config.xml and create Mbeans.
For eg:
From the above entries, myserver instance would create a ConfigMbean of type ServerMbean object with attributes ListenPort=7001 and name=myserver

Basic design concepts of WebLogic server - Administration Server
Admin server is just another server instance but with added functionality.

Admin server is equipped with the functionality to drive the deployment (master deployer), monitor the managed servers etc.
All the weblogic.Admin utility commands are initiated from Admin server.
Admin server would act as central point for managing the domain.

Basic design concepts of WebLogic server Managed Server
Managed Server:
Managed server would connect to admin server via http connection and gets a copy of config.xml file and creates the related Mbeans. These Mbeans exists in memory of server for the life time.

Once managed server has started up successfully, we do not need Admin server. The application hosted on managed servers can be accessed irrespective whether or not Admin server is up and running. Admin server would be needed for deployments, any administrative activity like accessing the console etc.

Basic design concepts of WebLogic server Managed Server independence Mode
• Managed server Independence mode:
Managed server can also start without Admin server if MSI mode is enabled. In this mode, managed server would look for file “msi-config.xml” under its root directory. It creates Mbeans out
of msi-config.xml to startup successfully.
msi-config.xml will be created/copied to managed server by admin server (If MSI replication is enabled- from Admin console)

For details look under:
http://e-docs.bea.com/wls/docs81/adminguide/failures.html#1118666
Basic design concepts of WebLogic server Server startup
Server startup:
WebLogic server can be started in the following ways:
• Using scripts
• Node manager
• Weblogic.Admin start command
• On windows weblogic server can be started as windows service (beasvc.exe)

Using Scripts
WebLogic installation creates the startup scripts. We can modify the created scripts to suit our environment.
All scripts would have :
java weblogic.Server

The main class is weblogic.Server that initiates the server startup sequence.
As part of server startup sequence (irrespective of how it is started) there are various services that get initialized.
RMI service, cluster service, JNDI service etc
Basic design concepts of WebLogic server
Node Manager

• Node Manager is a Java utility that runs as separate process from WebLogic Server and allows you to perform common operations tasks for a Managed Server, regardless of its location with respect to its Administration Server
• There should be one NodeManger process on each physical machine.
• NodeManager is domain independent

Basic design concepts of WebLogic server- beasvc
• WebLogic Server can be started as windows service.

The executable that initiates the server startup is beasvc.exe
The scripts used to create the service are created during installation ()

Run the script (edit the classpath and JVM arguments if needed) to create the service which will update the registry too.
Run the service through control panel.

Basic design concepts of WebLogic server Server lifecycle

The following phases are involved in server startup cycle:
SHUTDOWN—>STARTING—>STANDBY (optional) —>RUNNING

The following are high level activities that occur during server startup:
• Obtains the configuration data. While Admin server gets this from config.xml, managed server get this from Admin server.
• Starts logging and timer services & loads license.bea file.
• Initializes internal component services like RMI service, Cluster service, IIOP service, Deployment Manager service etc
• JDBC connection Pools are created.
• Startup classes (Eg Wily introscope) are executed.
• Applications are deployed.

• The following phases are involved in server graceful shutdown cycle (When weblogic.Admin utlity is used or stopped from console)

RUNNING—>SUSPENDING—>STANDBY—>SHUTTING DOWN—>SHUTDOWN

Graceful shutdown of server allows all the inflight work to be finished before the server shuts down.

The first services to shutdown are RMI service and web container, so that no new requests can come through.

Server would wait for “Graceful Shutdown Timeout” before shutting down.

All the existing httpsessions are destroyed or server would wait till they expire depending on “Ignore Sessions During Shutdown”.


Basic design concepts of WebLogic server Production Vs Development modes
Weblogic Server can be started in development mode or production mode.
In dev mode, there is an application poller.
This is a simple utility that polls the “/applications” directory every 3secs (default). If it finds a different version of application in the directory, then weblogic would redeploy the app automatically.
Obviously there is a little bit of overhead involved, but this is justified for dev environment.

In production mode application poller is turned off.
User would either have to deploy from console or run weblogic.Deployer utility

Basic design concepts of WebLogic server Domain-Wide Administration Port
Domain-Wide Administration Port
This is an optional feature available that would help in segregating the admin traffic with application related traffic.
A different port has to be configured and all admin tasks (including accessing Admin console) are picked up from the configured port.
The cons for this setup are:
• The Administration Server and all Managed Servers in your domain must be configured with support for the SSL protocol.
• After enabling the administration port, all Administration Console traffic must connect via the administration port.

Basic design concepts of WebLogic server Admin Server recovery
If Admin server machine is down and if we need to migrate to another machine:
1) Backup Config.xml file
2) Backup security data (ldap directory & SerializedSystemIni.dat)
Replace these files in the new domain and start the Admin server.
The new Admin server will automatically reconnect with running managed servers

Basic design concepts of WebLogic server
• WebLogic8.1 is J2EE1.3 compatible
• Managed Servers within the domain can be at different Service Pack levels as long as the Admin Server is at the same Service Pack Level or higher than its Managed Servers.
• All servers that are part of cluster should be of same service pack level.
Basic design concepts of WebLogic server WebLogic internal applications
WebLogic has some internal applications deployed by default to perform some of internal activities.
console.war – weblogic Admin console is a webApplication deployed only on Admin server.
FiledistributionServlet: This servlet takes care of communication activities between Admin and managed server. (Deployment, managed server startup etc)
Eg: During deployment, this servlet takes care of copying the application on to all managed servers staging directory.
uddi.war/uddiexplorer.war: These are meant to service webservice requests.

WebLogic Classloading
A classloader is part of JVM that loads classes into memory. Classloader is responsible for finding the classes at runtime.
The following classloaders are involved with weblogic server:

• Bootstrap classloader (JVM)
• Extensions classloader (JVM)
• Application classloader (WebLogic)

Points to remember:
• Java classloader would always search for a class from parent classloader. If two classes with same name exist within an EAR/WAR file and System ClassPath, the class from system classpath is picked up always because system classpath is parent classloader of application classloader (from the flow diagram above)
• Classes in WAR file can see/load EJB classes but not viceversa
because WAR classloader is child of EJB classloader

WebLogic Threading Model
• WebLogic Server is a multi-threaded java application that uses an internal component called an “Execute Thread” (which extends java.lang.Thread) to perform it’s work.

• Threads are created out of thread pools. Each thread pool is meant for specific purpose.

The important thread pools on weblogic8.1 are:
• Thread Group: weblogic.socket.Muxer
Defaults to 3 on Unix systems and 2 on Windows.Threads from this group are used for socket reading purpose.
• Thread Group : weblogic.health.CoreHealthMonitor
CoreHealthService creates an instance of this thread to do periodic monitoring of servers' memory and state of its threads. It reports stuck threads.
• Thread Group : weblogic.admin.RMI
Threads from this group are used for communication between Admin and managed servers.
Eg: Deployment of application.
• Thread Group: weblogic.kernel.System
Threads from this group are used for weblogic internal activities like RJVM heartbeats, getting http state dump for JNDI updates in cluster.
• Thread Group : weblogic.admin.HTTP
Threads from this group are used for serving all console related requests.
• Thread Group: weblogic.kernel.Non-Blocking
Threads from this group are used for session replication activities.
• There are other threads created by JVM.
For Eg, in case of SUN JVM we can see:

"Signal Dispatcher" daemon prio=10 tid=0x009226e8 nid=0xc24 waiting on condition[0..0]
The Signal Dispatcher is responsible for forwarding native events.
like when user hits kill -3, this thread takes care of initiating the process of getting thread dumps.

• Thread Group: weblogic.kernel.default
The threads from this group are the main worker threads that does the real work.
These threads service requests to the application hosted on weblogic.
There are 15 threads by default. The number of execute threads can be modified through Admin console.

WebLogic Threading
WebLogic Threading- Custom Queues

Custom thread queues are user defined.
The following are the two steps to achieve this.
• Create a thread queue. This can be done from Admin Console.
• Assign the created thread queue for an application. This is done by including the queue information in the application descriptor files.
For eg:
Weblogic Admin console (console.war) is a webapplication that has custom thread queue. If we see weblogic.xml of console.war we can see the custom queue definition as:


weblogic.admin.HTTP


If we are creating a sample thread queue say “my-sample-queue”.
To assign it our webApplication, just as these entries to weblogic.xml of our webApplication.

my-sample-queue



WebLogic Application hosting
• Applications deployed on weblogic server would be:
1) EAR ( WAR + EJB jar + Custom jars)
Descriptor files are:
META-INF/application.xml
META-INF/weblogic-application.xml

2) WAR ( Servlet/JSPs + java libraries,classes)
Descriptor files are:
WEB-INF/web.xml
WEB-INF/weblogic.xml

3) EJB (EJB interfaces, java libraries/classes)
Descriptor files:
META-INF/ejb-jar.xml
META-INF/weblogic-ejb-jar.xml

4) RAR ( J2EE connector)
Descriptor files:
META-INF/ra.xml
META-INF/weblogic-ra.xml

The applications can be deployed in archived format (EAR,WAR) or as exploded format. (Although we rarely see EJBs deployed in directory format)



Deployment modes

Deployment can be done through Admin console OR command line using webogic.Deployer utility.
More about the deployer utility can be found at:
http://e-docs.bea.com/wls/docs81/deployment/tools.html#wldeployer

The following are the modes of deployment.
• No stage
• Stage
• External_stage
Deployment
• WebLogic does deployment in two phases
(aka Two-Phase deployment)

Prepare : This phase makes sure that the application is in a state in which it can be deployed reliably. During this phase, Admin server copies files to target (managed servers)

Activate: The activate phase involves the actual deployment i.e the classes are loaded into memory. This takes place only if prepare phase is successful.
Deployment Modes

No Stage
In this mode, weblogic will deploy the application from the
Path= tag that is specified in the tag in config.xml.
weblogic will not make a copy of the app to any staging directory, nor will it make any copy of the app to distribute to managed servers. All servers should have to share a common file system so that all the servers would reference the application from one physical location.

Stage
weblogic will make a copy of all of the applications into the staging directory, and will deploy the app from there. weblogic will do all the file transfer of the application files to the managed servers. As part of deployment process weblogic would copy the application (WAR/EAR) on to managed servers and deploy from there.
This is default deployment mode unless changed.

External_Stage
weblogic will deploy from the staging directory, but we (user)
are responsible for getting the application (EAR/EAR) to staging directory.Weblogic will do no file transfer to the managed servers.

The difference between nostage and external stage is that nostage is intended for people who have one shared disk, so the Path= attribute is meaningful for every server in the domain.
External stage is for people who don't use a shared disk, so Path= only has meaning only the admin server. Each managed server will have a stagedPath that makes sense for that server.

WebLogic Deployment
At Narayanasetti's company we used External_Stage as our deployment mode.

Our scripts would do the file transfer and weblogic would do the deployment.

More over we use command line utility weblogic.Deployer
for deployment purpose.

WebLogic JDBC
Connection pool:
Connection pool is a named group of identical JDBC connections to a database that are created when the connection pool is deployed. Connection pooling is for efficiency purposes to reduce the overhead of creating connection objects at runtime.

JDBC Flow logic:
When server startup, depending on the initial JDBC pool capacity, connections are created.
Connections are created based on the information provided in pool attributes.
At runtime, Client asks for a connection, weblogic would take an existing connection from pool and do a ping (query is user defined).
If ping is successful, weblogic would give connection to client, else it will recreate and give the connection.
WebLogic JDBC (Flow chart)
WebLogic JDBC configuration
1) Create a connection Pool from Admin console.
Provide all the JDBC driver attributes like Driver class, Driver URL and test table name.
2) Target the connection pool to the list of servers.
3) Create a Datasource and assign it to the pool created.
4) Target the Datasource to the servers where connection pool is targeted.

WEbLogic JDBC (Drivers)
• WebLogic has integrated JDBC drivers.
All the drivers that weblogic ship are JDBC type4 thin drivers.
(All the weblogic JDBC drivers are from DataDirect technologies)

WebLogic would also include some of the frequently used drivers. These drivers come with installation. These drivers are shipped for convenience purpose and it is advisable to use the latest drivers available from vendors website.

WebLogic JDBC
The following are the that drivers come with weblogic installation:
(Apart from weblogic type 4 integrated drivers)
They are located under /beahome/weblogic81/server/ext/jdbc :
• Oracle 10g,9.2 thin driver
• Sybase Jconnect5.5 , 4.x
Any driver that implements JDBC spec can be used with weblogic. The driver should be thread safe too. (Suitable for multithreaded apps)
WebLogic would support type2 OCI drivers too. However thin driver is recommended because it is 100% java and not prone to jvm crash or native memory leak. In case of Oracle, the performance between thin driver and OCI driver has been pretty much narrowed down with JDK1.4 (As per Oracle)

WebLogic JDBC MultiPool
1.MultiPool
This is a pool of connection Pools. The pools can be connected to different DBMS or specific setup like Oracle RAC.
You would need to choose the algorithm based on application requirements (Only one algorithm can be applied)
• High Availability: All connection requests are served from first pool in the list. *Only* if the pool is not available, (connection fails and cannot be refreshed) the connection is routed to the next pool in the list.

2. Loadbalacing:

Connection requests to the pool are roundrobined across the list of pools. However if a connection request to a connection pool
was not successful and cannot be recreated (such as when DB is down or pool is suspended) then the request is routed to next connection pool in the list.
Oracle RAC:
The multipool concept can be applied to Oracle RAC where each connection pool is connected to one of the oracle instance and we can choose the algorithm.
There are quite a few limitations to multipool when global transactions are involved (XA).

WebLogic JMX
• Weblogic implements SUN JMX1.0 (Java Management Extensions)
All WebLogic Server resources are managed through these JMX-based services.
For easy understanding, consider the following : JDBC connection Pool (snippet from config.xml)


The JDBC connection pool resource is exposed as Mbean called “JDBCConnectionPoolMBean”
• For JDBCConnectionPoolMBean, there are lot of other attributes with default values. The values that we add through console override the default values.(like MinCapacity,MaxCapacity etc)
An Mbean is just a concrete java class that is coded as per JMX specification and it provides a set of setters and getters of the attributes.
In our example, we can invoke JDBCConnectionPoolMBean and
then change the attribute like initial capacity via JMX program.
• There are another set of Mbeans called runtime Mbeans.
As the name implies, this is read only and used to monitor/get the runtime values.

Extending our same JDBC pool example, the related runtime Mbean is : JDBCConnectionPoolRuntimeMBean.
If we see the API, we can see methods like:
getActiveConnectionsCurrentCount()
These are runtime values at the particular instant.

General rule of thumb is, Mbeans for which the change involves persisting data to config.xml are termed as configuration Mbeans.
(JDBC pool capacity, driver name, timeout value etc).
Mbeans that involve read-only data is termed a runtime Mbean.
The API documentation will give a better picture of all the methods

As shown in the example, we have configMbeans as well as RuntimeMbeans for all weblogic resources. (JDBC,JMS,EJB etc)
This provides an easy way for other vendors like vignette to monitor and administer weblogic resources.
When are these configMbeans created?
yes, at startup. Admin Server creates these Mbeans (for each resource) from config.xml and managed servers when they startup, connect to admin server and get a copy of these Mbeans into their local memory. RuntimeMbeans are created at rutime and they reflect the in instantaneous values.

WebLogic JNDI
• JNDI provides naming and directory functionality to applications.
Applications use naming services to locate objects in data sources, EJBs, JMS, MailSessions, and so on in the network.
JNDI provides abstraction where the client need not worry of any changes on server side.
For eg: datasource associated to JDBC connection pool.
connection pool properties can be changed or even connect to different DB, but there would not be any code change to client.
All the client has to do is lookup the JNDI name and get the connection.
• WebLogic Server provides its own implementation, weblogic.jndi.WLInitialContextFactory, that uses the standard JNDI interfaces.
• To check the JNDI binding on the server, open weblogic Admin console, right-click on the server name and view the JNDI tree.
• In weblogic cluster, all the JNDI information is replicated across all the servers in the cluster through multicast.
WebLogic JMS
• JMS enables applications to communicate with one another through the exchange of messages. All J2EE servers implement JMS specification.
In JMS we have Message Producer which creates/sends message and Message Consumer who actually consumes the message and probably do some action further.

The common messaging models in JMS are:

1) Publish-Subscribe Messaging
When multiple applications need to receive the same messages, Publish-Subscribe Messaging is used.
The main concept in this model is “topic”.
Publisher may send messages to a Topic, and all applications that are subscribed to the topic will receive the message.
Eg: Consider a stock quote application. The publisher would broadcast the updated stock quote (topic) and all the subscribers that are interested (subscribed) in the particular stock quote will receive it.

2) Point-To-Point Messaging (PTP)
The concept in this model is “queue”.
This is 1-1 where there is one sender and one receiver.
A queue sender (producer) sends a message to a specific queue. A queue receiver (consumer) receives messages from a specific queue.

WebLogic Clustering
Basics
• WebLogic Server provides clustering services for
WebApplications
EJB and RMI applications
JNDI
• Essential services of clustering:
LoadBalancing
Failover
• Underlying mechanism/protocol
Heartbeats and JNDI updates – Multicast (UDP)
Session replication –IP sockets(TCP)
JNDI state dump – http url connection

• J2EE spec does not mention about clustering.

So this is a value add on feature that most application server vendors provide.
Since the spec does not talk about it, each vendor would implement this in their own way. However the ultimate objectives are “loadbalancing and failover”.

So all the cluster related information would be defined in
weblogic.xml (This is vendor specific descriptor file)

• Replication
WebLogic Server provides clustering support for servlets and JSPs by replicating the HTTP session state of clients that access clustered servlets and JSPs. WebLogic Server can maintain HTTP session states in memory, a filesystem, or a database.
Replication involves, copying all the session information on to another server (called secondary server). In case of weblogic this is a RMI (t3) call.

WebLogic has five different implementations of session persistence: (set in weblogic.xml)

• Memory (default – session info stored in memory)
• File system persistence (session info stored in text file)
• JDBC persistence (session info stored in database)
• Cookie-based session persistence (session info stored in client browser)
• In-memory replication (Session info stored both in primary and secondary servers memory. Most commonly used in cluster setup)
Communication between Apache and WebLogic Cluster
A typical journey of request (Narayanasetti shared platform)
Request comes through hardware loadbalancer and gets forwarded to one of the Apache server.
WebLogic Apache plugin gets invoked, delegates the http request to one of the servers in cluster. (It gets the information of weblogic instances from httpd.conf file)
WebLogic server would create a session cookie and send back the session cookie as http response header.
As long as browser is alive, the session cookie is passed (as http header) and Apache plugin would parse the session cookie and route the request to same server.

Performance tuning
LAN Capacity
It is recommended to have 100 Megabit minimum
It is recommended to host DB on separate Hardware machine.

JVM Heap Size
Large Heap leads to slower, but less frequent GC
Small Heap leads to faster, but more frequent GC
If heap is larger than RAM, this will cause page swapping.

How to choose the Max Heap
Run the application with –xaprof and do the load test. After the load test kill the server process.
Upon the process exit, JVM would create a summary which includes the max amount of memory that has been utilized.
This serves as base point and max heap can be set more than this value.
There are other commercial tools available to record the heap usage like integration of LoadRunner into wily introscope so that
memory usage can be monitored and recorded.

Garbage Collection timings

Its good to have the JVM not spend more than 5 to 10 secs
for full GC because FullGC pauses the application threads per cpu. (Some of the latest JVMs have settings that would reduce the pause time but not 100%).

In those cases, try other GC settings (parallel GC or concurrent GC or incremental GC) and see if it reduces the GC times.

Set Execute Thread Count accordingly to utilize CPU’s
Too many threads will cause too much context switching.
OS has to schedule the threads so that all the threads get fair share.

Too little threads will cause CPU under utilization and
request waiting.

Set an optimal values based on thread/CPU usage Set.
This can be done by taking periodic thread dumps during load test
to see if threads are idle or busy for most part of the time.

JDBC tuning


• Use large enough pool to service largest number of concurrent database users
• It is recommended to have the Max pool capacity equal to Execute thread count. (This can be application dependant)
• Process most of the data inside the database to minimize network traffic
• Process a commit after a batch of statements


WebLogic tuning


• On Solaris, set the TCP timewait interval to 60secs
Suggested TCP values from BEA is at:
http://e-docs.bea.com/wls/docs81/perform/HWTuning.html#1121083

• Application would be most of the times be CPU intensive or IO intensive. Typically IO intensive applications need more threads because IO operations tend to be slow.

• Use NativeIO wherever possible.
NativeIO is fast compared to java IO because in java, the socket needs to be polled.
Troubleshooting WebLogic
Logging

Logging:

1) Server log
WebLogic server creates server log file by default under:
///.log
The location is configurable.

2) JDBC log
All SQL statements and DB related exceptions/errors.
This file is created under //jdbc.log

3) STDout log (If the process is redirected to STDout)

4) Domain log
All domain level information is logged into this file.
This is subset of server log file.
/.log

5) Access log
All http requests are recorded in this log file
//access.log

6) Transaction log
All servers record transaction in the tlog file
//.tlog

WebLogic Troubleshooting(Server crash)
Server Crash
This implies the weblogic java process no longer exists.
Server crash can occur only because of native code. (Java cannot cause a process to crash)

Determine all potential sources of native code used by the WebLogic Server.
• nativeIO.

• Type2 jdbc driver.(At Narayanasetti we use thin driver for most applications)

• Native libraries accessed with JNI calls.

• SSL native libraries.

• JVM itself. Most of the times its from JVM.

Sometimes the JVM will produce a small log file that may contain useful information as to which library the crash has originated from. (hs_err_pid*.log)


Server Crash Analysis
When a JVM is crashed, a core file(binary image of the process) is created. Run pmap and pstack against the core file to get the library that caused the crash.

Demo to figure out offending library using existing pmap & pstack out files.

Check list:

1) hs _err_pid*.log (Look for library that caused the crash)

2) pmap core (core file created in JVM root dir)
pstack core

3) Using debugger (gdb,dbx,adb) (if above two steps does not provide any information)

Server Hang

A server is said to be hung when:
1) Process is still alive
2) Server does not accept any requests because all the execute threads busy or stuck for some reason.
3) No reponse sent to clients.
4) java weblogic.Admin PING command doesn’t return a normal response

WebLogic Troubleshooting(Server hang)

Server Hang Analysis:
The first step is to take multiple thread dumps.
• A thread dump is a snapshot of the JVM at the particular instant.
• Multiple thread dumps are necessary to conclude that the threads are stuck and not progressing.

Procedure to take thread dumps:
Unix:
Open shell window and issue the command kill -3
where PID is java processID of weblogic. Thread dumps are
logged on to STDout file.

Windows:
Do ctrl-break on command window where weblogic is running.
Thread dumps are created on the same command window.

Windows Service:
Open a command prompt and issue the command(Make sure beasvc.exe is in the PATH)
c:\> beasvc -dump -svcname:service-name
Thread dumps are created in the defined log file.
While creating service, we can provide log option in installservice script as:
-log:"d:\bea\domains\mydomain\myserver-stdout.txt

• Before we analyze thread dumps, it is important to know the common thread states:
1)Runnable [marked as R in some VMs]:
This state indicates that the thread is either running currently or is ready to run the next time the OS thread scheduler schedules it.

2)Object.wait() [marked as CW in some VMs]:
Indicates that the thread is waiting for some condition to be
fulfilled.

3)Waiting for monitor entry [marked as MW in some VMs]:
Indicates that the thread is waiting to enter a synchronized block.

These threads are something to watch out because there is lock contention here. Thread is waiting for a lock on object and some other thread is holding the lock.

In case of weblogic, the main worker threads are from group weblogic.kernel.defalt:
"ExecuteThread: '1' for queue: 'weblogic.kernel.Default'“….
This is the set of threads we need to look for hang/slow performance issues.
This is a snapshot of idle thread waiting for some work to be assigned.
On an idle system you would see lot of threads in the below state:

"ExecuteThread: '1' for queue: 'weblogic.kernel.Default'" daemon prio=5 tid=0x031a6308 nid=0x980 in Object.wait() [2dff000..2dffd8c]
at java.lang.Object.wait(Native Method)
- waiting on <0x112cf2c0> (a weblogic.kernel.ExecuteThread)
at java.lang.Object.wait(Object.java:429)
at weblogic.kernel.ExecuteThread.waitForRequest(ExecuteThread.java:153)
- locked <0x112cf2c0> (a weblogic.kernel.ExecuteThread)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:172)

• As for thread dump analysis & conclusion, lets see a sample thread dump and drill into it further
Demo of RSD thread dump (Thread stuck issue on UAT)
WebLogic Troubleshooting(Server slow)

Server performing Slow
There are lot of reasons for server performing slow.
First step is to take thread dumps and see what the threads are doing. If there is nothing wrong with the threads there are other reasons why server performs slow:

Process runs OutOfMemory:
If java heap is full, server process appears to be hung and not accepting any requests because each request needs heap for allocating objects.
So if heap is full, none of the requests get served, all the requests fail with java.lang.OutOfMemory

WebLogic troubleshooting

• OutOfMemory Analysis:
OutOfMemory can occur because of real memory crunch or a memory leak causing the heap to fill with orphaned objects.
First step is to enable GC and run the server again.
(-XX:printGCDetails).
The STDout file would show the garbage collection details.
If the error is because of memory leak, then we would need to use profilers like Introscope or optmizeIT to figure out the source of leak.

Process size = java heap + native memory + memory occupied by the executables and libraries.
On 32 bit operating systems, the virtual address space of a process can go up to 4 GB. This is data bit limitation (2 pow 32)

Out of this 4 GB, the OS kernel reserves some part for itself (typically 1 – 2 GB).
This is not a limitation on 64 bit machines like solaris(sparc) or windows running on Itanium (64 bit)

WebLogic Troubleshooting Fragmentation
• OutOfMemory Analysis
OOM can occur due to fragmentation. In this situation, we can see free memory available but still get OutOfMemory errors.
Before we know about fragmentation, we need to know the following fact:
Heap allocation can only be contiguous (As per JVM spec). If a request needs 2MB of memory then JVM has to provide 2MB of contiguous memory chunk.
Over a period of time, memory allocation is becomes scattered and there might not be enough contiguous memory available.
FullGC might no be able to reclaim the contiguous space.
This is called fragmentation

WebLogic Troubleshooting
For eg: The verbose:gc output might look like the following if there was a fragmentation of heap. There is free memory available, but still JVM throws OOM error.
(Most of the fragmentation bugs are resolved in Sun JDK1.4.2_xx)

[GC 4673K->3017K(32576K), 0.0050632 secs]
[GC 5047K->3186K(32576K), 0.0028928 secs]
[GC 5232K->3296K(32576K), 0.0019779 secs]
[GC 5309K->3210K(32576K), 0.0004447 secs]
java.lang.OutOfMemoryError

• OutOfMemory Analysis
Fragmentation relates issues are because of bug in JVM.
Best approach is to try the latest minor version of JVM and if does not work out, we need to work with vendor to get it fixed.
• The following commands on solaris will provide good information:
vmstat :
The vmstat command reports statistics about kernel threads, virtual memory, disks, traps and CPU activity
sar:
An OS utility that is termed as system activity reporter

• If the application uses SSL, then the server performs slow compared to non SSL.
SSL reduces the capacity of the server by about 33 to 50 percent depending upon the strength of encryption used in the SSL connections.

• Process running out of File descriptors. Server cannot accept further requests because sockets cannot be created. (Each socket created consumes a FileDescriptor)
The following exception is thrown in such cases:
java.net.SocketException: Too many open files
OR
java.io.IOException: Too many open files
In the above case, the lsof utility would help. lsof utility shows the list of all open filedescriptors. From the list of open files, we ( application owner) can easily figure out if it is a bug or expected behavior. If it is expected behavior, then the number of FDs needs to be increased. (default number is 1024)

• GC taking long times (more than 20secs).
This appears like a hang for end users.
In the above case, we need to tune the GC parameters.
In these scenarios, we should be trying other GC options available. In some cases (GC taking very long times), incremental GC has been useful (-Xincgc).

Before knowing about high CPU analysis, it would be helpful if we know about solaris threading Model.

• Process consuming High CPU

Process using high CPU is not always bad. This is a common misconception. Infact we want our application to use the CPU efficiently.
What we do not want is, a single thread or couple of threads consuming all the CPU forever(In an infinite loop)
and not allowing other threads to get the CPU share(timeslice).

High CPU analysis:

1) Run prstat to know the process that is consuming the highest CPU.

2) Run prstat –L –p where PID is processID that is consuming the highest CPU.
From the above command you would see the lwps that are consuming the high CPU.

3) Run pstack
From the pstack, we can see the lwp mapped to native thread
For eg:
----------------- lwp#

No comments:

Post a Comment