Too often I have seen customers simply lacking in awareness to the fact that the authentication providers can and should be tuned from their default settings. The fact is that the default values in the LDAP authentication providers are better sized to development environments (and in some cases the development environments of 5 years ago) then they are to today’s production environments. So, the first step is awareness that the authentication providers include setting related to cache performance, connection management, and handling of group lookups that can and should be tuned in order to maximize the performance of your applications.
So, in this post I’d like to go through the authentication provider settings that affect performance, discuss what each setting does, and discuss what some guidelines and how these differ from the defaults.
First, let’s briefly discuss how to find the settings you’ll want to tune. Login to the WLS admin console, on the left hand side under domain structure click security realms and then “myrealm”. From there, click on the providers tab and select the LDAP authentication provider that you want to tune. Once you are in the authentication provider configuration screen you’ll want to look in the “Provider Specific” and “Performance” tabs to modify the settings we are about to discuss. We will also discuss one setting that is located under the “Performance” tab of “myrealm” (or whatever you have named your active security realm) itself.
If all goes well you should see a screen that looks something like this after clicking the provider specific configuration tab:
In discussing the tuning of LDAP authentication providers, I like to divide the settings into 3 categories: LDAP connection settings, cache settings, and group lookup settings. If you’d like to follow along with what the documentation says you can do so here: http://download.oracle.com/docs/cd/E21764_01/web.1111/e13707/atn.htm#i1216261
Connection related settings:Connection Timeout Limit – The maximum time in seconds to wait for the connection to the LDAP server to be established. The default is set to 0 which means that there is no maximum time limit. Note that this setting only comes into play when the authenticator is trying to open up a new connection in its pool of connections to the directory.
Connection Retry Limit – The number of times the server will attempt to connect to the LDAP server if the initial attempt fails. The default is 1. Again this setting applies only to situations where the authenticator is trying to open up new connections.
Now you may ask what happens if the connection timeout limit is reached and all retry attempts fail. The answer is that the authenticator will simply give up on the current request that it is trying to open a connection to handle and return a failure. Subsequent requests will be serviced from the available pool of connections until a new one must be opened again at which time the process will repeat itself.
I could be wrong but I see no good reason why one would want to wait forever. What you want to avoid is a cycle of death situation where degradation in LDAP performance is handled poorly and leads to things backing up more than they have to in the authentication provider. The specific values that you should go with for these settings are fairly environment specific in that they depend on your directory and network infrastructure but I think that 120 seconds for the Connection Timeout Limit and 5 for the Connection Retry Limit are good starting points.
Cache related settings:
Cache TTL and Cache Size
There are two cache settings on the “provider specific” tab of most LDAP authentication providers. These are Cache TTL and Cache Size. These setting refer to a “user related” cache in the authentication providers that cache the DN lookup that translates login names to full LDAP distinguished names and possibly caches some common attribute values following the lookup. I must stress that the authentication providers do not cache username and passwords. Real username/password authentication always results in a call to the directory. With that out of the way:
Cache TTL – is the time-to-live of entries in the cache in seconds. The default is 60 which seems low to me. I would consider upping this to 5 minutes and going from there.
Cache Size – is the size of the cache in kilobytes. The default is an absurdly low 32 KB. The per-entry size of this cache is low but I don’t think upping this to 2-4 MB would hurt.
The exception to the above recommendation would be a situation where you really don’t expect an individual user from hitting the authentication provider twice in a fixed period of time. In this case I would still up the Cache size some but might leave the TTL along at 60 seconds.
Principal Validator Cache
The Principal Validator Cache is actually a setting associated with the entire realm rather than the authentication provider and is configured in the “performance” tab of the realm itself. There are two settings associated with the cache: Enable WebLogic Principal Validator Cache and Max Weblogic Principals in Cache. It is enabled by default with the max number of principals defaulting to 500.
This cache is mentioned in the documentation in vague terms as something that can improve performance and indeed it is a fairly mysterious construct. What this cache does is cache signed Abstract Principals which are used in RMI calls when a Principal Validation Provider is being used.
The long and short of it is that this cache won’t have too much affect for most people and even in situations where it will be heavily hit, it is common for the validations to be associated with a limited number of service accounts. So, for the post part you can just leave these settings as is. However, don’t be afraid to bump up the number of cached principals, the default setting is very low considering the hardware you are likely to be running WLS on in production.
Group lookup related settings:
One of the most important, if not the most important piece of tuning you can do to a WLS LDAP authenticator is to change the Group Membership Searching from unlimited to limited and set the Max Membership Search Level to an appropriate value. Not only will this improve your performance, it will prevent you from encountering a loop that prevents users from logging in when two groups are members of each other. I blogged extensively about these two settings in a previous post entitled Weblogic, LDAP Authenticators, and Groups.
Rounding out the group lookup related settings are a group of settings that can be found in the performance tab of LDAP authentication providers.
These settings all deal with the group membership hierarchy cache. This cache stores the results of recursive membership group lookups or put another way it stores what groups are members of other groups.
The settings for this cache include a check box to enable the cache which is called Enable Group Membership Lookup Hierarchy Caching, a setting that controls the size of the cache called Max Group Hierarchies in Cache, and a setting that controls the time-to-live (in seconds) of cache entries called Group Hierarchy Cache TTL.
I recommend that you enable this cache if you utilize nested groups. I recommend that you set the Max Group Hierarchies in Cache to a value larger than the total number of groups in your directory. Finally, I recommend that you set the Group Hierarchy Cache TTL to a safe appropriate number. The default is 60 seconds which will improve performance, catch changes to the hierarchy fairly quickly, but still result in a fair amount of recursive group lookups. If you up this value to 5 minutes (300 seconds) which should still be safe for most people who aren’t doing funky things with dynamic groups then you should be able to improve performance a little more with no downside.
Conclusion
Tuning of WLS LDAP Authenticators is an overlooked component of successful WLS production deployments. Taking just a little time to change the LDAP authenticator performance related configuration settings from default values to values which are appropriate for your production environment can result in a much faster and more stable system.
Hi Brain,
ReplyDeleteAnother great post.
Could you also provide some information on what changes when the LDAP connection is over SSL.
I have an issue where Non-SSL works fine however with SSL the performance decreases and in the thread dump we see the thread waiting like below
at jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII)I(Native Method) at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:32) at java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(SocketInputStream.java) at java/net/SocketInputStream.read(SocketInputStream.java:129) at com/sun/net/ssl/internal/ssl/InputRecord.readFully(InputRecord.java:293) at com/sun/net/ssl/internal/ssl/InputRecord.read(InputRecord.java:331) at com/sun/net/ssl/internal/ssl/SSLSocketImpl.readRecord(SSLSocketImpl.java:798) ^-- Holding lock: java/lang/Object@
0x15943d9f8[thin lock] at com/sun/net/ssl/internal/ssl/SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1138) ^-- Holding lock: java/lang/Object@
0x15943d9c8[thin lock] at com/sun/net/ssl/internal/ssl/SSLSocketImpl.startHandshake(SSLSocketImpl.java:1165) at com/sun/net/ssl/internal/ssl/SSLSocketImpl.startHandshake(SSLSocketImpl.java:1149) at oracle/security/idm/providers/stdldap/LDSSLSocketFactory.init(LDSSLSocketFactory.java:181)
Hi,
ReplyDeleteyou talked about "the available pool of connections", but where is this pool configured ? Which size is predefined and how to change or configure this one ?
Anyway, thanks for this useful post,
JLM
The size of the pool is configurable in the LDAP authenticator's provider specific configuration: http://docs.oracle.com/cd/E21764_01/web.1111/e13707/atn.htm#BABIDIFB
ReplyDelete