Monday, April 26, 2010

Custom caching for OES attributes

First in the interest of full disclosure: this wasn't my idea. It's exactly the sort of thing I usually come up with but in this case someone else suggested it to me.


One of the most powerful things about OES is the ability to write policies on attributes. I talked about this in a post about writing OES policies back in December, but for those of you that missed that post here's a cliff's notes version:

Objects can have attributes hanging off of them
So can Roles, Groups, and Users.
The best way to write policies that make sense and scale is often to use a constraint to compare the values of attributes.

I gave a decent example in that post so I'm not going to go back into the details.

One of the things people ask about OES is whether it caches the values it retrieves. The answer is "of course!", and then we get into a whole bunch of questions about how the attribute cache works. Yes there are TTLs on the attributes. Yes they are configurable per attribute. Yes the caches get flushed automatically at appropriate times. Yes there's an API to flush the attributes out of cache.

Even so, sometimes there are cases where customers need to do something slightly different.

First an example Policy:
Deny access if AccountStatus is set to Expired

In OES' internal format this looks like
deny( any, //app/policy, //role/Everyone ) if AccountStatus="Disabled"
If getting AccountStatus is "cheap" then you can just go get it whenever you want from the database and not bother to cache it at all; then if the account gets disabled you can deny access to everything immediately. If getting AccountStatus is "expensive" (e.g. it requires a call back to your mainframe) you don't want to make it on every access attempt so you'll want to cache it for a while. Caching the value in a regular LRU cache with a TTL of something like an hour means you might continue granting the user access to resources even after their account has been disabled. Set the cache too high and you have a security problem, set it too low and you have a cost problem. I'm using cost here in the most generic sense possible - it could mean compute cycles, CPU time, clock time or literally dollars and cents.

Customers usually have to try to figure out how to balance these two requirements. It's one of my maxims all security winds up coming down to this very trade off.

What if you didn't have to make that trade off?

Oracle Coherence lets you build a cache that runs on individual machines in the same VM as the rest of your code, but automatically syncs up across your network. Coherence's Java API, at its simplest, is a Map you can put stuff into or get stuff out of. If the data you want is already known by someone in the cluster your call to get() it out comes back seamlessly. Similarly when you put something into the cache anyone else can get it out. And if someone removes data from the cache it gets removed everywhere in the grid. Coherence handles all of the painful stuff like node failure and recovery as well as nodes being added to the grid. Coherence can also persist the data or go get it for you if you want, but I'm going to try to keep this more generic and ignore those features and treat Coherence as a simple data storage cache and nothing more.

Steve Poz blogged about using a Coherence cache for OES attribute data back in September. I'm going to build on that work. Because if there's one thing I know it's that good programmers write good code, but great programmers know where to steal good code.

OES ships with a bunch of attribute retrievers for sources like databases and LDAP directories out of the box. If you're like me you don't want to build that code again; configuration strings, connection pools, handling data retrieval failures and logging are all solved problems so why waste your time reinventing the wheel?

Back to my example policy:
deny( any, //app/policy, //role/Everyone ) if AccountStatus="Disabled"
AccountStatus comes from an existing OES attribute retriever. Let's leave that as it is and go ahead and create a new attribute CachedAccountStatus. We'll write an OES Attribute Retriever that looks for any attribute that begins with the keyword Cached and just layer a cache on top of the existing attribute data.

To do that we import com.bea.security.providers.authorization.asi.AttributeRetrieverV2 and implement the AttributeRetrieverV2 interface.

The two methods are getHandledAttributeNames() and getAttributeValue(). For the former we just return null so that we get called for every attribute and have an opportunity to resolve the data or pass. Here's what the code looks like:


public String[] getHandledAttributeNames() {
return null;
}

public Object getAttributeValue(String name,
RequestHandle requestHandle,
Subject subject,
Map roles,
Resource resource,
ContextHandler contextHandle) {

// Set default value
String attrValue = null;

if (name.startsWith("Cached")) {

LOGGER.debug("Request received for Cached attribute." );
LOGGER.debug("Requested attribute name '" + name + "'" );

// skip forward 6 characters to get the real attribute's name
String uncachedName = name.substring( 6 );

LOGGER.debug("Uncached attribute name '" + uncachedName + "'" );

try {
AttributeElement ae = requestHandle.getAttribute( uncachedName, false );

if ( null == ae )
LOGGER.error( "Failed to get attribute named '" + uncachedName + "'" );
else {
// this code only "does" Strings. Real code would need to be smarter
attrValue = (String)ae.getValue();

LOGGER.debug("Uncached attribute '" + uncachedName + "' => '" + attrValue + "'" );
}
}
catch ( Exception ex ){
LOGGER.error( "Exception caught getting attribute", ex );
}

}

return attrValue;
}
Then the only thing left is to wire in the actual caching. And Steve showed us how to do that on his blog. Basically add the necessary Coherence jars to your classpath and add a couple of -D options to the line that starts your VM up. Then tweak the code above so that the constructor connects to the cache:

private NamedCache myCache = null;
public MyAttributeRetriever() {

CacheFactory.ensureCluster();
myCache = CacheFactory.getCache("oesattributecache");
}
And add a line to my code to try to get the data from the grid and another to insert the value into the cache if I find it.

I'll leave the copy/pasting to you. :-)

The only other bit we need to solve is how to remove the cached value when necessary. Basically you have the system that changes the value or the one that stores the value call the coherence API to remove the cached value or at least kick that process off. For example if you disable users through OIM you could have OIM treat the Coherence cache as a provisioned endpoint. Or you could use a database trigger. Or you could be listening for changes to your LDAP directory. Or any of an infinite number of other ways.

Let us know if you try this out on your own!

Tuesday, April 20, 2010

OES 10gR3 CP4 Now Available

New Certifications



9057333 Certify WLP 10.3 SSM
9383690 Certify OES Admin and WLS SSM on WLS 10.3.2
9198055 OES 10gR3 Admin/SSM certification for Tomcat 5.5.26
9471409 Certify 10gR3 ON HPUX 11.31 64 bit OS

Some bug fixes



9361693 Backport config tool to support scoped and non scoped policies.
The properties file used as input for ConfigTool now includes a new
configuration (scope.policy.model = true) to indicate if one requires
organization scoped policies or use the non organization scoped ALES 30
style policies. If scoped model is set to true, the config tool prompts
for organization name to scope the policy entities, otherwise thie tool
prompts for the resource root to scope the policies.

9248336 Policy distribution takes too long for large policy sets
A new configuration PD.simpleDistributionThreshold=1000 is included as
part of the the WLESblm.properties file. If the calculated delta to distribute
exceeds the configured value, a full policy distribution is made.

9367053 BLM API has slow performance with large resource datasets
BLM API now uses lazy loading for handling resources and attributes.
To turn off lazy loading and work in the older mode, a new configuration
BLM.lazyLoadResource=false is added to the WLESblm.properties configuration
file.

9207105 Removal of wsdl4j from OES
Replace wsldl4j.jar with oracle wsdl implementation to meet common code
compliance requirements across oracle products.

8826278 Policies imported from resource discovery no visible under EUI
Fixed resource discovery modules.

9328300 Using SYS_DEFINED aborted ATZ evaluation for missing attribute retriever
attribute
Fixed attribute retriever modules.

9081515 Audit logging does not log granted role names
Fixed audit modules to log role names.

9025166 Caching does not see to be working for policies created from EUI
Fixed caching modules.


I wanted to call attention to a few items. The first is the certification for OES on WLS 10.3.2. As Chris pointed out previously, since the certification is in the CP, you need to patch to the CP first, and then do then do the install.

The next item is the enhancements to the config tool. As people will recall from this summer, there were a number of issues around the scoped orgs and apps and the config tool. These are all fixed.

Finally, the policy distribution for large policy sets. As we've discussed on this blog, there are some times where you do want to model a lot (read: thousands) of resources in OES, and this can result in some very large policy distributions. CP4 has some enhancements for these types of models.

So, my advice is to take a look at CP4, and if it makes sense go a head and upgrade. The engineering team has done a really good job of simplifying the patching process.

Monday, April 12, 2010

By Request - Multiple Realms in WebLogic Server

Every once and a while, I see a request for support for multiple security realms in WebLogic Server. Just to clarify, what people want is multiple active realms, specifically with regards to authentication providers. There are multiple realms today in WebLogic server, but there are really for administrative convenience. There are a couple of valid use cases for wanting more than one realm.

The first is that if you are running multiple applications inside a single WebLogic Server domain, and each application has their own set of users in a different directory. This results in the administrator having to make a choice of which provider to put first. You end up with a single provider for each application, each one with the JAAS control flag set to SUFFICIENT. The container goes provider by provider checking to see if the credentials are there. This can lead to performance issues for the applications further down in the list.

Anotehr use case is the handling or mishandling of the weblogic or system users. It is a best practice to leave the weblogic user in the Default Authenticator. This way you can still boot the server even if other directory is down. This is fine, but the consequence is that the other directory needs to consider the "weblogic" user. Assuming that the application's authentication provider is configured first, it will get called with the weblogic username/password at start-up and when administrators use admin tools like the console. It's fine that the user isn't there, but this can create additional, unwanted traffic. I think another issue is that malicious users could lock out the weblogic user temporarily by attempting to log in multiple times with username weblogic and bogus passwords - a possible DOS attack. I guess in theory this is possible if the same users have access to the console application, and probably a good reason to change to have a different administrative user than weblogic.

Finally, the last user cases is that customers are migrating from other application servers and those servers make heavy of JAAS and basically have the ability to associate a JAAS Login Configuration with an application. They got used to this functionality, and expect it to be there in WebLogic Server. WebLogic Server at its core uses JAAS Login modules for its authentication, but wraps the JAAS Login Module into an MBean - the AuthenticationProvider. One this that is really nice about the authentication provider, beyond a simple MBean is that many of authentication provider also implement the optional Authentication SSPI MBeans. This allows from a single library both authentication and management - something JAAS alone does not provide. I want people to understand this point before I go an explain the solution - you're giving up all of the management aspects that WLS provides, and going with the strictly JAAS based - authentication only - approach.

That having been said, once again its the little known ServletAuthenticationFilter to the rescue. The idea is to create a ServletAuthenticationFilter that intercepts the request, and then calls the regular JAAS Login Module. The result is a javax.security.Subject. You can then push this Subject on to the WLS stack using ServletAuthentication.runAs()


public void doFilter(ServletRequest servletRequest,
ServletResponse servletResponse,
FilterChain filterChain) throws IOException, ServletException {

System.out.println("In the filter....");
HttpServletRequest httpServletRequest = (HttpServletRequest)servletRequest;

String uri = httpServletRequest.getRequestURI();

if (!uri.startsWith(this.URI)) {
filterChain.doFilter(servletRequest, servletResponse);
return;
} else {
System.out.println("Processing "+uri);
this.initalizeMultiRealmConfig();
}

try {


LoginContext lc = new LoginContext(this.jaasLoginEntry,null,new MutliRealmCallbackHandler((HttpServletRequest)servletRequest),this.multiRealmConfig);
lc.login();
Subject subject = lc.getSubject();
System.out.println("The subject is "+subject);
ServletAuthentication.runAs(subject, (HttpServletRequest)servletRequest);

} catch (LoginException le) {
le.printStackTrace();
HttpServletResponse httpResponse = (HttpServletResponse)servletResponse;
httpResponse.sendError(HttpServletResponse.SC_FORBIDDEN);
return;
}

filterChain.doFilter(servletRequest, servletResponse);

}


In this example, I did something a little "fancy". I made the javax.sql.Datasource defined in the application available to the login module. This supports the very common use case where applications want to authenticate these application users via the database. The way I did this was through creating my own javax.security.auth.login.Configuration. This configuration makes the DataSource available as an entry in the Option map of the JAAS Login Module. Why do it this way? In this example, I tried to eliminate dependencies between the JAAS Login Modules getting called and the ServletAuthenticationFilter which was calling them. By passing it through the map, this eliminated the need to create a proprietary CallbackHandler. It a little extra code on my side, but it simplifies the implementation for JAAS Login Modules.


public AppConfigurationEntry[] getAppConfigurationEntry(String name) {

AppConfigurationEntry[] theEntries = this.theConfiguration.getAppConfigurationEntry(name);
AppConfigurationEntry[] theNewEntries = new AppConfigurationEntry[theEntries.length];
for (int i=0; i<theEntries.length; i++) {

AppConfigurationEntry entry = theEntries[i];

Map<String,Object> newMap = this.copyMap(entry.getOptions());
newMap.put("DataSource", this.ds);

theNewEntries[i] = new AppConfigurationEntry(entry.getLoginModuleName(),entry.getControlFlag(),newMap);
}

return theNewEntries;

}




This is all well and good, except that WLS needs to know about the principals. If the JAAS Login module is creating instances of WLSUserImpl or WLSGroupImpl then you're good, otherwise you'll nee to use a custom PrincipalValidator. The principal validator that I included in the project is VERY MINIMAL - all principals are trusted. If you have access to the JAAS Login Modules, the simplest thing to do is to modify the principals to extends the weblogic.security.principal.WLSAbstractPrincipal and then you can use the OOTB WebLogic Principal Validator. The details of Principal Validation can be found here.

Building and deploying the sample



  • Download the sample from Subversion
  • Modify the MultiRealmFilter.java - specifically the MultiRealmCallbackHandler inner class

    By default, the sample always passes a hardcoded username/password of foo/foo. You'll probably want it to do more. You need to modify it to fetch credentials from the HttpServletRequest and create the Callback that you need for the login modules.


    /**
    * This callback handler gets information from the HttpRequest - cookies, username/password
    * @author jbregman
    *
    */
    class MutliRealmCallbackHandler implements CallbackHandler {

    private HttpServletRequest req;

    MutliRealmCallbackHandler(HttpServletRequest req) {
    super();
    this.req = req;
    }

    @Override
    public void handle(Callback[] callbacks) throws IOException,
    UnsupportedCallbackException {
    // TODO Auto-generated method stub

    for (int i=0; i<callbacks.length; i++) {

    Callback callback = callbacks[i];

    if (callback instanceof NameCallback) {

    NameCallback nc = (NameCallback)callback;

    //In here, you'd go and do something with the request
    System.out.println("The name is foo");

    nc.setName("foo");
    } else if (callback instanceof PasswordCallback) {

    PasswordCallback pc = (PasswordCallback)callback;

    System.out.println("The password is foo");
    pc.setPassword("foo".toCharArray());

    } else {

    System.out.println("Some other callback "+callback);

    }



    }

    }


    }


  • Build the sample
    You'll need to modify the build.xml for your environment
  • Restart your domain, and create an instance of the "MultiRealmIdentityAsserter"
    Under the Provider specific properties, you need to set-up a few values
    JAAS Config Entery - The entry in the jaas-login that you want called

    Principal Class - If you're using the principal validator provided, then this is the base class of all the principals that the login module is creating. If you modified the principals so that they will work with the WebLogic principal validator, then you also updated the MultiRealmIdentityAsserterProviderImpl.java to return null

    URI-The path this ServletAuthenticationFilter applies to.
  • Modify you domain to point to the jaas-login
    Modify the setDomainEnv.cmd/.sh

    set JAVA_OPTIONS=%JAVA_OPTIONS% -Djava.security.auth.login.config=multirealm_jaas.config

    By default WLS will look for the config in the domain's home directory. Here's the very simple login config that I used:

    Sample {
    multirealm.someotherloginmodule.MyLoginModule required debug=true;
    };

  • Make sure the resource in the application is protected by the container
    If the resource isn't then the ServletAuthenticationFilter never gets called.
  • Make sure that the users authenticated by the JAAS Login Modules wind up in the JEE role that is protecting the application.
    For example, if your weblogic.xml looks like this:

    <wls:weblogic-web-app xmlns:wls="http://www.bea.com/ns/weblogic/weblogic-web-app" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd http://www.bea.com/ns/weblogic/weblogic-web-app http://www.bea.com/ns/weblogic/weblogic-web-app/1.0/weblogic-web-app.xsd">
    <wls:weblogic-version>10.3</wls:weblogic-version>
    <wls:security-role-assignment>
    <wls:role-name>multirealm</wls:role-name>
    <wls:principal-name>multirealm</wls:principal-name>
    </wls:security-role-assignment>
    <wls:context-root>App1</wls:context-root>
    <wls:library-ref>
    <wls:library-name>jstl</wls:library-name>
    <wls:specification-version>1.2</wls:specification-version>
    <wls:exact-match>true</wls:exact-match>
    </wls:library-ref>
    <wls:library-ref>
    <wls:library-name>wls-commonslogging-bridge-war</wls:library-name>
    <wls:specification-version>1.0</wls:specification-version>
    <wls:exact-match>true</wls:exact-match>
    </wls:library-ref>
    <wls:library-ref>
    <wls:library-name>jsf</wls:library-name>
    <wls:specification-version>1.2</wls:specification-version>
    <wls:exact-match>true</wls:exact-match>
    </wls:library-ref>
    <wls:resource-description>
    <wls:res-ref-name>dataSource</wls:res-ref-name>
    <wls:jndi-name>multiRealmDataSource</wls:jndi-name>
    </wls:resource-description>
    </wls:weblogic-web-app>

    Then you need to make sure that the user gets added in a Principal that is named multirealm
  • Restart the server and attempt to access the protected resources.

Summary


What you're basically doing with this approach is re-inventing the wheel. This will not work with the out-of-the-box WLS authentication providers. It will pass the application datasource, but if you're working with LDAP then you're totally on your own. You'll also probably need to modify the MultiRealmFilter further to work with forms based authentication....HTTP redirect to an un-protected page, and then have that page POST back to the resource with the ServletAuthenticationFilter configured. You also lose all of the WebLogic management capabilities - lockouts, password policy composition - etc. I wonder if all of this complexity is worth it. What do you think?

Thursday, April 8, 2010

SAML is good, but it's no replacement for WAM

My recent posts about SAML got me thinking about a couple of common misconceptions I see from customers surrounding the technology.

The first and most important misconception is articulated by this quote:
"there is no SAML Fairy"
- Brian Eidelman

In other words there's nothing magical about SAML. Browsers don't "speak" SAML. SAML isn't like an HTTP cookie (and it's not like a chocolate chip one either, but I digress). SAML is just a means to convey identity from one place to another, so adding SAML into your architecture doesn't do anything to make it more secure. Application sessions will still be managed with cookies, hidden form fields, information in the query string or whatever it is that the app already did.

The other misconception / misunderstanding is that SAML can take care of all of your SSO problems. Or as a customer recently put it "if I just setup one authentication point for my enterprise and then use SAML to sign onto all of my applications I don't need a Web SSO solution like [OAM, OpenSSO, SiteMinder, etc]". This product space is usually called Web Access Management, abbreviated as WAM, which is pronounced like, but is unrelated to the band with a similar sounding name.

I've seen this same idea discussed by customers, application vendors and others so it seems pretty common. The fact the idea is common is bad, but it's the fact that it's both wrong and widespread that concerns me, and is why I'm writing this post.

The motivations people have put forward for using SAML in this way have included:
  • using "the standard"
  • only having thing to integrate to get SSO for everything
  • lots of applications support SAML out of the box
  • simplifying the environment
  • loosely coupling infrastructure components
and of course...
  • avoiding license costs
From a customer's perspective these are all valid desires and laudable goals. Unfortunately the reality is that SAML is simply not a replacement for a true web access management solution. The WAM space is one of those classic "elephant in the distance" problems - it's a whole lot bigger than you think it is when you start.

The main features of a WAM product (according to Wikipedia) are:
  • Authentication Management
  • Policy-based Authorization
  • Audit & Reporting Services (optional)
  • Single sign-on Convenience
Authentication Management includes obvious things like checking a username and password (i.e. HTTP Basic authentication or from an HTML form) or Certificates. It also includes features like requiring different authentication methods for different resources. The rest of the above are fairly self explanatory. I'd add a few more features that are common across most of the products in the space:
  • Session management - including enforcing maximum session time, idle timeouts, administratively terminating a session, etc
  • Security in depth - e.g. securing at least both the web tier and the app tier
If you use SAML for SSO you can centralize authentication, and if you do things carefully you can almost certainly get a limited form of Single Sign-On working. But trying to get most of the other items on my lists above is actually a whole lot harder than you might think.

Consider the use case where you have two applications protected with either a WAM solution or some sort of SAML integration. With a conventional WAM product you'd be able to log into either one and move back and forth seamlessly; your session would be active on either one or both apps for as long as specified in the central configuration and if you log out of one you would be logged out of all. Contrast this with the SAML solution...

You could do central login by configuring the existing login page of each application to kick off a Service Provider initiated SSO, and as long as your session was live at the central IdP you could go back there to get an assertion for the application. Of course you'd need to make sure that the central session didn't time out too early or you lose all SSO capabilities. You could also configure SAML to support Single Logout (SLO), though doing SLO across more than a handful of Service Providers gets problematic quickly.

So what are you missing? Lots of stuff...
  • Authentication technologies. WAM products generally support a wide variety of authentication methods out of the box. From the common username and password, certificates, smart cards, SecurID and Kerberos to less common things like biometrics and multifactor methods. Enabling additional authentication method with a WAM product generally requires simply following a set of documented configuration steps. Contrast this with what most people think of as how they'll "do" a SAML IdP for their internal SSO project... a web app deployed on an app server. In that case you'll be able to support username and password, certificates, probably Kerberos and not much more very easily (for appropriate values of "very"). Further you start running into problems when you try to configure one web application to require more than one method - J2EE apps, for example, can only support one authentication method.
  • Idle timeouts across the environment. Idle timeouts can't be enforced because SAML doesn't include a profile for session synchronization. Period. End of statement. Simply not possible with SAML. So once a user SSOs over to an SP there's no way to let the IdP know that the user is still actively using the app.
  • Session time limits. Maximum session time limits generally can't be enforced because SAML only specifies a time period during which an assertion will be valid (NotBefore and NotOnOrAfter). The SAML specification does include another attribute (SessionNotOnOrAfter) which is supposed to specify the maximum session time the SP should allow. In my experience this support is generally poorly implemented in all but the most advanced SAML-supporting products. Anecdotally and FWIW I've gone to a number of interop test events and I've never actually seen this feature included in the conformance test suite. Caveat emptor!
  • Central authorization policies. WAM products include "course grained access control" features. You generally wouldn't use a WAM product to secure individual features of an application, but using a WAM product to decide who is allowed to reach each application is common.
  • Central Authorization Reporting. Centralizing policies means that you can easily figure out who is going to have access to each application. You can then use this information to generate useful things like attestation reports which are often required by security auditors.
  • Central auditing. When a WAM product is deployed across applications you gain the ability report on who can do what (above) but also the ability to centrally track usage (who actually did what?). Most products will record user actions in flat files, a database, or both. Generating reports from that data then becomes a simple matter of pointing your reporting system at that data and clicking "Generate Report".
There are some other important features of WAM products, but those are the ones that come to mind immediately.

Then in addition to all of the features you're missing you introduce a bunch of additional problems. Such as? How about the fact that each Service Provider requires an x.509 certificate? And the fact that setting up a SAML Service Provider is still painful (even with the Metadata exchange protocol). And the fact that when you stand up a new SP you have to configure settings on the IdP.

None of which you have to worry about with a WAM product.

So when it comes to web browsers use SAML for the things it was intended to do - propagating identity across boundaries. And use WAM products for your intra-company SSO.

Do you disagree? Are you using SAML for your web SSO solution successfully? Have you figured out something I haven't?
Let me know below.

Wednesday, April 7, 2010

Security Clarification: OAM Identity Asserter for WebLogic

I want to clarify something rather important about the Oracle Access Manager (OAM) identity asserter for WebLogic. The OAM identity asserter is invoked based on the token "OBSSO Cookie" (though I think it now also lets you choose "OAM_REMOTE_USER"). It also contains provider specific configuration for connecting to an OAM access server. This might lead you to conclude that the identity asserter is validating the OBSSO cookie and then asserting the name extracted from the cookie.

The reality though is that even when the OBSSO cookie is the configured token for invoking the identity asserter, the asserter itself is simply asserting the value of the OAM_REMOTE_USER header. So, what about all the configuration for the connection to the access gate? The access gate connection only comes into play in an OAM/OWSM integration scenario where there is no webgate on a web server in front of WLS. If you are using the identity asserter to propagate an identity to an OAM protected web app then you don’t even need to fill in all that info.

If you had the wrong impression, don’t feel too bad. It is an easy and common mistake to make. The documentation touches on this here as step 2 is about establishing trust with WLS, but glosses over it in its description (10.2.2.2) of how the identity asserter works, which is why I thought this blog post was necessary.

So, given that the OAM identity asserter is asserting an identity contained in a clear text header that is inserted into the request at the web server, it is vitally important that measures are taken to ensure that all requests to OAM protected WLS resources come through the OAM webgate enabled web server.

As Chris mentioned in his post a few weeks ago this can be done by:
  1. Taking network security measures.
  2. Two-way SSL.
  3. Utilizing the WLS connection filter to lock down what IP addresses WLS will service requests from.

Thursday, April 1, 2010

SAML, REST, smart phones and you

(or Smart devices, not so smart protocols)

I've been working on and off with a customer on a project that involves all sorts of cool buzzwords - iPhone/Android/Blackberry Apps as clients, using REST to invoke Web Services, authenticating via SAML. While I can't go into the details or reveal too much about the project there is one line of discussion that is really interesting.

First the background:
  1. A thick client, running on a smartphone, will do some sort of handshake with one web server to authenticate the user.
  2. Once the user is authenticated that server will issue the user a SAML Assertion.
  3. The client will then use the SAML assertion to authenticate to a different server and will send REST-style requests to invoke services on that server.
So something like this:



This begs the question why not just use a conventional web SSO product like Oracle Access Manager? An excellent question, the short version of which is that the Authentication Server and the REST Server are run by two different companies and don't share any infrastructure (a more common situation than you might think).

SAML actually solves a whole bunch of painful problems in this architecture - the AuthN server can sign the SAML Assertion to prove its validity and protect it from alteration and encrypt it to prevent the user from even seeing its contents. SAML also allows the AuthN server to send additional information about the user (i.e. attributes) in an extensible way - and additional attributes can be added later without needing to change any infrastructure, communication protocols or even the client.

Did you fill up your Buzzword bingo card yet?

So moving on to the problems...

Transmitting the SAML Assertion
If we were using SOAP to go from the device to the server WS-Security would have solved all of our problems. That standard spells out exactly how to attach a SAML assertion to a SOAP message so that both the client and server can understand. Unfortunately almost all "smart" devices lack a full SOAP stack. Further complicating matters is the fact that REST is a (air quote) "standard" intended to be a very lightweight way to send requests to a server and get back a structured response. Because it's intended to be so much simpler than SOAP there's very little (read no) standards around things like authentication, encryption, signing or any of the things that make SOAP a bit complicated at times.

All of which is just a long way to say that you're basically on your own figuring out how to use SAML with REST.

There are a few obvious options of how to use SAML with REST.
  1. Send the SAML assertion to the server and swap it for a cookie. Your deployment then becomes nothing more than a standard web SSO situation and your application doesn't need to worry about the SAML bits.
  2. Send the SAML assertion in every request as part of the POST data. This places the responsibility for parsing the SAML assertion into your application logic or something that can see and handle the HTTP POST data stream.
  3. Send the SAML assertion in every request as an HTTP header. This is a slight variant of #2 that is more similar to SOAP's WS-Security model where the authentication information is separated from the actual input/output parameters of the call.
I like option 1 because it pushes handling the SAML assertion out of scope, or in other words into an S.E.P. On the other hand having a thick client interacting and cooperating with a web SSO solution introduces a whole raft of other issues including properly handling things like idle and session timeouts, dealing with redirects, and a long list of others. Web SSO products were designed to interact with web browsers and their 'on the wire' behavior can be difficult to understand from an HTTP client's perspective. I'm not convinced that this is the best solution to our problem so on to options 2 and 3.

Option 2 and 3 are nearly identical - differing only in where the assertion goes in the HTTP request. That subtle difference is actually kinda a big deal and after thinking about it for a while I think I vastly prefer option 3 to option 2. Besides the logical separation of authentication and inputs I have a few other reasons, such as the fact that POST data can only generally be consumed once which is really important for my next trick.

You probably know about Servlet Filters, and if you've been reading this blog for a while you probably know about WebLogic's security framework (often called the SSPI framework). What you may not know about is how to put them together into a Servlet Authentication Filter. Basically you write a Servlet Filter that takes on responsibility for getting the authentication token and then ask WebLogic to go call the authentication provider for you. If everything works out WebLogic goes ahead and establishes a session for you. Then when your actual service wants to know who is invoking the service it can ask by calling weblogic.security.Security.getCurrentSubject().

No fuss no muss. And most importantly you don't have to commingle service logic with any code to deal with SAML, encryption keys, XML parsing or anything else unrelated to actually doing your actual work!

Session Management
One of the concerns with sending the SAML assertion along with every request is the performance impact of the XML and crypto operations. If you are invoking a simple service (hello world for example) the overhead of all of the SAML seems like it might be awfully expensive. If you had to pay that price with every request the overhead would quickly eat up your CPU cycles grinding even a reasonably fast machine to a halt under load.

Thankfully WebLogic's designers thought about this very problem.

The first and most obvious solution is to act just a little bit more like a browser. When you authenticate to Weblogic it automatically creates a session for you and sends a cookie (usually named JSESSIONID) back to your browser. If you include that cookie with subsequent requests there's no need to authenticate again. So if you smarten up the client so that it handles cookies gracefully you'll avoid WebLogic having to re-parse and validate your SAML assertion. In fact if I'm reading the relevant iPhone SDK docs (just to cite one example) correctly I think the iPhone SDK handles cookies properly for you automatically by default! Android includes Apache HttpClient which makes cookie handling almost trivial. And as for Blackberry, well it's J2ME which means you'll have to do cookie parsing by hand; which, while unfortunate, isn't the end of the world.

As long as you do the right thing with the cookies coming from WebLogic your session will be fine. If something happens to your session (e.g. the server gets rebooted, you get shunted off to another server that doesn't know about your session, your session times out, etc) the auth filter will automatically reestablish a session as long as your SAML assertion is still OK.

But that's only one part of the solution. If you disable WebLogic's issuance of cookies or you choose to not handle cookies in your thick client's code WebLogic has still got your back.

Weblogic's Identity Assertion Cache
Decrypting a chunk of XML, parsing it, and extracting some data takes some CPU cycles, but isn't all that slow. Searching an LDAP directory to find a user, then doing another search to chase down all of the group memberships on the other hand takes real, measurable clock time. Some of the time is because you're doing a search and some is because you're going over a physical wire to talk to the LDAP server and those wires (AFAIK) are still subject to the laws of physics.

The WebLogic docs describe the setting in some detail. The Javadoc for the Authentication Provider says:
The assertIdentity() method of an Identity Assertion provider is called every time identity assertion occurs, but the LoginModules may not be called if the Subject is cached. The -Dweblogic.security.identityAssertionTTL flag can be used to affect this behavior (for example, to modify the default TTL of 5 minutes or to disable the cache by setting the flag to -1).
And the command line reference fills in some more details:
When using an Identity Assertion provider (either for an X.509 certificate or some other type of token), Subjects are cached within the server. This greatly enhances performance for servlets and EJB methods with tags as well as for other places where identity assertion is used but not cached (for example, signing and encrypting XML documents). There might be some cases where this caching violates the desired semantics.
Wrapping it all up
So to summarize:
  • SAML is cool
  • smart devices are pretty cool, but they lack a SOAP stack
  • WebLogic's SSPI framework is cool
  • the WebLogic engineering team thought of darn near everything
Oh, and if you combine a Servlet Auth Filter, the SAML SSPI Identity Asserter, a teensy bit of code to handle cookies on the client side you can do some pretty clever things.

Got a comment or question? Let me know below!
----
Update: After having this up a few days I had a talk with someone out of band that in effect said "you said #1 is probably not the best way, but then you went through a whole discussion about 2/3 but wound up describing #1 and saying that it was best." So I obviously need to clarify.

What I was talking about in #1 is actually invoking a specific server. In other words login(String SamlAssertion) and have a token come back. The problem with that solution is that it's complicated and if the token isn't acceptable for some reason you need to know how to go back and get a fresh token.

In the rest of the post I describe sending the SAML Assertion on every request and doing "the right thing" when it comes to cookies. If the server sees the cookies and can find the associated session it won't bother checking the SAML assertion. If you don't have a cookie, the cookie or session is invalid or something else goes wrong then the server will go ahead and validate the SAML Assertion and establish a new one.

Hope that clears things up.