Thursday, 6 February 2014

Stateful and Stateless application

At the beginning day, web page are stateless and static. It does not really matter how many time you visit the page, you will receive the same content. However, as web applications getting more and more complicated, people find need to provide customized, dedicated and dynamic content. In order to achieve that, authentication become a must have feature for most of the modern web application.

However, things seem not that straight forward because http is an unsecured and stateless protocol. To make thing worse, HTTP 1.0 does not remember any information of the web client, who initiated the request.

To overcome this obstacle, the common solution is to include some kind of ID as cookie for subsequence requests. With this simple technique, the server can identify the requests from the same client and manage to serve dedicated content for each client.  This solution has been so popular that it is automatically included and implement for most of the web server that support dynamic content. For example, if you check the cookie of a webpage and see JSESSIONID, the webpage must be implemented with Java, similarly for PHP and .NET. There was one time, I tested how the server recognize Java session by copy the JSESSIONID cookie from Chrome to Firefox and the session still maintain. It prove that at least for Tomcat, the server only check JSESSIONID, not browser version or any other information.

As you may guess from above, the server may need to have some ways to recognize the session cookie of the web client. The strategy to identify the session cookie split web applications to stateful and stateless applications.

Stateful is supported by Servlet API. For whichever web server that implement Servlet API, the server or web application store the session cookie some where and attempt to reconstruct the HttpSession object. For better performance, most of the server implementation store this session information in memory and only dump to file system when memory running low or to persist session before restarting.

Stateless is not part of Servlet API. Most of the time, developers may need to implement it them selves unless they use some of the frameworks that support stateless session as Play framework. Because of this, I will go further into explaining how to implement stateless session in this article.

To implement stateless session, cookie should be used as place to store all the session content. After that, the server sign this content with a secret key that can later be used to verify the session cookie content. Depend on the nature of your application, you can decide to encrypt the content of session cookie or not. For Play framework, the session cookie is kept plain. In this case, you may need to pay more attention to not store something confidential to session cookie. Normally, session has time out; so, you may want to include timestamps information to your session cookie. In stateful session implementation, the server need to regularly check and clean-up expired session but for stateless you do not need to clear anything. If the session cookie is time-out, reject it, otherwise refresh the timestamps on every request to keep the session alive.  

There is also one more variation of stateless session as in Ruby of Rail framework where session is stored to DB rather than the server itself. With this strategy, RoR still have the silent failed over of stateless application and still can store confidential data on session.

So now, you have known about both strategy, I would like to offer some thoughts on advantage and disadvantage of stateful and stateless session.

In term of complexity, they are pretty equal. However, as stateful is part of Servlet API, you do not need to implement it. If you use stateless, you may not find yourself lucky unless you pick some specific frameworks that support stateless session.

In term of efficiency, both strategies have weakness. Stateful is very vulnerable to DOS attack. Even if you choose the option to dump the session to file system storage, sending mass requests with empty session can quickly fill-up your session table and make the server take the pain of maintaining session table. Lost of session when server down is another major weakness in cluster environment.

Stateless session has its own problems. Cookie size is pretty limited (4kb). Hence, you will encounter some kinds of funny exception thrown if you store too much information on session. When you need to store more things in the session you may need to implement your own framework on top of this. Simply speaking, you may need to setup your own filter to populate full user profile based on limited data you have in the client-side session (you may only store userID in session cookie). It means that frequent DB accesses is required and you better implement some smart caching here to avoid overloaded your DB. For Ror, I still suspect their strategy to store session in DB. The biggest compromise is performance. For most of my career, the biggest bottleneck that we need to solve for high-scale application is the DB and writing too frequent to DB for each session creation still let your app vulnerable to DOS attack. In this case, I guess they only achieve silent fail-over.

With this information, I hope you can make decision for yourselves which strategy fit your application best.