Generalization of odd SaaS driven requirements and their technical fulfillments

ajaym259GeneralLeave a Comment

Overview of the typical SaaS application :

Most SaaS applications fulfill these requirements :

– there is a single deployment of one codebase

– there is a single deployment of one database

– multiple and distinct user groups use it, view it as their “own” customized application

– it is web based, accessible over browsers

For example, the SaaS provider deploys a single application called say App, with an AppDB. Now many companies subscribe to it by paying license fees to the SaaS provider. Inside one such company called say MyCompany, their users (a distinct user group) perceive and use it as their own customized MyApp. Once logged in, the users are completely inside their own silo application, and have no relation with any other subscribing companies users, data or operations.

Multi-tenanting : In order to achieve this, it is quite common to design the AppDB in a multi-tenanted mode. Which means that while the users are using their MyApp, their data is also being stored in a logical compartment called say MyAppDB.

While there are other ways of creating this logical MyAppDB, the commonest is at the database design level. For example, a root entity could be called the Licensor (i.e. the SaaS provider). Each subscriber is added to it as a Company entity. Now users are added to the Company. From now onwards, every piece of data, every table must have the Company as a column (attribute). Whenever the user accesses or writes any data, it must first specify the CompanyID, which means only that silo of data that carries the same CompanyID is retrieved.

Another way of multi-tenanting is to use separate database instances. In which case, immediately after a login, the user’s requests are made to access only that database URL which pertains to a specific Company.

Appeal of SaaS applications : Quite often the one-deployment-multiple-licensees concept is strongly attractive to the business community, because it is thought that such apps are easy to maintain, low entry and maintenance cost to end-users, and “controllable” from a consolidated standpoint. However, as one analyzes the complete life-cycle (5-6 years) of many SaaS products, and goes deep into the technical aspects of such apps, one may find this viewpoint too simplistic. Only in some common and generic domains do SaaS apps deliver on this promise. We shall examine many aspects of SaaS apps in these below paragraphs and leave that judgement to the reader.

Data (or Class) models : The model view is very often like this : (note: Company can also mean user group).

So every object accessed must use a Company identification in addition to any other identification. This has an impact on every database access statement and write statement.

Authentication : In the simplest case, any user approaching the website must first select the “Company” under which he/she wishes to log in. This is too naive, so user is usually asked to select the Company after he/she logs in. If it is guaranteed that one user belongs to one and only one company, the system will do this selection automatically.

Therefore, a scenario of “self-registering” in such apps has to go via a CompanyAdmin. Users cannot register themselves to any arbitrary company of their choice. Very often, users are created by such CompanyAdmin logins and then sent an email invite to self-register themselves with other necessary data.

It is simplistic to assume that one user will always belong to one and only one company. For example, some users may be “Consultants” or “Vendors” who may have relations with both companies above, access only specific orders or deal with only specific customers. Therefore it is obvious that the “data silo” now has to be sub-divided further, and a very strict “roles” definition over all entities/operations has to be followed, with the “CompanyID” being a part of each and every SQL statement.

Authorizations (Roles, Permissions) : It is at this stage that the application starts to get a little more complicated.

If one were to use out-of-the-box Authorization packages (for the cases where an User can belong to more than one Company), most such packages would need heavy customization. Roles and permissions in out-of-the-box packages will rarely include the “WHERE CompanyID=xyz AND” statement automatically.

Very often, SaaS technical personnel are left with no choice but to design their own authorization and roles settings, often keeping roles/permission bags/permissions in a set of database tables.

If the authorizations are simple i.e does not extend to ACLs over specific entities (like say one user may only see customerOrders 3, 6, and 9), one may go ahead with an out-of-the-box Auth provider like ACEGI. But this is rarely so, and in most cases a database storage of fine-grained roles, permissions and ACLs over entities is the only way out.

This has an effect on scaling. Because now it becomes impossible to store a set of roles and permissions in session, the “RolesAndPermissions” has to be fetched prior or post of every search operation.

When the user belongs to 2 or more companies, the scenario can be something like this – “fetch a list of orders where the user is a consultant to a customer of the company this user is currently logged in for”. One can easily envision that a multi-tenanted database multiplies the time taken for this operation.

Even the rendering of the screens can become overly complicated, Say such an user can do OperationX when under CompanyA, but cannot do OperationX when logged in under CompanyB.

The reason we are discussing Authorizations in such excessive detail is because such scenarios (user belonging to two companies) are quite common as soon as the SaaS app tries to model enterprise business processes. Enterprises, in order to be truly web-connected, need to extend their set of users to more than just their employees. They need to include vendors, consultants, government officials etc.

One last issue before we let go of authorizations. Very often, each Company, believing that they have paid for MyApp and have full flexibility, will want to customize MyApp in a very fine-grained way, according to their own internal authorization rules. So a do-it-yourself set of RolesSetUp screens is often a mandatory part of SaaS apps. As demands grow and fine grained customizations increase – on the the providers side, the developers and testers can hardly keep track of extreme complications of ACL based object retrievals (why are no orders being fetched on this screen ?).

The object graphs inevitably tend to have multiple parents and cyclic paths.

Integration : Let us say that an enterprise has several other applications, an accounting application, a time-sheet application, some client-server apps and some other SaaS applications, that it wishes to now integrate with this current SaaS application.

The technical issues here remain the same – for every database read or write, the integrating applications must necessarily “know” the identities in the main SaaS multi-tenanted database. For example, let us say CompanyA’s accounting application needs to know the orderValue for order X. It is not enough for it to query by using an enterprise wide “Order Number”, since such an Order Number might very well be in use in Company B too. So either the accounting application  has to “know” the OrderID in the SaaS database, or it has to say “Order Number X for CompanyA”.

Here Order Number is a “business key”, an universally understood identifier for an order. But for some odd reason, let us say, CompanyB does not maintain its orders by their Order Numbers, they do it instead by a different business key, say Inquiry Number.

There can be a proliferation of business keys in the Order model. This is not a good design, because an Order will not represent reality for either  companyA or companyB, but be a union of both.

SaaS databases after many modification typically tend t have large “holes”. Large numbers of “nullable” fields often remaining empty. If such fields are not varchars, there is a bloating of the database size for no real reason.

Multiple fields for subtly different attributes : Not just business keys, it is indeed very difficult to consider “one and only one model of an Order” for all companies under the sun, in a typical SaaS app. Some may want their order number to be a number i.e integer, some may want a specific formatted string.

The implications of such wish-lists are many. Validations of front-end entries become company specific. Data  types tend to become too generic (Strings, Hashmaps). In the extreme cases, SaaS applications often need to  use things like Custom Fields.

Custom Fields : The problem of “one Order model does not suit all companies” is ultimately solved by a concept called “custom fields”. In this mode, a CompanyA can create, for its own special needs, a set of CustomOrder fields which get appended to a core set of Order fields on every query. The front-end shows these custom fields, they can be updated just like normal Order fields.

Behind the scenes, it is often a single column table storing string values, and a set of “per Company custom field definitions”.

It would not matter so much if the numbers were few. But very often, it is the custom fields which become vitally important differentiators for a company to continue using the SaaS application. But custom fields are not ordinary database fields. They cannot be “WHERE xyz =”. In queries, reports, data aggregations and integration code, they have to be referred to by their “meta-data”.

On the front-end too, custom fields make the screens overly complicated. Because each custom field has a “presentation” aspect, they can be a select box in one company and a radio-group in another, a number in one and a formatted string in another.

Reports : Most applications have a “reports” module. Apart from a few “canned” reports, there is usually a query builder to create custom reports. All the issues discussed above – data bifurcations, data permissions, data identities, data meta-data – come together in this query builder and stand out sharply. SaaS applications may find it difficult to pick an out-of-the-box query builder.

Scaling : Imagine a scenario where a SaaS application is deployed, and initially finds only small companies as its licensors. Which means, say 20 users per company. Then it becomes popular, and large companies become licensors, say 10000 users. After this, there is a bandwagon, so every small SME becomes a licensor, say 2 users per SME.

Saas applications find it difficult to handle scaling by means other than “add more hardware”. Request load balancing runs into the technical issues of session copying on different servers and request dispatch based on CompanyID. When the applications have messaging queues and long running workflows, load balancing becomes even more custom coded.

A study of loads of two different companies (each with 10000 users) may reveal significant use pattern differences, which can be taken advantage of.

The very nature of “performance metrics” needed by 10000 users versus 2 users is different. 2 user companies want fast data refresh, 10000 user company people are more used to slow applications and take long coffee breaks keeping 1000’s of sessions open (just an example).

Support tickets also get handled differently. The SaaS provider often does long and agonizing calculations over per-user-revenue versus per-licensee-costs, on both feature additions and support issues.

Adding new licensees and legacy data ETL : As a 2-member or a 10000-member company gets added, its legacy data needs to get into the SaaS app before the company can start using it. Meantime its legacy apps continue to be used.

If the legacy model differs significantly from the SaaS model, ETL time taken is high. Because SaaS data models are often unnaturally complicated by CompanyIDs and custom fields meta-data, this often happens.

Off-the-shelf ETL tools are hardly designed for such things. So bespoke ETL tools (often XML based) are created.

If the data load is huge, the entire SaaS app is downed during the load, affecting all customer companies. Very often, it is useful to do a large legacy data load, and following it, a smaller one, for the residual data the company has accumulated on its still running legacy applications.

Some recomendations :

Do less – it often makes sense to not do everything in a large complicated all-encompassing application. It is far better to have small applications that do best a small set of things, but have very good integrative capabilities to properly integrate with each other.

Public APIs : Early in the design, thought must be given to create public APIs which can be used by licensees to create their own fine grained customizations, using the SaaS app only as a large grained validations and data store.

Multitenancy via multiple databases : It is far better to take user requests to their own separate database instances via a switch, rather than do multi-tenancy by markers like “CompanyID”.

Avoid fine grained customization requests : Too frequent model changes, code changes at the request of customer A or B leads to noodle code, from which it becomes impossible to grow into a better product.

Leave a Reply

Your email address will not be published. Required fields are marked *