Now we have some ideas about database, we quickly run into another requirement. Many websites will want to control who has access to what. Once embarked on this route, it turns out there are many situations where access control is appropriate, and they can easily become very complex. So in this chapter we look at the most highly regarded model role-based access control and find ways to implement it. The aim is to achieve a flexible and efficient implementation that can be exploited by increasingly sophisticated software. To show what is going on, the example of a file repository extension is used.
We need to design and implement a role-based access control (RBAC) system, demonstrate its use, and ensure that the system can provide:
- a simple data structure
- a flexible code to provide a usable RBAC interface
- efficiency so that RBAC avoids heavy overheads
Discussion and Considerations
Computer systems have long needed controls on access. Early software commonly fell into the category that became known as access control lists (ACL). But these were typically applied at a fairly low level in systems, and referred to basic computer operations. Further development brought software designed to tackle more general issues, such as control of confi dential documents. Much work was done on discretionary access control (DAC), and mandatory access control (MAC).
A good deal of academic research has been devoted to the whole question of access controls. The culmination of this work is that the model most widely favored is the role-based access control system, such a mouthful that the acronym RBAC is used hereafter. Now although the academic analysis can be abstruse, we need a practical solution to the problem of managing access to services on a website. Fortunately, rather like the relational database discussed in the last chapter, the concepts of RBAC are simple enough.
RBAC involves some basic entities. Unfortunately, terminologies are not always consistent, so let us keep close to the mainstream, and defi ne some that will be used to implement our solution:
- Subject: A subject is something that is controlled. It could be a whole web page, but might well be something much more specifi c such as a folder in a fi le repository system. This example points to the fact that a subject can often be split into two elements, a type, and an identifi er. So the folders of a fi le repository count as a type of subject, and each individual folder has some kind of identifi er.
- Action: An action arises because we typically need to do more than simply allow or deny access to RBAC subjects. In our example, we may place different restrictions on uploading fi les to a folder and downloading fi les from the folder. So our actions might therefore include 'upload', and 'download'.
- Accessor: The simplest example of an accessor is a user. The accessor is someone or something who wants to perform an action. It is unduly restrictive to assume that accessors are always users. We might want to consider other computer systems as accessors, or an accessor might be a particular piece of software. Accessors are like subjects in splitting into two parts. The fi rst part is the kind of accessor, with website users being the most common kind. The second part is an identifi er for the specifi c accessor, which might be a user identifying number.
- Permission: The combination of a subject and an action is a permission. So, for example, being able to download fi les from a particular folder in a fi le repository would be a permission.
- Assignment: In RBAC there is never a direct link between an accessor and permission to perform an action on a subject. Instead, accessors are allocated one or more roles. The linking of an accessor and role is an assignment.
- Role: A role is the bearer of permissions and is similar to the notion of a group. It is roles that are granted one or more permissions.
It is easy to see that we can control what can be done by allocating roles to users, and then checking to see if any of a user's roles has a particular permission. Moreover, we can generalize this beyond users to other types of accessor as the need arises. The model built so far is known in the academic literature as RBAC.
As RBAC can operate at a much more general level than ACL, it will often happen that one role embraces another. Suppose we think of the example of a hospital, the role of consultant might include the role of doctor. Not everyone who has the role of doctor would have the role of consultant. But all consultants are doctors.
At present, Aliro implements hierarchy purely for backwards compatibility with the Mambo, and Joomla! schemes, where there is a strict hierarchy of roles for ACL. The ability to extend hierarchy more generally is feasible, given the Aliro implementation, and may be added at some point.
The model with the addition of role hierarchies is known as RBAC.
In general data processing, situations arise where RBAC is expected to implement constraints on the allocation of roles. A typical example would be that the same person is not permitted to have both purchasing and account manager roles. Restrictions of this kind derive from fairly obvious principles to limit scope for fraud.
While constraints can be powerful additions to RBAC, they do not often arise in web applications, so Aliro does not presently provide any capability for constraints. The option is not precluded, since constraints are typically grafted on top of an RBAC system that does not have them.
Adding constraints to the basic RBAC model creates an RBAC2 model, and if both hierarchy and constraints are provided, the model is called RBAC.
Avoiding Unnecessary Restrictions
When it comes to design an implementation, it would be a pity to create obstacles that will be troublesome later. To achieve maximum fl exibility, few restrictions are placed on the information that is stored by the RBAC system.
Subjects and accessors have both types, and identifi ers. The types can be strings, and there is no need for the RBAC system to limit what can be used in this respect. A moderate limitation on length is not unduly restrictive. It is up to the wider CMS to decide, for example, what kinds of subjects are needed. Our example for this chapter is the fi le repository, and the subjects it needs are known to the designer of the repository. All requests to the RBAC system from the fi le repository will take account of this knowledge.
Identifi ers will often be simple numbers, probably derived from an auto-increment primary key in the database. But it would be unduly restrictive to insist that identifi ers must be numbers. It may be that control is needed over subjects that cannot be identifi ed by a number. Maybe the subject can only be identifi ed by a nonnumeric key such as a URI, or maybe it needs more than one fi eld to pick it out.
For these reasons, it is better to implement the RBAC system with the identifi ers as strings, possibly with quite generous length constraints. That way, the designers of software that makes use of the RBAC system have the maximum opportunity to construct identifi ers that work in a particular context. Any number of schemes can be imagined that will combine multiple fi elds into a string; after all, the only thing we will do with the identifi er in the RBAC system is to test for equality. Provided identifi ers are unique, their precise structure does not matter. The only point to watch is making sure that whatever the original identifi er may be, it is consistently converted into a string.
Actions can be simple strings, since they are merely arbitrary labels. Again, their meaning is important only within the area that is applying RBAC, so the actual RBAC system does not need to impose any restrictions. Length need not be especially large.
Roles are similar, although systems sometimes include a table of roles because extra information is held, such as a description of the role. But since this is not really a requirement of RBAC, the system built here will not demand descriptions for roles, and will permit a role to be any arbitrary string. While descriptions can be useful, it is easy to provide them as an optional extra. Avoiding making them a requirement keeps the system as fl exible as possible, and makes it much easier to create roles on the fl y, something that will often be needed.
Some Special Roles
Handling access controls can be made easier and more effi cient by inventing some roles that have their own special properties. Aliro uses three of these: visitor, registered, and nobody.
Everyone who comes to the site is counted as a visitor, and is therefore implicitly given the role visitor. If a right is granted to this role, it is assumed that it is granted to everybody. After all, it is illogical to give a right to a visitor, and deny it to a user who has logged in, since the user could gain the access right just by logging out.
For the sake of effi cient implementation of the visitor role, two things are done. One is that nothing is stored to associate particular users with the role, since everyone has it automatically. Second, since most sites offer quite a lot of access to visitors prior to login, the visitor role is given access to anything that has not been connected with some more specifi c role. This means, again, that nothing needs to be stored in relation to the visitor role.
Almost as extensive is the role registered, which is automatically applied to anyone who has logged in, but excludes visitors who have not logged in. Again, nothing is stored to associate users with the role, since it applies to anyone who identifi es themselves as a registered user. But in this case, rights can be granted to the registered role. Rather like the visitor role, logic dictates that if access is granted to all registered users, any more specifi c rights are redundant, and can be ignored.
Finally, the role of "nobody" is useful because of the principle that where no specifi c access has been granted, a resource is available to everyone. Where all access is to be blocked, then access can be granted to "nobody" and no user is permitted to be "nobody". In fact, we can now see that no user can be allocated to any of the special roles since they are always linked to them automatically or not at all.
Clearly an RBAC system may have to handle a lot of data. More signifi cantly, it may need to deal with a lot of requests in a short time. A page of output will often consist of multiple elements, any or all of which may involve decisions on access.
A two pronged approach can be taken to this problem, using two different kinds of cache. Some RBAC data is general in nature, an obvious example being the role hierarchy. This applies equally to everyone, and is a relatively small amount of data. Information of this kind can be cached in the fi le system so as to be available to every request.
Much RBAC information is linked to the particular user. If all such data were to be stored in the standard cache, it is likely that the cache would grow very large, with much of the data irrelevant to any particular request. A better approach is to store RBAC data that is specifi c to the user as session data. That way, it will be available for every request by the same user, but will not be cluttered up with data for other users. Since Aliro ensures that there is a live session for every user, including visitors who have not yet logged in, and also preserves the session data at login, this is a feasible approach.
Where are the Real Difficulties?
Maybe you think we already have enough problems to solve without looking for others? The sad fact is that we have not yet even considered the most diffi cult one! In my experience, the real diffi culties arise in trying to design a user interface to deal with actual control requirements.
The example used in this chapter is relatively simple. Controlling what users can do in a fi le repository extension does not immediately introduce much complexity. But this apparently simple situation is easily made more complex by the kind of requests that are often made for a more advanced repository.
In the simple case, all we have to worry about is that we have control over areas of the repository, indicating who can upload, who can download, and who can edit the fi les. Those are the requirements that are covered by the examples below.
Going beyond that, though, consider a situation that is often discussed as a possible requirement. The repository is extended so that some users have their own area, and can do what they like within it. A simple consequence of this is that we need to be able to grant those users the ability to create new folders in the fi le repository, as well as to upload and edit fi les in the existing folders. So far so good! But this scenario also introduces the idea that we may want the user who owns an area of the repository to be able to have control over certain areas, which other users may have access to. Now we need the additional ability to control which users have the right to give access to certain parts of the repository. If we want to go even further, we can raise the issue of whether a user in this position would be able to delegate the granting of access in their area to other users, so as to achieve a complete hierarchy of control.
Handling the technical requirements here is not too diffi cult. What is diffi cult is designing user interfaces to deal with all the possibilities without creating an explosion of complexity. For an individual case it is feasible to fi nd a solution. An attempt to create a general solution would probably result in a problem that would be extremely hard to solve.