Sunday, July 6, 2014

From Obvious To Agile

What do you do when obvious isn't?

Installing new fence posts

Many years ago I had a fence that needed to be repaired. I got a recommendation for a fence repair man from a friend and had him come out to take a look. He said the panels between the posts were fine and did not need to be replaced, I just needed new posts. He quoted me a price for installing new fence posts that seemed quite reasonable, and I accepted his bid.

A few days later he came back to do the job. After he had been out there working for a while, I went out to take a look. I was surprised when I saw how he had installed the new fence posts. He had not removed the old posts and put new posts in their places, as I had assumed; instead, he simply planted a new post next to each old post and strapped them together. I was flabbergasted, and complained to him that my expectation was that he was going to take out the old posts and replace them with new posts. He was nonplussed. "I told you I would install new posts," he said. "Taking out the old posts would be way more work, and I would have to charge you more."

Well, he had me: he had indeed said only that he would install new posts. I was the one who assumed he would take out the old posts. I grumbled, paid him extra to replace a few of the old posts where it was particularly troublesome to have an extra post sticking out, and had the whole fence replaced the right way a few years later.

Keep using gmail

One of the startups at which I worked used gmail and was acquired by a large company that used Exchange. Concerned about the possibility of having to move to what we felt was a worse system, we asked what would happen with email. We were relieved when they said we could keep using gmail.

On the very first day that we were officially part of the new company, we were all told that we now had Exchange email accounts. "Hey!," we said, "you told us we could keep our gmail accounts." "Yes, you can," came the response, "but you also need to have an Exchange account for all official company email."

This was, of course, not what we had expected when we asked if we could keep our gmail accounts. But, as with the new fence posts, they had in fact kept their word and let us keep our gmail accounts; it was we who assumed that that would continue to be our only email account.

Everything under SCCS

At one of the places I worked, we hired a contractor to work on a subsystem. At one point we became concerned about how he was managing his source code, so we asked how he was doing that. "Everything is under sccs," he said. (This was well before the days of git, subversion, cvs, or even rcs; at the time, sccs (Source Code Control System) was what most people in our industry were using.) When he finally delivered the source code to us, we were annoyed to discover that he simply had a directory named "sccs", and all of his source code was contained in that directory; there was in fact no versioning or history.

Once again, this was not what we had expected. When he said "sccs" we assumed he was talking about the source code control system, when in fact he was just referring to a directory name; and when he said "under" we assumed he meant "managed by", when in fact he just meant "contained in."

A new and improved version of Android

My first smart phone was an Android phone running version 2.2. I watched as the newer versions of Android came out, filled with interesting new features. Finally, an over-the-air update was available for my phone. I eagerly updated and started playing with the new features. My first disappointment was with the new and definitely not improved performance: my phone was slow and laggy, and it no longer lasted even one day on a full charge.

I was even more dismayed to discover that they had removed USB Mass Storage Mode (MSC or UMS) and replaced it with a significantly less functional alternative, MTP (Media Transfer Protocol). In my case, it was completely non-functional for my use, because my home desktop machine was running Linux, and at the time there was not a working Linux driver for MTP mode.

I was, as you might expect, pretty ticked off. I had assumed without thinking about it that they would not remove a significant feature from a new version of the software, but they never said that.

Alternate Interpretations

Ask yourself: when reading the above anecdotes, did you realize in advance of the denouement what the problem would be for all of them? If it had been you, would you have made the same assumptions as I did?

Sometimes something seems so obvious to us that it does not even cross our minds that there might be an alternate interpretation.

I don't think it is possible for us to see these alternative interpretations in every case; often it is something with which we have had no experience, so could not be expected to know. We do, of course, sometimes consider alternative interpretations. In the future, if someone tells me they will install new fence posts, I will be sure to ask for more details. But we have to make assumptions as we deal with the world every day. If we examined every statement and every experience for alternative interpretations, that would consume all of our time, and we would not have any time left to pursue new thoughts. We learn to make instant and unconscious judgment calls: as long as what we hear and see has a high enough probability of an unambiguous interpretation, the possibility that there is an alternate interpretation does not bubble up to our conscious minds. Overall this is a very effective strategy that lets us focus our mental energies on situations where an unusual outcome is more likely. But this does mean that every once in a while we will miss something, with undesired results.

Going beyond obvious

I have already given my recommendation to State The Obvious. However, as you can see from the above anecdotes, this is not always enough. But what else can we do?

If you consider the anecdotes above, you might notice that, in most of them, by the time I realized that I had made an incorrect assumption, the deed was done and I was stuck with an undesired result. But the fence post story was a little different: in that case, I checked up on the work before it was done. Because I discovered the problem while it was happening, I was able to ask for changes and get a result that was closer to what I wanted.

Software Development

Not all of my blog posts are about software development, but in this case the application is obvious. Well, it seems obvious to me, but just in case it is not obvious to everyone, I will follow my own advice and explain in detail.

In the traditional waterfall process, a complete and detailed specification of the desired system is created before doing any of the implementation work. Once that spec is done, the system is built to match it. But, as we have seen from the anecdotes above, even a very simple spec, such as "install new fence posts", might be interpreted in a bizarre way that still matches the letter of the specification. In this case, the result might be something that arguably matches what was specified, but is not what was wanted.

Based on my personal experience and anecdotes I have heard from others, I believe that it is very difficult to write a good spec for something new, and impossible to write a spec that can not be interpreted by somebody in some bizarre way that satisfies the spec but is not the desired result.

Given that we can't guarantee that we can write a spec that will not be misinterpreted, what is the alternative? I think the only alternative is to do what I did in the fence-post case: check up on the work and make corrections along the way. This is embodied in a couple of the value statements in The Agile Manifesto: "Customer collaboration over contract negotiation" and "Responding to change over following a plan".

If you are asking someone to create something that is very similar to things that have been created before, and through previous common experience there is already a shared vocabulary sufficient to describe how the desired result compares to those previous creations, then you can perhaps write a spec that will get you what you want. The closer the new thing is to those previously created things, the easier that will be. But in software development, where the goal is often specifically to create something novel, this is particularly difficult. In that situation, I think that creating and then relying solely on a detailed spec is less likely to result in a satisfactory outcome; I believe an agreement on direction and major points, followed by keeping a close eye on progress, paying particular attention when something is being done for the first time, is the key to good results.

Writing a Spec

I'm not saying don't write a spec. I'm saying you need to recognize that a spec won't take you all the way, and a poorly written spec can hinder your progress. Writing a spec is like looking at a map and planning your route: often necessary but seldom sufficient. You need to be prepared for construction closures, blocking accidents, or even additional interesting sights you might decide to see along the way. For any of these diversions, you will need to reexamine your route in the middle of the trip and select an alternative. For a short trip, you might not run into any such problems and thus not need to modify your route, but the longer the journey the more likely that at some point you will need or want to deviate from your original route.

If you are familiar with the roads and have a clear destination, you might be able to dispense with the initial route planning completely: just head in the right direction and follow the signs. Or if you are on a discovery road trip and don't have a specific destination, then heading out without a planned route is fine. In most cases, though, some level of advance route planning will save time. You just need to stay agile and be prepared to change your route along the way.

Sunday, November 3, 2013

Code Guidelines

A list of basic goals for creating code.

In our team project at work, we wanted to have a set of style guidelines to allow everyone to more easily and quickly read the codebase and to avoid spurious code reformatting changes. As you might expect, there were different opinions on many points. To avoid fruitless "my way is just better" discussions, I wanted to step back and make sure we could all agree on some general goals. With that agreement in place, we could at least ask people to explain how their preferred style on some point supports our general goals. If nobody can provide an argument to support a favored construct, we might as well flip a coin.

Below are the goals I proposed and with which the team agreed. I think many of these are obvious, but then I usually believe in stating the obvious. The first two criteria below are also listed in my post on Software Quality Dimensions. Your team may choose slightly different guiding principles, but I think having the team agree on and write down their principles and asking people to justify their proposed standards against those principles can help short-circuit disagreements that might otherwise take longer to resolve.

Goals

In order of priority, with the most important criteria first: First, we want our code to be correct.
This means that the code must:
  • perform the desired primary behavior.
  • behave in a defined way for expected error conditions.
  • not have undesirable side-effects.
  • not have security vulnerabilities such as buffer overflows or injections.
  • not have memory problems such as leaks or use of released or uninitialized memory.
  • run fast enough for the intended use cases (but without premature optimization).
Second, we want our code to be robust.
This means that the code should be written in such a way as to minimize the probability of incorrect behavior under a wide range of conditions, including when:
  • it receives unexpected, corrupted, or no input data (graceful degradation).
  • a programmer unfamiliar with the code makes changes to it.
  • the functionality of neighboring code changes.
  • the development environment or toolset changes.
Third, we want our developers to be as productive as possible.
This means the code should be written such that:
  • developers are unlikely to misunderstand what the code does (principle of least surprise).
  • developers can read and understand the code quickly.

Wednesday, October 24, 2012

Role-Based Authorization

A simple, uniform, powerful and extensible authorization model.

Introduction

The "three As" of security are:
  • Authentication - assuring that the user is who he says he is.
  • Authorization - allowing each authenticated user to perform selected privileged actions.
  • Audit - recording privileged actions to allow review of changes or potential abuse of privileges.
Given authentication and auditing it is pretty simple to add a bit more monitoring that is very useful for billing purposes and resource management, so you more often see the combination AAA (Authentication, Authorization, Accounting) or AAAA (Authentication, Authorization, Audit, Accounting).

In this post I discuss only authorization. Authentication and auditing are each big topics, so I won't try to cover them here. Similarly, I assume that the code and data are themselves secure. In particular, I do not cover the issue of multiple security domains and the problem of having lower security code make requests to higher security code.

With my focus only on authorization, in the discussion below I assume that the user has been authenticated so that we can trust that piece of data within the application.

I will use the language of relational databases in this post because it is well-known and precise. An implementation of this model can use some other mechanism to store and query the authorization data. The SQL examples provide precision to the discussion, but you should be able to skip the SQL code and still gain a basic understanding of the model.

In the SQL example code I indicate replacement variables within braces; for example the string {user} in a SQL statement indicates that the application should plug in the user name at that point in the expression. For a real implementation, the actual syntax would depend on the database access package in use.

I have run into some authorization systems intended to provide a powerful set of capabilities for a complex situation that were, unfortunately, themselves so complex as to make it difficult to understand how they were supposed to work, and even after having it explained, difficult to remember because there was not a simple underlying model to tie it all together.

In this post I present an approach to authorization that I believe provides a very high level of power with a model that is relatively simple to understand and to extend as needed. This model initially implements a Role-Based Access Control (RBAC) mechanism, a widely used approach to security that is now a NIST standard. I add a few extensions to the common model that make it start to look more like an Attribute-Based Access Control (ABAC) model.

Separation of Concerns

In an authorization system, we want to separate the management of authorization from the application. The application should ask permission for what it wants to do, which permission is supplied by the authorization system. All management of the granting of the authorizations is handled from the authorization system, completely outside of the application. If you build a system in which any of the abstractions used in the management of authorizations, such as roles, appear in the application, then, as they say, you are doing it wrong.

In this post I focus only on the part of the system that determines whether to grant authorization. A separate system is required to maintain the data that is used by the authorization system. That maintenance can become quite complex in enterprise systems, but I will not be discussing it further in this post except to mention that the authorization mechanism described here can be applied to the system that maintains the authorization data in order to control who is allowed to modify what parts of that data.

Users

Let's start with perhaps the simplest useful authorization model possible. We begin with a one-column user table containing user names.

create table user(name varchar(32) primary key);
When the application wants to check for our sole authorization, it takes a passed-in authenticated user name and calls the authorization function with that value. The authorization function just checks to see if that user exists in the table. If so, the user is authorized and the authorization function returns true; if not, the user is not authorized and the authorization function returns false.
-- authorized if count>0
select count(*) from user where name={user};
The user-only model is too simple for most applications.

Actions

The next step is to add a one-column action table containing actions. We will assume each action is represented by a string name, although for performance reasons some might choose a different representation.
create table action(name varchar(32) primary key);
We add one row to this table for each restricted action; for example, we might have entries for login, reboot_system, and view_system_users.

With the addition of the action table we can no longer just look up users in the user table. We add a third table called grant (or auth_grant, since grant is typically a reserved word in SQL) with two columns that are foreign-key columns to the user and action tables. Each row of the grant table refers to a user and an action, with the meaning that that user is granted authorization to perform that action.
create table auth_grant(
    user varchar(32) not null,
    action varchar(32) not null,
    constraint FK_grant_user foreign key(user)
        references user(name),
    constraint FK_grant_action foreign key(action)
        references action(name)
);
Our authorization function will now accept a combination of values. We will refer to this combination as the requested operation (the NIST standard uses transaction as the unit for which permissions are granted). When an application wants to perform a potentially restricted operation, it takes the passed-in authenticated user name, adds the action it wants to perform, and passes that data to the authorization function. The authorization function takes the passed-in user and action arguments and looks in the grant table for a row in which the passed-in values for user and action match the values in the corresponding columns in the table. That row defines a permission to execute the requested operation. If that row exists, the operation is authorized; if that row does not exist, the operation is not authorized.
-- authorized if count>0
select count(*) from auth_grant where
    user={user} and
    action={action};
The user+action model is sufficient for many simple systems, such as granting login rights to some users and admin rights to other users.

Objects

With just users and actions, each action granted to a user effectively has global scope within the system. This is fine for actions such as login which truly are intended to be global in scope, but we would also like to be able to specify that certain actions can be performed on specific objects. Modern operating systems include mechanisms to grant different access rights, such as read-file or write-file, to specific files based on the user.

We add a one-column object table containing references to the objects in our system for which we want to be able to issue grants, with one row for each such object. We are making the simplifying assumption that each object already has a unique identifier that can be stored in our database.
create table object(name varchar(32) primary key);
We add a third column to our grant table that is a foreign-key column to the object table, exactly analogous to the existing references to the user and action tables. Each row of the grant table now refers to a user, an action and an object, with the meaning that that user is granted authorization to perform that action on that object.
create table auth_grant(
    user varchar(32) not null,
    action varchar(32) not null,
    object varchar(32) not null,
    constraint FK_grant_user foreign key(user)
        references user(name),
    constraint FK_grant_action foreign key(action)
        references action(name),
    constraint FK_grant_object foreign key(object)
        references object(name)
);
If we still want to have actions with global scope, such as the example of a login action in the user+action model, we can add a special system object that can be used in that situation.

Our authorization requests from the application now include three pieces of data. We modify our function for authorizing a restricted operation to take an argument specifying the object, along with the user and action arguments that we already have. The authorization function looks in the grant table as before, but it now must find a row that matches all three fields rather than only user and action.
-- authorized if count>0
select count(*) from auth_grant where
    user={user} and
    action={action} and
    object={object};
The user+action+object model presented here is used in many databases, with the objects being database tables or views and the actions being the four database actions of select, insert, update and delete. There may also be additional actions such as grant (the ability to create additional grants on an object) or actions that allow creating and modifying users or databases.

Roles

In order to simplify the maintenance of grants when we have a large number of users, we add a mechanism that allows us to group users together and grant permissions to a group of users rather than just to a single user. Users are grouped according to the roles they play; example roles are user, administrator, and superuser.

We add a role table with one row for each role we define. (We will look at other possible implementations later, but this choice serves well for explaining the concepts.)
create table role(name varchar(32) primary key);
In order to indicate which users have been granted (assigned) which roles, we add a user_role table with two columns: the user column is a foreign key to the user table that references the user, and the role column is a foreign key to the role table. A user having a role is indicated by adding a row to the user_role table referencing that user and that role. When granting authorization, a user will receive authorization for all roles he has.
create table user_role(
    user varchar(32) not null,
    role varchar(32) not null,
    constraint FK_userrole_user foreign key(user)
        references user(name),
    constraint FK_userrole_role foreign key(role)
        references role(name)
);
We also add a role column to our grant table. This column is a foreign key to the one column in our role table. A row in the grant table can now refer either to a user or to a role. It must reference one or the other; while it might be possible to set up a structure to enforce that constraint directly in the database, we will skip that exercise and instead suggest that this constraint could be enforced by an application-level database consistency check.
create table auth_grant(
    user varchar(32),
    role varchar(32),
    action varchar(32) not null,
    object varchar(32) not null,
    constraint FK_grant_user foreign key(user)
        references user(name),
    constraint FK_grant_role foreign key(role)
        references role(name),
    constraint FK_grant_action foreign key(action)
        references action(name),
    constraint FK_grant_object foreign key(object)
        references object(name)
);
The addition of roles is entirely an abstraction within the authorization system; the application is not aware of roles. An operation is defined by the same three values as before, and the application calls the authorization function in the same way as before to see if an operation is authorized, but the authorization function has to do a little more work now.

The application still passes the user, action and object arguments to the authorization function, and the authorization function still looks in the grant table to see if that combination of user, action and object is authorized, but now in addition to looking for a row that exactly matches those three values, it also looks up all of the roles the specified user has, and it looks for a row in the grant table in which the action and object values exactly match the values passed in and in which the role in the grant table is one of the roles the user has. If the authorization function finds a row that exactly matches the action and object and that exactly matches either the user or any of the user's roles then the action is authorized; if no such matching row is found then the action is not authorized.
-- authorized if count>0
select count(*) from auth_grant where
    (user={user} or role in (select role from user_role where user={user})) and
    action={action} and
    object={object};
The (user+role)+action+object model presented here has been used in the Unix filesystem for many years, with the objects being files and directories, the actions being read, write and execute/search, and the roles called groups.

In the NIST RBAC model permissions can only be assigned to roles, not to users. A strict implementation of this aspect could easily be implemented by dropping the user check in our authorization test (which also means we can drop the user column in the grant table):
-- authorized if count>0
select count(*) from auth_grant where
    role in (select role from user_role where user={user}) and
    action={action} and
    object={object};
Alternatively, we could think of each user as automatically being assigned a unique role whose name is the same as the user name. Or, we can choose never to assign any permissions to a user, only assigning them to roles.

Role Activation

The NIST RBAC standard includes a concept called Role Activation (or Role Authorization). When a user logs in, some subset of his roles can be activated. Allowing a user to activate and deactivate his assigned roles gives the user a way to ensure that he (or some program he is running) does not perform a privileged operation when he is not expecting it. Permissions are only granted for active roles, so even if a user has been given permissions through a role, a program will not be able to take advantage of them unless the user has activated a role that grants those permissions.

We can implement role activation globally by adding an is_active column to the user_role table.
create table user_role(
    user varchar(32) not null,
    role varchar(32) not null,
    is_active boolean not null default false,
    constraint FK_userrole_user foreign key(user)
        references user(name),
    constraint FK_userrole_role foreign key(role)
        references role(name)
);
When checking for authorization, we only include roles that are active for that user. If we continue to allow user-based permissions, then we would need to add an is_active flag for those permissions as well. When using activation it is simpler to exclude user-based permissions, as is done in the NIST RBAC model.
-- authorized if count>0
select count(*) from auth_grant where
    role in (select role from user_role where user={user} and is_active) and
    action={action} and
    object={object};
The NIST RBAC standard uses session-based activation rather than global activation. This allows a user to have multiple sessions open simultaneously with different roles active for each session. To implement this, rather than adding an is_active column to the user_role table, we create a session table that keeps track of our sessions and a session_role table that lists the roles that are active for each session.
create table session(
    id varchar(32) primary key,
    user varchar(32) not null,
    constraint FK_session_user foreign key(user)
        references user(name)
);

create table session_role(
    session_id varchar(32) not null,
    role varchar(32) not null,
    constraint FK_sessionrole_sessionid foreign key(session_id)
        references session(id),
    constraint FK_sessionrole_role foreign key(role)
        references role(name)
);
When testing for authorization we only want to use roles that are both assigned (in the user_role table) and active (in the session_role table). Assuming the mechanism that maintains active roles in the session_role table ensures that the only roles appearing in that table are in the user_role table (i.e. only an assigned role from the user_role table can be active in the session_role table), then we can modify the authorization function to accept an additional argument which is the session_id, and change our implementation SQL:
-- authorized if count>0
select count(*) from auth_grant where
    role in (select role from session_role where user={user} and session_id={session_id}) and
    action={action} and
    object={object};
At this point our model includes the capabilities of RBAC0, the first level of the NIST RBAC standard (although the NIST model does not include action and object as presented above). However, in order to keep the discussion of the other aspects of the model less cluttered, I will generally not be including role activation in the remainder of this discussion except where noted.

Role Hierarchies

Given the ability to group users into roles and thus simplify the number of grants we need to create, we can generalize on that concept by also allowing roles to be grouped into other roles.

In the discussion of Roles above, we added a user_role table that allowed us to assign roles to users. We now add a role_hierarchy table with parent and child columns that allows us to assign roles (children) to other roles (parents).
create table role_hierarchy(
    parent varchar(32),
    child varchar(32),
    constraint FK_rolehierarchy_parent foreign key(parent)
        references role(name),
    constraint FK_rolehierarchy_child foreign key(child)
        references role(name)
);
When collecting the list of roles for a user, we now have to recursively consult the role_hierarchy table to collect all of the child roles for any role the user has. How this is actually done is heavily dependent on the implementation. Some SQL databases include the ability to formulate recursive queries, but most do not.

We hide this implementation detail inside a view that collects the closure of the role-role relationships, effectively flattening our hierarchy. Defining this flattening in a view allows us to change how we collect the closure of the roles without affecting the queries that invoke this view. In this particular example, our view is defined using a non-recursive query that will suffice for a hierarchy of limited depth.
-- not a full closure if the hierarchy is too deep
create view role_closure as
    select distinct user, a3.child as role from user_role
        join role_hierarchy as a1 on user_role.role=a1.parent or
            (user_role.user=a1.parent and user_role.role=a1.child)
        join role_hierarchy as a2 on a1.child=a2.parent or
            (user_role.user=a2.parent and user_role.role=a2.child)
        join role_hierarchy as a3 on a2.child=a3.parent or
            (user_role.user=a3.parent and user_role.role=a3.child)
    ;
We can now use the role_closure view in place of the user_role table:
-- authorized if count>0
select count(*) from auth_grant where
    (user={user} or role in (select role from role_closure where user={user})) and
    action={action} and
    object={object};
If we want to use session-based activation, we can do that by modifying our role_closure view to be based on the session_role table rather than the user_role table:
-- not a full closure if the hierarchy is too deep
create view role_closure as
    select distinct session_id, user, a3.child as role from session_role
        join role_hierarchy as a1 on session_role.role=a1.parent or
            (session_role.user=a1.parent and session_role.role=a1.child)
        join role_hierarchy as a2 on a1.child=a2.parent or
            (session_role.user=a2.parent and session_role.role=a2.child)
        join role_hierarchy as a3 on a2.child=a3.parent or
            (session_role.user=a3.parent and session_role.role=a3.child)
    ;
As above when adding session-based role activation, the authorization SQL includes the session-id and we no longer allow user-based permissions:
-- authorized if count>0
select count(*) from auth_grant where
    role in (select role from role_closure where user={user} and session_id={session_id}) and
    action={action} and
    object={object};
This change can be made in any of the authorization SQL statements given below to add session-based authorization where it is otherwise not included.

Note that although we stated that the role parent/child relationships form a hierarchy, there is actually no reason to limit it to that, and our design does not preclude defining role relationships that form a more complex graph. We do want to avoid cycles in our role graph, as a graph with cycles would not provide us any useful benefits, and we need to ensure that our implementation does not blow up if the role graph happens to have some cycles. If we use the role_closure view implementation provided above, an incidental benefit is that the closure mechanism is so simple and limited, cycles will not cause any problems other than wasting a bit of processing power.

The NIST RBAC standard defines both general and restricted forms of hierarchy as part of the RBAC1 level. The restricted form is a tree structure and the general form is an arbitrary partial order. Our model above support the general form.

NIST RBAC levels RBAC2 and RBAC3 add Constraints (to ensure support of Separation of Duties) and Symmetry (the ability to review permission-role assignments as well as user-role assignments). With the simple database implementation presented here, these capabilities are available.

Alternate Hierarchy Implementations

In the implementation of user roles and role hierarchies above we added a role table, a user_role table and a role_hierarchy table, we added a role column to the grant table, we added a role_closure view and we modified our example SQL select statement for checking authorization to use that view. In this section I present three alternate approaches to this step when using a relational database, and of course there are other approaches not discussed here that are not based on a relational database. These implementation alternatives do not affect the basic model being developed.

In the first alternate approach, after defining the role table we next define the user_or_role view that is the union of those two tables.
create view user_or_role as
    (select name from user)
    union all
    (select name from role)
;
In the grant table, rather than adding a role column and having the user column be a foreign key to the user table, we make the user column a foreign key to the user_or_role view. Unfortunately, it is typically not possible to declare a foreign key to a view, in which case this foreign key relationship would have to remain implicit and not enforced by the database (it could be part of our application-level database consistency checks). Nonetheless, the SQL statements that join using this foreign key will work the same as if the foreign key were declared, although performance may be an issue if the user_or_role view can not be indexed. By using a materialized view it might be possible to index the view and have a foreign key refer to it, but then we would need to deal with rematerializing the view every time we changed the contents of the user or role tables.

Instead of creating a role_hierarchy table, we do the same thing to the user column of the user_role table as we did to the grant table, making it a foreign key to the user_or_role view rather than to the user table. This allows the user_role table to represent which roles have other roles as well as which roles users have directly been given.

In our second alternate implementation, we start by defining user_or_role as a table that contains the records for both users and roles, with an is_role column that indicates whether a row represents a user or a role. We then create user and role as appropriate views into that table.
create table user_or_role (
    name varchar(32) primary key,
    is_role boolean not null default false
);

create view user as
    select name from user_or_role where not is_role;

create view role as
    select name from user_or_role where is_role;
As in our first alternate implementation, the grant table points to the user_or_role table, as does the user column in the user_role table.
create table auth_grant(
    user_or_role varchar(32) not null,
    action varchar(32) not null,
    object varchar(32) not null,
    constraint FK_grant_user foreign key(user_or_role)
        references user_or_role(name),
    constraint FK_grant_action foreign key(action)
        references action(name),
    constraint FK_grant_object foreign key(object)
        references object(name)
);

create table user_role(
    user varchar(32) not null,
    role varchar(32) not null,
    constraint FK_userrole_user foreign key(user)
        references user_or_role(name),
    constraint FK_userrole_role foreign key(role)
        references role(name)
);
Many databases, including MySQL, do not allow indexes or foreign keys on views, so neither of the above two alternate implementations will work very well on those databases, and the table statements would have to be modified not to declare foreign keys to view columns.

If we want to use indexes and foreign keys, we have to compromise our data model a bit and not use views when we need foreign keys, which leads us to our final alternative.

In our third alternate implementation, we don't have a separate role table or view. Instead, we use the user_or_role approach as in the second alternative above: we place the role names into the user table and add an is_role column that indicates whether a row represents a user or a role.
create table user (
    name varchar(32) primary key,
    is_role boolean not null default false
);
In our user_role table, in which the role column was a foreign key to the role table, we make that column instead be a foreign key to the user table, where we are now storing our role names.
create table user_role(
    user varchar(32) not null,
    role varchar(32) not null,
    constraint FK_userrole_user foreign key(user)
        references user(name),
    constraint FK_userrole_role foreign key(role)
        references user(name)
);
We don't need a role_hierarchy table because we can now represent those role-to-role relationships in the user_role table. In our role_closure view we replace the role_hierarchy references with user_role references.
create view role_closure as
    select distinct a0.user, a3.role from user_role as a0
        join user_role as a1 on a0.role=a1.user or
            (a0.user=a1.user and a0.role=a1.role)
        join user_role as a2 on a1.role=a2.user or
            (a0.user=a2.user and a0.role=a2.role)
        join user_role as a3 on a2.role=a3.user or
            (a0.user=a3.user and a0.role=a3.role)
    ;
Because we are now storing our roles in the user table, the user column in our grant table can refer to either a user or a role, depending on what we are storing in the user table, so we don't need the role column and we can go back to the previous definition that did not have that column.

With this implementation our foreign key constraints all work because we are not dealing with any views, and our table structure is simpler because we have combined users and roles into one table. Although we are putting roles into the user table, we do need to remember that this is just a convenient fiction to simplify our implementation because there are some situations in which we want to treat users and roles the same. But we must remember that, although we are storing them in the same table and in some situations ignoring the difference between them, if we forget about that difference and start treating them the same in other situations we can easily start getting absurd behavior from our system.

(I have a mental image of our legal system as having a people table, and a law table with a foreign key to the people table. At some early point, someone wanted some laws that applied to corporations as well as people, so they said, "I know, let's just add an is_corporation flag to the people table and put the corporations in there, then our foreign keys from the law table will still work and we won't need to add a bunch more structure to our law schema!" With the passage of time, law programmers who should have been paying attention to the is_corporation flag started ignoring it more and more often, until finally the law programmers were saying, "Well, those corporations are in the people table, so they must be people." If you are concerned that this kind of situation might happen to you, you might not want to put roles into the user table.)

For the remainder of this discussion, we will use this third alternate implementation approach.

Interlude

In the above discussions, I have been assuming that the names of users, actions, objects and roles are also their key values. This implies that each of those names are unique. Given that I have discussed a couple of implementations in which users and roles have been mixed together, you might wonder whether it would cause problems to add a user whose name is the same as a role. In the above simple implementation the answer is "yes", and the system would have to disallow that. A real system is likely to be a bit more complex, using unique IDs as primary keys rather than names. The problem of having unique names thus gets moved from a database issue to an application-level issue. The system implementer must decide under what circumstances it is acceptable to have duplicate names, and there must be a way to distinguish those duplicates to someone operating the system.

We have reached a point in the development of our authorization model that is similar in power to many existing systems. People who need more flexibility than this model provides might diverge at this point into custom authorization systems with various forms of exceptions and extensions that rapidly start adding complexity to the model.

There are still a number of extensions we can make to our authorization model that will improves its power while adding only a small amount to the cognitive load of understanding how it all works. Let's get back to our model and add some more power to it.

Tasks

In the same way that we allow specifying a group of users having a role, we add the ability to specify a group of actions, which we call a task. The relation between tasks and actions is exactly analogous to the relation between users and roles. Each action can be assigned to multiple tasks, a task can be assigned other tasks, and an authorization grant can refer either to an action or to a task.

Analogous with our second alternative implementation above, in which we added an is_role column to the user table and put roles into the user table, for the equivalent addition of tasks we add an is_task column to the action table, add an action_task table with columns action and task both being foreign key references to the action table, and add a task_closure view.
create table action(
    name varchar(32) primary key,
    is_task boolean not null
);

create table action_task(
    action varchar(32) not null,
    task varchar(32) not null,
    constraint FK_actiontask_action foreign key(action)
        references action(name),
    constraint FK_actiontask_task foreign key(task)
        references action(name)
);

create view task_closure as
    select action, a3.task as task from action_task as a0
        join action_task as a1 on a0.task=a1.action or
            (a0.action=a1.action and a0.task=a1.task)
        join action_task as a2 on a1.task=a2.action or
            (a0.action=a2.action and a0.task=a2.task)
        join action_task as a3 on a2.task=a3.action or
            (a0.action=a3.action and a0.task=a3.task)
    ;
We expand our authorization query to look for tasks in the same way as we expanded it to handle roles, with the same caveats about hierarchy depth.
-- authorized if count>0
select count(*) from auth_grant where
    (user={user} or user in (select role from role_closure where user={user})) and
    (action={action} or action in (select task from task_closure where action={action})) and
    object={object};

Domains

Roles and tasks give us the ability to group users and actions. We complete the pattern by adding the ability to group objects into groups that we call domains (not to be confused with internet domain names). As with the tasks example above, we add the is_domain column to the object table, create the object_domain table to allow defining groups of objects, create the domain_closure view, and modify the authorization function to check for either objects or domains in the same way as we modified it to check for either actions or tasks. All of these steps are exactly analogous to what we did when we added tasks.

Intermediate Summary

Let's take stock of what our model looks like:
  • There are three dimensions: user, action, and object.
  • The handling of the three dimensions is completely symmetric (unless role activation is being used, in which case the user dimension has that extra wrinkle).
  • The application passes those three values to the authorization function, which returns true if that operation is authorized, false if not.
  • For each dimension, there is a grouping mechanism: role for user group, task for action group, domain for object group.
  • The grouping mechanism supports a hierarchy of groups, or more generally a (directed acyclic) graph of groups (a partial ordering).
  • To determine if a request should be authorized, take each dimension, collect the closure of the groups for that dimension, and look for a grant in which each dimension of the grant matches any of the items in the closure for that dimension.
The model presented above is easy to understand, but despite its simplicity it is quite powerful. Yet it does not suffice for everyone. Let's see how we can continue to enhance it's power without significantly increasing its complexity.

Times, Periods and Schedules

In some systems it is desirable to allow some operations only at specified times. For example, one might want to allow users to log in to the system only during their work shift.

We define another dimension, the time dimension, and we define a time range as a period, where a period is an interval of time such as 8AM to 5PM, or Sunday, or 8AM to 5PM on weekdays. We add the time dimension to our definition of an operation, so when the application calls the authorization function, it must now pass the current time as a fourth argument.

The dimensions we have defined previously are all discrete dimensions, with only one matching value for each definition. The time dimension is different in that it is a continuous dimension: there are multiple time values that can match a period. This makes the authorization function a little more difficult to write, but it does not add much complexity to the user's conceptual model.

The other dimensions all have groups, so it would not add to the complexity of the model to add groups of periods. In fact, the model would be more complex if we did not add groups of periods, as that would make this dimension different from all the others in that aspect, which would be an additional detail that the user would have to factor into his mental model.

We add a group called schedule. As with all the other groups, a period can be included in any number of schedules, and schedules can contain other schedules. When checking authorization, we collect all the periods that match the current time and the closure of all the schedules for those periods, and we search for grants that include any of those in the period column.

Locations, Areas and Regions

By now the pattern should be pretty clear. If the system requires other dimensions, they are easy to add by following the same pattern. By keeping to the pattern, the complexity of the model that the user must work with to understand the system is kept low, even when there is some small difference for the new dimension, as there was for the time dimension when compared to the three previously defined dimensions. When there are small model extensions for a dimension, as there was when we added the time dimension, we can leverage that model concept when adding some other dimension.

Location is a system-specific concept. For some systems it might be a logical location, such as "console", "secure terminal", or "dial up". Since these are discrete values, it would suffice to have a location table, group locations in regions, and handle it in the same manner as the other discrete dimensions such as user.

For other systems a location might mean a physical location specified by one or more continuous values, such as latitude and longitude, in which case we define an area analogously to a period, where one area includes a range of locations. The area might be defined with a center point and radius, it might be defined with a bounding box, it might be defined as a polygon, using splines, or in some other even more complex way. As with periods, the complexity of the definition of an area has an effect on the difficulty of implementing the authorization function that has to determine whether a location is or is not in an area, but has little effect on the complexity of the user's mental model of the authorization. For the user, it is sufficient to know that a given location will be either contained in or not contained in an area, and that grants are based on areas.

Our group for an area is a region, and it groups together areas and other regions in the same way as the groups in the other dimensions.

Denials

The approach described above is essentially a "whitelist" approach, which is the standard approach to authorization. If an operation is listed in the grant table then it is allowed; any operation which is not listed is not allowed.

It is also possible to use a "blacklist" approach: rather than allowing what is listed and denying everything else, we can deny what is listed and allow everything else. In this case we would create a denial table that is exactly like the grant table except that it contains operations to be denied rather than operations to be allowed. The authorization function would do the same search as before, except that it would deny the operation if any matching records were found, and allow the operation otherwise.

Using a blacklist approach to authorization as just described is generally not recommended (in fact the NIST RBAC standard specifically recommends against "Negative permissions", although it does not outright disallow them). Since the default action is to allow an operation, if a new operation is added to the system and through oversight the appropriate denials are not added, then there is no protection for the new operations.

Exceptions

We can combine the original grant approach and the denial approach described just above to give us the ability to have both a whitelist and a blacklist. We start with our original grant table approach, following the recommended position that the default is to deny any operation unless it is explicitly granted; on top of that, we add the denial table as exceptions to the grants.

Our authorization function first looks in the denial table; if a matching record is found, then the request is denied. If no matching record is found, then the function looks in the grant table; if a matching record is found, then the request is granted; otherwise it is denied.

This allows the admin to think in terms of exceptions: grant privileges to all of X, except for Y. In some situations this allows expressing the intended grants more simply than if one is restricted to just additive grants.

We could also flip the grant and denial tables around, first looking in the grant table for a match, then looking in the denial table for a match, then granting if nothing is found. As discussed in the previous section, this is not recommended, but understanding that it is possible is conceptually useful, and leads us to our last enhancement.

Prioritization

The structure of the grant and denial tables are identical, and their contents are checked in the same way, with the only difference being an inversion of the interpretation of the results in one case as compared to the other. We can easily combine both of these tables into a single auth table that includes an additional allow column that is true for all records from the grant table and false for all the records from the denial table. We can also add a priority column that we use to determine which records we should attend to first.
create table auth(
    id integer auto_increment,
    allow boolean not null default true,
    priority integer not null default 0, -- higher values take precedence
    user varchar(32) not null,
    action varchar(32) not null,
    object varchar(32) not null,
    period varchar(32) not null,
    area varchar(32) not null,
    constraint FK_auth_user foreign key(user)
        references user(name),
    constraint FK_auth_action foreign key(action)
        references action(name),
    constraint FK_auth_object foreign key(object)
        references object(name),
    constraint FK_auth_period foreign key(period)
        references period(name),
    constraint FK_auth_area foreign key(area)
        references area(name)
);
If we define the priority value such that higher values are more important than lower values, then we can get the same behavior as described in the first part of the previous section by setting the priority on all the denial records to 2 and setting the priority on all the grant records to 1. Our authorization function then looks in the auth table for the matching record with the highest priority value and looks at the allow value for that record.

If we wanted to get the (non-recommended) behavior as described at the end of the previous section, we could do that by setting the priority of all the grant records to 2 and setting the priority of all the denial records to 1, plus making the default behavior (when no matching rows are found) to allow the operation.

Given this structure, we can of course put in records with any priority value. This allows building up a series of toggling exceptions, much as the way leap years in the Gregorian calendar are defined (each year has 365 days, except every 4th year is a leap year with 366 days, except every 100 years is not a leap year, except every 400 years is a leap year).

Since we can stack up alternating grant and denial records, the only distinction between the "whitelist" and "blacklist" approaches discussed earlier is the question of what the default is when no matching records are found in the auth table (the default for whitelisting is deny, the default for blacklisting is grant). Given that using a default of allow is not recommended, we define the system to use a default of deny, but we provide a way that the system can effectively be set up with a default of allow if desired.

To simulate a default of allow, the admin can create a group for each of the dimensions in our authorization model (user, action, etc) that includes all elements of that dimension. Thus there would be an AllUsers role, an AllActions task, an AllObjects domain, etc. The admin then creates a rule that includes all of these groups with allow set to true and priority set to zero. Since the rule has been defined to include all elements of every dimension, it will always match every operation, so there will never be a case where there are no matches and the system default of deny is used. Assuming all other priority values are greater than zero, this rule will be the lowest priority, so it will only have an effect if there are no other matches, and thus it acts as the default.

As described above, there is one more potential ambiguity to resolve: what happens if there are two rules with the same priority but opposite allow values? (Two rules with the same priority and the same allow are not a problem, as they both give the same result.) We resolve this ambiguity by defining the denial records to take precedence over the grant records when they have the same priority value. This definition reduces nicely to the desired behavior for the simplest denial+grant case when all records have the same priority.

Our authorization function thus looks for all matching records in the auth table, sorts first by priority then by allow, picks the first one, and uses its allow value to determine whether to allow the operation. If no matching records are found, the operation is not allowed.

Ignoring for now the more complicated portions of the WHERE clause for selecting time and location, here is our SQL statement for determining if an operation is authorized:
-- The single selected value is true if authorized; if false or no records, not authorized
select allow from auth_grant where
    (user={user} or user in (select role from role_closure where user={user})) and
    (action={action} or action in (select task from task_closure where action={action})) and
    (object={object} or object in (select domain from domain_closure where object={object}))
    order by priority desc, allow asc
    limit 1
Adding prioritization like this adds a new concept to the authorization model, but provides a good amount of additional power relative to the additional mental load to understand the model. However, creating well-structured rules using prioritization is trickier that it seems at first glance. It has the same essential problem as for the blacklist approach described above: mistakes in setting up the conceptual layers of the different levels of prioritization can result in unexpected security holes. If you can figure out how to set up your authorizations using grants only, without denials, you should do that. But if the grant-only model is not sufficient, then adding prioritization as described in this section is a reasonable way to take the model to the next level of power - just remember that you have to be more careful in how you set up your rules.

Summary

With the addition of prioritization in the previous section, our authorization model is complete. Let's review the complete model.
  • There are two kinds of dimensions: discrete and continuous.
  • There are five dimensions: user, action, object, time and location.
  • User, action and object are discrete; time is continuous; location can be either discrete or continuous, depending on how the system defines it.
  • Additional dimensions can be added if necessary, following the pattern of the existing dimensions.
  • The handling of every discrete dimension is completely symmetrical with every other discrete dimension (unless session-based role activation is included, in which case the user dimension is a little different); the handling of each continuous dimension is close to completely symmetrical with the other continuous dimensions; and there is a high level of symmetry between the discrete and the continuous dimensions.
  • The application passes a value for each dimension to the authorization function. This collection of dimension values is the operation for which the application is requesting authorization. The authorization function returns true if that operation is authorized, false if not.
  • For each continuous dimensions, there is a range defined as the basic match: period for time, area for location.
  • For each dimension, there is a grouping mechanism: role for user group, task for action group, domain for object group, schedule for period group, region for area or location group.
  • The grouping mechanism supports a hierarchy of groups, or more generally a (directed acyclic) graph of groups.
  • There is a set of rules that is used to determine whether an operation is authorized. Each rule includes a set of comparison values, one for each dimension, a priority, and an allow flag that tells whether that rule specifies that authorization for a matching operation should be granted or denied.
  • To determine if a request is authorized, take the value for each dimension in the request, collect the closure of the groups for that value, and collect the records in which each dimension of the grant matches any of the items in the closure for that dimension. Pick the record with the highest priority, giving preference to deny records over grant records, and use the allow value of that record to determine whether to authorize or deny the operation. If no matching records are found, the operation is denied.
This conceptual model is no longer trivial, but the above rules are still relatively concise and easy to understand. The model is general enough and powerful enough that it should be suitable for a wide variety of applications.

In our model the application passes in a set of values to the authorization function, which uses its abstractions (in the form of groups) and rules (in the form of prioritization) to determine whether or not to grant permission for an operation. If we need more power, the application can pass in additional information, whether it is additional attribute information about the user, the environment, or other aspects of the operation, and the authorization system can apply even more complex rules. This is the approach used by Attribute-Based Access Control, with a rules engine used in place of the mechanisms described here.

Monday, April 30, 2012

Git Rebase Across Many Commits

Not all git merge conflicts are real.

Contents

The Scenario

In both my personal and my work projects I prefer to use git rebase to keep my commit histories simple and readable. To make this work in a team setting, we never work on the master branch, instead always working on a feature branch in our local repositories. Our process flow looks something like this:
$ git branch feature           #create the working branch
$ git checkout feature          #do all development work on that branch
#Edit files, etc.
$ git commit -m "Implement Feature"
#Repeat the above as desired during development.
#When ready to merge to master, do the following:
$ git checkout master
$ git pull                      #update master from shared repository
$ git checkout feature
$ git rebase master             #optionally with -i if squashing is desired
$ git checkout master
$ git merge feature
$ git push origin master
$ git branch -d feature
Because we never use our local master branch for development, the git pull on master is always a fast-forward merge. Likewise, because we have just rebased the feature branch against the master right before we merge that feature branch back into master, that merge is also always a fast-forward merge. Looking at it another way, we don't have any merge conflicts when updating or merging master because we resolve all of the merge conflicts when we rebase the feature branch against the latest master.

The Problem

At work, we have a large codebase and a handful of active developers who typically merge feature branches to the master using the above workflow multiple times each day. Sometimes somebody has a feature branch that takes a long time to finish, so that between the time that branch was started and the time it is ready to go into master, there may have been 40 or 50 other commits made to master. In general in this situation we will occasionally rebase our local feature branch against the latest master a few times during feature development, but inevitably there are occasions when a large rebase across many commits ends up being done.

Even if there are many commits on the master branch, if none of those commits touched any of the same code as the commits on the feature branch, then there should be no merge conflicts when rebasing the feature branch against the updated main branch. However, in my experience this has not always been the case. Sometimes git rebase reports merge conflicts when I think there should not be any. Since I don't generally know exactly what code the other team members have edited, I can't immediately tell if the merge conflicts make sense.

The normal advice for how to handle merge conflicts is to edit the named file, look for the conflict markers, inspect the conflicting code fragments, determine what to keep, edit out what is not being kept along with the conflict markers, git add the repaired file, and git rebase --continue to let it tell you about the next merge conflict.

That's a lot of work, and it might all be completely unnecessary.

The Solution

It seems that git sometimes just gets confused when doing a rebase across a large number of commits. Sometimes if you rebase in smaller steps, git will happily rebase each smaller step with no merge conflicts, until you have stepped all the way up to the latest master, at which point your rebase is done.

You could rebase against every single commit and work your way up to master, but that, too, is a lot of work. Here's what I do when the initial rebase of the feature branch against the latest master tells me there are merge conflicts.

When the initial git rebase reports a merge conflict, I immediately do git rebase --abort to undo that rebase attempt. Using gitk --all to view the commit tree, which lets me see the master branch and the commit at which my feature branch branches off the master branch, I select a commit on the master branch about half way between those two commits. I copy the commit ID and paste it into a rebase command that looks something like this:
$ git rebase 8bc85584989e4435c2d98b13447bcab37648ba7f
If this rebase reports no merge conflicts, then I try rebasing against master and repeat the process.

If there are merge conflicts, then I abort the rebase and pick another commit half way again to the branch point. I repeat this until either the rebase succeeds or I am trying to rebase across a single commit. At that point, if there are still merge conflicts, they are real and I address them in the normal way. Since the conflict is only across a single commit, it is easier to see the cause of the conflict and to resolve it.

After resolving the conflict across that one commit, I go back to the first step and try rebasing against master again, repeating the process.

I have followed this process a number of times. I think that a majority of these times I binary-divide my commits a few times and end up piecemeal stepping through the commits until I have rebased against master without ever having to resolve any conflicts. The other times I typically have to resolve one or two small conflicts, after which I can rebase against master.

The next time you do a rebase across more than one commit and git tells you there are merge conflicts, try this approach. You might save yourself a lot of work.

Thursday, December 8, 2011

Levels of Expertise

An attempt to improve the objectivity of skill self-ratings.

Contents

Discussion

We are often asked to rate things on a scale, typically 1 to 5 or 1 to 10. Rarely is there an attempt to define what those different numbers mean. From a statistician's point of view, this makes the values useful for the sole purpose of comparing a single individual's ratings against other ratings of that individual. In particular, without a good definition of what the various levels mean, I don't see how there can be any effective communication from one person to another of the meaning of such a rating.

When my doctor asks me to tell him how much something hurts on a scale of 1 to 10, I have no idea what information he expects to get when I say "3" or "7".

I once asked an acquaintance to rate, on a scale of 1 (bad) to 10 (good), a movie he had just seen. He said it was a 9. I was suspicious of this answer, so I asked him how he would rate Star Wars, which I knew to be his all-time favorite movie, on the same 1-to-10 scale. He said 12.

I personally consider it an aspect of innumeracy, but people often try to emphasize something by using numbers that are outside of the valid range. We may chuckle when Nigel says he likes his amp better because it goes to 11, but how often have you heard someone talking in all seriousness about putting in a "110% effort"? What does that actually mean? How would you know if someone were putting in 110% versus 100%? If 110% is a valid number, then presumably so is 120%, so anyone suggesting a mere 110% is clearly not asking for enough effort.

People tend to overestimate how good they are at all sorts of things, including cognitive, social and physical skills. If we all overrate ourselves by the same amount, I suppose that could all cancel out and you could still compare people's ratings - but without knowing a priori what their ratings should be, we don't know how much they might be overrating themselves.

When people consider their own expertise, it is common for those with less expertise to overvalue themselves more than people with more expertise. With more expertise comes more awareness of what one could do better. Einstein said, "As our circle of knowledge expands, so does the circumference of darkness surrounding it." Relative beginners easily fall into the Sophomore Illusion of thinking they know a lot because the circumference of their knowledge is not yet large enough for them to recognize the size of the surrounding darkness.

In 1989, psychologist John Hayes at Carnegie Mellon University identified what is now called the "ten-year rule" (although there are earlier commenters, including Herbert Simon, who was also at CMU). As Leonard Mlodinow says in "The Drunkard's Walk", "Experts often speak of the 'ten-year rule,' meaning that it takes at least a decade of hard work, patience and striving to become highly successful in most endeavors." (links mine) The ten-year rule is related to the idea that it takes about 10,000 hours of practice at something to become an expert; with 5 hours of practice per business day and 200 business days per year, it would take ten years to rack up that many hours. If you find yourself thinking how wonderfully expert you are in something that you have practiced for only a few years, perhaps you should consider the ten-year rule and temper your evaluation.

Given that people are so bad at these ratings, it seems to me that the only way to get any useful information from someone when asking this kind of self-rating question is to have an objective definition of what each level means.

One way to think about a scale is by how many people fall into each level. There are currently 7 billion people in the world, or almost 10 to the 10th power. This conveniently maps to a logarithmic scale from 0 to 10, allowing us to define eleven levels starting with level 0 containing all approximately 10 billion people in the world and with each higher level having one tenth the number of people as the level just below it. If the descriptions of a level are hard to interpret, perhaps the size of that level will help give an indication of whether a person should be rated there.

Years ago, during a job interview, I was asked to rate my level of expertise in various subjects, such as programming languages and development tools. This was not an unusual question, I had been asked this question before and have been asked it since. What was different that time was that the interviewer included a scale with some relatively objective descriptions for determining level of expertise. I rather liked the scale, so although I don't recall the exact definition of his levels, I have tried to reproduce that concept here, using descriptions somewhat similar to those given by that interviewer. Unfortunately, I don't remember who introduced that scale to me, so I am unable to give credit.

There are many reasons one might want a scale of expertise, including rating potential employees or creating a summary of the amount of expertise within a company. The scale I present here is intended to be very general; given its logarithmic nature that can include the entire world population, it is capable of allowing comparison of expertise across everyone in the world. You might think that would make it suboptimal for rating (potential) employee expertise, but I think there are enough levels to make it useful for that purpose.

Scale

The scale below includes the following columns:
  • Level: a number for the level, from 0 to 10, with 10 being the highest level of expertise.
  • Name: a name for the level. These are taken from a set of expertise level names proposed by the Traveling School of Life. My use of them probably doesn't quite match their intent, but I liked the names and thought the ten words matched my levels pretty well, so I applied them to my levels and added "ignorant" for level 0.
  • Description: a brief description of the level. The descriptions are worded as if for a technical tool; for application to other areas or concepts, modify accordingly. Comments referring to companies assume a large company (10,000+ people) with large divisions (1000+ people); being a company-wide guru in a company with 100 people might not get you past level 6.
  • Size: the approximate number of people expected to be at that level worldwide. As mentioned above, this is a simple logarithmic scale. The number of people in a level is 1010-L where L is the level number.
  • Practice: the approximate amount of practice that could be required to reach that level of expertise. Putting in that many hours does not guarantee reaching that level, and reaching that level does not necessarily require putting in that many hours. The conversion factors are 1,000 hours per year or 5 hours per day.
All of these different factors are rough estimates, not intended as absolutes but merely as guidelines to help people rank themselves in a way that allows for more meaningful results. I don't have any research to show how well my guesses about Description, Size and Practice correlate; if anyone knows of something along those lines, that would be interesting.

Level Name Description Size Practice
0 ignorant I have never heard of it. 10,000,000,000 none
1 interested I have heard a little about it, but don't know much. 1,000,000,000 1 hour
2 pursuing I have read an article or two about it and understand the basics of what it is, but nothing in depth. 100,000,000 1 day (5 hours)
3 beginner I have read an in-depth article, primer, or how-to book, and/or have played with it a bit. 10,000,000 1 week (25 hours)
4 apprentice I have used it for at least a few months and have successfully completed a small project using it. 1,000,000 3 months (250 hours)
5 intermediate I have used it for a year or more on a daily or regular basis, and am comfortable using it in moderately complex projects. 100,000 1 year (1,000 hours)
6 advanced I have been using it for many years, know all of the basic aspects, and am comfortable using it as a key element in complex projects. People in my group come to me with their questions. 10,000 5 years (5,000 hours)
7 accomplished I am a local expert, with ten or more years of solid experience. People in my division come to me with their questions. 1,000 10 years (10,000 hours)
8 master I am a company-wide guru with twenty or more years of experience; people from other divisions come to me with their questions. 100 20 years (20,000 hours)
9 grandmaster I am a recognized international authority on it. 10 30 years (30,000 hours)
10 great-grandmaster I created it, and am the number 1 expert in the world. 1 50 years (50,000 hours)

References

Other scales of expertise: Other articles: