Wednesday, October 22, 2008

Software Quality Dimensions

In my role as a Software Systems Architect, in positions as both employee and consultant, I have critiqued and contributed to existing and planned systems. When doing so I consider a number of different issues, most of which I was not taught in school. Although the title of this post refers to software, I also like to consider the environment in which the software is developed and used, so some of the issues discussed below are purely software while some are related to hardware or other aspects of the development or operating environments. There are relationships between some of these issues, but for the most part they are orthogonal, so each must be separately considered to ensure sufficient quality along that dimension. Strip out the comments after the colons and you could use this as a checklist to see how your system measures up.

Contents

Basics

Most developers know about these issues. They apply to almost every software system, no matter how small.
  • Functional: This is the area most people initially think of when considering software quality, and is the area most visible to end users.
    • Correct: the system must perform the right operations, preferably in a way that matches the user's expectations. The "principle of least surprise" is applicable here.
    • Usable: the system should be as easy to use as is practical. In colloquial terms, it should be user friendly.
    • Consistent: the different parts of the system should appear similar to the user. There should be model of the system presented to the user that is easily understandable and as simple as possible. Consistency among different parts of the system allows the user to understand those different parts with less overall to learn.
  • Performant: Performance improvements are usually thought of as quantitative changes, but a big enough change in performance becomes a qualitative difference in the system. As one of my friends says, performance is a feature.
    • Fast: the system must be fast enough to satisfy any real-time or near real-time constraints. Worst case scenarios must be considered: can the system keep up under the biggest possible load, and if not, what happens?
    • Small: the system must be able to operate within its resource constraints. If you can make the system smaller, that usually means cheaper, faster, and easier to maintain.
  • Maintainable: Systems can last for a long time, sometimes far longer than the original designers might have thought (as happened with the Y2K problem). For such systems, the cost of maintenance over the life of the system can be far more than the original cost of development.
    • Modular: the system should be subdivided into separable parts with well-defined integration points between the parts. This allows separate replacement or upgrading of just some parts rather than the whole system.
    • Documented: significant aspects of the system design should be captured in written form to reduce the loss of critical knowledge when employees leave the project.
    • Commented: a specific form of documentation. Code is easier to write than to read, but is typically read far more often than it is written. Appropriate comments simplify the task of reading and understanding the code.
    • Standardized: minimal and standard languages/tools/subsystems should be required. Fewer languages, tools and subsystems such as databases, operating systems and hardware platforms simplifies finding the resources needed to maintain the system.
    • Versioned (change-controlled): all source code and other system configuration information should be stored in a version-controlled repository so that new versions can easily be created when fixes are required, and old versions can be recovered when necessary to audit problems or restore functionality that was unintentionally lost in an upgrade.
    • Testable: the system should include automated unit and system tests to provide confidence that changes to the system have not broken anything. The system should be designed to support effective tests.

Hardening

For larger systems or systems with more valuable data, these capabilities have particular importance. They are more expensive to implement than the Basics, so are often omitted from smaller systems. They can also be very difficult to retrofit to a system that was not originally designed with them in mind.
  • Secure: Systems must be protected against unauthorized use.
    • Access control: the first line of defense.
      • Physical: anyone who has physical access to the system can cause problems. Console access is often not secured as well as remote access, and even if the person can't get in to the software, he can still disconnect or physically damage the system. Depending on the size of the system and the value of its data and services, this could mean physical locks on machines, locked rooms, or secure data centers with controlled access, including ID card with photo, signature, and biometrics such as handprint or retina scan.
      • Network: the system should include firewalls to limit access to only the services the system is designed to provide. For more secure systems, IP address filtering, VPN connections, or TLS certificate requirements can limit who can get to the system.
    • Authentication: users must be identified. This is typically done by requiring a username and password. More secure systems can require biometrics, such as a thumb scan, or a physical device such as an RFID security card. Every user should have a separate account to allow tracking exactly who it is that is logging in to the system.
    • Authorization: once a user is identified through the authentication process, the system can load up the privileges assigned to that user. Each user should have only the privileges necessary to do his job. Giving everyone complete access to the system increases the probability that someone will do something he should not do, whether intentionally or by accident.
    • Auditing: the system should record all pertinent information to allow a definitive determination of history for all significant security events. Details should include at a minimum what was done, who did it, and when it happened.
    • Internal checks (against bugs and attacks): for improved security, systems should run self-checks to ensure that critical parts of the system continue to run correctly. Ideally the self-check applications should be able to monitor themselves (or each other if more than one) to guard against problems within the self-check programs (including malicious users attempting to circumvent the self-checks).
    • Resource management/limits: the system should include mechanisms to allow limiting the resources consumed by each user, including disk space and CPU usage. In addition to allowing for a more fair use of the system by preventing one user from hogging resources, these mechanisms help prevent DOS (denial of service) attacks.
  • Robust: Nothing is perfect. A well designed system takes into account many different kinds of possible failure. The probability of failure can not be completely eliminated, but can be made very small given sufficient resources. The cost of failure must be compared to the cost of building a system with a sufficiently low probability of failure.
    • Redundant: the system should have no (or minimal) single points of failure. Redundancy can be applied at many levels: a single box can have multiple power supplies, multiple CPUs, multiple disks, and multiple network connections. Within one data center there can be multiple boxes and multiple network routers, with battery backup power. For maximum redundancy there can be multiple geographically separated data centers with multiple network routes connecting them.
    • Diverse: monocultures are more susceptible to damage. Just as with biological systems, diversity provides protection against problems that affect a "species". Sharing responsibilities among different operating systems and different applications provides defense against viruses that attack specific operating systems and applications, and against bugs in those components. This aspect of robustness can be very expensive, so is not often considered.
    • Forgiving (fault-tolerant): the larger the system, the higher the probability that it will have some problems, including bugs in the software. The system should tolerate small problems in the parts; a small problem should remain a small problem, not be amplified by cascading failures into a larger problem.
    • Self-correcting: Self-monitoring can be done at multiple levels, from memory parity up to application-level system checks. More sophisticated techniques allow for automatic correction of errors, such as Reed-Solomon coding instead of simple memory parity. Care must be taken to ensure that the probability of failure or errors in the error-correcting step is less than the parts that it is monitoring.
  • Scalable: If you expect usage of the system to grow over time, your design should allow incremental expansion of the system and should continue to perform well at the new usage levels. Scalability should be considered at multiple levels, depending on how far you need to scale. When scaling up by adding more identical units, you also get the benefits of redundancy, because with more units, the portion that you must set aside purely for redundancy can be reduced. Stated another way, effective redundancy can be added to a system much less expensively when that system is already using a collection of identical units for scaling purposes.
    • Scalable algorithms: algorithms should have appropriate big-O performance. Attention should be paid to word size limits to prevent overflow on large data sets.
    • Resource-tight: even a small memory leak can bring down a application when there is enough usage. The system should be tested under load and monitored to ensure it does not run out of memory, file descriptors, database connections, or other resources that could leak from poor coding.
    • Parallelizable to multiple threads or processes: the first step in spreading out to multiple units is to use multiple threads or processes on one machine. Areas of concern are shared memory and concurrency: deadlock or livelock, excessive blocking due to excessive synchronization, and stale or corrupt data due to insufficient synchronization. On a multiprocessor machine, be sure you understand the memory model, and be aware of possible cache coherence problems that can arise if code is not properly synchronized.
    • Parallelizable to multiple machines: when one machine is not enough, scaling up to multiple machines clustered in one location is the next step. For this level of parallelization, typical issues of concern are network bandwidth limits, how to distribute data for improved throughput, and what to do when one host in the cluster is down.
    • Geo-redundant: for the largest applications, having multiple data centers in geographically separated locations provides redundancy and disaster security, as well as sometimes providing improved performance as compared to a single location because of reduced network delays. Typical issues are dealing with network latency, managing routing between data centers, and data replication between data centers for improved performance and redundancy.

Business

In theory you could build a great system without considering these issues, but in practice you had better pay attention to them, else your business will not last very long.
  • Affordable: The art of engineering includes balancing costs against other qualities. The following costs should be considered:
    • Design and development: the labor costs of building the system.
    • Hardware and external software: the purchase costs of the system.
    • Maintenance: the ongoing costs of repairing and upgrading the system.
    • Operational: the data center costs of operation.
  • Timely: When a product hits the market can determine whether or not it succeeds. When a product is scheduled to hit the market should be factored into the design, both in terms of how much time is available for development, and in terms of what the expected environment will be like then.
    • Soon enough: taking too long to get to market is a well-known concern.
    • Late enough: the first to market is not always the most successful. Sometimes it is better to let someone else bear the costs of opening up the market.
    • Available support and parts: a good design will plan on using parts that will be cost effective when the product is shipping, which is not necessarily the same as what is cost effective when the product is being designed. This requires predicting the future, so can be tricky to get right.
  • Operational: If the system is successful, it may be used for a long time or by many people, amplifying the value of making the system easy to operate.
    • Visibility: the operator should be able to easily verify that the system is functioning properly. This includes being able to determine what level of resources are being used and that it is not being attacked. All of the following pieces of information should be readily available to the operator:
      • What version of what component is running where.
      • What is happening now.
      • What has happened recently.
      • Who is accessing now or recently.
      • Performance and other resource statistics.
      • Usage statistics (e.g. what function is most popular).
      • Error and warning conditions.
      • Debugging information (e.g. logging).
      • Post-mortem info (for bug fixes).
    • Control: the operator should be able to adjust aspects of the system as appropriate to maintain its proper operation and to protect its security. This includes being able to do the following:
      • Start and stop processes, applications or hosts.
      • Manage access control, including adding, removing or modifying accounts, privileges and resource limits for users and groups.
      • Enable and disable features.
      • Perform maintenance.
    • High availability: for systems that require high availability once put into production, in addition to the robustness items listed above the following operations should be possible:
      • Piecemeal upgrades or replacement of components while running.
      • Ability to run with mixed versions of components (soft rollout), particularly on a system with multiple copies of the same component.
    • Granular control: for optimal resource management and flexibility in managing the business model of usage and payment, the system should support these capabilities:
      • Per-user restrictions, e.g. by time of day, stop when out of money.
      • Ability to charge customers per transaction or per other metrics (cpu usage, disk usage, #calls, etc).
    • Image: you might not be able to judge a book by its cover, and beauty is only skin deep, but people respond to how things look, so it is important to remember to maintain consistency and quality in these areas:
      • Branding: consistent look and feel concerning logos, company colors, slogans, etc.
      • Beauty: too often not included in software products.
    • Backups: any system with data of any value should be backed up. This is valid even for small systems, such as your laptop or home network.
      • You should have a backup schedule based on how much you would be bothered by losing your data. For a simple system, a regular schedule of full backups plus an automated incremental backup should suffice and is generally relatively easy to set up. For a larger system, you might want a full backup less often (due to the amount of data) with a layered schedule of weekly and daily backups.
      • Backup media should be validated. At a minimum, the backup media should be checked to make sure it contains the data that was intended to be written to it. For a more complete test, the backup data should be used to recreate a system in the same manner as would need to be done following a disaster.
      • Copies of backups should be stored off-site for disaster security. If your backups are stored on-site and the building is destroyed, those backups won't do you much good.
      • Sensitive data on backup media should be encrypted to prevent its use in the event that media is stolen.
I am always interested in continuing to learn, so if you think I have left out anything from my list, please let me know.

Monday, October 13, 2008

Polymorphism Using Implicit Conversions

In the Java implementation of one of my applications I had a few methods that accepted or returned null to indicate a missing value. These were originally designed this way because they were layered on top of some Swing calls which used the same practice. When I converted the app from Java to Scala, I decided to use Option for these interfaces in order to avoid nulls.

Contents

Motivation

To start with, I implemented some Scala methods that called the Java methods that might return null and converted their return values into an Option. I wanted to be able to pass these values directly to some other Scala methods, so I designed those methods to accept Option values as well.

In one case, the Java code included a class with two methods overloading the same name, one with an optional File argument and the other with an optional String argument (representing a file name). Simplified for expository purposes, the Java methods looked something like this:
//Java code import java.io.File; public class Foo { //name is optional, may be null public void bar(String name) { } //f is optional, may be null public void bar(File f) { } }
In my initial conversion attempt to Scala, I tried some code that looked something like this:
import java.io.File class Foo { def bar(fOpt:Option[File]):Unit = { } def bar(nameOpt:Option[String]):Unit = { } }
However, this did not compile, because the signature of the two methods is identical after type erasure, which makes Option[File] indistinguishable from Option[String].

Solution

After some discussion and suggestions from the Scala mailing list, I came up with a design using implicit conversions, and realized it is in fact another way to implement functions which are polymorphic on the type of one argument.

Rather than defining multiple functions with different signatures, I defined a single function which accepts a special type that I defined. That type is a sealed case class that can represent the set of types that my function understands.

Although the actual function implementation accepts a single type, the use of implicit conversions makes it effectively behave the same as if there were multiple functions, each accepting a value of one of the implicit conversion input types. The function receives enough information in the parameter to determine the type of the argument passed in, so it can arbitrarily modify its behavior based on the type of the argument, which is a requirement of true polymorphism.

In this case, since my original functions understood String, File and null arguments, my type definition looked like this:
import java.io.File sealed abstract class BarArg case class BarString(s:String) extends BarArg case class BarFile(f:File) extends BarArg case object BarNothing extends BarArg
There is now a single bar function that combines the functionality of the previous multiple functions of that name. For this example, the bar function looks like this:
def bar(b:BarArg) = { b match { case BarString(s) => println("Got a String "+s) case BarFile(f) => println("Got a File "+f) case BarNothing => println("Got a Nothing") } }
The BarArg class is sealed, so the Scala compiler can figure out that the cases in the match statement are complete, and we don't need a default case.

With the above definitions I can call bar(BarString("abc")), but I want to be able to pass it a String or File directly. I also want to be able to pass in an Option of either type and have it appropriately converted from None to BarNothing or from Some to BarString or BarFile. In order to do this, I created a set of implicit conversions:
object BarArg { implicit def stringToBarString(s:String):BarArg = if (s==null) BarNothing else BarString(s) implicit def fileToBarFile(f:File):BarArg = if (f==null) BarNothing else BarFile(f) implicit def optionStringToBarString(s:Option[String]):BarArg = s match { case None => BarNothing case Some(ss) => BarString(ss) } implicit def optionFileToBarFile(f:Option[File]):BarArg = f match { case None => BarNothing case Some(ff) => BarFile(ff) } }
The implicit conversions are packaged up in an object so that they can be imported into the application calling bar:
import BarArg._
With this set of implicit conversions in scope, the following all work:
bar("abc") bar(new File("f")) bar(BarNothing) bar(BarFile(new File("g"))) bar(BarString("def"))

Modifications

Unfortunately, bar(None) does not work because the compiler does not know if None is an Option[String] or an Option[File], so the implicit conversions are ambiguous and it is unable to apply one. The compiler gives the same complaint for bar(Some("abc")) (as of Scala version 2.7.2, although this may be a bug). We can work around this limitation in two ways: either by explicitly declaring the type on the None or Some we are passing in, or by setting the Some value into a variable (or setting the None value into a variable of the appropriate declared type) and passing that to bar. The following examples work:
bar(None:Option[String]) bar(Some("abc"):Option[String]) bar(Some[String]("abc")) val n:Option[String] = None; bar(n) bar{val n:Option[String] = None; n} val s = Some("abc"); bar(s) bar{val s = Some("abc"); s}
In addition, we can pass in Some("abc") directly if we modify slightly our implicit conversion functions to accept type parameters with an upper bound:
implicit def optionStringToBarString[T<:String](s:Option[T]):BarArg = s match { case None => BarNothing case Some(ss) => BarString(ss) } implicit def optionFileToBarFile[T<:File](f:Option[T]):BarArg = f match { case None => BarNothing case Some(ff) => BarFile(ff) }
Since the BarArg type is only used when calling our bar function, the implicit conversions will only be applied to those calls, so it is safe for those implicit conversions always to be in scope.

Summary

We can use this technique as an alternative to function overloading when we want a polymorphic function that accepts multiple types for one of its arguments. We do this with the following steps:
  • Define our own set of case classes for that argument that define the set of types we accept
  • Ensure that our function handles those types
  • Define a set of implicit conversions to those case classes
  • Import those implicit conversions into our calling application
This technique allows you to create a single method with a parameter that can accept one of a selected number of types, which can be disjoint (such as String, Int and Point), yet still have the compiler type check for you to prevent passing an unsupported type (such as Double) for that parameter. This is useful in situations where polymorphism by overloading can not be done because of type erasure. Although similar in some respects to ad-hoc type coercion polymorphism, in this case we are able to set things up so as to be able to detect the difference in the way the method was called and thus to behave differently based on that type.

Update 2008-11-04: Michael Dürig posted his solution to this problem (writing a method to accept two different types that are identical after type-erasure) back in January of this year. His approach is slightly different, but also uses implicit conversions.

Update 2009-03-12: Mitch Blevins has a better implementation that he calls stuttering-or, using "or" as a type infix operator to allow combining disjoint types.

Monday, October 6, 2008

Avoiding Nulls

One of the reasons I like Scala better than Java is because I believe it promotes better programming practices. Avoiding null is one of these practices.

In Java, null is often used as a marker to indicate a missing value for an object. For example, System.getProperty returns null if the property is not defined.

In Scala, the accepted way to do this is to use an instance of the Option class.

Contents

The Option Class

Scala's Option class has two subclasses: None and Some. Option itself is an abstract class, so can't be instantiated. Thus an object of type Option must contain either None or Some (well, it is possible for it to be null, but that's generally considered bad form). The None value is used in the same way as null is typically used in Java, to indicate a missing value. If the object is an instance of Some, it contains a value which is the data of interest.

Using Option has these advantages over using null to indicate a missing value:
  • A None value unambiguously means the optional value is missing. A null value may mean a missing value or it may mean the variable was not initialized.
  • The fact that Option contains explicit methods to get at the actual value makes you think about the possibility that it might not be there and how to handle that situation, so you are less likely to write code that mistakenly assumes there is a value when there is not.
  • If you do write code that assumes there is a value there and it is executed when there is not a value, the NoSuchElementException exception you get when using an Option is more specific than a NullPointerException and so should be easier to interpret, track down, and fix.
  • Option contains an assortment of methods to get at or manipulate the optional value, which make for more concise coding.
Let's take a look at that last point in more detail.

How would we want to define getProperty in Scala? We would of course make it return Option[String] rather than a String value which might be null.

Here is a code sample that shows how a Scala implementation of getProperty could be used if that hypothetical method returned an Option that can contain a String (remember that Scala uses [] for types, the way Java uses <> in generics):
val opt : Option[String] = getProperty("MY_PROPERTY") opt match { case None => println("MY_PROPERTY is not defined") case Some(val) => println("MY_PROPERTY is "+val) }
The Option class has methods such as isEmpty, isDefined, get, and getOrElse that can be used to test and use the contained value. It also has some more interesting methods such as map and filter that are less familiar to imperative programmers but can be used in very nice ways, especially when more than one level of Option is involved.

For example, say we want to have a method that returns an Int value for a property, so we define a method getIntProperty that takes a String property name argument and returns Option[Int]. In other words, it returns None if the property is not defined, or Some[Int] (such as Some(123)) if the property is defined. Assuming we already have the getProperty method mentioned above that returns Option[String], we can build our getIntProperty method from getProperty plus a method that accepts an Option[String] and parses it as an integer to produce an Option[Int]. Let's call this method parseOptionalInt. With this parseOptionalInt, our getIntProperty method is simple:
def getIntProperty(s:String) : Option[Int] = parseOptionalInt(getProperty(s))
We could write parseOptionalInt like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = { if (sOpt.isEmpty) return None else return Some(sOpt.get.toInt) }
Or, using pattern matching, we could write it like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = { sOpt match { case None => None case Some(s) => Some(s.toInt) } }
But there is another interesting way of writing it, using the map method.

Using map and flatMap

Using the map method of Option, our method definition looks like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = sOpt map (_.toInt)
Using map allows us to reduce the body of our parseOptionalInt method from four lines of code down to one line.

The map method on Option has the property that None always maps to None. It only applies the mapping function to Some() values. The flatMap and filter methods have this same behavior. This allows you to chain operations together, and if any of them generate None from their input, the result will be None.

As a more extended example of this chaining, assume we have the following functions available to us, each of which returns None if it can't load the requested item:
//Like Java's System.getProperty def getSystemProperty(key:String) : Option[String] //Given a filename, load a .properties file into a Scala Map def loadPropertyFile(filename:String) : Option[Map[String,String]]
We also take advantage of Map.get, which returns an Option.

Here's what we want to do: read a System property called PROPFILE that is the name of a properties file, load that properties file, read the value of the TIMEOUT property from it and convert it to an integer. If the System property is not set, or the file does not exist, or the property does not exist in the file, then return a default value of 60. Here's the Scala code to do that:
val x = (getSystemProperty("PROPFILE") flatMap loadPropertyFile flatMap (_.get("TIMEOUT")) map (_.toInt) getOrElse 60)
Because Option implements map, flatMap and filter, we can also use the for syntax on Options, as with any of the Scala classes that implement those functions.

Legacy Java

Java has a lot of methods that return null or accept null as an argument to indicate a missing value. Scala can easily call these methods directly and use null in this way, but it would be nicer if those Java methods could take and return instances of Option instead.

We could implement an Option class in Java, with subclasses Some and None as in Scala and many of the Option methods such as isEmpty, isDefined, get, and getOrElse. Unfortunately, we can't implement map and filter as elegantly as in Scala because Java does not have function literals, although this may change in a future version of Java.

The other approach is to write wrapper functions in Scala that call the Java functions and translate between null values and None values. There are two halves to this: accepting an Option and passing the contents of that Option or null to Java, and accepting a value that might be null from Java and converting it to an Option that might be None.

I implemented these two conversions by creating an object SomeOrNone with an apply method that creates an Option and an implicit conversion to convert from an Option to a raw value or null.
object SomeOrNone { class OptionOrNull[T](x:Option[T]) { def getOrNull():T = if (x.isDefined) x.get else null.asInstanceOf[T] } implicit def optionOrNull[T](x:Option[T]) = new OptionOrNull(x) def apply[T](x:T) = if (x==null) None else Some(x) }
The application includes the following line to pick up the implicit conversion:
import SomeOrNone.optionOrNull
With the implicit conversion in scope, the getOrNull method can be applied to any Option, making it easy to convert from the Option to a value or null when passing an argument to a legacy Java method.

On the return side, SomeOrNone can be applied as a function to the value returned by the Java method in order to get an Option that will be None if the value was null.

With this simple helper class, we can now trivially write the Scala version of getProperty that we mentioned above:
def getProperty(key:String) = SomeOrNone(System.getProperty(key))
As an example of passing in a value, which may be null, that is taken from an Option, here is how we would call the two-argument version of Java's System.getProperty, where the second argument is a default value, which may be null, to be returned if the specified property value is not found:
import SomeOrNone.optionOrNull def getProperty(key:String, dflt:Option[String]) = SomeOrNone(System.getProperty(key, dflt.getOrNull))
We can write wrapper methods as shown above, or we can just use the SomeOrNone and getOrNull functions in-line when calling a legacy Java method from our Scala code. Either way, we need never deal with those pesky null-as-missing-value markers again.

Other Blogs

Here are some other blogs about Scala's Option type or dealing with nulls:
  • David Pollack (2007-04-13) on how Option is used in his Lift web framework, including how Option fits in to for comprehensions.
  • Tony Morris (2008-01-16) on using higher-order functions with Option to replace pattern matching.
  • Debasish (2008-03-11) comparing to the Maybe Monad.
  • Daniel Wellman (2008-03-30) on using Option to avoid NullPointerExceptions.
  • Daniel Spiewak (2008-04-07), including a Java implementation of Option. Read the comments for some good info (and links) about Option as a Monad (which Daniel mentions at the start of his article, but he does not discuss that aspect much), and a pointer to another Java implementation.
  • Luke Plant (2008-05-07) on why Haskell Maybe is better than null pointers.
  • Ted Neward (2008-06-27) on Option as a container, with a sidebar comparing it to C# 2.0 nullable types.
  • Stephan Schmidt (2008-08-06) on a hack to implement an Option monad in Java.
  • Tom Adams (2008-08-20) comparing the handling of nulls in Java and Scala.

Wednesday, October 1, 2008

State The Obvious

It is a standard plot device, often used in sitcoms, to have two people talking to each other, each thinking he knows what the conversation is about, but each believing the conversation is about something completely different than the other is thinking. Each goes away thinking he has communicated and bases his continued activities on that assumption, eventually leading to a comical denouement when the truth comes out.

Unfortunately, if it happens to you, I suspect it would be much less likely to seem funny. I try to avoid getting into this situation by following a simple principle: State The Obvious. Because what is obvious to me may not be obvious to the other person.

I enjoyed a scene from the recent movie Mama Mia where they took this standard plot device and turned it around, in the scene where Harry and Bill are in the boat discussing their recent revelations. As an audience member, I thought I understood that each was talking about discovering that Sophie was (as he thought) his daughter and thinking the other was talking about something else - only to realize at the end of the movie that perhaps they really were each talking about what the other thought they were! A nice little two-sided double-entendre conversation.

When I was in college, I heard this anecdote:
A professor is lecturing in a math class, writing derivations on the board as he talks through a proof. At one point, he says, "From here it is obvious that -" but then he turns away from the board, saying "Just a moment." He flips open his notebook and scribbles for a few minutes, then turns back to the proof on the board: "Yes, I was right, it is obvious."
As E.T. Bell said, "'Obvious' is the most dangerous word in mathematics."

"Obvious" is like "intuitive": not necessarily the same for different people. Many years ago I was developing an application using the Athena widgets on X-Windows. I asked one of my co-workers to try it out so that I could get some early usability feedback. At one point he came to a window with a scroll bar. He had quite a bit of difficulty making the scroll bar work, and he was getting rather annoyed. Finally he said, "I don't understand this scroll bar. It's not intuitive. It's not like the Mac." He could have decided to leave off that final part of his complaint, where he mentioned the Mac, assuming I would understand. Fortunately, he stated the obvious. That made it an Aha! moment for me, because I finally understood what people really mean when they say something is intuitive.

Earlier this year I watched a Mythbusters episode in which they tested the question of whether an airplane on a treadmill can take off. I thought to myself, "Well, that's a pretty stupid one, the answer is obvious to anyone who understands any physics." Part way through the episode Jamie made a similar comment, saying it was one of the stupidest tests they had ever done because the answer was obvious. What I found rather curious was the fact that they said this was one of the most hotly debated myths ever discussed on their web site. Why would there be so much debate over something so obvious? I was even more surprised when Jamie said "of course it can", given that my thought had been "of course it can't"! The discrepancy became clear once they finished the test and their airplane took off: Jamie and I were making different assumptions about the interpretation of the test. Jamie was assuming a very long treadmill, long enough that the plane could still accelerate and move relative to the air, whereas I had been assuming a treadmill similar in size to the plane, which would require the plane to take off essentially from a standstill with respect to the air. The difference in our interpretations of the problem explained not only why our conclusions were different, but also explained to me why there was so much debate: people were not sufficiently explaining their assumptions, probably because they were "obvious", so each side ended up thinking the other side were idiots who did not understand physics.

Although there are some people who like to make fun of being obvious (and I admit I enjoy their humor), we are also reminded in music and other places to state the obvious. It is also culturally important. And while I agree that we should not be required to state the obvious, doing so can sometimes save us some trouble.

I believe that companies should also follow the State The Obvious principle. Here are a couple of statements of company policy that I would hope would be obvious for all companies, each of which I have seen explicitly stated by a different large company:
  • Don't Be Evil.
  • We obey all the laws of the land.
There are many companies that abide by these principles without explicitly stating them, but I can think of at least one large American company with a reputation for violating both of them.

As with the company, I also believe you should adhere to the State The Obvious principle for your own work at the office. For example, if you are working with your peers to select a new vendor, product, or technology, make sure you have explicitly stated and agreed on the requirements or selection criteria, even if they are obvious to you. You may think it is obvious that single-supplier lock-in is a bad thing, but perhaps your peers do not share that position. Once you have stated and agreed on those criteria, you can ask your peers to justify their selections based on how those selections support the agreed-upon criteria - but be prepared to do the same for any selection you propose.

Of course, there is always the possibility that, once you do State The Obvious, the other person will refuse to accept your statement, no matter how much you belabor it. Or, even worse, he may try to change the situation such that your obviously true statement is no longer true, which could have unpleasant consequences for you ("whop").

Actually, this whole discussion should be unnecessary. After all, it's obvious that your life will be better if you follow the principle of State The Obvious. Isn't it?