The arity of equality

The symbol “=” is the fulcrum of a balance. It declares that the things on either side of it, whatever they may be, are equivalent, identical, alike, indistinguishable, the same. [*]

The metaphor of the balance implies that exactly two things are being weighed, one against the other; thus equality is usually taken as a pairwise, or binary, relation. The other relational operators, ≠, <, >, ≤ and ≥, get a similar interpretation as comparisons of just two entities. The pairwise constraint is reinforced by the infix notation of mathematics and of many programming languages. When you write x=y or u>v, there’s no convenient place for more than two operands.

The restriction to two arguments no longer seems quite so natural and necessary when you begin thinking in a language with either prefix notation (Lisp) or postfix notation (Forth or Postscript). In Lisp an equality predicate is written (= x y), and there’s nothing to stop you from going on to write (= w x y) or (= w x y z). Furthermore, it seems easy to assign a meaning to such an expression: (= x y z) is true if and only if x, y and z are all equal. Similarly, (< x y z) is true if and only if x, y and z form a strictly increasing sequence.

If we can cope with comparisons of more than two arguments, what about fewer than two? What should we make of an expression such as (= x) or (< y)? How about relational functions with zero arguments, which would be written in Lisp as (=) or (<)? This question arose recently on a mailing list concerned with standards for the Scheme programming language (a dialect of Lisp). The message that initiated the thread was posted last Saturday evening; three days later, the discussion has spawned a couple of satellite threads and there are more than 75 responses.

It’s been a lively exchange, with quite a wide spectrum of views expressed. I can’t do full justice to the dialogue; if you’re at all interested in the issue, please read the original messages. Here I’ll just lay out some of the positions argued:

  • The concept of comparison is intrinsically binary; a comparison should have exactly two arguments and otherwise constitutes an error.
  • Comparisons with more than two arguments make sense, but not those with fewer than two.
  • Comparisons with one argument are okay, and likewise those with two or more arguments, but there’s no need or reason to allow zero arguments.
  • Comparison operators should be able to handle any number of arguments. Those invoked with zero arguments or one argument should return false.
  • Comparison operators should be able to handle any number of arguments. Those invoked with zero arguments or one argument should return true.

What’s the right answer? At the outset, I was leaning toward the ultraliberal, “anything goes” view, but now I’m not so sure.

On first thought, the extension from exactly two arguments to at least two arguments seems like a harmless convenience. After all, even infix notations allow expressions such as x=y=z or 1<2<3. In a straightforward way we could define the more-than-two-argument versions of the relational operators in terms of a strictly binary operator:

(= x y z)(and (= x y) (= y z))

Here we rely on the transitivity of equality to dispense with the third comparison, (= x z). If for some reason you don’t believe in transitivity, life is going to be hard for you. But even for those of us who make the leap of faith about x=z, there are some subtle traps here. For strictly binary comparison operators, we can always rely on a few logical identities:

(≠ x y) = (not (= x y))

(≤ x y) = (not (> x y))

(≥ x y) = (not (< x y))

With more than two arguments, these identities no longer hold. Consider (≤ 1 2 1) and (> 1 2 1), which are both false. In the case of = and ≠, it’s not even entirely clear what the semantics ought to be. Is (≠ 1 2 1) true or false? In other words, should the symbol ≠, when applied to more than two arguments, be read as “not all the same” or as “all different”? If we choose the latter interpretation, then we can no longer trust transitivity. The expansion of (≠ x y z) becomes (and (≠ x y) (≠ y z) (≠ x z)). (Note that the number of pairs to be checked grows quadratically with the number of arguments.)

Looking in the other direction—at comparisons of fewer than two arguments—a first question is why anyone would ever want to try such a thing. What’s the point of comparing just one number? Or no numbers? But the counterargument asks why we should forbid such operations. If an expression such as (< 3) or (=) can be assigned a meaning consistent with the semantics of the rest of the language, why impose an arbitrary limitation?

One way of approaching this issue is by analogy with other procedures or operators. Consider addition. In this case too, infix notation gives the impression that “+” ought to be a binary operator, but in a prefix language there’s no reason to enforce that restriction. An expression such as (+ 3 4 5) has an obvious meaning; it returns the value 12. Moreover, there’s no mathematical offense in declaring that (+ 3) evaluates to 3 or that the zero-argument form, (+), yields 0. This last result is justified by noting that 0 is the identity element of the integers under addition.

This reasoning leads to an admirably simple and spare definition of summation. Suppose we are given a binary addition operator, binary+, and we want to build a generic summation operator, list+, which can be applied to any number of arguments from zero on up. To avoid introducing extraneous details, I’m going to assume that the arguments come already wrapped up in a list, numlist, and that all the elements of this list are guaranteed to be numbers. The definition of list+ looks like this:

(define (list+ numlist)
  (if (null? numlist)
      0
      (binary+ (car numlist)
               (list+ (cdr numlist)))))

The definition says that the sum of a list of numbers is 0 if the list is empty and otherwise is equal to the first element of the list plus the sum of the remaining elements. The simple recursive structure of this procedure depends crucially on applying the same rules to a list of any length, including a length of zero. Trying to impose restrictions on the number of arguments would just clutter up the code with needless tests and exceptions.

It’s easy to see that we can deal with multiplication in exactly the same way, except that the identity element is 1 rather than 0. Greatest-common-divisor and least-common-multiple allow a very similar treatment, with the zero-argument (gcd) yielding 0 and (lcm) returning 1.

The logical connectives and and or also lend themselves to this kind of implementation. Given the primitive operation binary-and, we can define an n-ary version as follows:

(define (list-and booleanlist)
  (if (null? booleanlist)
      #t
      (binary-and (car booleanlist)
                  (list-and (cdr booleanlist)))))

Here the argument is a list of booleans. In the base case, applying and to an empty list returns #t (the Scheme token for true) as an identity element; otherwise we invoke list-and recursively on the tail of the list. For the corresponding or procedure, the base case is #f, or false.

It seems we have a pattern—a general method for converting a binary operator into an n-ary one. But does it always work?

Think about max and min, which select the largest and the smallest of their arguments. It looks as if we could fit them into the framework constructed above for “+” and for and—but there’s a problem. What is the base case for the recursion? There’s no doubt that (max 1 2) is 2, and no one would object to saying that (max 1) is 1, but what numerical value is appropriate for the zero-argument form (max)? Mathematically, I suppose the best answer is negative infinity. Some versions of Scheme include a representation of this number, but they don’t return it as the value of (max), and as a matter of practical programming, I don’t think it would be a good idea for them to do so. In fact, both Scheme and Common Lisp require max and min to have at least one argument. (They apply the same rule to subtraction and division.)

Getting back to the relational operators, it’s interesting to note that Common Lisp treats them the same as max and min: They work for one or more arguments but raise an error if invoked on zero arguments. No rationale for this choice is given in the standards documents. The Scheme standard specifies two or more arguments for the relationals, but apparently some implementations of the language also allow zero arguments or one argument, returning true in those cases.

The two dialects also differ in how they deal with the ambiguity of (≠ x y z). Common Lisp declares that “≠” means “all different,” accepting the oddity that “≠” is not the same as “not =”. Scheme has a slyer way of resolving the conflict: The language simply doesn’t offer a “≠” predicate.

I suppose this whole issue is one that can safely be left to the language lawyers. The arity of equality is not likely to become a major issue in software engineering, and I’d be wary of any program whose behavior depends critically on whether (=) is true or false. What intrigues me is simply that issues like these can remain unresolved and the subject of lively debate in spite of the best efforts of very smart people to find right answers. Lisp is celebrating its 50th birthday, and Scheme has been around for roughly 30 years. The standards that define the languages are not slapdash documents; in particular, their treatment of numbers and arithmetic is unsurpassed in the world of programming languages. But it seems we still have to sweat the details.

This entry was posted in computing.

14 Responses to The arity of equality

  1. Jim Ward says:

    When you say (+ 3 4 5) this is an implicit (+ 0 3 4 5) because + has an identity operator, so (+3) is really 0 + 3 and (+) is 0. Some operators, like max, min, and = don’t have an identity operator.

  2. John Cowan says:

    Common Lisp’s /= operator is particularly annoying because it’s linearithmic (it has to sort its operands to make sure they are all different, although a hash table might be helpful), whereas all the other operators are linear.

  3. Derek Jones says:

    I don’t think it is possible to rely on the identity:

    (≠ x y) = (not (= x y))

    where programming languages are concerned.

    See sentences (number in larger point size in the margin) 577, 602-604, 1197, 1212, 1219 and probably more in
    http://www.coding-guidelines.com/cbook/cbook1_1.pdf

  4. brian says:

    @ Derek Jones: Everything depends on how numbers are represented. With inexact numbers (e.g., floating point), all bets are off. Reasoning about questions like these is pretty much hopeless except in languages that offer some kind of exact number representation. (Most Lisps have exact rationals.)

  5. Aaron says:

    Max and min do have identity elements: The smallest and largest numbers, respectively. These do vary by domain, of course, but so?

  6. Carl Witty says:

    @ Brian: It is certainly possible and sometimes useful to reason about the exact behavior of floating point numbers. Granted, it’s much more difficult and annoying than reasoning about exact numbers, but floating point is important; we shouldn’t just give up on understanding/reasoning about it.

    @ Aaron: For the answer of “What should (max) return”, you have to decide without picking a domain (because the function doesn’t know what context it’s being called in). Also, we usually pretend that the bignums don’t actually have a smallest or largest element (but not always; see http://jwz.livejournal.com/854482.html).

    @ Brian (again): I still wish these comments had a “preview” button :)

  7. Mike Stay says:

    Congratulations! You’ve rediscovered the idea of initial and terminal objects in a category.

    If your category has at most one morphism between any two objects, then you’ve got a poset (partially ordered set), and we say that A is less than or equal to B if there’s an arrow from A to B. If for any pair of objects A, B it’s true that either A

  8. Mike Stay says:

    Well, crap. There were seven more paragraphs full of examples and illustrations that got cut, with no recovery! A warning would have been nice. Here’s the terse version:

  9. Mike Stay says:

    - poset=partially ordered set
    - toset=totally ordered set
    - morphism A->B in a poset is A less-or-eq B or A divides B or A subset B etc.
    - initial object is = every object in a poset
    - product x is intersection (gcd) in a poset, min in a toset
    - coproduct + is union (lcm) in a poset, max in a toset
    - in a distributive category, x distributes over +
    - also in a d.c. the product of all objects is initial and the coproduct over all objects is terminal.
    - so gcd of all integers is 1, lcm of all integers is 0

  10. Mike Stay says:

    That should have been
    - initial object is less-or-eq <= every object
    - terminal object is gt-or-eq >= every object

    (Darn parser thought it was a tag…)

  11. Mike Stay says:

    And I meant distributive poset, not distributive category: clearly the disjoint union of all sets is not the one-element set.

  12. brian says:

    My apologies for the primitive comment interface. I’m embarrassed by it, enough so that I have marked my calendar to attempt a WordPress update this coming weekend. (I’m not sure the update is all that I need, but it’s a necessary first step.)

  13. brian says:

    Further to Aaron’s and Carl Witty’s comments on (max):

    The last time this came up for me in a real program, I was applying max to a list of numbers representing the pairwise distances between points in a plane. In this context there is a readily available least element: 0, or perhaps 0.0. But even if I had a version of max set up so that (max) ==> 0.0, I wouldn’t be happy with it. Because then I’d have to check every 0.0 result to see if it came from (max) or from (max 0.0 0.0 …).

    Of course this is an instance of a more widespread problem of distinguishing exception or error signals from normal results, but it raises the question of whether the at-least-one-argument approach might not have some practical justification, even if there’s no good, principled rationale for it.

  14. Rob Mayoff says:

    In Mathematica, Min[] returns Infinity and Max[] returns -Infinity.