Consequences of inability to add natural numbers in C

Tag: haskell , types , language-design , turing-complete Author: yuanqinglin Date: 2011-11-15

In System F I can define the genuine total addition function using Church numerals.

In Haskell I cannot define that function because of the bottom value. For example, in haskell if x + y = x, then I cannot say that y is zero - if x is bottom, x + y = x for any y. So the addition is not the true addition but an approximation to it.

In C I cannot define that function because C specification requires everything to have finite size. So in C possible approximations are even worse than in Haskell.

So we have:

In System F it's possible to define the addition but it's not possible to have a complete implementation (because there are no infinite hardware).

In Haskell it's not possible to define the addition (because of the bottom), and it's not possible to have a complete implementation.

In C it's not possible to define the total addition function (because semantic of everything is bounded) but compliant implementations are possible.

So all 3 formal systems (Haskell, System F and C) seem to have different design tradeoffs.

So what are consequences of choosing one over another?

What do you mean by "C specification requires everything to have finite size" if not "there is no infinite hardware"? C does limit dynamically allocated chunks of memory to less than pow(2, 8 * sizeof(size_t)) bytes but you can just chain those if you have hardware that has enough memory.
Is System F Turing complete? If so I don't think you can avoid the x + y = x problem because y can be infinitely recursive.
@Owen: System F is not Turing complete; there are many ways to encode some complex recursive programs, but it's still strongly normalizing
@acfoltzer in that case I would say that's one of the big tradeoffs ;)
@MikeSamuel I can't chain those infinitely because pointer size is fixed. So I cannot represent arbitrary integer in C as an arbitrary length linked list - the semantics always impose a limit. In contrast, semantics of SystemF or Haskell don't specify the limits.

Best Answer


This is a strange problem because you're working with a vague notion of =. _|_ = _|_ only "holds" (and even then you should really use ?) at the domain semantic level. If we distinguish between information available at the domain semantic level and equality in the language itself, then it's perfectly correct to say that True ? x + y == x --> True ? y == 0.

It's not addition that's the problem, and it's not natural numbers that are the problem either -- the issue is simply distinguishing between equality in the language and statements about equality or information in the semantics of the language. Absent the issue of bottoms, we can typically reason about Haskell using naive equational logic. With bottoms, we can still use equational reasoning -- we just have to be more sophisticated with our equations.

A fuller and clearer exposition of the relationship between total languages and the partial languages defined by lifting them is given in the excellent paper "Fast and Loose Reasoning is Morally Correct".


You claim that the C requires everything (including addressable space) to have a finite size, and therefore that C semantics "impose a limit" on the size of representable naturals. Not really. The C99 standard says the following: "Any pointer type may be converted to an integer type. Except as previously speci?ed, the result is implementation-de?ned. If the result cannot be represented in the integer type, the behavior is unde?ned. The result need not be in the range of values of any integer type." The rationale document further emphasizes that "C has now been implemented on a wide range of architectures. While some of these architectures feature uniform pointers which are the size of some integer type, maximally portable code cannot assume any necessary correspondence between different pointer types and the integer types. On some implementations, pointers can even be wider than any integer type."

As you can see, there's explicitly no assumption that pointers must be of a finite size.


I'm only interested in the semantics of the language
@nponeccop: My point is that semantically, addition in Haskell is still "true addition" -- its just that it is defined over a lifted domain. If that's not what you're looking for, then your question is really just about the tradeoffs of total functional programming. On which see Turner (2004) and the papers which cite it, and see, e.g., this ltu discussion:
I bet mathematicians will laugh at your face when you say that finite ring arithmetics is the same as Peano arithmetics and addition in finite rings is just an addition defined over a finite domain. My question was about relative ease of reasoning when working in N, lifted N and finite rings or other sensible finite "approximations".

Other Answer1

You have a set of theories as frameworks to do your reasoning with; finite reality, Haskell semantics, System F are just ones of them.

You can choose appropriate theory for your work, build new theory from scratch or from big pieces of existing theories gathered together. For example, you can consider set of always terminating Haskell programs and employ bottomless semantics safely. In this case your addition will be correct.

For low level language there may be considerations to plug finiteness in but for high level language it is worth to omit such things because more abstract theories allow wider application.

While programming, you use not "language specification" theory but "language specification + implementation limitations" theory so there is no difference between cases where memory limits present in language specification or in language implementation. Absence of limits become important when you start building pure theoretic constructions in framework of language semantics. For example, you may want to prove some program equivalences or language translations and find that every unneeded detail in language specification brings a much pain in proof.

Other Answer2

I'm sure you've heard the aphorism that "in theory there is no difference between theory and practice, but in practice there is."

In this case, in theory there are differences, but all of these systems deal with the same finite amount of addressable memory so in practice there is no difference.


Assuming you can represent a natural number in any of these systems, you can represent addition in any of them. If the constraints you are concerned about prevent you from representing a natural number then you can't represent Nat*Nat addition.

Represent a natural number as a pair of (heuristic lower bound on the maximum bit size and a lazily evaluated list of bits).

In the lambda calculus, you can represent the list as a function that returns a function that called with true returns the 1's bit, and called with false returns a function that does the same for the 2's bit and so on.

Addition is then an operation applied to the zip of those two lazy lists that propagates a carry bit.

You of course have to represent the maximum bit size heuristic as a natural number, but if you only instantiate numbers with a bit count that is strictly smaller than the number you are representing, and your operators don't break that heuristic, then the bit size is inductively a smaller problem than the numbers you want to manipulate, so operations terminate.

On the ease of accounting for edge cases, C will give you very little help. You can return special values to represent overflow/underflow, and even try to make them infectious (like IEEE-754 NaN) but you won't get complaints at compile time if you fail to check. You could try and overload a signal SIGFPE or something similar to trap problems.

I cannot say that y is zero - if x is bottom, x + y = x for any y.

If you're looking to do symbolic manipulation, Matlab and Mathematica are implemented in C and C like languages. That said, python has a well-optimized bigint implementation that is used for all integer types. It's probably not suitable for representing really really large numbers though.


The question is not about computational power - it's mostly about the ease of accounting for edge cases during programming.
@nponeccop, I'm still unsure what you're asking for. Please see my edit.