26
$\begingroup$

An integer range type like 0..100 means that a value can be any integer in that range. Arithmetic operations on range types produce other range types, for example if a: 10..20 and b: 5..7 then a + b: 15..27. (Each range here includes both endpoints, for the sake of the example.) To be clear, here 0..100 is a type, not a value; the values are integers in that range.

Checking if a number is in a range can be treated as a fallible type conversion, e.g. converting x to type 0..100 could return an optional value which is None if x is not in that range. Alternatively, it can be treated as a type predicate, so that if(x in 0..100) { ... } narrows the type of x within the conditional branch.

Range types are used in a few languages, notably Ada (which only has integer types with defined ranges). They have some benefits, such as eliding bounds-checking for arrays of known length, avoiding overflows by using wide enough integers for any operation, and preventing division-by-zero errors. They also seem relatively easy to support, unlike dependent types which are more powerful and can represent types like 0..n where n isn't a compile-time constant. So what are the reasons why most statically-typed languages don't support integer range types?

$\endgroup$
15
  • 2
    $\begingroup$ @BoppreH All languages with fixed-width integers require programs to make such assumptions, just with quite wide ranges like 0..4294967295, but they don't provide the advantages of letting the user specify that range arbitrarily, and they don't prevent overflows or arithmetic errors. That said, the main use-cases are for integers which are logically bounded (like an index into a constant array), not integer values which theoretically ought to be unbounded but must be bounded for practical reasons (like a counter for the number of users). $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 15:36
  • 2
    $\begingroup$ That said, the answer might well be "because not enough people need integer range types" and that wouldn't be too surprising, but I'd like to know if there are other reasons, or e.g. if there's some empirical evidence showing that few programs would benefit from them. The kind of use-cases I'm thinking of are like grid[x + 9 * y] where grid is an array of length 81 in a sudoku program, where bounds-checking should be unnecessary because x and y are of type 0..8. $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 15:39
  • 1
    $\begingroup$ How consistent are semantics for range types in different languages? You mentioned the case of Ada where (T1: a..b) + (T2: c..d) == (T3: (a+c)..(b+d)), but does every user of a range type want that behaviour instead of bound-checking that it stays within T1 or T2? In a template language like C++ it's fairly easy for the user to implement those kinds of types and you don't risk imposing unwanted semantics. $\endgroup$
    – kouta-kun
    Commented Jun 30, 2023 at 15:45
  • 1
    $\begingroup$ @kouta-kun If the user wants arithmetic to stay within a..b, but the arithmetic expression they write doesn't logically have a result that is necessarily in that range, then they need to do a bounds check before they can assign that value back to a variable of type a..b. So the semantics of (a+c)..(b+d) also allow the alternative semantics in a straightforward way, and both ways require the user to handle the condition when the arithmetic result is out of bounds. That said, if there are with languages with more restrictive range types like you describe, answers could mention them. $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 15:49
  • 2
    $\begingroup$ @Seggan-OnStrike As far as I know, Rust does not have integer range types; it does have integer range values, i.e. there is a Range type in the standard library for representing ranges as values, and a specific syntax for constructing Range values. So in Rust 0..100 is a value but it is not a type. $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 17:10

6 Answers 6

12
$\begingroup$

The most popular reason is simply that having additional types is more work for the authors of compilers and libraries. Every feature starts with -100 points, and only gets implemented when there's a compelling reason to do so.

In low-level languages like C, automatic bounds-checking was deliberately avoided because it was seen as inefficient. Better to let a program crash with a segfault or divide-by-zero error than to waste precious CPU instructions validating conditions that the programmer should have already explicitly checked.

If you do implement range types, you'll have to figure out the semantics of doing arithmetic on them. So, using your example of a: 10..20 and b: 5..7, you'd need:

  • a + b: 15..27
  • a - b: 3..15
  • a * b: 50..140
  • a / b: 1..4

What happens if you define c : 1..100 and then write c = a * b;? Do you give a compile-time error? Throw an exception at runtime if a * b > 100?

Moreover, a lot of real-world values just don't have well-defined bounds. For example, if you're writing a Date class based on the Gregorian calendar, defining month: 1..12 and day: 1..31 seems obvious, but what bounds do you put on year? Do you start with 0, 1, 1601, 1753, 1900, 1970, 2000, or -3760? Do you end with 2038, 2099, 2999, or 9999? You'd have to make an arbitrary decision.

It's just easier to use plain int.

$\endgroup$
6
  • 2
    $\begingroup$ "Do you give a compile-time error?" ─ Yes, specifically a type error because the type of the expression a * b does not match the type of the variable you want to assign it to. If you want the value of a * b without having to check that the result is in range, you can just let the compiler infer what range the result is in, or you can declare c as an unbounded integer. The last paragraph of your answer seems to assume that range types and plain int cannot coexist, but support for range types doesn't mean you can't also have an int type. $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 17:15
  • $\begingroup$ What happens with c = a * b;? Of course it's a compile-time error. The meaning of a: 10..20 is that a can take the value 20, and so forth, so you'll have overflow for some values, and the compiler is telling you so. $\endgroup$
    – Pablo H
    Commented Jun 30, 2023 at 17:15
  • 1
    $\begingroup$ Also, plain int is an inter ranged type, as OP said. The compiler just doesn't flag potential overflows (except in some case where it has static/literal values). Just check out how many commits FFmpeg has for code with overflow errors detected by fuzzer... $\endgroup$
    – Pablo H
    Commented Jun 30, 2023 at 17:28
  • $\begingroup$ Plain int isn't an integer range type and I haven't said that it is; what I did say is that plain int types still require the programmer to make assumptions about what range their values will be in, due to overflow, unless the plain int type in your language is a bigint. The thing that makes int32 semantically different to -2147483648..2147483647 is that int32 arithmetic will overflow to stay in bounds, whereas arithmetic on range-typed integers will produce results of a different range type. $\endgroup$
    – kaya3
    Commented Jun 30, 2023 at 17:42
  • 1
    $\begingroup$ Also, keep in mind that segfaults and processes that can be crashed from and the like are fancy-schmancy OS facilities. When you're doing really low-level programming, you don't have those! $\endgroup$
    – hegel5000
    Commented Aug 15, 2023 at 15:19
8
$\begingroup$

In addition to the other fine answers, I'll note that many languages outsource their underlying type system to a "lowest common denominator" type system provided by a runtime, and then layer some small additional features on top of that. The languages of the .NET runtime all have basically the same type system with minor variations; similarly for the JVM languages, and so on.

This has pros and cons for the language designer. On the pro side, if you like the type system then you save on time and effort designing one. On the con side, we spent a fair amount of time on the C# team asking "why does the verifier not like this typesafe-in-C# program?" and discovering that there was some subtle rule of the runtime's type verifier that we were violating.

But the big benefit for a platforms company like Microsoft is interoperability between languages: that you can write libraries in one language and use them in any compatible language.

That then leads to the big cost: what happens when two or more languages disagree on the design for a type not represented in the underlying type system, or represented in a way that doesn't meet all the needs of a particular language? We saw this play out between C# and F# over some decades as the designs for various kinds of tuple types evolved.

If the designers of C# wanted to add ranged integers -- and it's an entirely reasonable suggestion on the face of it -- one of the first pushbacks they'd give you is "will adding this feature to C# create work for the runtime team, the F# team, or anyone else who provides a language in this ecosystem?"

Creating work for others is a lot of points against a feature, particularly when there are potential new features that produce benefits for maintainers of other languages in the ecosystem, rather than costs.

$\endgroup$
4
$\begingroup$

TL;DR: Integer Ranges are a poor, niche feature, subsumed by (Constant) Generics.

Primitive Obsession

Many developers, especially junior developers, tend to underuse static type systems, instead using int or string as the type for every single "scalar". At the extreme, this is known as Primitive Obsession: only deviating from primitive types when forced to.

Underusing the type system means that the compiler cannot assist you. For example, if a function takes 3 parameters: Year (Int), Month (Int), and Day (Int) and the caller passes Day/Month/Year... there will be no compile-time error pointing out their mistakes.

Unfortunately, Integer Ranges lead the user astray towards more Primitive Obsession.

The very example of using 1..12 for a Month, while well-meaning, demonstrates this very issue: 1..12 is also the exact range for the Hour hand of a clock... even though both have different semantics from one another.

Narrow Scope

There are many invariants which may make sense, and the bounds of a range are only one such invariant. It may make sense to restrict an integral to be either a power of 2, a power of 10, a prime, an odd or even number, etc...

Integer Ranges bless the range as a special invariant, and completely (and dubiously) ignore all others.

(Constant) Generics

Constant generics -- the ability to parameterize by constants -- allow to implement Integer Ranges as library, as well as an infinite number of invariants: the user imagination is the limit.

Hence, in any language with support for Constant Generic, Integer Ranges are syntactic sugar for what could be a library feature...

Examples:

Additional Integration Work

Still, why not?

One of the difficulty with any language feature is that it rarely is completely orthogonal from all other features.

In particular, meta-programming -- be it via macros, templates, generics, mix-ins, ... -- is generally all-encompassing, interacting with every single other feature.

As an example, Checked Exceptions and Template/Generics are a notorious example of poor integration, both in Java and C++, as there are no good facilities to manipulate the checked exception list. This results in Stream.map having no exception specification, when it should transparently propagate them.

This means that adding Integer Ranges in a language which supports, or plans to support, meta-programming requires extra work to properly integrate the two features.

$\endgroup$
2
  • $\begingroup$ Could you show an example of how you would use constant generics (e.g. in Rust) to implement range types where a + b would have type 15..27 as in the question? $\endgroup$
    – kaya3
    Commented Jul 24, 2023 at 13:34
  • $\begingroup$ @kaya3-supportthestrike: Can I use C++ instead? Rust Const Generics story is still a wee bit underwhelming ;) $\endgroup$ Commented Jul 24, 2023 at 14:09
2
$\begingroup$

Generics or templates are relatively new. Many languages with generics don't have this concept when the language was firstly created. And it would be too late for them to include range types as a basic feature later.

Without generics, and mechanisms to read information from types, specifically the type parameters of a generic, there is little to no observable effects of a variable being in the range type instead of a full integer. Range checking could be a thing, but that tends to be optimized out even in languages with range types. And without generics, it may not be easy for a programmer to specify the exact behavior after failing a range check.

Notably in Pascal-like languages, including Ada, range types are used to specify array dimensions. I'd say it's a very reasonable design. But the problem is we use arrays with dynamic sizes much more in modern programming, and won't really care that much about the few cases that still need fixed size arrays. The benefits couldn't apply easily in arrays with dynamic sizes.

$\endgroup$
2
$\begingroup$

It seems like ranges might solve problems like out-of-bounds checks.

def f(array, index): return array[index]

I could add a type for array that specifies the size of the array, but the safety of this function depends on the relationship between array's range of valid indices and index's value.

Types are about a value, but contracts are about relationships between values, so I think a contract system that incorporates range arithmetic would be more useful than merely range types.

I believe Wuffs uses ranges and contracts because it is a designed for writing codecs and codecs often have to deal with binary files with layout determined by the file format specification.

$\endgroup$
2
  • 1
    $\begingroup$ Welcome! In a language like C++ with templates that are expanded before type-checking, you would be able to write this function generically, with a parameter for the length of the array. In that case N would be a fixed constant when the template gets expanded, and then the range type 0..N would be expressible. $\endgroup$
    – kaya3
    Commented Jul 5, 2023 at 0:46
  • $\begingroup$ Yeah. Compile-time metaprogramming often lets you do that if you know statically the length of the array. Contract systems let you push the responsibility to the caller: don't call me unless you checked the bounds or have a contract that pushes responsibility yet farther up. $\endgroup$ Commented Jul 5, 2023 at 0:52
1
$\begingroup$

Among the true and potentially useful statements one could make describing aspects of a program's execution, the fraction that can be easily statically proven is often only a small portion of the fraction that can be statically proven at all. That in turn is only a small fraction of the portion which would be true for all inputs a program could possibly receive, which is often in turn only a small fraction of the portion that would be true for all inputs that a program would be expected to process usefully.

Range types may be useful in some situations where one wants to be able to prove things that are, in fact, easy to prove statically. For example, if a program reads in six four-digit decimal integers and adds them all together, a compiler that understands the process of reading four-digit decimal numbers could fairly easily statically prove that the result would fit in a sixteen-bit unsigned integer.

If a language had range types with the semantics that trapping program execution if a range was exceeded, or performing arithmetically-correct computations without regard for the range, would both be equally acceptable, then range types could improve efficiency by letting compilers judge the cost of doing a range check versus having to support computations with larger numbers, and select whichever course of action would be more efficient. Unfortunately, compiler philosophy today seems more focused on "anything can happen" UB which doesn't satisfy any program requirements than on trying to generate the most efficient machine-code programs that actually accomplish what needs to be done.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .