A business needs to bucket sets of consecutive, ordinal, data for aggregation purposes like age group, salary band, period, life cycle phase, and so on. This is different from grouping non-consecutive, nominal, data into buckets, such as hierarchies.
Modify each boundary name to show the inclusive or exclusive state, e.g. since/until for periods and from/up to for numbers. Boundaries hold real data values or the data's sorting values: e.g. January through December are ordinal if you use numbers 1 through 12 to sort them. Identify the correct data type, and its precision and scale, or grain.
Enforce uniqueness if overlap between ranges cannot be tolerated; this is not easy.
The exclusive upper boundary removes +/-1 calculations, except for ranges the size of a single data point; the grain. This allows data types to change to a finer grain — DECIMAL(6,2) becomes DECIMAL(8,4) — without having to update code or values, because the niggling +/-1 is obsolete[1].
The Range pattern cannot use the ANSI-SQL BETWEEN operator, which would be used to compare a single data point against a [closed-closed] range. You can write out the greater-than-or-equal-to and less-than logic and most RDBMS allow overloaded, deterministic functions as an alternative. Three-valued logic, if needed, would invalidate BETWEEN as well.
As a final consequence, joins between consecutive ranges are straight equi-joins without calculations, so both sides of the join can benefit from indexes. And you can query a continuous chain of ranges with a recursive CTE.[2]
| Principle Trade-offs | ||
|---|---|---|
| Extend your understanding | > | Know your limits |
| Stay relevant | < | Build enduring |
| Do right | = | Share |
- If you like a paper copy better…↑
- See also Common Table Expressions and Hierarchical Query↑