I’ve mentioned previously how disappointing most type systems are. OK, pretty much all of them.
Why do they prevent truly strict typing? Why are they so loosey-goosey? I mean, seriously, it’s more than two decades into the third millennium — shouldn’t we be able to specify precisely what we expect?
Union types the easy way
I’m no expert on type theory, so maybe I’m missing some key limitation of type systems, but I don’t see why we can’t create very specific types.
For example, take unions. Yes, we can create union types of various strings.
But the first question that comes to mind is this: Why must I create this long list of types which change occasionally? Isn’t there a single source of truth for which protocols can be used in a URL? There ought to be.
So let’s pretend that there is a simple list at, say, truth.net/url/protocols
. Why couldn’t I do something like this?
That would be nice. We could create union types easily for many things. For example, every possible US Zip Code+, e.g., 60615-1234. The possibilities are endless, and the type would be both exact and up to date. Or maybe we could version it so that we could lock in a version and then upgrade when we had time to fix the issues.
Heck, maybe some of these types should be baked in to the type system, so common are they.
Is there even a way to type the ZipCode with a regular expression so we know that it is exactly five digits with an optional hyphen and exactly four more digits? Not that I’m aware of.
String types via configuration objects
We could also allow specification of minimum and maximum lengths of strings (or arrays, etc.) as well as allowing or disallowing characters or even specifying a regular expression that the string must match.
I know the idea of pattern matching has at least been discussed. What are we waiting for?
If only I could do this:
Strings from patterns
When I specify a string, why can’t I bind it to a specific pattern with optional parts and alternative values?
OK, we can do some of this using template strings or guard functions (ack!) or other very tricky and complex hacks. But why? Why not just a string type builder?
Please guard me from this
I swiped this example from a StackOverflow post showing how simple it is (um, no) to create a string type with a minimum and maximum length. All you have to do is this:
Easy peasy, right? Although if I’d have written this, I’d have put the min and max first in a curried function so I could do this:
const stringOfLength = (min, max) => (input) => { ... }
const stringOfOneToTenChars = stringOfLength(1, 10)
stringOfOneToTenChars("hello") // type checks
stringOfOneToTenChars("") // fails type check
But check out that last bit:
const a: StringOfLength<0, 10> = ‘hello’ // FAILS TYPE CHECK!
Hello indeed! Isn’t that exactly what I wanted to do? But I get an error! Why?
Oh, but crap! I also note that the max and min are of type number
. What happens if I do this:
stringOfLength('hello', 1, 4.9999999999999999999999999999999999)
Oh, look! It type checks! WTF? Which brings up this (for TypeScript): why is it that I still can’t type something as an integer? By golly, that seems kind of important.
Why do we keep bending over and saying, well, I guess this is all I deserve? When are we going to demand better from the JS/TS folks? But no, we just take whatever we’re given and say “thank you, sir, may I have another?” Ugh.
Here’s what I want:
type StringOfOneToTenChars = string<1,10>
const greeter: StringOfOneToTenChars = "Boy howdy!" // type checks
const failure: StringOfOneToTenChars = "" // fails type check
// And this:
type StringOfFiveToEightChars = string<5,7.99> // ERROR!
I don’t care about the specific syntax, just keep it simple, consistent, and reasonably terse. And it sure wouldn’t break my heart if those mins and maxs could be set dynamically.
And, frankly, guard functions suck. I want my type system to work at both compile time and runtime. Having to write complex guard functions defeats the whole purpose, and makes my code needlessly complex. I mean, seriously? That should all be handled in the interpreter. Why ship all that code over and over again?
I guess the academic fan boys (and girls) love this shit. Hey, let’s make everything ultra-complex so only the “illuminati” can grok it. Screw the poor coders who have to try to make sense out of this mess.
Here’s a hint, boys and girls: when you’re writing way more type code than program code, something is truly messed up.
Idris leads the way
The only programming language I’ve seen (there are probably others) that attempts to implement this feature is Idris. Unfortunately, like Haskell, it is a language of, by, and for academics, hence pretty damn difficult for the lowly coder to grok.
From the introduction to the Idris documentation, we find this example:
Dependent types allow types to “depend” on values — in other words, types are a first class language construct and can be manipulated like any other value. The standard example is the type of lists of a given length,
Vect n a
, wherea
is the element type andn
is the length of the list and can be an arbitrary term. [emphasis mine]When types can contain values, and where those values describe properties, for example the length of a list, the type of a function can begin to describe its own properties. Take for example the concatenation of two lists. This operation has the property that the resulting list’s length is the sum of the lengths of the two input lists. We can therefore give the following type to the
app
function, which concatenates vectors:
app : Vect n a -> Vect m a -> Vect (n + m) a
Fucking brilliant. How would this work in something like TypeScript? How about this (I’ve switched to strings as arrays are more complicated):
function concatenate (a: string<m>, b: string<n>): string<m + n> { ... }
So simple. And so powerful. This says if I concatenate a string of length m with a string of length n, then the string returned should be of length m + n. Duh.
We can has kewl types?
Why not move the type declaration into a comment (where it belongs) and make it optional (let’s pretend that //+
marks a type annotation rather than a comment):
//+ (string<m>, string<n>) => string<m + n>
function concatenate (a, b) { return `${a}${b}` }
const show = "Show" // length 4
const me = "Me" // length 2
concatenate(show, me) // "ShowMe" (length 6 = 4 + 2)
Ideally, this would fail to compile:
//+ (string<m>, string<n>) => string<m + n>
function concatenate (a, b) { return `${a} ${b}` } // OOPS! (note space)
Again, any reasonable syntax is OK with me. I’m equally fine with Idris’s Haskell-style syntax (and what a joy if all JS functions were automatically curried!):
//+ string m -> string n -> string (m + n)
function concatenate (a, b) { return `${a}${b}` }
Now that the type declaration is in a comment, anything goes. We could have type functions such as reverse
that reverse ordering, so:
//+ string -> reverse(string)
function reverse (s) { return s.split('').reverse().join('') }
reverse("Bob's yer uncle") // "elcnu rey s'boB"
Idris does something similar with type functions and where
clauses:
Functions can also be defined locally using
where
clauses. For example, to define a function which reverses a list, we can use an auxiliary function which accumulates the new, reversed list, and which does not need to be visible globally:
reverse : List a -> List a
reverse xs = revAcc [] xs where
revAcc : List a -> List a -> List a
revAcc acc [] = acc
revAcc acc (x :: xs) = revAcc (x :: acc) xs
Ha, ha. Those academy types! Totally lost me there, but I get the gist of it.
The point is, our current type systems are woefully inadequate to the task. We should be able to specify the data shape precisely. And we need the type system to be available at runtime, which is when most of the data with which we’re going to work shows up.
Is that really asking too much?