Static typing is not at all bad; usual static typing mechanisms are, due to the fact that most of them force you into designing your types before you even have a working prototype.
(If you should first understand your problem before you write some code or if you should use code to help you understand the problem and then write more code is up to debate, of course. Typing is an invaluable
tool of thought to help you understand your problem, yes, but it is just one more tool in the toolbox.)
The sad thing I noticed though is that, having hacked a mostly untyped Python code base during the last few days, making sure that all typing annotations you add to the code base are sound is a big PITA, to put it bluntly, and I would pretty much prefer to work with a statically typed language in this particular case. If your typing discipline is uncoupled from the ability to run code, people simply remove that obstacle from their way. It starts to be treated pretty much like tests and documentation: indispensable in theory, relegated to second plan in practice.
So my impression is that gradual typing almost got it right, but it may do a great disservice to overall code quality as well. I am now inclined to think that some kind of barbell strategy on typing will render better results: use both a completely untyped dynamic language such as Tcl, where everything is a string, and a static, strongly typed programming language (ideally a proof assistant with dependent types). You prototype with the former and move into production after translating your solution to the latter. If you ever need to push untyped code into production, it will be obvious to everyone involved and, since it is so decoupled, it cannot affect the quality of the statically typed codebase by any chance.
> Static typing is not at all bad; usual static typing mechanisms are, due to the fact that most of them force you into designing your types before you even have a working prototype.
> Typing is an invaluable tool of thought to help you understand your problem, yes, but it is just one more tool in the toolbox
For me those two sentences contradict each others. Types force you to draw an outline of your solution. For any non trivial problem this will be a time saver even in the prototype phase.
I can't see how they do, sorry. In fact, they complement each other. The usual static typing mechanisms I mentioned force you to approach a problem with their specific mindset, very much like math problems that add unreasonable constraints to their statements in order to check if you grasped a specific concept really well. In a real world scenario you should be able to reason your way out of the problem without such artificial constraints. So it should not be a restriction to anybody if they can accurately explain a concept or implement a program without using tools such as types.
> use both a completely untyped dynamic language such as Tcl, where everything is a string, and a static, strongly typed programming language (ideally a proof assistant with dependent types).
This is gradual typing. The real problem with gradual typing is that you largely forgo the performance advantages of static typing, since you spend a lot of compute time translating data in and out of "dynamically-typed" (i.e. tagged/runtime dispatched) representations. This is especially obvious in stringly-typed languages ala Tcl, but applies elsewhere as well.
I don’t think so. My hopes here are that, in having two orthogonal code bases in extremely different languages, there would be no chance that typed and untyped code intertwine, hence compromising typing accuracy. I regard this possibility as a serious problem with gradually typed languages, one of those that risk to be addressed with lengthy software engineering books, those to be mentioned as mandatory knowledge on every single programmer job interview by the year of 2039. But I digress.
“Abrupt typing” would describe this approach more precisely, if there was such jargon. A program module has no types while it is a prototype, but it must be completely typed (which is just another term for “having its correctness proved with the best tools available at this day and age”) before it goes into production.
Static typing isn't bad, but it's only as good as the type system is strong and detailed. Haskell's QuickCheck doesn't understand biases or infinite-typed values. The Hypothesis strategy of Strategies (the design pattern) is far more useful for producing serious test suites, as less time has to be spent fretting about which invariants can be hoisted to the type system.
So I use QuickCheck often and I'm a true believer, but the point this post is making is absolutely justified. Tying the test generation to the type is often requires some creative ways of writing `Arbitrary` instances (the specification of how test cases should be generated).
However, I don't think what the author is suggesting in its place better. The advantage of tying the test case generation to the type is that it's compositional. Lots of sub Arbitrary instances come together to generate the test case and in the long term I find this to be more maintainable. Whereas the author's approach discourages compositionally.
The real problem is that the types in vanilla Haskell are too loose, so there lots of inhabitants of a type that we are not interested in. The author mentions in passing, dependent-types might be a solution and that is exactly right. What is needed is to make the irrelevant terms illegal altogether. This is pretty achievable and is already somewhat used. For example, the program synthesis (proof search) mechanism of Idris 2[0] uses linear[1] dependent types[2] to heavily constrain which terms are possible candidates for a given type. The specificity of the type reduces the search space, hence increases the effectiveness of the proof search.
I find it rather disrespectful to summarise a 20 years old, very successful and at the time truly revolutionary tech like QuickCheck with a simple “this is a bad idea”. If you stand on the shoulder of giants at least pay some respect.
The title of this HN post is completely out of context, and the discussion here is unrelated to the article. The author is not at all arguing against types or Haskell, nor is he bashing QuickCheck (he’s developed a QuickCheck-like framework for Python). He’s merely complaining about some details of QuickCheck’s design.
I agree with your observation, except for the part where he not bashing on QuickCheck (he is, but you have to click the first link in the article in order to find it).
> One of the big differences between Hypothesis and Haskell QuickCheck is how shrinking is handled.
> Specifically, the way shrinking is handled in Haskell QuickCheck is bad and the way it works in Hypothesis (and also in test.check and EQC) is good. If you’re implementing a property based testing system, you should use the good way. If you’re using a property based testing system and it doesn’t use the good way, you need to know about this failure mode.
> The big difference is whether shrinking is integrated into generation.
> Integrating shrinking into generation has two large benefits...
> The first is mostly important from a convenience point of view...
> But the second is <i>really</i> important, because the lack of it makes your test failures potentially extremely confusing.
> In Haskell, traditionally we would fix this with a newtype declaration which wraps the type. We could find a newtype NonEmptyList and a newtype FiniteFloat and then say that we actually wanted a NonEmptyList[FiniteFloat] there.
> In Haskell, traditionally we would fix this with a newtype declaration which wraps the type. We could find a newtype NonEmptyList and a newtype FiniteFloat and then say that we actually wanted a NonEmptyList[FiniteFloat] there. […] But why should we bother? Especially if we’re only using these in one test, we’re not actually interested in these types at all, and it just adds a whole bunch of syntactic noise when you could just pass the data generators directly.
Did you know it's one function call to change your NonEmptyList FiniteFloat into [Float]? It's called coerce. There's very little extra syntactic baggage.
And you know what, carrying data generators around is a bad idea. It forces you to think about how you want the random data to be generated, and not what the random data should look like.
In order to understand "shrinking", an unfamiliar term to me, I read the linked blog post on test case reduction: https://blog.regehr.org/archives/1284
The primary assumption of whatever's in the first page is that types are sets of values right? Integrated shrinking assumes the rules that form the set while type based shrinking assumes subsetting. Am I right in reading this?
If so.. why are we still using the notion that types = sets of values?
Static typing (in any language) is just like training wheels on a bike. It's great to learn with them but eventually, as your skill increases, the training wheels become a problem and then they must come off.
Most things in the real world don't have a fixed set of attributes, attributes can change over time and as software developers we need to
model that. There is a reason why the largest package managers (with the most packages published) are for dynamically typed languages. It's simply more flexible; these packages can work with anything; no need to import type definitions that don't fit within your type system. They are unopinionated, they return only simple objects and therefore they are highly composable.
Early in my career, dynamically typed languages allowed me to push hard and fast and learn a lot in a short space of time via rapid prototyping.
If you don't care about error handling then you can wrangle hard with a dynamic language and it won't catch up with you for a long time. But eventually...
Now? I generally write even Python scripts over a few tens of lines with mypy running, because otherwise it becomes necessary to adopt some form of Hungarian notation anyway, and then you're just doing typing badly.
For the most part static typing prevents you from doing things you know you shouldn't anyway. Like tacking arbitrary attributes onto objects that get passed around.
APIs change indeed; and you want to know whether a type has changed, not simply be surprised.
Do the lisp / Clojure people find the dynamic types a problem? I guess I’m curious if dynamic typed languages share the similar problem around refactoring or does Clojure have features that mitigate them
The thing that makes dynamic types quite easy to work with in Clojure is the fact that there's really not a whole lot of them there. In Clojure, pure data is king and creating your own types is strongly discouraged. It's possible to create types, mainly to allow for interop with Java and JS, but Clojure programmers really don't do it if it can be avoided.
The flip side here is that ALL of the core functions work on ALL of the core data types, so no matter what type of data you're working on (sets, maps, lists/stacks, vectors, even strings) you run in to the same handful of data manipulation functions. All of the persistent data structures are immutable, memory-efficient, and allow for easy value-to-value comparisons. The Clojure data structures can also be applied as functions themselves so e.g. a map is both a data structure and a mapping function from one set of values to another. Everything fits neatly together like Lego bricks, you NEVER write any adapter code in Clojure.
You do sometimes miss type information, especially when working with devs that aren't consistent in function parameter naming, but the trade-off is that you get a remarkably succinct and flexible language that really seems purpose-built to avoid the Lisp curse.
I think refactoring is even worse with lisp/clojure because the effect of macros are even less easy to predict. I need an intimate knowledge of the macro library of the local codebase to be able to refactor it.
I can't speak for other Lisps, but Clojure code really doesn't feature a lot of macros (by convention) and I've honestly never had to use any macro knowledge to refactor Clojure code. I work full time on a 100% Clojure/ClojureScript stack with 60 other Clojure devs.
Maybe you can give us an example where this is an issue?
> Static typing (in any language) is just like training wheels on a bike.
Well, no. It is more like a ABS once you switched from a bike to a car that can go way more than 100kph. In theory, you can take all corners without, but it will be dangerous, you have to do more, and you cannot go quite as fast. And while everyone thinks they understand how it works, its inner mechanism is only understood by specialists.
> Static typing (in any language) is just like training wheels on a bike. It's great to learn with them but eventually, as your skill increases, the training wheels become a problem and then they must come off.
LOL, what?
If anything, it's the opposite.
Dynamic typing is the lesser easy-to-start with machine (say, a toy bike), and as the needs grow (for both safety, performance, and guarantees for working with larger codebases/teams) you add types...
>Most things in the real world don't have a fixed set of attributes, attributes can change over time and as software developers we need to model that.
Which is irrelevant. Our programs (functions, custom types, etc) have a fixed set of attributes at any given time -- as the code must know about them to use them in the first place.
For things that we model and don't have fixed attributes, we can and do build ADTs and data structures to accommodate them.
This is nonsense. Static types can represent things with a non-fixed set of attributes just fine.
Also, note that the article isn't implying what you are: "Where stronger types are naturally present in the program, taking advantage of them to get better testing is certainly worth doing..."
They're more popular with beginners because they're more expressive and simpler to reason about.
I find naming types to be a major pain. In complex projects, you often end up with multiple types that are almost the same but have a few attributes different. In reality these are the same type, it's just not supposed to have a single fixed schema.
For example, how would you model a tadpole which develops into a frog with a statically typed language? Do you need two different types for this? What about all the awkward intermediate states between tadpole and frog? How do you model those? Do you really need to create multiple different instances in the code to represent that one entity?
You model every state that your program is expected to handle. How else will the computer know what to do?
If the state is modeled in numeric terms then you can represent the intermediate state as a range of integers. You need to do this whether or not your program is typed.
Without types you're allowing the program to fall into a state that it doesn't know how to handle. This works because the program doesn't try to handle these intermediate invalid states. What happens when it eventually does try to handle that state? That's when bugs happen.
Static type systems are a brute force way to eliminate a certain class of bug. A dynamic language appears simpler because it doesn't even attempt to capture those errors. It's like saying your payment validation is "simpler" because it doesn't do CCV checks.
In a dynamic language it's entirely likely you still have those types, but they're just not defined anywhere. You the developer have to validate them in your head.
In dynamically typed languages, you never check or think about the types, instead you only think about the features. Does this object have a run() method? Does it have a color property? It doesn't matter what type it is... Maybe it's a new kind of type that you could not have imagined when you first wrote the code, but if it supports the necessary method, it's going to work and you don't need to change or recompile your code to account for it.
>Does this object have a run() method? Does it have a color property? It doesn't matter what type it is..
Only it very much matters...
And to track it you have an ad-hoc, incomplete, custom, buggy, implementation of type checking in your head -- as opposed to in the program, and statically checked.
>Maybe it's a new kind of type that you could not have imagined when you first wrote the code, but if it supports the necessary method, it's going to work and you don't need to change or recompile your code to account for it.
You obviously need to change your code to have objects of that "new kind of type" (even if they are dynamic and untyped). Where would the new objects come from without changing the code in some place, the ether?
Supporting the necessary method is a very poor check for "it's going to work".
A "bomb" and a "lock" object might both have an "activate" method. You wouldn't want to call the former when you meant the latter.
You're still doing the same checks, only know you do only half of them, in your head, and whatever goes...
Asking the question 'what are the features of this thing' is a potentially very complicated question in a dynamic language. Each and every line of code could have added or removed a feature. Dynamic language make for very messy and unreliable computing.
But the choice between static and dynamic typing (or duck typing add described here) is not cost neutral.
Dynamically typed languages can be much quicker and cheaper to write, making for more rapid business product cycles. The lack of type safety can largely be made up by adding to the already necessary unit tests.
Isn't that how the YouTube startup out-manoeuvred Google's much larger dev team working with static types?
> Dynamically typed languages can be much quicker and cheaper to write...
Code isn't just written, though. It's read, modified, and maintained. Sure, you can write it faster. But are you building that program for this week, or for the next two decades?
(Yes, I know, you can be in a situation where writing it faster now is life-or-death for the company. Most of us won't spend most of our careers in that situation, though.)
What you're describing is structural typing, and it doesn't have to be dynamic. Look at OCaml and its object model to see an example of a statically typed system with automatic type inference, that nevertheless checks all these things at compile-time. Which is to say, you can write:
let frob something = something#run ...
and it will enforce the fact that whatever is passed to frob, it must have a method named "run", and it must have the requisite number of arguments and their types.
The types are still there, though. They're just types like, "some object with method "run" with ...".
I've programmed dynamically typed langs (Perl, PHP, Python, Ruby, JS) and statically typed langs (C, Java, Dart). I've worked on teams for projects that grew and matured, and my experience has been that strong typing and static analysis help prevent code defects that are common in dynamic langs. The tradeoffs can be well worth it for some kinds of larger team based projects.
I think if you gave an example implementation of your tadpole model in a dynamically typed language, people would be happy to sketch their version in a statically typed one. The question is much too vague so far.
// Tadpole state, can swim but cannot walk.
return {
color: 'black',
size: 10,
speed: 5,
direction: 3.14,
swim: function () {
// ...
}
};
}
function allowFrogToWalk(frog) {
frog.walk = function () {
// ...
};
}
function allowFrogToJump(frog) {
frog.jump = function () {
// ...
};
}
function makeFrogToxic(frog) {
frog.intoxicate = function (animal) {
// ...
};
}
let frog = createFrog();
// ...
// Then when the frog reaches a certain size:
allowFrogToWalk(frog);
// After the frog spends a certain amount of time on land.
allowFrogToJump(frog);
// If frog eats a certain toxic plant.
makeFrogToxic(frog);
// ...
// Let's say our game has many different kinds of animals and we manage them
// from a single place in the code (and they can interact with each other).
// When an animal tries to go on land:
if (animal.walk) {
// ...
}
// If animal enounters an obstacle:
if (animal.jump) {
// ...
}
// When an animal tries to go on water:
if (animal.swim) {
// ...
}
// When two animals touch each other
if (touches(animalA, animalB)) {
if (animalA.intoxicate) {
animalA.intoxicate(animalB);
}
// ...
}
Doing this with a statically typed language would add a lot of unnecessary complexity.
You'd need to create a lot of types/interfaces:
- Animal (which has all the generic properties of an animal)
- WaterAnimal (which inherits from Animal and can swim but not walk)
- LandAnimal (which inherits from Animal and can walk but not swim)
- Tadpole (inherits from WaterAnimal)
- ToxicTadpole (inherits from tadpole but adds intoxicate method)
- YoungFrog (inherits from WaterAnimal and LandAnimal and is constructed from a Tadpole instance, cannot jump)
- ToxicYoungFrog (inherits from YoungFrog but adds intoxicate method)
- MatureFrog (inherits from WaterAnimal and LandAnimal and is constructed from a YoungFrog instance, has the ability to Jump)
- ToxicMatureFrog (inherits from MatureFrog but adds intoxicate method)
As you add more complex functionality, you either have to support more permutations of different types or you have come up with some kind of complex decorator pattern which subverts the whole purpose of having static types to begin with.
This is cool, but I think the OP wanted to be able to change a given instance's capabilities rather than allocate new wrappers around it. But it wasn't very clear.
Yeah so changing capabilities at runtime is essentially dynamic typing, and here I'm suggesting a (IMHO safer–you don't have to deal with mutation and its weird effects) statically-typed alternative to that.
In this case I would use something like std::function in C++. An std::function can also be unititialized and one can test for that using 'if'. Obviously, one does not want an inheritance hierarchy as complicated as the one you sketch. It does seem like a dynamic language maps a bit more direct to this problem. However, if one uses a dynamic language there still is the complication that there is not a single place with a list of all possible attributes that the Animal can have. This is one main pain of dynamic objects that one especially feels when one is trying to find out how something works. When something can potentially be any type one might have to read a great amount of code to find out what operations are available for that thing.
I also have to notice that this is not a very typical problem. I don't seem to remember that I have ever had to model something like this in a programming language.
This kind of problem with using heavily inheritance-based systems for behavior can be addressed by using entity component systems. Often found in game development. The general guidance these days is to use "is-a" relationships less and "has-a" relationships more.
Yes but my point is that if you allow an instance of a specific class to either have or not have a specific method/behavior at runtime, then you lose type safety which is the whole point of a dynamically typed language.
The point that I'm trying to make is that true compile-time type safety is not possible in my example without some kind of rigid complex inheritance structure that is predefined at compile time.
Thanks for this spec. Here's a complete, runnable implementation in OCaml:
type walker = WalkFn of (animal -> unit)
and jumper = JumpFn of (animal -> unit)
and swimmer = SwimFn of (animal -> unit)
and intoxicator = ToxicFn of (animal -> animal -> unit)
and animal = Animal of animal_data
and animal_data = {
name: string;
color: string;
mutable walk: walker option;
mutable jump: jumper option;
mutable swim: swimmer option;
mutable intoxicate: intoxicator option
}
let can_walk = function
| Animal { walk = Some _ } -> true
| _ -> false
let walk animal =
match animal with
| Animal { walk = Some (WalkFn walk_fn) } -> walk_fn animal
| _ -> invalid_arg "this animal can't walk"
let default_walker = function
| Animal { name; color } -> Printf.printf "the %s %s walks\n" color name
let allow_to_walk (Animal data) =
data.walk <- Some (WalkFn default_walker)
(* --- more of the same for the other actions --- *)
let can_jump = function
| Animal { jump = Some _ } -> true
| _ -> false
let jump animal =
match animal with
| Animal { jump = Some (JumpFn jump_fn) } -> jump_fn animal
| _ -> invalid_arg "this animal can't jump"
let default_jumper = function
| Animal { name; color } -> Printf.printf "the %s %s jumps\n" color name
let allow_to_jump (Animal data) =
data.jump <- Some (JumpFn default_jumper)
let can_swim = function
| Animal { swim = Some _ } -> true
| _ -> false
let swim animal =
match animal with
| Animal { swim = Some (SwimFn swim_fn) } -> swim_fn animal
| _ -> invalid_arg "this animal can't swim"
let default_swimmer = function
| Animal { name; color } -> Printf.printf "the %s %s swims\n" color name
let allow_to_swim (Animal data) =
data.swim <- Some (SwimFn default_swimmer)
let can_intoxicate = function
| Animal { intoxicate = Some _ } -> true
| _ -> false
let intoxicate animal victim =
match animal with
| Animal { intoxicate = Some (ToxicFn toxic_fn) } -> toxic_fn animal victim
| _ -> invalid_arg "this animal can't intoxicate"
let default_intoxicator animal victim =
match animal, victim with
| Animal attacker, Animal victim ->
Printf.printf "the %s %s poisons the %s %s\n"
attacker.color attacker.name victim.color victim.name
let allow_to_intoxicate (Animal data) =
data.intoxicate <- Some (ToxicFn default_intoxicator)
(* --- end of actions --- *)
let create_frog () =
Animal {
name = "frog";
color = "black";
walk = None;
jump = None;
swim = None;
intoxicate = None
}
let create_cow () =
Animal {
name = "cow";
color = "green";
walk = None;
jump = None;
swim = None;
intoxicate = None
}
let story () =
let frog = create_frog () in
Printf.printf "newborn frog, can it walk? %s\n"
(if can_walk frog then "yes" else "no");
allow_to_walk frog;
Printf.printf "and now, can it walk? %s\n"
(if can_walk frog then "yes" else "no");
walk frog;
allow_to_swim frog;
swim frog;
allow_to_jump frog;
jump frog;
allow_to_intoxicate frog;
intoxicate frog (create_cow ());
Printf.printf "the end\n"
(* now run all this *)
let _ = story ()
And here's what it looks like when it's run:
$ ocaml frog_story.ml
newborn frog, can it walk? no
and now, can it walk? yes
the black frog walks
the black frog swims
the black frog jumps
the black frog poisons the green cow
the end
It's true that the animal type must have types and fields for all possible actions an animal might ever take. It would be possible to compartmentalize things a bit more, maybe. But for a quick prototype this should work quite well, I think.
(If you should first understand your problem before you write some code or if you should use code to help you understand the problem and then write more code is up to debate, of course. Typing is an invaluable tool of thought to help you understand your problem, yes, but it is just one more tool in the toolbox.)
The sad thing I noticed though is that, having hacked a mostly untyped Python code base during the last few days, making sure that all typing annotations you add to the code base are sound is a big PITA, to put it bluntly, and I would pretty much prefer to work with a statically typed language in this particular case. If your typing discipline is uncoupled from the ability to run code, people simply remove that obstacle from their way. It starts to be treated pretty much like tests and documentation: indispensable in theory, relegated to second plan in practice.
So my impression is that gradual typing almost got it right, but it may do a great disservice to overall code quality as well. I am now inclined to think that some kind of barbell strategy on typing will render better results: use both a completely untyped dynamic language such as Tcl, where everything is a string, and a static, strongly typed programming language (ideally a proof assistant with dependent types). You prototype with the former and move into production after translating your solution to the latter. If you ever need to push untyped code into production, it will be obvious to everyone involved and, since it is so decoupled, it cannot affect the quality of the statically typed codebase by any chance.