Big upvotes for this article. I'm glad it was written, because I've seen nothing but hype for Unikernels on Hacker News (and in ACM, etc.) for the last 2 years. It's great to see the other side of the story.
The biggest problem with Unikernels like Mirage is the single language constraint (mentioned in the article). I actually love OCaml, but it's only suitable for very specific things... e.g. I need to run linear algebra in production. I'm not going to rewrite everything in OCaml. That's a nonstarter.
An I entirely agree with the point that Unikernel simplicity is mostly a result of their immaturity. A kernel like seL4 is also simple, because like unikernels, it doesn't have that many features.
If you want secure foundations, something like seL4 might be better to start from than Unikernels. We should be looking at the fundamental architectural characteristics, which I think this post does a great job on.
It seems to me that unikernels are fundamentally MORE complex than containers with the Linux kernel. Because you can't run Xen by itself -- you run Xen along with Linux for its drivers.
The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.
> The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.
But if you never understand why it was a bad state in the first place you're doomed to repeat it. Pathologies need to be understood before they can be corrected. Dumping core and restarting a process is sometimes appropriate. But some events, even with stateless services, need in-production, live, interactive debugging in order to be understood.
> But some events, even with stateless services, need in-production, live, interactive debugging in order to be understood.
The question then becomes if it is reproducible since "debuggable when not running normally" seems to be the common thread of unikernels, such as being able to host the runtime in Linux directly rather than on a VM.
I think it if you try a low level language these kinds of things are going to bite you, but a fleshed out unikernel implementation could be interesting for high level languages, since they typically don't require the low level debugging steps in the actual production environment.
In either case unikernels have a lot of ground to cover before they can be considered for production.
Running unikernels on SEL4 is a perfectly sane thing to do. SEL4 does not provide the network stack, or much application interface, so a unikernel is a great thing to put on top.
I really do feel like at the end of this container experiment we're going to reinvent microkernels. Possibly badly, but I'm hopeful that it won't work out that way.
" I'm glad it was written, because I've seen nothing but hype for Unikernels on Hacker News (and in ACM, etc.) for the last 2 years. "
That much is true. I'm countering what I can where it gets overblown. Just part of something going mainstream... crowd effects and so on...
"The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one."
Additionally, there's no inherent reason that I see that unikernels are impossible to debug. We can debug everything from live hardware up to apps with existing tooling. So, if unikernels are lacking, it just means they're still young and nobody has adapted proven techniques to debugging them. I imagine the simple ones on simpler HW will make that even easier.
Enough said about what? That there's no current debugging (my claim) or that it's impossible (his)?
"Now, could one implement production debugging tooling in unikernels? In a word, no"
That's a lie: you could implement it. Debugging has been implemented for everything from ASIC's to kernel mode to apps in other categories. Whether unikernel crowd wants to or will is another question. Not looking good there per Google. Him talking like it's impossible shows he has an agenda, though.
I'm guessing he didn't tell everyone to ditch the entire concept of Joylent's offering early on because they were lacking some important features or properties. Just a guess but I'd bet on it.
EDIT: Changing search terms to "debugging" "xen" "guests" got me two results showing foundations are already built. Weak compared to UNIX but there.
> The biggest problem with Unikernels like Mirage is the single language constraint
Do you write all your software in C? Of course not. The single language constraint doesn't exist, for the same reasons we can write Go software that runs on the Linux kernel
But the entire point is that unikernels aren't like the Linux kernel, that there isn't a userspace/kernelspace boundary in the way that there is on Linux and other traditional OSes.
> We present unikernels, a new approach to deploying cloud services via applications written in high-level source code. Unikernels are single-purpose appliances that are compile-time specialised into standalone kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and should reduce operational costs. Our Mirage prototype compiles OCaml code into
unikernels that run on commodity clouds and offer an order of magnitude reduction in code size without significant performance penalty.
> An important decision is whether to support multiple languages within the same unikernel. [...] The alternative is to eschew source-level compatibility and
rewrite system components entirely in one language and specialise that toolchain as best as possible. [...] existing non-OCaml code can be encapsulated in separate VMs and communicated with via message-passing [...]
> We did explore applying unikernel techniques in the traditional systems language, C, linking application code with Xen MiniOS, a cut-down libc, OpenBSD versions of libm and printf, and the lwIP user-space network stack.
That is, there is absolutely a single language constraint, on purpose, as arguably the primary differentiation from non-unikernels.
Just because some concrete system call interface is missing doesn't mean you can't glue layers of code together from different languages, that's some logical jump I don't understand.
The single language constraint you're continuing to claim exists is addressed by their own blog posts: https://mirage.io/blog/modular-foreign-function-bindings . After reading this, I can see approximately 50 lines code that would let me run a Python interpreter (another mostly memory-safe language, btw) within the unikernel, assuming enough of the Unix syscall interface was already implemented in shims back into OCaml land, something I presume will likely be released in the coming months in preparation for making their stuff useful to the general public.
Yes, you can, but you shouldn't. The continuation of the third passage I quoted explains why they think that several of the advantages of unikernels disappear if you try writing things in not-OCaml. It looks like the instructions you're linking is intended for third-party libraries that already exist in C, not for entire applications (although, yes, that would work).
I mean, you also can port applications to Linux kernelspace. (Remember the Tux web server?) But that's not really the point of Linux, and if you want that, you should... use a unikernel.
That said, sure, it's entirely possible that as they shift focus from an academic project to a commercial one, they'll give up on this distinction and its performance advantages, and start marketing a product that lets you just write C. (Just like they may well give up on hypervisor-based parallelism and add fork().) But that's not how they're currently envisioning the concept.
High-assurance security approaches on separation/MILS kernels have been doing this successfully for years. It's common for those RTOS's to have a native target on microkernel, a safety-critical runtime for Ada/Java, a featured runtime for them, a POSIX layer, user-mode Linux... all containing pieces of the system or even an application working together through robust middleware.
So, it's a proven model that's literally flying through the air right now due to aerospace take-up. It would likely work for unikernels, too, so long as they included same checks/mediation at interface points or middleware that prior model required. The only real questions should be about the resulting attributes of that system: is it a good approach vs regular unikernels w/ performance, containment, etc (theory vs practice)? Or just ditch them to enhance separation kernels, micro-hypervisor platforms, or capability systems?
Personally, I'm not sold on unikernels for resilience: prior, security models were better, field-proven, and survived advanced pentesting. Under-utilized imho. Cross-language is similar in both, though, with attributes of one application likely carrying to other. The real problem is the TCB being complex & insecure, breaking isolation paradigm.
It's much easier than you think. OCaml can expose functions with a C ABI. You can put newlib on top of MirageOS, calling down to OCaml for basics (and newlib needs only a handful of IO and memory functions to be provided, so it isn't particularly hard), and then you have instant POSIX. Now you can run most anything you want from C-land.
Are we talking about microservices that talk over TCP of some sort? If that is the case it is a moot point as the other end need not even compile against a unikernel at all, it can sit on a traditional OS.
Exactly. They seem to be creating unsolvable, strawman problems instead of assuming we start with the middleware approach to getting things to work together. And there's both very efficient and very robust tooling available for that these days.
No, we're talking about running things under unikernels.
I mean yes, you could have some of your app implemented in OCaml and then it talks over TCP to some Fortran running on Linux to do linear algebra, but that approach has its own problems.
You are splitting the problem at the wrong place. Microservices is a well known thing and is what is providing the fire under unikernel research, you cannot simply ignore the aspect of "do the hard thing in a different environment" when talking about microservices.
IMO the popularity of microservices is a lot more about languages that provide very weak isolation guarantees (i.e. Ruby). OCaml is a strongly typed language with an excellent module system, and so microservices offer relatively little value.
The biggest problem with Unikernels like Mirage is the single language constraint (mentioned in the article). I actually love OCaml, but it's only suitable for very specific things... e.g. I need to run linear algebra in production. I'm not going to rewrite everything in OCaml. That's a nonstarter.
An I entirely agree with the point that Unikernel simplicity is mostly a result of their immaturity. A kernel like seL4 is also simple, because like unikernels, it doesn't have that many features.
If you want secure foundations, something like seL4 might be better to start from than Unikernels. We should be looking at the fundamental architectural characteristics, which I think this post does a great job on.
It seems to me that unikernels are fundamentally MORE complex than containers with the Linux kernel. Because you can't run Xen by itself -- you run Xen along with Linux for its drivers.
The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.