Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This type of stuff is one reason I like vendoring all my deps in golang. You have to be very explicit about updating dependencies, which can be a big hassle, but you're required to do a git commit of all the changes, which gives you a good time to actually browse through the diffs. If you update dependencies incrementally, it's not even that big a job. Of course, this doesn't guarantee I won't miss any malicious code, but they'd have to go to much greater lengths to hide it since I'm actually browsing through all the code. I'm not sure the amount of code you'd have to read in python would be feasible, though. Definitely not for most nodejs projects, for example.

I think it's an interesting cultural phenomenon that different language communities have different levels of dependency fan-out in typical projects. There's no technical reason golang folks couldn't end up in this same situation, but for whatever reason they don't as much. And why is nodejs so much more dependency-happy than python? The languages themselves didn't cause that.



> And why is nodejs so much more dependency-happy than python?

Part of it—but I'm sure not all—is that the core language was really, really bad for decades. Between people importing (competing! So you could end up with several in the same project, via other imports! And then multiples of the same package at different versions!) packages to try to make the language tolerable and polyfills to try to make targeting the browser non-crazy-making, package counts were bound to bloat just from these factors.

Relatedly, there wasn't much of a stdlib. You couldn't have as pleasant a time using only 1st-party libraries as you can with something like Go. Even really fundamental stuff like dealing with time for very simple use cases is basically hell without a 3rd party library.

Javascript has also been, for whatever reason, a magnet for people who want to turn it into some other language entirely, so they'll import libraries to do things Javascript can already do just fine, but with different syntax. Underscore, rambda, that kind of thing. So projects often end up with a bunch of those kinds of libraries as transitive dependencies, even if they don't use them directly.


It’s worth mentioning that Underscore started before browsers widely implemented the same features in standard JavaScript. Underscore is much less necessary now that Internet Explorer EoL’d.


The problem is the tree of dependencies you might check. Sure you can check the changes in a direct dependency, but when that dependency updates a few others and those update a few others, the number of lines you need to read grow very quickly


Golang flattens the entire dependency tree into your vendor directory. It's still not that big. The current project I am working on has 3 direct external dependencies, which expands out into 22 total dependencies, 9 of which are golang.org/x packages (high level of scrutiny/trust). It's really quite manageable.


Indeed, gophers often make it a point of pride to have no dependencies in their packages.


> And why is nodejs so much more dependency-happy than python?

Could it be that nodejs has implemented package management more consistently and conveniently than other languages/platforms?


That's one thing, the other is the almost complete absence of a standard library.


Yeah, I think this is a big one. One of the things that I have always liked about Golang is that the standard library is quite complete and the implementations of things are (usually) not bare-bones implementations that you need to immediately replace with something "prod-ready" when you build a real project. There are exceptions, of course, but I think it's very telling that most of my teammates go so long without introducing new dependencies that they usually have to ask me how to do it. (I never said the ux was fantastic :) This also goes to GP's "consistent and convenient" argument.


Totally agree. It feels like there is a pretty strong inverse correlation between standard library size, and average depth of a dependency tree for projects in a given language. In our world, that is pretty close to attack surface.


Rust is another example of this. Just bringing in grpc and protobuf gets about a hundred dependencies. Some of them seemingly unrelated. For a language aimed at avoiding security bugs, I find this to be an issue. But a good dependency manager and a small (or optionally absent) stdlib has lead to highly granular dependencies and bringing in giant libs for tiny bits.


pip throws your dependencies in some lib directory either on your system (default if you use sudo), in your home directory (default if you don't use sudo), or inside your virtualenv's lib directory.

npm pulls dependencies into node_modules as a subdirectory of your own project as default.

Python really should consider doing something similar. Dependencies shouldn't live outside your project folder. We are no longer in an era of hard drive space scarcity.


Have you seen how much space a virtualenv uses? It can easily be >1 GB. For every project, this adds up. (Not to mention the bandwidth, which is not always plentiful).


Well, npm uses a cache so it won't re-download every package every time you install it.


4TB hard drives are $300 these days.


4tb HDDs are closer to 80$ now, but that reinforces your point :). Even SSDs are now close to 300$ for 4tb!


Yeah i meant 4TB SSDs, who uses magnetic HDDs anymore lol


As of Python 3, pip install into the system Python lib directory is strongly discouraged. ISTR that even using pip to update pip results in a warning.

That’s not to say that there’s not still some libs out there that haven’t updated docs to get with the times.


More distros should adopt the Debian practice of installing into dist-packages and leaving site-packages as a /usr/local equivalent for pip to use on it's own.


It also blows up the size of your git checkouts pretty fast though.

I don't think you really gain much either; vendoring was useful before modules, but now we have modules and go.sum I don't really see the advantage. If you have "github.com/foo/bar" specified at version 1.0.4 the go.sum will ensure you have EXACTLY that version or it will issue an error in case of any tomfoolery.


Vendoring also means your builds don’t need an Internet connection.

Going on a trip somewhere without an Internet connection? Checkout the repo on your laptop and go. Without vendoring: oh shoot, I forgot to download the deps, I guess I’m going to be forced into a work-life balance. With vendoring: no additional step needed after checking out the repo. The repo has everything you need to work.

Another case: repo of your dependency is removed, or force-pushed to overwriting history. You’ve lost the ability to build your project, and need to either find another source for your dependency, or rewrite it. With vendoring: everything still works, you don’t even notice the dep repo went under.

Generally, with vendoring your code is in just one place instead of being a distributed being which crumbles when any part of it gets sick.

Moreover, relying on checksums to me seems a bit overcomplicated. It’s like going to a pub and giving each drink from a stranger to a chemist for verification to make sure they didn’t slip any pills, when you could just carry your own drink around and cover the top with your hand.


You should have the modules downloaded to the module cache for the occasional case when you don't have direct internet access.

> Another case: repo of your dependency is removed, or force-pushed to overwriting history. You’ve lost the ability to build your project, and need to either find another source for your dependency, or rewrite it.

The GOPROXY (https://proxy.golang.org/) still contains that removed repo, and since everything is summed people can't just force overwrite it. Plus, you still have it in the module cache locally.

You can of course always come up with "but what if...?" scenarios where any of the above fails, and all sort of things can happen, but they're also not especially likely to happen. So the question isn't "is it useful in some scenario?" but rather "is it worth the extra effort?"

> Moreover, relying on checksums to me seems a bit overcomplicated.

It's built-in, so no extra complications needed.


> You should have the modules downloaded to the module cache for the occasional case when you don't have direct internet access.

That’s assuming I’ve built the thing previously on that same computer. I’m talking about the common case of working on a normal desktop day-to-day and then switching to a laptop, when travelling to a place without internet (or internet of such a poor quality you might as well not bother). With vendoring I don’t need to think about any other steps than copy/checkout the repo. The repo is self-contained. Without it, I’m making the quantum leap to a checklist.


You need internet access to either checkout or update the repo; you can use "go mod download" (or just go build, test, etc.) to fetch the modules too. It's an extra step, but so is vendoring stuff all the time.

But like I said, it's not about "is it useful in some scenarios?" but "is it worth the extra effort?" I'm a big fan of having things be self-contained as possible but for this kind of thing modules "just work" without any effort. Very occasionally you might go "gosh, I wish I had vendored things!", but I think that's an acceptable trade-off.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: