Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you were ready to throw away a bit of archival quality, I wonder if writing Apache Arrow straight to disk and then just mmap your columns back in for a query.


It's certainly possible. But Parquet and Arrow are being built with each other in mind. I doubt the tradeoffs on storage, durability, and future-proofness are worth skipping the marginal overhead to load a Parquet file into memory.


Isn't that what Feather, the arrow file format, is? https://arrow.apache.org/docs/python/feather.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: