Hacker Newsnew | past | comments | ask | show | jobs | submit | toomuchtodo's commentslogin


Write a book, send a copy to the Internet Archive, upload the digital version. Leave your kids the ISBN or Archive.org item identifier. Donate $2/GB uploaded if you can afford it.

You could also have the Internet Archive crawl your site to preserve it if the above is too much trouble, with it being accessible through Wayback.

https://help.archive.org/help/how-do-i-make-a-physical-donat...

https://help.archive.org/help/uploading-a-basic-guide/

https://hackernoon.com/the-long-now-of-the-web-inside-the-in...

https://news.ycombinator.com/item?id=46611593


How fast you fire is a function of your savings and burn rate. The more you save and the lower your burn rate, the faster you fire.

https://www.mrmoneymustache.com/2012/01/13/the-shockingly-si...



Worry not, you can't change the US but you can leave for a developed country.





Yes, there are documents and third party projects indicating that it has a free public API, but I haven't been able to get it to work. I presume that a paid API would have better availability and the possibility of support.

I just tried waybackpy and I'm getting errors with it too when I try to reproduce their basic demo operation:

  >>> from waybackpy import WaybackMachineSaveAPI
  >>> url = "https://nuclearweaponarchive.org"
  >>> user_agent = "Mozilla/5.0 (Windows NT 5.1; rv:40.0) Gecko/20100101 Firefox/40.0"
  >>> save_api = WaybackMachineSaveAPI(url, user_agent)
  >>> save_api.save()
  Traceback (most recent call last):
    File "<python-input-4>", line 1, in <module>
      save_api.save()
      ~~~~~~~~~~~~~^^
    File "/Users/xxx/nuclearweapons-archive/venv/lib/python3.13/site-packages/waybackpy/save_api.py", line 210, in save
      self.get_save_request_headers()
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
    File "/Users/xxx/nuclearweapons-archive/venv/lib/python3.13/site-packages/waybackpy/save_api.py", line 99, in get_save_request_headers
      raise TooManyRequestsError(
      ...<4 lines>...
      )
  waybackpy.exceptions.TooManyRequestsError: Can not save 'https://nuclearweaponarchive.org'. Save request refused by the server. Save Page Now limits saving 15 URLs per minutes. Try waiting for 5 minutes and then try again.

Reach out to patron services, support @ archive dot org. Also, your API limits will be higher if you specify your API key from your IA user versus anonymous requests when making requests.

Pick the items you want to mirror and seed them via their torrent file.

https://help.archive.org/help/archive-bittorrents/

https://github.com/jjjake/internetarchive

https://archive.org/services/docs/api/internetarchive/cli.ht...

u/stavros wrote a design doc for a system (codename "Elephant") that would scale this up: https://news.ycombinator.com/item?id=45559219

(no affiliation, I am just a rando; if you are a library, museum, or similar institution, ask IA to drop some racks at your colo for replication, and as always, don't forget to donate to IA when able to and be kind to their infrastructure)


There are real problems with the Torrent files for collections. They are automatically created when a collection is first created and uploaded, and so they only include the files of the initial upload. For very large collections (100+ GB) it is common for a creator to add/upload files into a collection in batches, but the torrent file is never regenerated, so download with the torrent results in just a small subset of the entire collection.

https://www.reddit.com/r/torrents/comments/vc0v08/question_a...

The solution is to use one of the several IA downloader script on GitHub, which download content via the collection's file list. I don't like directly downloading since I know that is most cost to IA, but torrents really are an option for some collections.

Turns out, there are a lot of 500BG-2TB collections for ROMs/ISOs for video game consoles through the 7th and 8th generation, available on the IA...


Is this something the Internet Archive could fix? I would have expected the torrent to get replaced when an upload is changed, maybe with some kind of 24 hour debounce.

"They're working on it." [1]

It sounds like they put this mechanism into place that stops regenerating large torrents incrementally when it caused massive slowdowns for them, and haven't finished building something to automatically fix it, but will go fix individual ones on demand for now.

[1] - https://www.reddit.com/r/theinternetarchive/comments/1ij8go9...


It is on my desk to fix this soon.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: