IA needs to do what Usenet has done. Have a bunch of mission-aligned but unrelated orgs (under different ownership and distributed around the world) that peer with each other, distribute all the content obtained by any of the orgs to each other, but that have no technical channel nor capability to distribute DMCA complaints and takedown requests.
This is (AFAIK) basically how Usenet piracy works. You send your warez to one provider, and that provider instantly replicates them to all the providers they peer with, recursively, until they eventually reach the entire network. When any of those providers get a DMCA complaint, they remove the offending files (as they're required to do by law), but they don't inform other providers that they've received a DMCA notice, so those providers keep serving those files. This makes it much harder to remove data from the network than it is to add it.
IMO personal security would only be improved if we diversified away from "the open web".
"Flood the field" with protocols and pre-shared key networks where we have to generate keys together in meat space, make it too expensive to operate the panopticon.
Everyone putting their eggs in the open web basket, gathering in that public commons means all it takes is one bomb on us all, so to speak.
BitTorrent allows untrusted users (read: industry plants) to connect and slurp down direct IP addresses to swarm participants. It's an unanswered legal question whether low-level uploading (such as the percentages one would get as a "leech", connecting to the torrent and then disconnecting immediately after completion) might fall under "fair use" or "fair dealing" statutes in various jurisdictions.
US-centric here: I feel that uploading a small percentage of a file as a condition of downloading the whole thing may very well fall under fair use - most BT traffic is noncommercial, the portion of the covered work uploaded by "leeches" is very small and probably would be covered by the "30-second" rule often quoted in fair use discussions. The only really arguable point is the "effect on the work's value", but then again an average leech is not uploading enough of the work to have that much of a material effect on the work's value.
In Germany at least, uploading even a single byte of content is illegal. We don't really have Fair Use here; there are only few, very narrow exceptions.
It is also not even required to show that that single byte was uploaded, your IP getting logged as part of the swarm suffices. The burden of proof is on you now. It was much, much worse than in the US.
While all this is technically still true today, a new law a few years ago luckily mostly blocked the path. It was badly needed, because the situation was horribly abused by law firms.
> It is also not even required to show that that single byte was uploaded, your IP getting logged as part of the swarm suffices
What if someone would release software that would connect to random swarms and not upload or download anything? Would they still be criminally liable? You could disguise the purpose by saying it's measuring swarm diversity.
Of course Telco's can choose to be involved, perhaps accept payment to lookup and snitch, etc. but for the most part a number of ISPs in Au just wash their hands of devoting resources to play connect the dots for others.
Same in Japan. There's allegedly someone making big bucks going after bittorrent users, straining ISP abuse teams and judicial systems. Interesting that Germany has laws against that.
“Here’s byte 0x67, which is at offset 0x729B1A38 of Copyrighted_Blockbuster.4k.mkv, as requested” is different from “here’s byte 0x67, and it’s the first byte of my text response to your comment”.
There are only 3-4 providers because the system is spammed with hundreds of terabytes of new data per day by actors seeking to destroy it. They can't moderate the spam because the pirated data is all encrypted so indistinguishable from random data, and because moderation would destroy their pretense of not knowing what content is being posted.
The binary Usenet is the one that Internet Archive would be like. It receives hundreds of terabytes of new data every day. Most of it is just random bits designed to waste space on the providers.
Suppose you don't have ten hosts that each have 175PB of data but rather a million hosts that each have an average of 1.75TB, and therefore the equivalent of 10 full copies. And then something that periodically checks if there is any given subset of the data with too few copies and makes more.
> Internet Archive Switzerland joins a growing group of mission-aligned organizations, alongside Internet Archive, Internet Archive Canada, and Internet Archive Europe. Together, these independent libraries strengthen a shared vision: building a distributed, resilient digital library for the world.
I was interested in the others, but https://www.internetarchive.eu is a horrible corporate-looking site with a hero image, a boast about AI, a carousel of news that won't scroll with doing its slow scroll animation, a huge "meet the team" section with mugshots and boring profiles, social media links, a newsletter signup form, and nothing to say where the actual archive is.
Reading what little information they have there, they aren't a public facing or public serving organization. They seem to provide their services to institutions only:
"working with dozens of European libraries and government agencies to build web collections, Internet Archive Europe prioritized collaboration with cultural heritage organizations to safeguard our collective history."
They do exist and involved in archiving. Someone reached out to our amateur radio club and offered to archive any documents we might have. They even asked to archive the video recording of one of our monthly meetings.
That website is really struggling. Very tempting to go to a mirror on archive.org to view it :)
This seems very distinct from Internet Archive in the US, I wonder how separate it is.
Internet Archive Canada (I worked there in 2024) operated like it was a subsidiary, even though I think it was technically an independent organization with some shared directors. Same Slack, same archive.org email domain, etc.
IA.ch has Brewster and Caslon on the board.
I suspect that for the political threats of the current decade the different Internet Archive organisations need to start operating more independently, especially when it comes to funding?
Can you share more about your time at the Canadian one? I feel like there was a big hullabaloo about it years ago, but it's not really clear what they do.
Not sure what hullabaloo -- they do provide a bunch of services to Canadian institutions (including Libraries and Archives Canada) and they perform physical services like book scanning and in the last few years I believe they are the parent organization for the physical Canadian datacentre _somewhere in BC_.
For my work, I worked in their Archiving & Data Services department, on https://archive-it.org/ -- I didn't know this before I joined, but Internet Archive offers various for-pay services to other cultural institutions, mostly around archiving their stuff or white-labelling playback of archives.
On one hand this is neat, as IA have expertise around this, but on the other hand (as a Canadian) I don't like that it's not actually sovereign and that it looks like it's run by our government but that it's not. Tradeoffs, I guess.
You can't register a ch domain with fewer than 3 characters. It's showing as available because that thing that checks available only looks if it's registered, not if it's allowed.
> We are a team of change-makers who believe that every helping hand can raise a child and create a better future for them.
Which I found weird. And searching for this phrase yields many site-hits verbatim, which is even weirder. Anyone know what is up with that? Is it some kind of filler text?
Edit: I guess it's from a template, the Contact section is also mumbo-jumbo (address: 123 Fifth Avenue, NY and so on).
Huh. I can’t find the actual... archive. It mentions an AI archive less than 10 sentences in, and has a couple of links, but seems void of any actually archived content.
We’re just constantly in denial that the internet actually does the thing we want it to do.
The internet archive is an excellent demonstration of how to do it.
It’s primarily getting a ragtag group to pool resources and manage them and then gossip with other groups that are doing the same thing.
I’ve spent so much time around the archive that I plainly see a divide between internet people online that can’t connect the dots and internet people in real life that are confused as to why the dots aren’t connecting.
The easiest way to see the dots is to:
1. Stop trying to make money
2. Tally the things that cost money
3. Amortize the upkeep over time
E.g. where do we source resources from, where do we store resources and how do we secure them.
Like HTTP, but for physical materials, not digital.
None of those things help with the problem of centralization. Centralization isn't limited to moneymaking enterprises, or the modern internet. A centralized server operated by donations for free can just as easily go down, be seized by law enforcement, have its domain or internet service taken offline by government action, and so on.
The internet is not the thing we want (or not sufficient alone), because the internet's resources, and the communication systems between them, are largely centralized.
Yeah, them as a single instance is centralized, but if you actually go (show up at 300 Funston on a Friday at 1pm) you can hear about the research into how to replicate and become the resiliency in the network to make it decentralized.
A lot of it is ancient Unix philosophy like “this massive text file is a seekable index” and “rsync does basically most of the heavy lifting” and you’ll quickly realize decentralization is a social problem and not a technical one.
They’re shifting more and better data than the centralized services we’re complaining about— we need better education, not innovation at this current juncture.
The technology exists, the will of the people is lacking in spirit.
They've been constantly trying to set up P2P solutions. Torrents, DWEB, IPFS, Filecoin, WebTorrent, YJS, whole bunch of tech acronyms. I'm not sure much of it has really caught on?
St Gallen has been archiving knowledge for over a thousand years. Now they are archiving AI models before they get retrained out of existence. The location is not a coincidence…
Typical for something made in St. Gallen. A sensible web developer from Zurich interested in the topic would have created this website in just a single HTML and an optional CSS file.
a dev from ZH would've added a blockchain, mobile app and hosted it on an over-allocated kubernetes cluster. 97% uptime and you need a macbook pro so the website doesn't stutter.
Normally we'd reply with "please don't do regional flamewar on HN" but this sounds so good-humored to me that I've canceled the (no doubt well-intentioned) downvotes instead.
Edit: now someone is going to tell me how mean internecine Swiss conflict actually is...
I've previously contributed to the IA, and what I see here is some clout-chasing both the IA and AI, with a badly made and run Wordpress website. St. Gallen is mostly known for having a business school, and the behavior fits the stereotype. The good thing is, it's out of the way from people actually getting stuff done.
"Made in St. Gallen" is at the bottom right on that website like a badge of honor, but professionally obscurred by the back-to-top button.
Anyone who actually wants to contribute to the cause should donate to the IA or provide infrastructure, not words.
As someone who lives in Switzerland, but is not Swiss, I love this kind of thing. It’s an insight into an internal cultural understanding I didn’t get growing up and doesn’t really come up in the conversations I have day to day.
I might be overlooking something, but is a mirror of the Internet Archive even mentioned as a plan anywhere here? It was my first thought after reading the headline, too, but the website only speaks of archiving LLMs and, vaguely, some other collections, but not, for instance, the Wayback Machine.
> But about time the Internet Archive had a US-independent backup.
Agreed!
> The Internet Archive Switzerland, online at https://internetarchive.ch/, is a newly-formed Swiss non-profit foundation that will operate independently within its national context.
Anything that is being built today, based on the assumptions about the future that extend into multiple years, is bound to fade away. Because the "future no longer what it used be". What's the envisaged future context and purpose where this would save the world?
This is (AFAIK) basically how Usenet piracy works. You send your warez to one provider, and that provider instantly replicates them to all the providers they peer with, recursively, until they eventually reach the entire network. When any of those providers get a DMCA complaint, they remove the offending files (as they're required to do by law), but they don't inform other providers that they've received a DMCA notice, so those providers keep serving those files. This makes it much harder to remove data from the network than it is to add it.
IMO personal security would only be improved if we diversified away from "the open web".
"Flood the field" with protocols and pre-shared key networks where we have to generate keys together in meat space, make it too expensive to operate the panopticon.
Everyone putting their eggs in the open web basket, gathering in that public commons means all it takes is one bomb on us all, so to speak.
US-centric here: I feel that uploading a small percentage of a file as a condition of downloading the whole thing may very well fall under fair use - most BT traffic is noncommercial, the portion of the covered work uploaded by "leeches" is very small and probably would be covered by the "30-second" rule often quoted in fair use discussions. The only really arguable point is the "effect on the work's value", but then again an average leech is not uploading enough of the work to have that much of a material effect on the work's value.
It is also not even required to show that that single byte was uploaded, your IP getting logged as part of the swarm suffices. The burden of proof is on you now. It was much, much worse than in the US.
While all this is technically still true today, a new law a few years ago luckily mostly blocked the path. It was badly needed, because the situation was horribly abused by law firms.
Ideally in english but all is translatable.
What if someone would release software that would connect to random swarms and not upload or download anything? Would they still be criminally liable? You could disguise the purpose by saying it's measuring swarm diversity.
https://en.wikipedia.org/wiki/Roadshow_Films_Pty_Ltd_v_iiNet...
Of course Telco's can choose to be involved, perhaps accept payment to lookup and snitch, etc. but for the most part a number of ISPs in Au just wash their hands of devoting resources to play connect the dots for others.
“Here’s byte 0x67, which is at offset 0x729B1A38 of Copyrighted_Blockbuster.4k.mkv, as requested” is different from “here’s byte 0x67, and it’s the first byte of my text response to your comment”.
I heard a rumour that this byte also exists in the Legend of Zelda! No go get em Mr Policeman!
Softlink data to the appropriate mount
The options are endless and tech nerds can 1:1 help friends and family
Locking the knowledge into corporate silos is a huge security risk. The masses should be just as competent and informed so they don't panic
Minority say over the economy and government is just fascism. These people are not deities. They're normal meat and bone
We have processes to replace politicians and workers; we need processes to replace the rich.
Free speech is a circular right and there is no freedom from consequences of speech. They can face consequences too
It’s centralised in the way you describe now that it’s only used for large files / piracy, but it used to me much more diverse.
> Internet Archive Switzerland joins a growing group of mission-aligned organizations, alongside Internet Archive, Internet Archive Canada, and Internet Archive Europe. Together, these independent libraries strengthen a shared vision: building a distributed, resilient digital library for the world.
"working with dozens of European libraries and government agencies to build web collections, Internet Archive Europe prioritized collaboration with cultural heritage organizations to safeguard our collective history."
In a best case scenario, this eventually becomes the replacement for the (lets be honest) absurdly awful archive.org front and backend.
So: an expansion into the EU market. And yes, a honeypot for grant funds, because why not? Good for them.
This seems very distinct from Internet Archive in the US, I wonder how separate it is.
Internet Archive Canada (I worked there in 2024) operated like it was a subsidiary, even though I think it was technically an independent organization with some shared directors. Same Slack, same archive.org email domain, etc.
IA.ch has Brewster and Caslon on the board.
I suspect that for the political threats of the current decade the different Internet Archive organisations need to start operating more independently, especially when it comes to funding?
The Slack has (had?) hundreds of guest accounts due to volunteers and allied organizations. It’s an interesting (and cool) institution!
For my work, I worked in their Archiving & Data Services department, on https://archive-it.org/ -- I didn't know this before I joined, but Internet Archive offers various for-pay services to other cultural institutions, mostly around archiving their stuff or white-labelling playback of archives.
For example https://webarchiveweb.bac-lac.canada.ca/ (the Government of Canada's own Internet Archive) is actually outsourced to ADS within Internet Archive.
On one hand this is neat, as IA have expertise around this, but on the other hand (as a Canadian) I don't like that it's not actually sovereign and that it looks like it's run by our government but that it's not. Tradeoffs, I guess.
I am not saying the user in question is malicious. I am sorry to repeat myself, but URLs don't admit abbreviations
I hate this fucking website sometimes.
Good point we shouldn’t abbreviate URLs in case they get typosquatted? Just raised in a very indirect fashion
> We are a team of change-makers who believe that every helping hand can raise a child and create a better future for them.
Which I found weird. And searching for this phrase yields many site-hits verbatim, which is even weirder. Anyone know what is up with that? Is it some kind of filler text?
Edit: I guess it's from a template, the Contact section is also mumbo-jumbo (address: 123 Fifth Avenue, NY and so on).
I've noticed that this domain now host content subject to copyright.
As a example : entire season of startrek "voyager" are randomly hosted there in direct download.
Why? Is that not a liability?
The one in Egypt doesn't get updated.
If tpb dot org can still exist ...
At least these people tried. We need a p2p archive solution ASAP. Before our history is entirely re-written.
No one has cracked this one yet.
The internet itself is the thing we want.
We’re just constantly in denial that the internet actually does the thing we want it to do.
The internet archive is an excellent demonstration of how to do it.
It’s primarily getting a ragtag group to pool resources and manage them and then gossip with other groups that are doing the same thing.
I’ve spent so much time around the archive that I plainly see a divide between internet people online that can’t connect the dots and internet people in real life that are confused as to why the dots aren’t connecting.
The easiest way to see the dots is to:
1. Stop trying to make money
2. Tally the things that cost money
3. Amortize the upkeep over time
E.g. where do we source resources from, where do we store resources and how do we secure them.
Like HTTP, but for physical materials, not digital.
None of those things help with the problem of centralization. Centralization isn't limited to moneymaking enterprises, or the modern internet. A centralized server operated by donations for free can just as easily go down, be seized by law enforcement, have its domain or internet service taken offline by government action, and so on.
The internet is not the thing we want (or not sufficient alone), because the internet's resources, and the communication systems between them, are largely centralized.
Yeah, them as a single instance is centralized, but if you actually go (show up at 300 Funston on a Friday at 1pm) you can hear about the research into how to replicate and become the resiliency in the network to make it decentralized.
A lot of it is ancient Unix philosophy like “this massive text file is a seekable index” and “rsync does basically most of the heavy lifting” and you’ll quickly realize decentralization is a social problem and not a technical one.
They’re shifting more and better data than the centralized services we’re complaining about— we need better education, not innovation at this current juncture.
The technology exists, the will of the people is lacking in spirit.
That’s the crucial social layer that powers all of the everything else on the decentralized internet.
Take git as a social platform.
SSH is the social protocol.
GitHub centralized most of the git+ssh net, but that was a choice and we use all these other git+ssh services to not give them a monopoly.
https://blog.archive.org/tag/decentralized-web/
https://github.com/internetarchive/dweb-transports
Third-party attempt:
https://wiki.archiveteam.org/index.php/INTERNETARCHIVE.BAK
Turns out it's hard! Or maybe just too niche. But you can also help them today, by seeding some of collections that are available as torrents.
I have neither the technical nor financial abilities to address this problem.
However, as one of the greatest technical collectives of all time, the users of this website might be capable of doing such a thing.
This is likely the greatest challenge of our time.
I don't understand what this means?
Edit: now someone is going to tell me how mean internecine Swiss conflict actually is...
At the time of my comment, the linked URL was https://internetarchive.ch/.
I've previously contributed to the IA, and what I see here is some clout-chasing both the IA and AI, with a badly made and run Wordpress website. St. Gallen is mostly known for having a business school, and the behavior fits the stereotype. The good thing is, it's out of the way from people actually getting stuff done.
"Made in St. Gallen" is at the bottom right on that website like a badge of honor, but professionally obscurred by the back-to-top button.
Anyone who actually wants to contribute to the cause should donate to the IA or provide infrastructure, not words.
Why would they want to collect the AI wave ?!
But about time the Internet Archive had a US-independent backup.
Agreed!
> The Internet Archive Switzerland, online at https://internetarchive.ch/, is a newly-formed Swiss non-profit foundation that will operate independently within its national context.
I think the Wikipedia Editors will have to decide whether they will add it to the existing page. The Operations section is still listing only U.S. data centers: https://en.wikipedia.org/wiki/Internet_Archive#Operations
isn't this a nightmare for privacy