[Architecture] Hub-Spoke model for federation? · Issue #3245 · LemmyNet/lemmy
github.com
external-link
Question One of the biggest issues I am currently seeing with Lemmy, is federation. Either... Federation between instances having issues. Federation is backed up. Federation / Syncing is not scalin...

https://github.com/LemmyNet/lemmy/issues/3245

I posted far more details on the issue then I am putting here-

But, just to bring some math in- with the current full-mesh federation model, assuming 10,000 instances-

That will require nearly 50 million connections.

Each comment. Each vote. Each post, will have to be sent 50 million seperate times.

In the purposed hub-spoke model, We can reduce that by over 99%, so that each post/vote/comment/etc, only has to be sent 10,000 times (plus n*(n-1)/2 times, where n = number of hub servers).

The current full mesh architecture will not scale. I predict, exponential growth will continue to occur.

Let’s work on a solution to this problem together.

HTTP_404_NotFound
creator
link
fedilink
English
11Y

My logic, was the move the load away from the primary instance server, onto a service/server that only focuses on handling federation duties.

My reasoning- is to break apart the two workloads, and hopefully build a more scalable federation tier, that can scale independently on the primary instance server.

@andscape@feddit.it
link
fedilink
English
21Y

I understand the logic, and you’re right to think about how improve Lemmy’s scalability. But I’m not sure if this is the way to go.

If you build a dedicated federation proxy for an instance, you’ve really just slightly moved the problem. The federation proxy is going to have the same scalability issues, and if anything the total load goes up.

If you build multi-instance hubs, you suddenly introduce a lot of new issues.

  • Security: I think Lemmy checks the source of an update to verify that it comes from the legitimate host. You would have to introduce some kind of signatures to verify that the activity originated from the legitimate host.
  • Privacy: now your users have to trust the hub owners with their data, not just the instance.
  • Motive: who would be running the hubs, and why? They would have to be even bigger that the instances, and there would be much less incentive to do it.
HTTP_404_NotFound
creator
link
fedilink
English
01Y

I would agree with all of your above points.

That said- https://github.com/LemmyNet/lemmy/issues/3245 The most recent idea has popped up a few times on both lemmy, and now github- and actually sounds like a potential solution as well.

Just- using signed messages between instances, which can be transmitting P2P, instead of direct only.

@andscape@feddit.it
link
fedilink
English
11Y

Yeah what’s being described there is basically a P2P model. I still think it wouldn’t make a huge difference in the chattiness of the protocol. At best it would redistribute the load for outgoing federation messages, but not for incoming ones. An instance still has to receive each message individually, regardless of where they comes from.

Create a post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 1 user online
  • 59 users / day
  • 169 users / week
  • 619 users / month
  • 2.31K users / 6 months
  • 1 subscriber
  • 3.28K Posts
  • 67K Comments
  • Modlog