For example, if I (on kbin.run - which is Mbin, but for the purposes of this let’s just assume it’s Kbin) go to a random magazine on kbin.social, I will often see a prompt that the magazine may be incomplete and that I should visit the original instance for all the content.

Why doesn’t the request to that magazine automatically trigger a “pull” from that instance for that magazine, or at least cause it to check if the number of threads is the same (and conditionally pull on that)? I would think by pulling the changes then, magazines would never be out-of-date.

I get that it would be a lot heavier of a load on the servers, but in combination with good caching techniques (maybe setting a time of 1 day or something until the next pull occurs, idk) I feel like that could be mitigated.

Is this maybe an implementation detail of ActivityPub?

Thank you!

  • Skull giver@popplesburger.hilciferous.nl
    link
    fedilink
    arrow-up
    7
    ·
    10 months ago

    ActivityPub is almost exclusively push-based. There are APIs for retrieving content, of course, but those aren’t meant to be the primary method of federation. Kbin can expose a pull API of its own, of course, but other servers that host objects that may be represented as magazines won’t expose that API.

    Fediverse servers sometimes lose connectivity as well, for example when another server is under DDoS attack and the ActivityPub endpoints get shut down. That means the code still needs to be designed to deal with the occasional out-of-dateness.

    With the size of some magazines, the sync process can involve hundreds or thousands of objects every hour. After all, every vote is a federated ActivityPub object. To prevent abuse, any receiving server would also need to verify all of those objects’ signatures (so a server cannot pretend to be super popular as easily). With a couple hundred Kbin servers, that can be quite a big load compared to the push based system ActivityPub is built around.

    This problem is particularly annoying on Mastodon, where the sync process is almost entirely broken. Very few servers see all reactions under a post, and tools like Fedifetcher do exactly what you propose Kbin should do. In my experience this adds quite significant load to Mastodon, because every incoming message needs a bunch of fetches, and the more incoming messages you get, the worse the problem becomes.

    You can probably write a tool to make Kbin sync in the same fashion, but I’m not sure if it’ll be taken up.

  • Rimu@piefed.social
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    10 months ago

    There is no good reason for it. Just a choice by the coder of kbin.

    PieFed retrieves the last 50 posts when a Lemmy community is added for the first time. It only takes a few seconds because all 50 posts can be retrieved at once by making a GET request to the community outbox. It doesn’t do this for Kbin magazines because the Kbin developer chose not to make an outbox.

  • NaN@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    10 months ago

    I don’t think it’s usually out of date that’s the issue, but rather history. Generally activitypub instances don’t get historical data, they get data from the point someone subscribes, so you may be missing old threads and comments on your local instance that exist on the hosting instance. If nobody is subscribed your instance may not be getting new content though.

    That’s one reason there are tools available like bots that “mass subscribe” to content on various instances when you set up a new one, otherwise it will stay pretty empty.