There’s been a lot of talk about Meta’s new twitter clone called Threads because it will federate with other ActivityPub apps. I’ve seen several posts about them possibly using the app as a way to embrace, extend, and extinguish ActivityPub.

Another more immediate concern I have is that Meta will now be able to harvest data from users of other ActivityPub social networks like Lemmy and Mastodon. If Alice on Threads follows Bob on Mastodon for example, that means Bob’s mastodon instance will send information about all of Bob’s posts and everyone who interacts with them to Meta so that Alice can see it.

This is a concern specifically with Meta and other big tech companies running ActivityPub-enabled servers, because their primary motive is to harvest user data to use for advertising. The scariest part to me is that users on networks like Mastodon specifically migrated to Mastodon to get away from big tech, and Meta is still able to harvest their data with Threads.

  • oatmilkmaid@possumpat.io
    link
    fedilink
    English
    arrow-up
    46
    ·
    edit-2
    1 year ago

    Meta doesn’t need threads.net to harvest data from ActivityPub instances. In fact, literally anyone can do it without much effort. You’re not protecting your data by being here. It’s as public as old phpbb forums used to be even more so. That bit is a non issue (well, it is an issue…)

    My issue is just with the quality of content from meta owned instances and what it’ll bring to other federated instances.

    • fubo@lemmy.world
      link
      fedilink
      English
      arrow-up
      27
      ·
      1 year ago

      “Protecting your data” is probably a nonsense thing to say. You can’t at the same time make a post public and “protect” it from someone copying, reading, or learning from it.

      • jecxjo@midwest.social
        link
        fedilink
        English
        arrow-up
        7
        ·
        1 year ago

        It is theoretically possible to publish in public while protecting information but it requires the added step of maintaining authorization for those you wish to see that data.

        You could create a protocol where poster info is encrypted and only the appropriate parties are given the keys to access that information. To the general public it’s posted by an anonymous user.

        The question to ask is who really cares that much about your posts? Use a VPN and make an alt account and Bob’s your Uncle.

        • orclev@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          1 year ago

          There’s also the question of the value of that metadata. If something like an email address or phone number was tied to your account in a publicly available fashion that would be one thing as those are relatively unique and can be used to tie back to an individual person. But if all you’re getting is a username with nothing else attached that’s a lot more nebulous. If you care to obfuscate your activity you could easily just create an account with a name you’ve never used before. That’s probably far more effective at preventing your “data” from being harvested by meta than some kind of encryption scheme would be unless you’re willing to go all in on E2E encryption which creates a ton of other problems that would need to be solved.

          • jecxjo@midwest.social
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            Exactly. But you can now get burner numbers much easier so obfuscating the meta data outside the service is getting easier. Not perfect but easier.

            If you are doing things where you need full on clean room level obfuscation you can never use any type of service outside of your full control. At that point you’re talking one time pads and posting messages in the wanted ads of news papers.

    • DoucheAsaurus@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      I’ve always assumed every part of the internet to be tracked and logged somehow. What I take issue with is the use of that data to shove targeted ads down my throat at every opportunity.

  • Boiglenoight@lemmy.world
    link
    fedilink
    English
    arrow-up
    47
    arrow-down
    2
    ·
    1 year ago

    I don’t care about data harvesting. This stuff’s public.

    What I don’t want is my community that I joined to be impacted by a larger community filled with bigots, fascists, the hateful. That’s going to be my cue to exit. So I hope that the community I’m a part of values me and others like me more than tolerating the above for the sake of growth or whathaveyou.

  • Muddybulldog@mylemmy.win
    link
    fedilink
    English
    arrow-up
    20
    ·
    1 year ago

    If Meta, Reddit, Twitter, etc. aren’t already harvesting all of the fediverse it’s only because they don’t see it big enough to be of value.

    It’s a trivial task.

  • Tosti@feddit.nl
    link
    fedilink
    English
    arrow-up
    10
    ·
    1 year ago

    Since pubfed is open, it will be relatively impossible to avoid outside actors harvesting the content. All they would need to do is deploy a custom instance and that way they would have access.

    I think this will be an interesting struggle as it evolves, protecting against a giant like Meta who won’t hesitate to spend serious money if it suits them.

    • Ferk@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Or simply query the API from an existing instance, like third party clients do. Though I suppose then it’d have to account for rate limits.

  • red@feddit.de
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    1 year ago

    Anyone can set up a Lemmy instance, write a small script/bot to find and follow all the communities on all the instances in the Fediverse and store all that data. It’s not even hard, maybe a day of work for a proof of concept if you start from zero. (Then you have to figure out how to scale it properly, how to detect you’re getting defederated and how to change domains to restart without the defederations. Maybe a week’s worth of effort.)

    Threads would be way overkill to achieve this goal. You don’t need any users. You don’t want any users. Just your one account that follows everything.

    Edit: or you can just set up a web crawler like Google Search uses to find and store all the data you’re looking for, you don’t necessarily to be federated / use ActivityPub

  • zalack@kbin.social
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    I think it would be interesting to explore an API affordance for attaching licenses to fediverse content, with the admins being able to set a default license for their server.

    So if Meta wants to get content from the fediverse, it has to check the headers of each post and make sure it’s licensed for commercial use.