I’m really enjoying lemmy. I think we’ve got some growing pains in UI/UX and we’re missing some key features (like community migration and actual redundancy). But how are we going to collectively pay for this? I saw an (unverified) post that Reddit received 400M dollars from ads last year. Lemmy isn’t going to be free. Can someone with actual server experience chime in with some back of the napkin math on how expensive it would be if everyone migrated from Reddit?

  • SalamanderA
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    2 years ago

    I think this underestimates how users will naturally gravitate towards more centralized instances, or they’ll give up because the bigger instances are closed.

    (This is purely my personal opinion, of course!) In the scenario in which a few large instances dominate, the idea of the fediverse failed. One may estimate the likelyhood of success or failure given how they expect humans to behave, but in the end experiment beats theory. I think that for the fediverse to work a significant cultural shift has to occur, but I don’t think that it is an impossible shift. I would like the fediverse to succeed, and so I choose to take part in the experiment.

    This also ignores that the system isn’t horizontally scalable at all, so scaling up gets even more expensive

    Yes, that might cause some serious issues. The project is still in an early-development phase, and I don’t understand the technical aspects well enough yet to be able to identify whether there is obviously a fundamentally invincible barrier when it comes to scalability. My optimistic hope is that the developers are able to optimize horizontal scalability fast enough to meet the demand for scale. If it turns out to be impossible to scale, then only rich enough parties would be able to have viable instances, and that could be a reason for failure.

      • SalamanderA
        link
        fedilink
        English
        arrow-up
        11
        ·
        2 years ago

        This is what I think, but if anyone understands it differently please correct me.

        Vertical scalability refers to scaling within a single instance. More users join and they post more content, increasing the amount of disk space needed to hold that memory, network bandwidth to handle many users downloading comments and images at once, and processing power.

        Horizontal scaling refers to the lemmyverse growing because of the addition of new instances. The problem in this form of scaling is due to the resources that an instance has to use due to its interactions with other instances. So, you may create a small instance without a lot of users, but the instance might still need a lot of resources if it attempts to retrieve a lot of information (posts, comments, user information, etc) from the other larger instances. For example, at some point a community in lemmy.ml might be so popular that subscribing to that community from a small instance would be too much of a burden on the smaller instance because of the amount of memory required to save the constant stream of new posts. The horizontal scaling is a problem when the lemmyverse becomes so large that a machine with only a small amount of resources is no longer able to be part of the lemmyverse because its memory gets filled up in a few hours or days.

        • Jeremy [Iowa]@midwest.social
          link
          fedilink
          English
          arrow-up
          10
          ·
          2 years ago

          You can summarize by thinking of vertical scaling as “make machine bigger / more powerful” with horizontal scaling as “make more machines”.

        • honk@feddit.de
          link
          fedilink
          English
          arrow-up
          4
          ·
          2 years ago

          I don’t believe this is how it works though.

          Let’s say your tiny 3 person instance is connected to a big one. I believe it only pulls in content from the communities somebody from the small instance is subscribed too. Correct me if I’m wrong.

          • panoptic@fedia.io
            link
            fedilink
            arrow-up
            4
            ·
            2 years ago

            That’s what they’re saying.

            Essentially - if someone from the small instance subscribes to a community that has a ton of data (huge post volume, images, whatever), the small instance needs to pull data over from the larger instance. At some point there may be communities that are so large small instances can’t pull them in without tanking.

            • Silviecat44@vlemmy.net
              link
              fedilink
              arrow-up
              2
              ·
              edit-2
              2 years ago

              I wonder if there is a way to get around this? maybe smaller instances will have to be text-only?

              • panoptic@fedia.io
                link
                fedilink
                arrow-up
                2
                ·
                2 years ago

                If I’m reading the protocol right, it’s probably larger instances that will avoid more duplication, since:

                1. There’s a higher chance they’re going to have more communities shared among users (for really tiny instances you’re probably going to get a lot of overlap since those people likely have interconnected interests, but I expect that would fall off quickly, but then converge at scale).
                2. The larger number of users will mean they ‘use’ more of the content they’re pulling down (I can’t read all of a highly active community in a day, but 1000 people together checking through the day might ‘use’ it all).

                I’m not sure I see where you see caching fitting in.
                I am surprised I don’t see some kind of lower resolution digest concept in the protocol (which might be what you’re looking for)

            • honk@feddit.de
              link
              fedilink
              arrow-up
              0
              ·
              2 years ago

              maybe I phrased that poorly and you didn’t understand what I was trying to say. The size of the bigger instance shouldn’t matter at all because only data from communities is pulled, that a member of the smaller instance is subscribed to. So if the bigger instance has 1000 members or 2 million members wouldn’t make a difference. The only thing relevant should be how active the communities are that members are subscribed to.

              • panoptic@fedia.io
                link
                fedilink
                arrow-up
                1
                ·
                2 years ago

                Sure, the sizes of the communities is what matters (multiplied by the number of communities users on the server care about).
                I think most of us are assuming larger instances are more likely to host the larger communities.

                Actually, if I’m reading the protocol right, it’d be hard for a small server to host a highly active community anyway (for some value of highly active). So yes, some 2 person instance that was created to offload stuff could be the primary host for a massive community, but in practice it won’t.

                • honk@feddit.de
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  2 years ago

                  We are arguing about very specific things here anyway. And I generally do share your concerns about how well this is going to scale. I want this to do well.

          • flambonkscious@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 years ago

            That’s what I’ve gathered, but I don’t believe there’s a way for instance owners to limit what’s fetched - a user crafts the query and the server does the needful.

            I imagine this could amount to a denial of service attack of sorts, if some high-churn communities are imported into tiny instances. How bad that could be, I have no idea - I’m speaking pretty theoretically, here. Text is tiny, after all, so it’s probably not much of a concern, since most of the media is actually handled elsewhere…

            • honk@feddit.de
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 years ago

              I’m not a web developer. I’m sort of a sysadmin so i have some experiences maintaining machines for web apps for other people. And you are right…text will not create massive amounts of data. But a lot of tiny transactions can bring down machines surprisingly fast even if the total amount of data is relatively small.

              I guess we are here to experience it first hand. I don’t think anybody…not even the developers have a clear idea of how well this will scale. There is only one way to find out lol

        • suspicious_dog@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 years ago

          Interesting, so would the smaller instance in this case have to perpetually store all content from the remote community, or does it just store the most recent X posts with the rest archived on the instance hosting the community? Or is it more an issue of the resources required to handle the transactions rather than the amount of data per se?

      • bobaduk@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 years ago

        Some things can go faster if you add more workers, some things can only go faster if you make the workers bigger or faster .

        If you’re tidying a garden you can get it all done more quickly, and tackle bigger gardens, by getting your friends to help. That’s horizontal scaling.

        If you need to get a parcel from your house to Burkina Faso the only way to do that more quickly is to use a bigger, faster machine. That’s vertical scaling.

        The way Lemmy is designed right now (says the op, I don’t know the detail) you can only support more users by making the server bigger and more expensive, not by using lots of smaller servers.

    • monobot@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 years ago

      I think that having few big instances is not failing, it is natural for social network (where lemmy is some representation of one) to be scale-free network, which has big hubs and buch of smaller nodes connected.

      Most people would go to general instances, but artists will probably go to some art focused instabce, developer to proggraming.dev… But we will have bih hubs, there is no way around it.

      • SalamanderA
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 years ago

        Yes, you are right, I should have formulated that better.

        I would expect that there would be a few big instances. What I should have said is having only a few big instances and no small instances would be a failure. It would be totally acceptable to have a few big instances and lots of smaller instances that can still interact with the fediverse. The failure would be if you have something like 20 very big instances that only interact with each other and that are inaccessible to the small instances - either because they close their federation, or because it is too resource-intensive for small instances to interact with them. In this case you end up with a centralized system again, not better and potentially worse than something like Reddit. As long as someone can spin-up their own instance if they want to and be part of the larger ecosystem, it would be a success.

    • soulnull@burggit.moe
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 years ago

      Sadly I have to agree with the original point of people just wanting to click something and see kitties. The vast majority of people will just go to one of the bigger ones. “Bigger is better, they must be doing something right I guess”.

      I don’t normally want to be wrong, but I do want to be very wrong here… I admit I’m cynical and just think most people are going to glaze over when someone tells them something other than a simple link to type into their browser to click and see kitties with, or just install this app and click the dancing kitty. Once they have to look at stuff to make a decision, even if it’s not life or death or of actual importance to the experience, they’ll just go back to what they’re comfortable with. We’re creatures of habit, and most people don’t like learning new routines.

      My mother still pays DirecTV $150 a month to watch the same programs I’ve already set up on her Android box for less than that a year, because, survey says, “I don’t understand it”. No desire to learn it. I literally mapped all the channels to the DirecTV numbers. All you do is click the icon and scroll through the guide the exact same way you do with DirecTV, except there’s categories that’ll let you narrow down which channels you want to show up in the guide… Which is apparently the part she nopes out at, because decisions are scary. she’s also a Reddit user, and I don’t anticipate she’ll be joining us on the fedi…

      But again, I hope I’m wrong. I truly do.