So after we’ve extended the virtual cloud server twice, we’re at the max for the current configuration. And with this crazy growth (almost 12k users!!) even now the server is more and more reaching capacity.
Therefore I decided to order a dedicated server. Same one as used for mastodon.world.
So the bad news… we will need some downtime. Hopefully, not too much. I will prepare the new server, copy (rsync) stuff over, stop Lemmy, do last rsync and change the DNS. If all goes well it would take maybe 10 minutes downtime, 30 at most. (With mastodon.world it took 20 minutes, mainly because of a typo :-) )
For those who would like to donate, to cover server costs, you can do so at our OpenCollective or Patreon
Thanks!
Update The server was migrated. It took around 4 minutes downtime. For those who asked, it now uses a dedicated server with a AMD EPYC 7502P 32 Cores “Rome” CPU and 128GB RAM. Should be enough for now.
I will be tuning the database a bit, so that should give some extra seconds of downtime, but just refresh and it’s back. After that I’ll investigate further to the cause of the slow posting. Thanks @veroxii@lemmy.world for assisting with that.
Yes. It’s called performance testing. Basically an engineer would need to setup test user transactions to simulate live traffic and load test the system to see how everything scales, where it breaks, etc. Then you can use the results of the tests to figure out how big of an instance you should use for your projected number of users.
Jmeter, and locust.io are the two biggest open source performance test tools.
The alternative is take a wild guess. See how the system behaves, and make adjustments in real time… like what @ruud@lemmy.world is currently doing.
Worth noting that typical app scaling does not scale linearly, and hardware caps out at some point (with diminishing returns up to that point) - federation will help with that much cheaper where normally a company would just have to throw more money at more servers themselves :)
Yup. You don’t have to explain that to me. It’s funny when folks assume:
The reality is as soon as you remove one bottleneck you will find the next bottleneck.
Yeah, I meant specific data using lemmy.world as a datum, not the theoretical “check and see if you guessed right” method.