Experiments in Ceph (with Promox)

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Experiments in Ceph (with Promox)

eros@lemmy.world · 1 year ago

Nice writeup. As long as you can throw fast drives, fast networking and plenty of RAM at it Ceph is happy.

Ceph seems to work fine on my cluster at work. For less than $40k I replaced my whole VMware vSAN cluster and we’re saving as much again in software licensing over the next 5 years with buying support from Proxmox. Also much lighter as far as administrative tasks to keep it up to date and running well.

3x Supermicro SSG-110P-NTR10

Intel Xeon Gold 5713
256 GB RAM
10 Intel D7-P5510 3.84TB NVME
2 Micron 5400 Max
Onboard dual 10GbE
Mellanox ConnectX4 Dual SFP28 25GbE
5 year NBD parts warranty

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Have you done any measurements of IOPs? Just curious to know.

eros@lemmy.world · 1 year ago

I don’t, but I’ll run some and try to remember to post back.

MangoPenguin@lemmy.blahaj.zone · 1 year ago

Ceph seems neat, but the fact that it can’t even function with normal SSDs points to something very wrong with how it’s designed. It seems like it has an absurd overhead.

HTTP_404_NotFound@lemmyonline.com · 1 year ago

I believe its a data-safety thing, similar to how ZFS’s ZIL works.

That is, a write isn’t completed until its actually written. In the case of consumer SSDs, this means, waiting for the write to complete. In the case of enterprise SSDs, this means the write-cache, (due to PLP, power loss protection).

With anything though, you can disable those safety features.

absurd overhead.

Actually a massive understatement. I threw together over 5 million IOPs worth of disks, to barely squeeze 100k IOPs out of the cluster! Its EXTREMELY inefficient, compared to… well, pretty much any other option. I mean, writing encrypted zip files to SD card storage can be faster in some circumstances. lol

But, its reliable, fault-tolerant storage, which is instantly available(ie, no replication, syncing, etc).

redcalcium@lemmy.institute · 1 year ago

How is Ceph latency compared to plain old NFS with the same (single) hardware? Especially when your apps requires reading a lot of small files where latency matter more than raw speed? NFS is pretty awful for this so I’m interested if there are any good alternatives.

HTTP_404_NotFound@lemmyonline.com · 1 year ago

I am going to guess- normal NFS is going to be faster…

There is really nothing about ceph that even remotely says, “Fast” to me.

One alternative might for you, might be minio, if object storage works for you. In my experience, it performs pretty well.

ShatteredScales@lemmy.world · 1 year ago

Well that’s some weird behavior on the latency.

I have several Samsung 870 Evos across three hosts, and they’re all ~7ms.

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Might be due to the load?

Or, perhaps cache setting. I think one of the issues the consumer drives have, is lack of PLP.

https://forum.proxmox.com/threads/vm-i-o-performance-with-ceph-storage.120929/

This particular thread had some really good info around half way down.

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/2P5ROTWHR5Y2VWI6MA3IKQKUTC3WKYFB/

Experiments in Ceph (with Promox)

Experiments in Ceph (with Promox)

Cluster Details

Attempt number one.

Attempt / Experiment Number 2.

A few notes-

Future - Attempt #3