I’ve been running an HPC system for a science group for a while now and have built a couple of different systems based on common HPC infrastructures (ROCKS or Open HPC). These have been built on top of the rebuilt RHEL distros (mostly CentOS), but I don’t really need the level of stability that these provide and would actually like the sort of updates that you get from something like CentOS stream, so this seems like a time to try this.
The problem is that I haven’t found an HPC framework which would natively support this so I’m potentially going to have to roll my own. I don’t need anything fancy just some way to automatically deploy nodes and set up slurm to get jobs queued.
Any pointers to suitable frameworks or tools which would help with this and which aren’t tied to older distros?
I’m not entirely certain about the actual HPC stuff, but there’s no good reason CentOS Stream wouldn’t do what you need.
Yeah, conceptually I like it. A while back I used to run my systems on Fedora which was great in that I always had the latest of everything, but doing updates every 6 months got tedious. Stream seems like a good compromise on the way to that.
I mean, if you know the software you need to have, to make it work on RHEL, It might take a bit of work on your part, but I can’t imagine getting it installed on CentOS Stream will be that onerous a task.
I would not be using CentOS in your use case as it is a rolling release and as such not considered stable for production environments. In recent times Ubuntu server has taken over where CentOS was once used.
In regards to a framework for HPC, I would be looking at grid computing and using one of the scientific workflow management solutions which is compatible with your requirements and a Linux environment.
https://en.m.wikipedia.org/wiki/Grid_computing
https://en.m.wikipedia.org/wiki/Scientific_workflow_systemThe lack of stability is actually quite attractive to me. In a scientific environment we’re normally running fairly new, often unstable code, and we often hit problems because of using older versions of libraries / packages / compilers, so somthing which stays a bit more current would be good and we can deal with breakage if it happens. The trouble is the management systems around HPC assume you’re working on enterprise systems, which isn’t really true in our case.
I’ve looked at things like OpenHPC but they’re still on RHEL8 (RHEL9 is in testing but not released yet), and even lower level tools like warewulf is still only supporting RHEL8 at the moment which is getting too old for me to want to build a new system from it.
I’ve looked at more generic tools like Ansible and Chef / Puppet but before I go down that rabbit hole I’d like a sanity check that there isn’t something more suited that I’m missing.
It’s a misconception that Centos Stream is a rolling release. It comes in versioned releases that tracks ahead of Red Hat by a few months and have 5-year support cycles.
It literally states in the CentOS site that it is a, and I quote;
“Continuously delivered disto that tracks just ahead of Redhat Enterprise Linux (RHEL) development…”That’s a rolling release.
I don’t see how that quote means anything. It factually just isn’t a rolling release model. Rolling releases are like Arch, which don’t have versions and are instead continuously updated. Point releases have a versioning system in place. Centos (as the quote says) tracks ahead of Red Hat, so Centos Stream 9 released a few months before RHEL 9. In the future there will be a 10 and an 11. That makes it a point release schedule, not a rolling release schedule.
The wiki article literally states in the first line.
“Rolling release, also known as rolling update or continuous delivery…”You are just trying to argue for arguments sake. Just stop it. CentOS is a rolling release.
Centos is 100%, factually not a rolling release. Rolling releases don’t deploy based on version, they do it based on snapshot. That is quite literally the only defining characteristic of a rolling release and Centos does not share it. Centos deploys by version AKA a point release schedule. Centos 9. Centos 10. Centos 11. Actual rolling releases don’t have this characteristic. There isn’t an Arch 5, or an OpenSUSE Tumbleweed 23.1, or a Void Linux 4.8. There is just Arch, OpenSUSE Tumbleweed, and Void Linux. Maybe you have Arch 2023.29.6 snapshot but that is not the same thing.
“This is in contrast to a standard or point release development model which uses software versions that must be reinstalled over the previous version.”
This is exactly how Centos works. It’s also how Red Hat, Ubuntu and Debian work. Are Red Hat, Ubuntu and Debian rolling releases?