- cross-posted to:
- amd@lemmy.ml
- fedora@kbin.social
- cross-posted to:
- amd@lemmy.ml
- fedora@kbin.social
What is “end-to-end GPU Acceleration”? Like for playing back video? Or for rendering stuff like in blender
Data science term. Means everything runs inside the GPU entirely. No CPU or system RAM outside of the (usually Python) interface that started, monitors, and collects the result of the job.
ROCm is AMD’s solution to CUDA that covers for nVidia.
For years I’ve wondered what ROCm was, too lazy to figure it out. Thank you for this!
Both are vendor specific implementations of processing on GPUs. This is in opposition to open standards like OpenCL, which a lot of the exascale big boys out there mostly use.
nVidia spent a lot of cash on “outreach” to get CUDA into a lot of various packages in R, python, and what not. That did a lot of displacement from OpenCL stuff. These libraries are what a lot of folks spin up on as most of the leg work is done for them in the library. With the exascale rigs, you literally have a team that does nothing but code very specific things on the machine in front of them, so yeah, they go with the thing that is the most portable, but doesn’t exactly yield libraries for us mere mortals to use.
AMD has only recently had the cash to start paying folks to write libs for their stuff. So were starting to see it come to python libs and what not. Likely, once it becomes a fight of CUDA v ROCm, people will start heading back over to OpenCL. The “worth it” for vendor lock-in for CUDA and ROCm will diminish more and more over time. But as it stands, with CUDA you do get a good bit of “squeezing that extra bit of steam out of your GPU” by selling your soul to nVidia.
That last part also plays into the “why” of CUDA and ROCm. If you happen to NOT have a rig with 10,000 GPUs, then the difference between getting 98% of your GPU and 99.999% of your GPU means a lot to you. If you do have 10,000 GPUs, having like a 1% inefficiency is okay, you’ve got 10,000 GPUs the 1% loss is barely noticeable and not worth it to lose portability with OpenCL.
Ah okay dope
This is the best summary I could come up with:
Fedora 40 is looking at shipping the AMD ROCm 6.x GPU compute stack to offer “end-to-end open-source GPU acceleration” with ease for this Red Hat funded Linux distribution.
Fedora has been among the Linux distributions already working on packaging up AMD’s ROCm to make it easier to deploy this GPU compute solution on their platform.
This has often been a headache for those wanting to use AMD ROCm outside of the few officially supported enterprise Linux distributions.
This change proposal is being pursued by Red Hat’s Tom Rix.
To address this feedback several packages are in the process of being added to Fedora including rocFFT rocSolver hipBLASLt MiOpen.
… Fedora has finally end-to-end open source GPU acceleration.
The original article contains 362 words, the summary contains 117 words. Saved 68%. I’m a bot and I’m open source!
HIP is amazing. For everyone saying “nah it can’t be the same, CUDA rulez”, just try it, it works on NVidia GPUs too (there are basically macros and stuff that remap everything to CUDA API calls) so if you code for HIP you’re basically targetting at least two GPU vendors. ROCm is the only framework that allows me to do GPGPU programming in CUDA style on a thin laptop sporting an AMD APU while still enjoying 6 to 8 hours of battery life when I don’t do GPU stuff. With CUDA, in terms of mobility, the only choices you get are a beefy and expensive gaming laptop with a pathetic battery life and heating issues, or a light laptop + SSHing into a server with an NVidia GPU.
The problem with ROCm is that its very unstable and a ton of applications break on it. Darktable only renders half an image on my Radeon 680M laptop. HIP in Blender is also much slower than Optix. We’re still waiting on HIP-RT.