[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PyTorch with ROCm
From: |
Ludovic Courtès |
Subject: |
Re: PyTorch with ROCm |
Date: |
Tue, 02 Apr 2024 16:00:34 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Hello!
(Cc’ing my colleague Romain who may work on related things soon.)
David Elsing <david.elsing@posteo.net> skribis:
> It is the same as for other HIP/ROCm libraries, so the GPU architectures
> chosen at build time are all available at runtime and automatically
> picked. For reference, the Arch Linux package for PyTorch [1] enables 12
> architectures. I think the architectures which can be chosen at compile
> time also depend on the ROCm version.
Nice. We’d have to check what the size and build time tradeoff is, but
it makes sense to enable a bunch of architectures.
>>> I'm not sure they can be combined however, as the GPU code is included
>>> in the shared libraries. Thus all dependent packages like
>>> python-pytorch-rocm would need to be built for each architecture as
>>> well, which is a large duplication for the non-GPU parts.
>>
>> Yeah, but maybe that’s OK if we keep the number of supported GPU
>> architectures to a minimum?
>
> If it's no issue for the build farm it would probably be good to include
> a set of default architectures (the officially supported ones?) like you
> suggested, and make it easy to recompile all dependent packages for
> other architectures. Maybe this can be done with a package
> transformation like for '--tune'?. IIRC, building composable-kernel for
> the default architectures with 16 threads exceeded 32 GB of memory
> before I cancelled the build and set it to only architecture.
Yeah, we could think about a transformation option. Maybe
‘--with-configure-flags=python-pytorch=-DAMDGPU_TARGETS=xyz’ would work,
and if not, we can come up with a specific transformation and/or an
procedure that takes a list of architectures and returns a package.
>>> - Many tests assume a GPU to be present, so they need to be disabled.
>>
>> Yes. I/we’d like to eventually support that. (There’d need to be some
>> annotation in derivations or packages specifying what hardware is
>> required, and ‘cuirass remote-worker’, ‘guix offload’, etc. would need
>> to honor that.)
>
> That sounds like a good idea, could this also include CPU ISA
> extensions, such as AVX2 and AVX-512?
That’d be great, yes. Don’t hold your breath though as I/we haven’t
scheduled work on this yet. If you’re interested in working on it, we
can discuss it of course.
> I think the issue is simply that elf-file? just checks the magic bytes
> and has-elf-header? checks for the entire header. If the former returns
> #t and the latter #f, an error is raised by parse-elf in guix/elf.scm.
> It seems some ROCm (or tensile?) ELF files have another header format.
Uh, never came across such a situation. What’s so special about those
ELF files? How are they created?
>> Oh, just noticed your patch bring a lot of things beyond PyTorch itself!
>> I think there’s some overlap with
>> <https://gitlab.inria.fr/guix-hpc/guix-hpc/-/merge_requests/38>, we
>> should synchronize.
> Ah, I did not see this before, the overlap seems to be tensile,
> roctracer and rocblas. For rocblas, I saw that they set
> "-DAMDGPU_TARGETS=gfx1030;gfx90a", probably for testing?
Could be, we’ll see.
Thanks,
Ludo’.
- Re: PyTorch with ROCm,
Ludovic Courtès <=