|
Message-ID: <e130a6a1-3736-4e47-853f-d11a62c9a0b4@gmail.com>
Date: Thu, 5 Sep 2024 19:25:24 +0200
From: Andrea <muradin80@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: How to debug segmentation fault using OpenCL and
ROCM
Hi Alexander,
On 02/09/2024 20:28, Solar Designer wrote:
> Hi Andrea,
>
> On Mon, Sep 02, 2024 at 08:01:09PM +0200, Andrea wrote:
>> I have just recently started to learn using JtR tool and I would like to
>> enable OpenCL capabilities.
>>
>> Unfortunately, running some tests, I'm always getting segmentation fault
>> error; I would like to understand the best way to debug the issue and
>> track it down to understand if the issue is related to either ROCM,
>> OpenCL or JtR (or a mix of them).
> In our experience, on AMD GPUs it is "normal" to have a few failing
> formats. So if you run tests of many/all formats and only a few fail,
> you just accept it's that way with your current combination of GPU and
> driver. AMD's software is just poor.
>
> We sometimes try and workaround individual bugs, but we've long since
> accepted that some formats will be failing on AMD. If you want all to
> work at once (with the same driver version, etc.), get NVIDIA. We do
> get to 100% passing tests on NVIDIA, but (as I recall) never on AMD.
>
> That said, it is also possible that in your case the problem is
> different. So if you determine that _all_ formats are failing like
> that, then it makes sense to look into the cause (e.g., incorrect driver
> or JtR installation) and fix that. From your testing of just one
> format, we can't tell.
>
>> Example of command and related error:
>>
>> john --test=0 --format=sha256crypt-opencl
>>
>> Device 1: gfx90c:xnack- [AMD Radeon Graphics]
>> Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256
>> OpenCL]... Build log: warning: argument unused during compilation: '-I
>> opencl' [-Wunused-command-line-argument]
>> 1 warning generated.
>> Segmentation fault
> It is unusual that this specific format fails. It normally works:
>
> $ ./john --test=0 --format=sha256crypt-opencl
> Device 1: gfx900 [Radeon RX Vega]
> Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256 OpenCL]... PASS
>
> So please try testing other formats.
Unfortunately no test is passing whatever test I select using OpenCL
>
> Alexander
It's really a bit of a mess to enter the world of computation on GPU on
Linux.
I did some more trials, I will share, hoping it could be useful for someone.
1. I didn't share my setup; I'm playing with this stuff on an Asus PN50
with integrated Ryzen 5 GPU (not the best in terms of performance, still
this is what I have by hand). Moreover, having a look to ROCM
documentation, it seems my detected board (gfx90c) is not supported,
neither by ROCM nor by proprietary AMDGPU-PRO driver. Still, I wanted to
check if OpenCL by itself, not throughout JtR, is working on reporting
any error.
2. Found on the net a tool named OpenCL-Benchmark to do some benchmark
on OpenCL: https://github.com/ProjectPhysX/OpenCL-Benchmark; another
issue, it crashed due to VRAM size.
3. This triggered to me the question: how can I increase VRAM, that
actually is shared memory used also by the main CPUs; found that no
option was available within former BIOS version, so I updated the BIOS
with the latest available on Asus website and now I have one more option
in BIOS setup; so I moved VRAM from 512 MB to 2 GB.
4. Rerun OpenCL-Benchmark (which was failing at point 2) with good
result, nice! (see attachment, good means it is just working ...)
5. Now, how can I be sure the tool is not, by chance, using CPU instead
of GPU? Found another interesting tool to monitor AMD GPU, amdgpu_top:
https://github.com/Umio-Yasuno/amdgpu_top. Normally GPU is doing almost
no significant operation, but I started the benchmark and the tool
reported GPU processing stuff sky-rocketing to 100% of usage, nice!
6. Here maybe I'm wrong, but the chain to me is JtR -> OpenCL -> ROCM,
so this should mean ROCM is working if OpenCL can be benchmarked, and
here things are becoming interesting: I have installed rocRAND
https://github.com/ROCm/rocRAND, which tells me the list of supported
GPU (gfxNNN), where gfx90c is not listed (ref. point 1). Indeed, the
tool didn't work on my setup at first, but I wanted to give it a try
and, googling a little, I found an interesting macro
HSA_OVERRIDE_GFX_VERSION, which let me to override the detected; another
surprise, selecting either gfx906 or gfx908, the tool started to work
(GPU usage monitored by amdgpu_top as per previous tests), nice! My
explanation: probably the involved commands towards GPU are "similar
enough" to work, kind of common ground for these GPUs.
7. And now, the final step: let's try to use the previous macro so,
instead of this:
john --test=0 --format=sha256crypt-opencl
Device 1: gfx90c:xnack- [AMD Radeon Graphics]
Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256
OpenCL]... Segmentation fault
Let's try with this:
HSA_OVERRIDE_GFX_VERSION=9.0.8 john --test=0 --format=sha256crypt-opencl
Device 1: gfx908:sramecc-:xnack- [AMD Radeon Graphics]
Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256
OpenCL]... Build log: warning: argument unused during compilation: '-I
opencl' [-Wunused-command-line-argument]
1 warning generated.
Segmentation fault
Ops ...no, this is not a good (enough) work-around; I can see gfx908 is
forced instead of using gfx90c, but it is still not enough, the fact
ROCM doesn't support gfx90c here seems to be a blocking point.
Hope I have provided at least some good tools reference to play with
this stuff, I will try some more stuff and (wait or prey?) for ROCM
gfx90c support (even if I guess there is no much interest to use GPU in
this integrated devices).
Andrea
View attachment "asus-pn50-2GB-VRAM.log" of type "text/x-log" (2480 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.