Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e130a6a1-3736-4e47-853f-d11a62c9a0b4@gmail.com>
Date: Thu, 5 Sep 2024 19:25:24 +0200
From: Andrea <muradin80@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: How to debug segmentation fault using OpenCL and
 ROCM

Hi Alexander,

On 02/09/2024 20:28, Solar Designer wrote:
> Hi Andrea,
>
> On Mon, Sep 02, 2024 at 08:01:09PM +0200, Andrea wrote:
>> I have just recently started to learn using JtR tool and I would like to
>> enable OpenCL capabilities.
>>
>> Unfortunately, running some tests, I'm always getting segmentation fault
>> error; I would like to understand the best way to debug the issue and
>> track it down to understand if the issue is related to either ROCM,
>> OpenCL or JtR (or a mix of them).
> In our experience, on AMD GPUs it is "normal" to have a few failing
> formats.  So if you run tests of many/all formats and only a few fail,
> you just accept it's that way with your current combination of GPU and
> driver.  AMD's software is just poor.
>
> We sometimes try and workaround individual bugs, but we've long since
> accepted that some formats will be failing on AMD.  If you want all to
> work at once (with the same driver version, etc.), get NVIDIA.  We do
> get to 100% passing tests on NVIDIA, but (as I recall) never on AMD.
>
> That said, it is also possible that in your case the problem is
> different.  So if you determine that _all_ formats are failing like
> that, then it makes sense to look into the cause (e.g., incorrect driver
> or JtR installation) and fix that.  From your testing of just one
> format, we can't tell.
>
>> Example of command and related error:
>>
>> john --test=0 --format=sha256crypt-opencl
>>
>> Device 1: gfx90c:xnack- [AMD Radeon Graphics]
>> Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256
>> OpenCL]... Build log: warning: argument unused during compilation: '-I
>> opencl' [-Wunused-command-line-argument]
>> 1 warning generated.
>> Segmentation fault
> It is unusual that this specific format fails.  It normally works:
>
> $ ./john --test=0 --format=sha256crypt-opencl
> Device 1: gfx900 [Radeon RX Vega]
> Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256 OpenCL]... PASS
>
> So please try testing other formats.
Unfortunately no test is passing whatever test I select using OpenCL
>
> Alexander

It's really a bit of a mess to enter the world of computation on GPU on 
Linux.

I did some more trials, I will share, hoping it could be useful for someone.

1. I didn't share my setup; I'm playing with this stuff on an Asus PN50 
with integrated Ryzen 5 GPU (not the best in terms of performance, still 
this is what I have by hand). Moreover, having a look to ROCM 
documentation, it seems my detected board (gfx90c) is not supported, 
neither by ROCM nor by proprietary AMDGPU-PRO driver. Still, I wanted to 
check if OpenCL by itself, not throughout JtR, is working on reporting 
any error.

2. Found on the net a tool named OpenCL-Benchmark to do some benchmark 
on OpenCL: https://github.com/ProjectPhysX/OpenCL-Benchmark; another 
issue, it crashed due to VRAM size.

3. This triggered to me the question: how can I increase VRAM, that 
actually is shared memory used also by the main CPUs; found that no 
option was available within former BIOS version, so I updated the BIOS 
with the latest available on Asus website and now I have one more option 
in BIOS setup; so I moved VRAM from 512 MB to 2 GB.

4. Rerun OpenCL-Benchmark (which was failing at point 2) with good 
result, nice! (see attachment, good means it is just working ...)

5. Now, how can I be sure the tool is not, by chance, using CPU instead 
of GPU? Found another interesting tool to monitor AMD GPU, amdgpu_top: 
https://github.com/Umio-Yasuno/amdgpu_top. Normally GPU is doing almost 
no significant operation, but I started the benchmark and the tool 
reported GPU processing stuff sky-rocketing to 100% of usage, nice!

6. Here maybe I'm wrong, but the chain to me is JtR -> OpenCL -> ROCM, 
so this should mean ROCM is working if OpenCL can be benchmarked, and 
here things are becoming interesting: I have installed rocRAND 
https://github.com/ROCm/rocRAND, which tells me the list of supported 
GPU (gfxNNN), where gfx90c is not listed (ref. point 1). Indeed, the 
tool didn't work on my setup at first, but I wanted to give it a try 
and, googling a little, I found an interesting macro 
HSA_OVERRIDE_GFX_VERSION, which let me to override the detected; another 
surprise, selecting either gfx906 or gfx908, the tool started to work 
(GPU usage monitored by amdgpu_top as per previous tests), nice! My 
explanation: probably the involved commands towards GPU are "similar 
enough" to work, kind of common ground for these GPUs.

7. And now, the final step: let's try to use the previous macro so, 
instead of this:

john --test=0 --format=sha256crypt-opencl
Device 1: gfx90c:xnack- [AMD Radeon Graphics]
Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256 
OpenCL]... Segmentation fault

Let's try with this:

HSA_OVERRIDE_GFX_VERSION=9.0.8 john --test=0 --format=sha256crypt-opencl
Device 1: gfx908:sramecc-:xnack- [AMD Radeon Graphics]
Testing: sha256crypt-opencl, crypt(3) $5$ (rounds=5000) [SHA256 
OpenCL]... Build log: warning: argument unused during compilation: '-I 
opencl' [-Wunused-command-line-argument]
1 warning generated.
Segmentation fault

Ops ...no, this is not a good (enough) work-around; I can see gfx908 is 
forced instead of using gfx90c, but it is still not enough, the fact 
ROCM doesn't support gfx90c here seems to be a blocking point.

Hope I have provided at least some good tools reference to play with 
this stuff, I will try some more stuff and (wait or prey?) for ROCM 
gfx90c support (even if I guess there is no much interest to use GPU in 
this integrated devices).

Andrea

View attachment "asus-pn50-2GB-VRAM.log" of type "text/x-log" (2480 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.