|
Message-ID: <20111216231340.GA23495@openwall.com> Date: Sat, 17 Dec 2011 03:13:40 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: 1.7.9's --external + OpenMP fails on Cygwin On Sat, Dec 17, 2011 at 01:46:48AM +0400, Solar Designer wrote: > src/winsup/cygwin/thread.cc: > > int > pthread_mutex::init (pthread_mutex_t *mutex, > const pthread_mutexattr_t *attr, > const pthread_mutex_t initializer) > { > if (attr && !pthread_mutexattr::is_good_object (attr)) > return EINVAL; > > mutex_initialization_lock.lock (); > if (initializer == NULL || pthread_mutex::is_initializer (mutex)) > > Notice how the not yet initialized mutex is checked with > "pthread_mutex::is_initializer (mutex)". And yes, it catches faults: ... This was close, but not quite it. The same approach is used in other parts of the Cygwin threads code, including in: int semaphore::init (sem_t *sem, int pshared, unsigned int value) { /* We can't tell the difference between reinitialising an existing semaphore and initialising a semaphore who's contents happen to be a valid pointer */ if (is_good_object (sem)) { paranoid_printf ("potential attempt to reinitialise a semaphore"); } where: inline bool semaphore::is_good_object (sem_t const * sem) { if (verifyable_object_isvalid (sem, SEM_MAGIC) != VALID_OBJECT) return false; return true; } While paranoid_printf() is probably not triggered, a fault is often triggered (on invalid pointer inside the not-yet-initialized semaphore). And apparently there's something wrong with the fault handling. Since this stuff is not needed, I binary-patched it out of my copy of cygwin1.dll. As seen with "objdump -d" and "diff -u": 610ecff6: e8 75 d4 06 00 call 6115a470 <__Z11__set_errnoPKcii> 610ecffb: b8 ff ff ff ff mov $0xffffffff,%eax 610ed000: eb 42 jmp 610ed044 <__ZN9semaphore4initEPPS_ij+0x194> -610ed002: 8b 06 mov (%esi),%eax -610ed004: 81 78 04 4c f0 0d df cmpl $0xdf0df04c,0x4(%eax) +610ed002: 33 c0 xor %eax,%eax +610ed004: 40 inc %eax +610ed005: 90 nop +610ed006: 90 nop +610ed007: 90 nop +610ed008: 90 nop +610ed009: 90 nop +610ed00a: 90 nop 610ed00b: 0f 85 0f ff ff ff jne 610ecf20 <__ZN9semaphore4initEPPS_ij+0x70> 610ed011: 8b 95 14 ff ff ff mov -0xec(%ebp),%edx 610ed017: 64 a1 04 00 00 00 mov %fs:0x4,%eax After this change, the problem went away. Another workaround that worked was to add: free(calloc(1, 0x100)); free(calloc(1, 0x1000)); free(calloc(1, 0x10000)); free(calloc(1, 0x100000)); right before one of the parallel regions where the problem was otherwise triggered, but this obviously has performance impact. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.