Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc3da1a4-4b99-737f-050e-54ef5844c402@gmail.com>
Date: Wed, 5 Oct 2016 15:11:05 +0200
From: Jakub Narębski <jnareb@...il.com>
To: musl@...ts.openwall.com, James B <jamesbond3142@...il.com>
Cc: Johannes Schindelin <Johannes.Schindelin@....de>,
 Jeff King <peff@...f.net>, git@...r.kernel.org
Subject: Re: Regression: git no longer works with musl libc's regex impl

W dniu 05.10.2016 o 00:33, Rich Felker pisze:
> On Wed, Oct 05, 2016 at 09:06:25AM +1100, James B wrote:
>> On Tue, 4 Oct 2016 18:08:33 +0200 (CEST)
>> Johannes Schindelin <Johannes.Schindelin@....de> wrote:
>>>
>>> No, it is not. You quote POSIX, but the matter of the fact is that we use
>>> a subset of POSIX in order to be able to keep things running on Windows.
>>>
>>> And quite honestly, there are lots of reasons to keep things running on
>>> Windows, and even to favor Windows support over musl support. Over four
>>> million reasons: the Git for Windows users.
>>
>> Wow, I don't know that Windows is a git's first-tier platform now,
>> and Linux/POSIX second. Are we talking about the same git that was
>> originally written in Linus Torvalds, and is used to manage Linux
>> kernel? Are you by any chance employed by Redmond, directly or
>> indirectly?
>>
>> Sorry - can't help it.

Windows is one of the major platforms, yes.  I think there much, much
more people using Git on Windows, than using Git with musl.  More
users = more important.

Also, working with some inconvenience (requiring compilation with
NO_REGEX=1) is better than not working at all.

In CodingGuidelines we say:

 - Most importantly, we never say "It's in POSIX; we'll happily
   ignore your needs should your system not conform to it."
   We live in the real world.

 - However, we often say "Let's stay away from that construct,
   it's not even in POSIX".

 - In spite of the above two rules, we sometimes say "Although
   this is not in POSIX, it (is so convenient | makes the code
   much more readable | has other good characteristics) and
   practically all the platforms we care about support it, so
   let's use it".

The REG_STARTEND is 3rd point, mmap shenningans looks like 1st...

...on the other hand midipix <writeonce@...ipix.org> wrote in
http://public-inbox.org/git/20161004200057.dc30d64f61e5ec441c34ffd4f788e58e.efa66ead67.wbe@email15.godaddy.com/
that the proposed fix should work on all Windows version we are
interested in (I think).  Test program included / attached.

The above-mentioned email also explains that the problem was
caught on MS Windows; it triggers if file end falls on the mmapped
page boundary, which is more likely to happen with 4096 mod size
on Windows rather than 65536 mod size on Linux.


On the other hand, while the proposed solution of "add padding as
to not end at page boundary, if necessary" doesn't have the
performance impact of "memcpy into NUL-terminated buffer" that
was originally proposed in patch series, it is still extra code
to maintain.

> 
> I don't think the hostility and sarcasm are really needed here. But
> what this does speak to is that users don't like feeling like their
> platform is being treated as a second-class target, which is what it
> feels like when you have to manually flip a switch to make git build.

You are welcome to send a patch adding to configure.ac detection
of REG_STARTEND support in standard library - setting NO_REGEX if
needed, and/or adding to Makefile uname-based defaults setting
NO_REGEX for compiling with musl.

> This is especially unfriendly when the semantics of the switch come
> across, at least to some users, as "your system regex is incomplete"
> rather than "git can't use it because git depends on nonstandard
> extensions".

Nonstandard but common extension.  As 2f8952250a commit message says
https://github.com/git/git/commit/2f8952250a84313b74f96abb7b035874854cf202

  Happily, there is an extension to regexec() introduced by the NetBSD
  project and present in all major regex implementation including
  Linux', MacOSX' and the one Git includes in compat/regex/: [...]

  [...]

  Since support for REG_STARTEND is so widespread by now, let's just
  introduce a helper function that always uses it, and tell people
  on a platform whose regex library does not support it to use the
  one from our compat/regex/ directory.

Also, as Junio said, the description of NO_REGEX option in the
Makefile now explicitly says:

    # Define NO_REGEX if your C library lacks regex support with REG_STARTEND
    # feature.

Best,
-- 
Jakub Narębski

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.