Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20171011202836.GM1627@brightrain.aerifal.cx>
Date: Wed, 11 Oct 2017 16:28:36 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH v5] stdio: implement fopencookie(3)

On Wed, Oct 11, 2017 at 04:52:54AM +0000, William Pitcock wrote:
> +static size_t cookieread(FILE *f, unsigned char *buf, size_t len)
> +{
> +	struct fcookie *fc = f->cookie;
> +	size_t ret, remain = len, readlen = 0;
> +
> +	if (fc->iofuncs.read == NULL) return 0;

This should probably set F_EOF (per the man page). Also as a matter of
style, musl usually uses !p rather than p==0 or p==NULL, and avoids
NULL entirely. I wouldn't insist on changing this if there were
nothing else wrong with the code, but if you have to change it for EOF
anyway it'd be nice to make it consistent with style too.

> +	ret = fc->iofuncs.read(fc->cookie, (char *) buf, len - 1);

Except when called by fread, len is always 1, so you're making a call
to iofuncs.read with a length of 0 in all these cases. I doubt
iofuncs.read has accepting lenth 0 as part of its (undocumented)
contract; it seems like a really bad idea to do it even if it is
valid.

> +	if (ret == 0) goto bail_eof;

And in this case it's going to have to return 0, which makes goto
bail_eof clearly wrong. I don't think this code could have passed even
minimal testing. Have you only tried it with fread() and not other
functions like fgetc(), fgets(), etc.?

This can probably be fixed by making the above code (both the
iofuncs.read call and the goto check) conditional on len>1.

If len>1, you are servicing an fread() that wasn't satisfied fully by
existing buffer contents. Last night on IRC I suggested that it might
make sense to avoid filling the buffer at all in this case (since it's
a bit costly -- it requires 2 calls to iofuncs.read as you've done),
but I think I was wrong. The interesting case is a number of small
fread calls, each reading a length much smaller than the buffer size.
In that case, if you don't fill the buffer, you end up with lots of
calls to iofuncs.read. So I think you need to fill the buffer.

I still do think there might be good ways to make it so iofuncs.read
is always called just once, but making it work that way right now is
not essential as long as it just works, and might be better done as
part of a larger overhaul of how the stdio readv-like stuff works. As
noted before, the current way read/write backends for stdio FILEs have
to work is really klunky and designed around readv/writev for ordinary
files and around the needs of fread/fwrite without much consideration
for other functions. It could probably be improved considerably at
some point, but doing so will be delicate and probably should be done
alongside properly documenting all the internals.

> +	readlen += ret;
> +	remain -= ret;
> +
> +	f->rpos = f->buf;
> +	ret = fc->iofuncs.read(fc->cookie, (char *) f->rpos, f->buf_size);
> +	if (ret == 0) goto bail_eof;
> +	f->rend = f->rpos + ret;
> +
> +	if (remain > f->buf_size) remain = f->buf_size;
> +	memcpy(buf + readlen, f->rpos, remain);
> +	readlen += remain;
> +	f->rpos += remain;
> +
> +	return readlen;
> +
> +bail_eof:
> +	f->flags |= F_EOF;
> +	f->rpos = f->rend = f->buf;
> +	return readlen;
> +}

The rest of this looks like it mostly works, but seems to be missing
error handling. The man page documents that the read function can
return -1 on error, so I think you need to check for this and set the
stream error state.

> +static size_t cookiewrite(FILE *f, const unsigned char *buf, size_t len)
> +{
> +	struct fcookie *fc = f->cookie;
> +	size_t ret;
> +	size_t len2 = f->wpos - f->wbase;
> +	if (fc->iofuncs.write == NULL) return 0;
> +	if (len2) {
> +		f->wpos = f->wbase;
> +		if (cookiewrite(f, f->wpos, len2) < len2) return 0;
> +	}
> +	return fc->iofuncs.write(fc->cookie, (const char *) buf, len);
> +}

Likewise this is missing error handling. If iofuncs.write returns -1,
it will wrongly return SIZE_MAX, probably blowing up pointer
arithmetic in the caller. Instead it needs to return some value <len,
the actual amount of len written, on error, and must set the stream
error state if this happens. See __stdio_write.c and its handling of
cnt<0.

> +static off_t cookieseek(FILE *f, off_t off, int whence)
> +{
> +	struct fcookie *fc = f->cookie;
> +	if (fc->iofuncs.seek) return fc->iofuncs.seek(fc->cookie, &off, whence);
> +	return -1;
> +}
> +
> +static int cookieclose(FILE *f)
> +{
> +	struct fcookie *fc = f->cookie;
> +	if (fc->iofuncs.close) return fc->iofuncs.close(fc->cookie);
> +	return 0;
> +}
> +
> +FILE *fopencookie(void *cookie, const char *mode, cookie_io_functions_t iofuncs)
> +{
> +	FILE *f;
> +	struct fcookie *fc;
> +
> +	/* Check for valid initial mode character */
> +	if (!strchr("rwa", *mode)) {
> +		errno = EINVAL;
> +		return 0;
> +	}
> +
> +	/* Allocate FILE+fcookie+buffer or fail */
> +	if (!(f=malloc(sizeof *f + sizeof *fc + UNGET + BUFSIZ))) return 0;

In practice I think the pointer arithmetic works out okay, but it
might be better to use something like:

struct cookie_FILE {
	FILE f;
	struct fcookie fc;
	char buf[UNGET+BUFSIZ];
};

so that you don't have to rely on it being right.

> +	/* Zero-fill only the struct, not the buffer */
> +	memset(f, 0, sizeof *f);
> +
> +	/* Impose mode restrictions */
> +	if (!strchr(mode, '+')) f->flags = (*mode == 'r') ? F_NOWR : F_NORD;
> +
> +	/* Set up our fcookie */
> +	fc = (void *)(f + 1);
> +	fc->cookie = cookie;
> +	fc->iofuncs.read = iofuncs.read;
> +	fc->iofuncs.write = iofuncs.write;
> +	fc->iofuncs.seek = iofuncs.seek;
> +	fc->iofuncs.close = iofuncs.close;
> +
> +	f->fd = -1;
> +	f->cookie = fc;
> +	f->buf = (unsigned char *)f + sizeof *f + sizeof *fc + UNGET;
> +	f->buf_size = BUFSIZ;
> +	f->lbf = EOF;
> +	f->lock = 0;

The reason f->lock is 0 and not -1 might be worth a comment.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.