* jk/http-backend-deadlock-2.2:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of our tests in t5551 creates a large number of tags,
and jumps through some hoops to do it efficiently. Let's
factor that out into a function so we can make other similar
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create 50,000 tags to check that we don't overflow the
command-line of fetch-pack. But by using run_with_cmdline_limit,
we can get the same effect with a much smaller number of
tags. This makes the test fast enough that we can drop the
EXPENSIVE prereq, which means people will actually run it.
It was not documented to do so, but this test was also the
only test of a clone-over-http that requires multiple POSTs
during the conversation. We can continue to test that by
dropping http.postbuffer to its minimum size, and checking
that we get two POSTs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jc/test-lazy-prereq' (early part):
t3419: drop unnecessary NOT_EXPENSIVE pseudo-prerequisite
t3302: drop unnecessary NOT_EXPENSIVE pseudo-prerequisite
t3302: do not chdir around in the primary test process
t3302: coding style updates
test: turn USR_BIN_TIME into a lazy prerequisite
test: turn EXPENSIVE into a lazy prerequisite
Make clear which one is for dumb protocol, which one is for smart from
their file name.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>