core.fsyncmethod: add writeout-only mode

This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.

The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.

Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.

When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.

On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Neeraj Singh
2022-03-10 22:43:20 +00:00
committed by Junio C Hamano
parent 19d3f228c8
commit abf38abec2
13 changed files with 184 additions and 8 deletions

View File

@ -546,6 +546,70 @@ int xmkstemp_mode(char *filename_template, int mode)
return fd;
}
/*
* Some platforms return EINTR from fsync. Since fsync is invoked in some
* cases by a wrapper that dies on failure, do not expose EINTR to callers.
*/
static int fsync_loop(int fd)
{
int err;
do {
err = fsync(fd);
} while (err < 0 && errno == EINTR);
return err;
}
int git_fsync(int fd, enum fsync_action action)
{
switch (action) {
case FSYNC_WRITEOUT_ONLY:
#ifdef __APPLE__
/*
* On macOS, fsync just causes filesystem cache writeback but
* does not flush hardware caches.
*/
return fsync_loop(fd);
#endif
#ifdef HAVE_SYNC_FILE_RANGE
/*
* On linux 2.6.17 and above, sync_file_range is the way to
* issue a writeback without a hardware flush. An offset of
* 0 and size of 0 indicates writeout of the entire file and the
* wait flags ensure that all dirty data is written to the disk
* (potentially in a disk-side cache) before we continue.
*/
return sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE |
SYNC_FILE_RANGE_WRITE |
SYNC_FILE_RANGE_WAIT_AFTER);
#endif
#ifdef fsync_no_flush
return fsync_no_flush(fd);
#endif
errno = ENOSYS;
return -1;
case FSYNC_HARDWARE_FLUSH:
/*
* On macOS, a special fcntl is required to really flush the
* caches within the storage controller. As of this writing,
* this is a very expensive operation on Apple SSDs.
*/
#ifdef __APPLE__
return fcntl(fd, F_FULLFSYNC);
#else
return fsync_loop(fd);
#endif
default:
BUG("unexpected git_fsync(%d) call", action);
}
}
static int warn_if_unremovable(const char *op, const char *file, int rc)
{
int err;