ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
John Snow	5ccac6f186	blockjob: add block_job_start Instead of automatically starting jobs at creation time via backup_start et al, we'd like to return a job object pointer that can be started manually at later point in time. For now, add the block_job_start mechanism and start the jobs automatically as we have been doing, with conversions job-by-job coming in later patches. Of note: cancellation of unstarted jobs will perform all the normal cleanup as if the job had started, particularly abort and clean. The only difference is that we will not emit any events, because the job never actually started. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1478587839-9834-5-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-14 22:47:34 -05:00
John Snow	a7815a764c	blockjob: add .start field Add an explicit start field to specify the entrypoint. We already have ownership of the coroutine itself AND managing the lifetime of the coroutine, let's take control of creation of the coroutine, too. This will allow us to delay creation of the actual coroutine until we know we'll actually start a BlockJob in block_job_start. This avoids the sticky question of how to "un-create" a Coroutine that hasn't been started yet. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1478587839-9834-4-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-14 22:47:34 -05:00
John Snow	e8a40bf71d	blockjob: add .clean property Cleaning up after we have deferred to the main thread but before the transaction has converged can be dangerous and result in deadlocks if the job cleanup invokes any BH polling loops. A job may attempt to begin cleaning up, but may induce another job to enter its cleanup routine. The second job, part of our same transaction, will block waiting for the first job to finish, so neither job may now make progress. To rectify this, allow jobs to register a cleanup operation that will always run regardless of if the job was in a transaction or not, and if the transaction job group completed successfully or not. Move sensitive cleanup to this callback instead which is guaranteed to be run only after the transaction has converged, which removes sensitive timing constraints from said cleanup. Furthermore, in future patches these cleanup operations will be performed regardless of whether or not we actually started the job. Therefore, cleanup callbacks should essentially confine themselves to undoing create operations, e.g. setup actions taken in what is now backup_start. Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1478587839-9834-3-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-14 22:47:34 -05:00
Stefan Hajnoczi	199a5bde46	* NBD bugfix (Changlong) * NBD write zeroes support (Eric) * Memory backend fixes (Haozhong) * Atomics fix (Alex) * New AVX512 features (Luwei) * "make check" logging fix (Paolo) * Chardev refactoring fallout (Paolo) * Small checkpatch improvements (Paolo, Jeff) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQExBAABCAAbBQJYGaRPFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D XKgH/RgNtosBTqJsmphkS7wACFAFOf7Uq46ajoKfB66Pt1J/++pFQg4TApPYkb7j KlKeKmXa7hb6+Jg8325H4zGkGno4kn2dE+OnznaB1xPKwiZVAMQVzQsagsEVqpno k/5PBVRptIiuHQKyU29Go0CxbWJBTH0O14S7rDK4YDF0YMnuT280HQOI3jdu1igV G/Q+CMgfk+yXf6GWHE8Z9sNq7n0ha8qgruA/X3NC7+pAvEsUcAP065zwLp9weYuK W1MU68L7Ub4tRo0SVf1HFkDUNdMv4T4hg+wpGe1GwthJWexHu9x0YAQBy60ykJb6 NtHwjLwCUWtm7AiZD/btsOJPmjk= =+Dt/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * NBD bugfix (Changlong) * NBD write zeroes support (Eric) * Memory backend fixes (Haozhong) * Atomics fix (Alex) * New AVX512 features (Luwei) * "make check" logging fix (Paolo) * Chardev refactoring fallout (Paolo) * Small checkpatch improvements (Paolo, Jeff) # gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (30 commits) main-loop: Suppress I/O thread warning under qtest docs/rcu.txt: Fix minor typo vl: exit qemu on guest panic if -no-shutdown is not set checkpatch: allow spaces before parenthesis for 'coroutine_fn' x86: add AVX512_4VNNIW and AVX512_4FMAPS features slirp: fix CharDriver breakage qemu-char: do not forward events through the mux until QEMU has started nbd: Implement NBD_CMD_WRITE_ZEROES on client nbd: Implement NBD_CMD_WRITE_ZEROES on server nbd: Improve server handling of shutdown requests nbd: Refactor conversion to errno to silence checkpatch nbd: Support shorter handshake nbd: Less allocation during NBD_OPT_LIST nbd: Let client skip portions of server reply nbd: Let server know when client gives up negotiation nbd: Share common option-sending code in client nbd: Send message along with server NBD_REP_ERR errors nbd: Share common reply-sending code in server nbd: Rename struct nbd_request and nbd_reply nbd: Rename NbdClientSession to NBDClientSession ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-11-03 16:32:30 +00:00
Eric Blake	1f4d6d18ed	nbd: Implement NBD_CMD_WRITE_ZEROES on server Upstream NBD protocol recently added the ability to efficiently write zeroes without having to send the zeroes over the wire, along with a flag to control whether the client wants to allow a hole. Note that when it comes to requiring full allocation, vs. permitting optimizations, the NBD spec intentionally picked a different sense for the flag; the rules in qemu are: MAY_UNMAP == 0: must write zeroes MAY_UNMAP == 1: may use holes if reads will see zeroes while in NBD, the rules are: FLAG_NO_HOLE == 1: must write zeroes FLAG_NO_HOLE == 0: may use holes if reads will see zeroes In all cases, the 'may use holes' scenario is optional (the server need not use a hole, and must not use a hole if subsequent reads would not see zeroes). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-16-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:56 +01:00
Eric Blake	b6f5d3b573	nbd: Improve server handling of shutdown requests NBD commit 6d34500b clarified how clients and servers are supposed to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN (for the server to announce it is about to go away during option haggling, so the client should quit sending NBD_OPT_* other than NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about to go away during transmission, so the client should quit sending NBD_CMD_* other than NBD_CMD_DISC). It also clarified that NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not. This patch merely adds the missing reply to NBD_OPT_ABORT and teaches the client to recognize server errors. Actually teaching the server to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that the server has been requested to shut down soon (maybe we could do that by installing a SIGINT handler in qemu-nbd, which transitions from RUNNING to a new state that waits for the client to react, rather than just out-right quitting - but that's a bigger task for another day). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-15-git-send-email-eblake@redhat.com> [Move dummy ESHUTDOWN to include/qemu/osdep.h. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:56 +01:00
Eric Blake	c203c59ad9	nbd: Support shorter handshake The NBD Protocol allows the server and client to mutually agree on a shorter handshake (omit the 124 bytes of reserved 0), via the server advertising NBD_FLAG_NO_ZEROES and the client acknowledging with NBD_FLAG_C_NO_ZEROES (only possible in newstyle, whether or not it is fixed newstyle). It doesn't shave much off the wire, but we might as well implement it. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alex Bligh <alex@alex.org.uk> Message-Id: <1476469998-28592-13-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:56 +01:00
Eric Blake	c8a3a1b6c4	nbd: Share common option-sending code in client Rather than open-coding each option request, it's easier to have common helper functions do the work. That in turn requires having convenient packed types for handling option requests and replies. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-9-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:55 +01:00
Eric Blake	ed2dd91267	nbd: Rename struct nbd_request and nbd_reply Our coding convention prefers CamelCase names, and we already have other existing structs with NBDFoo naming. Let's be consistent, before later patches add even more structs. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-6-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:55 +01:00
Eric Blake	b626b51a67	nbd: Treat flags vs. command type as separate fields Current upstream NBD documents that requests have a 16-bit flags, followed by a 16-bit type integer; although older versions mentioned only a 32-bit field with masking to find flags. Since the protocol is in network order (big-endian over the wire), the ABI is unchanged; but dealing with the flags as a separate field rather than masking will make it easier to add support for upcoming NBD extensions that increase the number of both flags and commands. Improve some comments in nbd.h based on the current upstream NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md), and touch some nearby code to keep checkpatch.pl happy. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-3-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:55 +01:00
Eric Blake	b1a75b3348	nbd: Add qemu-nbd -D for human-readable description The NBD protocol allows servers to advertise a human-readable description alongside an export name during NBD_OPT_LIST. Add an option to pass through the user's string to the NBD client. Doing this also makes it easier to test commit `200650d4`, which is the client counterpart of receiving the description. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1476469998-28592-2-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-02 09:28:55 +01:00
John Snow	d899636810	blockjobs: fix documentation (Trivial) Fix wrong function names in documentation. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-8-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 08:04:56 -04:00
John Snow	c87621ea68	blockjobs: split interface into public/private, Part 1 To make it a little more obvious which functions are intended to be public interface and which are intended to be for use only by jobs themselves, split the interface into "public" and "private" files. Convert blockjobs (e.g. block/backup) to using the private interface. Leave blockdev and others on the public interface. There are remaining uses of private state by qemu-img, and several cases in blockdev.c and block/io.c where we grab job->blk for the purposes of acquiring an AIOContext. These will be corrected in future patches. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-7-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 08:04:56 -04:00
John Snow	0df4ba5863	Blockjobs: Internalize user_pause logic BlockJobs will begin hiding their state in preparation for some refactorings anyway, so let's internalize the user_pause mechanism instead of leaving it to callers to correctly manage. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-6-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 07:55:57 -04:00
John Snow	8254b6d953	blockjob: centralize QMP event emissions There's no reason to leave this to blockdev; we can do it in blockjobs directly and get rid of an extra callback for most users. All non-internal events, even those created outside of QMP, will consistently emit events. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-5-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 07:55:57 -04:00
John Snow	47970dfb0a	Replication/Blockjobs: Create replication jobs as internal Bubble up the internal interface to commit and backup jobs, then switch replication tasks over to using this methodology. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-4-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 07:55:57 -04:00
John Snow	f81e0b4532	blockjobs: Allow creating internal jobs Add the ability to create jobs without an ID. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-3-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 07:55:57 -04:00
John Snow	559b935f8c	blockjobs: hide internal jobs from management API If jobs are not created directly by the user, do not allow them to be seen by the user/management utility. At the moment, 'internal' jobs are those that do not have an ID. As of this patch it is impossible to create such jobs. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1477584421-1399-2-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-11-01 07:55:57 -04:00
Alberto Garcia	23d402d42b	block: Add block_job_add_bdrv() When a block job is created on a certain BlockDriverState, operations are blocked there while the job exists. However, some block jobs may involve additional BDSs, which must be blocked separately when the job is created and unblocked manually afterwards. This patch adds block_job_add_bdrv(), that simplifies this process by keeping a list of BDSs that are involved in the specified block job. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-10-31 16:52:38 +01:00
Alberto Garcia	c0778f6693	block: Add bdrv_drain_all_{begin,end}() bdrv_drain_all() doesn't allow the caller to do anything after all pending requests have been completed but before block jobs are resumed. This patch splits bdrv_drain_all() into _begin() and _end() for that purpose. It also adds aio_{disable,enable}_external() calls to disable external clients in the meantime. An important restriction of this split is that no new block jobs or BlockDriverStates can be created between the bdrv_drain_all_begin() and bdrv_drain_all_end() calls. This is not a concern now because we'll only be using this in bdrv_reopen_multiple(), but it must be dealt with if we ever have other uses cases in the future. Signed-off-by: Alberto Garcia <berto@igalia.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-10-31 16:51:14 +01:00
Paolo Bonzini	3fe7122337	aio: convert from RFifoLock to QemuRecMutex It is simpler and a bit faster, and QEMU does not need the contention callbacks (and thus the fairness) anymore. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-21-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	65c1b5b622	iothread: release AioContext around aio_poll This is the first step towards having fine-grained critical sections in dataplane threads, which will resolve lock ordering problems between address_space_* functions (which need the BQL when doing MMIO, even after we complete RCU-based dispatch) and the AioContext. Because AioContext does not use contention callbacks anymore, the unit test has to be changed. Previously applied as `a0710f7995` and then reverted. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-19-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	c9d1a56174	block: only call aio_poll on the current thread's AioContext aio_poll is not thread safe; for example bdrv_drain can hang if the last in-flight I/O operation is completed in the I/O thread after the main thread has checked bs->in_flight. The bug remains latent as long as all of it is called within aio_context_acquire/aio_context_release, but this will change soon. To fix this, if bdrv_drain is called from outside the I/O thread, signal the main AioContext through a dummy bottom half. The event loop then only runs in the I/O thread. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-18-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	720150f318	block: prepare bdrv_reopen_multiple to release AioContext After the next patch bdrv_drain_all will have to be called without holding any AioContext. Prepare to do this by adding an AioContext argument to bdrv_reopen_multiple. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-15-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	e437016511	aio: introduce qemu_get_current_aio_context This will be used by BDRV_POLL_WHILE (and thus by bdrv_drain) to choose how to wait for I/O completion. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-12-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	88b062c203	block: introduce BDRV_POLL_WHILE We want the BDS event loop to run exclusively in the iothread that owns the BDS's AioContext. This macro will provide the synchronization between the two event loops; for now it just wraps the common idiom of a while loop around aio_poll. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-8-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	9972354856	block: add BDS field to count in-flight requests Unlike tracked_requests, this field also counts throttled requests, and remains non-zero if an AIO operation needs a BH to be "really" completed. With this change, it is no longer necessary to have a dummy BdrvTrackedRequest for requests that are never serialising, and it is no longer necessary to poll the AioContext once after bdrv_requests_pending(bs) returns false. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-5-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Paolo Bonzini	bae8196d9f	blockjob: introduce .drain callback for jobs This is required to decouple block jobs from running in an AioContext. With multiqueue block devices, a BlockDriverState does not really belong to a single AioContext. The solution is to first wait until all I/O operations are complete; then loop in the main thread for the block job to complete entirely. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-3-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2016-10-28 21:50:18 +08:00
Kevin Wolf	cbc14ac9c3	block: Remove bdrv_aio_ioctl() It is unused now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-10-27 19:05:23 +02:00
Kevin Wolf	16a389dc9e	block: Introduce .bdrv_co_ioctl() driver callback This allows drivers to implement ioctls in a coroutine-based way. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-10-27 19:05:23 +02:00
Kevin Wolf	61b2450414	block: Remove bdrv_ioctl() It is unused now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-10-27 19:05:23 +02:00
Kevin Wolf	48af776a5b	block: Use blk_co_ioctl() for all BB level ioctls All read/write functions already have a single coroutine-based function on the BlockBackend level through which all requests go (no matter what API style the external caller used) and which passes the requests down to the block node level. This patch exports a bdrv_co_ioctl() function and uses it to extend this mode of operation to ioctls. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-10-27 19:05:22 +02:00
Kevin Wolf	7381e95cc2	block: Remove bdrv_aio_pdiscard() It is unused now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-10-27 19:05:22 +02:00
Fam Zheng	6d3f4049ba	block: More operations for meta dirty bitmap Callers can create an iterator of meta bitmap with bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on it. Meta iterators are also counted by bitmap->active_iterators. Also add a couple of functions to retrieve granularity and count. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-11-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-10-24 17:56:07 +02:00
Vladimir Sementsov-Ogievskiy	882c36f590	block: BdrvDirtyBitmap serialization interface Several functions to provide necessary access to BdrvDirtyBitmap for block-migration.c Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [Add the "finish" parameters. - Fam] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-9-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-10-24 17:56:07 +02:00
Fam Zheng	15891fac7d	block: Add two dirty bitmap getters For dirty bitmap users to get the size and the name of a BdrvDirtyBitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-6-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-10-24 17:56:07 +02:00
Fam Zheng	fb933437de	block: Support meta dirty bitmap The added group of operations enables tracking of the changed bits in the dirty bitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-5-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-10-24 17:56:07 +02:00
Fam Zheng	dc162c8e4f	block: Hide HBitmap in block dirty bitmap interface HBitmap is an implementation detail of block dirty bitmap that should be hidden from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying HBitmapIter. A small difference in the interface is, before, an HBitmapIter is initialized in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because the structure definition is in block/dirty-bitmap.c. Two current users are converted too. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-2-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-10-24 17:56:07 +02:00
Paolo Bonzini	5b8bb3595a	async: add aio_bh_schedule_oneshot qemu_bh_delete is already clearing bh->scheduled at the same time as it's setting bh->deleted. Since it's not using any memory barriers, there is no synchronization going on for bh->deleted, and this makes the bh->deleted checks superfluous in aio_compute_timeout, aio_bh_poll and aio_ctx_check. Just remove them, and put the (bh->scheduled && bh->deleted) combo to work in a new function aio_bh_schedule_oneshot. The new function removes the need to save the QEMUBH pointer between the creation and the execution of the bottom half. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-10-07 13:34:07 +02:00
Kevin Wolf	818584a43a	block: Move 'discard' option to bdrv_open_common() This enables its use for nested child nodes. The compatibility between the 'discard' and 'detect-zeroes' setting is checked in bdrv_open_common() now as the former setting isn't available before calling bdrv_open() any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-09-29 14:13:39 +02:00
John Snow	4085f5c7a2	block: reintroduce bdrv_flush_all Commit `fe1a9cbc` moved the flush_all routine from the bdrv layer to the block-backend layer. In doing so, however, the semantics of the routine changed slightly such that flush_all now used blk_flush instead of bdrv_flush. blk_flush can fail if the attached device model reports that it is not "available," (i.e. the tray is open.) This changed the semantics of flush_all such that it can now fail for e.g. open CDROM drives. Reintroduce bdrv_flush_all to regain the old semantics without having to alter the behavior of blk_flush or blk_flush_all, which are already 'doing the right thing.' Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-29 14:13:13 +02:00
Yaowei Bai	e7e4f9f950	block: mirror: fix wrong comment of mirror_start Obviously, we should write to '@target'. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Reviewed-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1473851019-7005-2-git-send-email-baiyaowei@cmss.chinamobile.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-28 11:21:46 +01:00
Alberto Garcia	f87a0e29a9	block: Add "read-only" to the options QDict This adds the "read-only" option to the QDict. One important effect of this change is that when a child inherits options from its parent, the existing "read-only" mode can be preserved if it was explicitly set previously. This addresses scenarios like this: [E] <- [D] <- [C] <- [B] <- [A] In this case, if we reopen [D] with read-only=off, and later reopen [B], then [D] will not inherit read-only=on from its parent during the bdrv_reopen_queue_child() stage. The BDRV_O_RDWR flag is not removed yet, but its keep in sync with the value of the "read-only" option. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-23 13:36:10 +02:00
Alberto Garcia	38b5e4c3dc	block: Remove bdrv_is_snapshot This is unnecessary and has been unused since `5433c24f0f`. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-23 13:36:10 +02:00
Ladi Prosek	d4b84d564e	Remove unused function declarations Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-09-15 15:32:22 +03:00
Wen Congyang	b49f7ead8d	mirror: auto complete active commit Auto complete mirror job in background to prevent from blocking synchronously Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Message-id: 1469602913-20979-7-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-13 11:00:56 +01:00
Changlong Xie	a8bbee0edf	Backup: export interfaces for extra serialization Normal backup(sync='none') workflow: step 1. NBD peformance I/O write from client to server qcow2_co_writev bdrv_co_writev ... bdrv_aligned_pwritev notifier_with_return_list_notify -> backup_do_cow bdrv_driver_pwritev // write new contents step 2. drive-backup sync=none backup_do_cow { wait_for_overlapping_requests cow_request_begin for(; start < end; start++) { bdrv_co_readv_no_serialising //read old contents from Secondary disk bdrv_co_writev // write old contents to hidden-disk } cow_request_end } step 3. Then roll back to "step 1" to write new contents to Secondary disk. And for replication, we must make sure that we only read the old contents from Secondary disk in order to keep contents consistent. 1) Replication workflow of Secondary virtio-blk ^ -------> 1 NBD \| \|\| server 3 replication \|\| ^ ^ \|\| \| backing backing \| \|\| Secondary disk 6<-------- hidden-disk 5 <-------- active-disk 4 \|\| \| ^ \|\| '-------------------------' \|\| drive-backup sync=none 2 Hence, we need these interfaces to implement coarse-grained serialization between COW of Secondary disk and the read operation of replication. Example codes about how to use them: #include "block/block_backup.h" static coroutine_fn int xxx_co_readv() { CowRequest req; BlockJob job = secondary_disk->bs->job; if (job) { backup_wait_for_overlapping_requests(job, start, end); backup_cow_request_begin(&req, job, start, end); ret = bdrv_co_readv(); backup_cow_request_end(&req); goto out; } ret = bdrv_co_readv(); out: return ret; } Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1469602913-20979-4-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-13 11:00:56 +01:00
Wen Congyang	49d3e828f8	Backup: clear all bitmap when doing block checkpoint Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1469602913-20979-3-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-13 11:00:56 +01:00
Pavel Butsykin	13b9414b57	drive-backup: added support for data compression The idea is simple - backup is "written-once" data. It is written block by block and it is large enough. It would be nice to save storage space and compress it. The patch adds a flag to the qmp/hmp drive-backup command which enables block compression. Compression should be implemented in the format driver to enable this feature. There are some limitations of the format driver to allow compressed writes. We can write data only once. Though for backup this is perfectly fine. These limitations are maintained by the driver and the error will be reported if we are doing something wrong. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Jeff Cody <jcody@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-05 19:06:48 +02:00
Pavel Butsykin	35fadca80e	block: remove BlockDriver.bdrv_write_compressed There are no block drivers left that implement the old .bdrv_write_compressed interface, so it can be removed. Also now we have no need to use the bdrv_pwrite_compressed function and we can remove it entirely. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Jeff Cody <jcody@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-05 19:06:48 +02:00
Pavel Butsykin	29a298af9d	block/io: reuse bdrv_co_pwritev() for write compressed For bdrv_pwrite_compressed() it looks like most of the code creating coroutine is duplicated in bdrv_prwv_co(). So we can just add a flag (BDRV_REQ_WRITE_COMPRESSED) and use bdrv_prwv_co() as a generic one. In the end we get coroutine oriented function for write compressed by using bdrv_co_pwritev/blk_co_pwritev with BDRV_REQ_WRITE_COMPRESSED flag. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Jeff Cody <jcody@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-05 19:06:48 +02:00
Pavel Butsykin	751e2f0698	block: Convert bdrv_pwrite_compressed() to BdrvChild Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-05 19:06:47 +02:00
Pavel Butsykin	fe5c1355e7	block: switch blk_write_compressed() to byte-based interface This is a preparatory patch, which continues the general trend of the transition to the byte-based interfaces. bdrv_check_request() and blk_check_request() are no longer used, thus we can remove them. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Jeff Cody <jcody@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-09-05 19:06:47 +02:00
Kevin Wolf	cd7fca952c	nbd-server: Use a separate BlockBackend The builtin NBD server uses its own BlockBackend now instead of reusing the monitor/guest device one. This means that it has its own writethrough setting now. The builtin NBD server always uses writeback caching now regardless of whether the guest device has WCE enabled. qemu-nbd respects the cache mode given on the command line. We still need to keep a reference to the monitor BB because we put an eject notifier on it, but we don't use it for any I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-09-05 19:06:47 +02:00
Evgeny Yakovlev	ce83ee57f6	block: fix deadlock in bdrv_co_flush The following commit commit `3ff2f67a7c` Author: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Date: Mon Jul 18 22:39:52 2016 +0300 block: ignore flush requests when storage is clean has introduced a regression. There is a problem that it is still possible for 2 requests to execute in non sequential fashion and sometimes this results in a deadlock when bdrv_drain_one/all are called for BDS with such stalled requests. 1. Current flushed_gen and flush_started_gen is 1. 2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same as flushed_gen). It gets past flushed_gen != flush_started_gen and sets flush_started_gen to 1 (again, the same it was before). 3. Request 1 yields somewhere before exiting bdrv_co_flush 4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past flushed_gen != flush_started_gen and sets flush_started_gen to 2. 5. Request 2 runs to completion and sets flushed_gen to 2 6. Request 1 is resumed, runs to completion and sets flushed_gen to 1. However flush_started_gen is now 2. From here on out flushed_gen is always != to flush_started_gen and all further requests will wait on flush_queue. This change replaces flush_started_gen with an explicitly tracked active flush request. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Message-id: 1471457214-3994-2-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-08-18 14:36:49 +01:00
Eric Blake	b8d0a9804d	block: Cater to iscsi with non-power-of-2 discard Dell Equallogic iSCSI SANs have a very unusual advertised geometry: $ iscsi-inq -e 1 -c $((0xb0)) iscsi://XXX/0 wsnz:0 maximum compare and write length:1 optimal transfer length granularity:0 maximum transfer length:0 optimal transfer length:0 maximum prefetch xdread xdwrite transfer length:0 maximum unmap lba count:30720 maximum unmap block descriptor count:2 optimal unmap granularity:30720 ugavalid:1 unmap granularity alignment:0 maximum write same length:30720 which says that both the maximum and the optimal discard size is 15M. It is not immediately apparent if the device allows discard requests not aligned to the optimal size, nor if it allows discards at a finer granularity than the optimal size. I tried to find details in the SCSI Commands Reference Manual Rev. A on what valid values of maximum and optimal sizes are permitted, but while that document mentions a "Block Limits VPD Page", I couldn't actually find documentation of that page or what values it would have, or if a SCSI device has an advertisement of its minimal unmap granularity. So it is not obvious to me whether the Dell Equallogic device is compliance with the SCSI specification. Fortunately, it is easy enough to support non-power-of-2 sizing, even if it means we are less efficient than truly possible when targetting that device (for example, it means that we refuse to unmap anything that is not a multiple of 15M and aligned to a 15M boundary, even if the device truly does support a smaller granularity where unmapping actually works). Reported-by: Peter Lieven <pl@kamp.de> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1469129688-22848-5-git-send-email-eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-08-03 18:44:57 +02:00
Eric Blake	7423f41782	nbd: Limit nbdflags to 16 bits Rather than asserting that nbdflags is within range, just give it the correct type to begin with :) nbdflags corresponds to the per-export portion of NBD Protocol "transmission flags", which is 16 bits in response to NBD_OPT_EXPORT_NAME and NBD_OPT_GO. Furthermore, upstream NBD has never passed the global flags to the kernel via ioctl(NBD_SET_FLAGS) (the ioctl was first introduced in NBD 2.9.22; then a latent bug in NBD 3.1 actually tried to OR the global flags with the transmission flags, with the disaster that the addition of NBD_FLAG_NO_ZEROES in 3.9 caused all earlier NBD 3.x clients to treat every export as read-only; NBD 3.10 and later intentionally clip things to 16 bits to pass only transmission flags). Qemu should follow suit, since the current two global flags (NBD_FLAG_FIXED_NEWSTYLE and NBD_FLAG_NO_ZEROES) have no impact on the kernel's behavior during transmission. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1469129688-22848-3-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-08-03 18:44:56 +02:00
Cao jin	54a16a63d0	AioContext: correct comments Correct comments of field notify_me Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Message-id: 1468575858-22975-1-git-send-email-caoj.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-26 17:46:37 +02:00
Peter Maydell	61ead113ae	Pull request v2: * Resolved merge conflict with block/iscsi.c [Peter] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXj6TkAAoJEJykq7OBq3PI15oIAL24OFnMTFOAWyY2h3yfzGwe Gn1qzFOTHCXgbgVsolVuZnFU/SDn9WzMS9y6o7jcJykv++FdMSYUO7paPQAg9etX o9KOgyfUUDhskbaXEbTKxhqy7UvcyJsZYmjN+TD5tupfnrqGlmu6rkUPpSwGo++G u7CkAZewJc5SvsCdQRe3N10LbG4xU/j06cjCIHWDPRr+AV9qvsb9KkIOg3wkrhWN R83yj+yX6vhfQouHLoOlh3RYZ5HNo020JPum0jT1tGcoshKoFyu0prSRqNVVMqTY BepvWMhXbiq219HnFaQBtnysve9gWix4WhuCHtaAlaZDTBh945ZztsW7w7dubtA= =7ics -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request v2: * Resolved merge conflict with block/iscsi.c [Peter] # gpg: Signature made Wed 20 Jul 2016 17:20:52 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: (25 commits) raw_bsd: Convert to byte-based interface nbd: Convert to byte-based interface block: Kill .bdrv_co_discard() sheepdog: Switch .bdrv_co_discard() to byte-based raw_bsd: Switch .bdrv_co_discard() to byte-based qcow2: Switch .bdrv_co_discard() to byte-based nbd: Switch .bdrv_co_discard() to byte-based iscsi: Switch .bdrv_co_discard() to byte-based gluster: Switch .bdrv_co_discard() to byte-based blkreplay: Switch .bdrv_co_discard() to byte-based block: Add .bdrv_co_pdiscard() driver callback block: Convert .bdrv_aio_discard() to byte-based rbd: Switch rbd_start_aio() to byte-based raw-posix: Switch paio_submit() to byte-based block: Convert BB interface to byte-based discards block: Convert bdrv_aio_discard() to byte-based block: Switch BlockRequest to byte-based block: Convert bdrv_discard() to byte-based block: Convert bdrv_co_discard() to byte-based iscsi: Rely on block layer to break up large requests ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: block/gluster.c	2016-07-21 11:00:36 +01:00
Eric Blake	70c4fb2648	nbd: Convert to byte-based interface The NBD protocol doesn't have any notion of sectors, so it is a fairly easy conversion to use byte-based read and write. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468624988-423-19-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:24:25 +01:00
Eric Blake	02aefe43cb	block: Kill .bdrv_co_discard() Now that all drivers have a byte-based .bdrv_co_pdiscard(), we no longer need to worry about the sector-based version. We can also relax our minimum alignment to 1 for drivers that support it. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-18-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:24:25 +01:00
Eric Blake	47a5486d59	block: Add .bdrv_co_pdiscard() driver callback There's enough drivers with a sector-based callback that it will be easier to switch one at a time. This patch adds a byte-based callback, and then after all drivers are swapped, we'll drop the sector-based callback. [checkpatch doesn't like the space after coroutine_fn in block_int.h, but it's consistent with the rest of the file] Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-10-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:55 +01:00
Eric Blake	4da444a0bb	block: Convert .bdrv_aio_discard() to byte-based Another step towards byte-based interfaces everywhere. Replace the sector-based driver callback .bdrv_aio_discard() with a new byte-based .bdrv_aio_pdiscard(). Only raw-posix and RBD drivers are affected, so it was not worth splitting into multiple patches. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-9-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:55 +01:00
Eric Blake	60ebac16bc	block: Convert bdrv_aio_discard() to byte-based Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_aio_discard() with a new byte-based bdrv_aio_pdiscard(), which silently ignores any unaligned head or tail. Driver callbacks will be converted in followup patches. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1468624988-423-5-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:55 +01:00
Eric Blake	0c51a893b6	block: Convert bdrv_discard() to byte-based Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_discard() with a new byte-based bdrv_pdiscard(), which silently ignores any unaligned head or tail. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-3-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:55 +01:00
Eric Blake	9f1963b3f7	block: Convert bdrv_co_discard() to byte-based Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_co_discard() with a new byte-based bdrv_co_pdiscard(), which silently ignores any unaligned head or tail. Driver callbacks will be converted in followup patches. By calculating the alignment outside of the loop, and clamping the max discard to an aligned value, we can simplify the actions done within the loop. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-2-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:54 +01:00
Eric Blake	1e2a77a851	nbd: Drop unused offset parameter Now that NBD relies on the block layer to fragment things, we no longer need to track an offset argument for which fragment of a request we are actually servicing. While at it, use true and false instead of 0 and 1 for a bool parameter. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-6-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:54 +01:00
Denis V. Lunev	6d07859926	dirty-bitmap: operate with int64_t amount Underlying HBitmap operates even with uint64_t. Thus this change is safe. This would be useful f.e. to mark entire bitmap dirty in one call. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-2-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Peter Maydell	ad31cd4c69	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXjV3bAAoJEH3vgQaq/DkOArAQAKc03QhwukqT2aZ5zs7kZ1Hv PCHrISL0dKGK3YfyiSitppoxr6eBoBR9UIEjaNlW0XwujqwWdWfJ7kIbVSGAyWqR YWmV+bA+TUWXz+tFbeDI0maMH9GNVCAuvQiqldmJxhZ303pf5cksZ9CqiALSylTY t5XUB6nhV7MPto63C2X2xjLkKxlsT9KOTsYxGVgVXwUzgW1lAuu8Lo6eNULXCgUa j+azgSFAiUOKwfKxcKD25kPOxgWlrxkGRc2LdFlopEzShENROhR2r9ut3okgAoM4 KPVIE3jSsLMhNr9bRQRUJw53vRSL/bxFvlCdzKiBFSo7wNMKWNA9RWF6+If1Jvoi Am+BzINCfNfoFmqlXppqWGlapk9ZtmDGbPwaUyT6NJ9axAASTQxcj8QOjGEX07UE ubvzIXx7D1Amo59/4RWRXVDpMb9+p3npqkuCL+DWZzq7EVB42ig8+fKhijhS4jUK 2DA7uL4orUjUIoTbJZsKciw7MfaWuP2/SnP1VRNRXSsiNg5N4qJDUB6Wo5AKksQB LWP4Ou4irPj/ZGvMhJBMMOQ5kl2maqj8beP9pVjltVjGVzAihAyuXHUqjPdqe1R9 PLQidHKCb1OaJWi3c53vayuzlINH9iY+adjgSuYxi1QOZ+uqEsjSs+3+m1gyTOeh sd8VU/TtJbiGBCh9VTTf =9N8+ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging # gpg: Signature made Mon 18 Jul 2016 23:53:15 BST # gpg: using RSA key 0x7DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/ide-pull-request: block: ignore flush requests when storage is clean tests: in IDE and AHCI tests perform DMA write before flushing ide: set retry_unit for PIO and FLUSH requests ide: refactor retry_unit set and clear into separate function Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-19 11:47:07 +01:00
Evgeny Yakovlev	3ff2f67a7c	block: ignore flush requests when storage is clean Some guests (win2008 server for example) do a lot of unnecessary flushing when underlying media has not changed. This adds additional overhead on host when calling fsync/fdatasync. This change introduces a write generation scheme in BlockDriverState. Current write generation is checked against last flushed generation to avoid unnessesary flushes. The problem with excessive flushing was found by a performance test which does parallel directory tree creation (from 2 processes). Results improved from 0.424 loops/sec to 0.432 loops/sec. Each loop creates 10^3 directories with 10 files in each. This affected some blkdebug testcases that were expecting error logs from failure-injected flushes which are now skipped entirely (tests 026 071 089). This also affects the performance of block jobs and thus BLOCK_JOB_READY events for driver-mirror and active block-commit commands now arrives faster, before QMP send successfully returns to caller (tests 141 144). Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-5-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>	2016-07-18 18:19:01 -04:00
Cao jin	7e00346505	aio-posix: remove useless parameter Parameter **errp of aio_context_setup() is useless, remove it and clean up the related code. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Fam Zheng <famz@redhat.com> Cc: Eric Blake <eblake@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1468578524-23433-1-git-send-email-caoj.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-18 15:10:52 +01:00
Paolo Bonzini	0187f5c9cb	linux-aio: share one LinuxAioState within an AioContext This has better performance because it executes fewer system calls and does not use a bottom half per disk. Originally proposed by Ming Lei. [Changed #include "raw-aio.h" to "block/raw-aio.h" in win32-aio.c to fix build error as reported by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1467650000-51385-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> squash! linux-aio: share one LinuxAioState within an AioContext	2016-07-18 15:09:31 +01:00
Peter Maydell	190c93c982	* SCSI scanner support * fixes to qemu-char and net exit * FreeBSD fixes * Other small bugfixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJXhiZDAAoJEL/70l94x66DrGAH/10ZlIYugx6Ijn12qy3irmIC hbMY6HWjvPlk8ZpAcPa3UXNQvqhTwqhSMXRiwp9aNPlRUqrXnDXZapQunJveKSAn luLE8ISRKODz0W39qg6znyb4R1ipCGJWwjBCQmLWZuD7883JJ2DsykTATRx7yKQF qsq9r/DPBTfD3vnOCTbqp0GeB80UFleTNm+K7cct8M1+WzfiwKeVHk9CAKy0fkTH hS+YnV9UWYL6PR/w+uZ+2MfgH5er4X794+HaNbio0QJJbEZ2bsL4A3Prh7pUonN7 qJoCbT4W79scrnWQ40RbWRXOMfUk4J7gIMEZYar8z6NmqnamNZgxbWj3dv6pO+k= =sz/L -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * SCSI scanner support * fixes to qemu-char and net exit * FreeBSD fixes * Other small bugfixes # gpg: Signature made Wed 13 Jul 2016 12:30:11 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: hostmem: detect host backend memory is being used properly hostmem: fix QEMU crash by 'info memdev' char: do not use atexit cleanup handler net: do not use atexit for cleanup slirp: use exit notifier for slirp_smb_cleanup tap: use an exit notifier to call down_script util: Fix MIN_NON_ZERO qemu-sockets: use qapi_free_SocketAddress in cleanup disas: avoid including everything in headers compiled from C++ json-streamer: fix double-free on exiting during a parse main-loop: check return value before using pointer Use "-s" instead of "--quiet" to resolve non-fatal build error on FreeBSD. scsi-bus: Use longer sense buffer with scanners scsi-bus: Add SCSI scanner support Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-14 13:44:06 +01:00
Alberto Garcia	fd62c609ed	commit: Add 'job-id' parameter to 'block-commit' This patch adds a new optional 'job-id' parameter to 'block-commit', allowing the user to specify the ID of the block job to be created. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	2323322ed0	stream: Add 'job-id' parameter to 'block-stream' This patch adds a new optional 'job-id' parameter to 'block-stream', allowing the user to specify the ID of the block job to be created. The HMP 'block_stream' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	70559d499c	backup: Add 'job-id' parameter to 'blockdev-backup' and 'drive-backup' This patch adds a new optional 'job-id' parameter to 'blockdev-backup' and 'drive-backup', allowing the user to specify the ID of the block job to be created. The HMP 'drive_backup' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	71aa98678c	mirror: Add 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror' This patch adds a new optional 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror', allowing the user to specify the ID of the block job to be created. The HMP 'drive_mirror' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	7f0317cfc8	blockjob: Add 'job_id' parameter to block_job_create() When a new job is created, the job ID is taken from the device name of the BDS. This patch adds a new 'job_id' parameter to let the caller provide one instead. This patch also verifies that the ID is always unique and well-formed. This causes problems in a couple of places where no ID is being set, because the BDS does not have a device name. In the case of test_block_job_start() (from test-blockjob-txn.c) we can simply use this new 'job_id' parameter to set the missing ID. In the case of img_commit() (from qemu-img.c) we still don't have the API to make commit_active_start() set the job ID, so we solve it by setting a default value. We'll get rid of this as soon as we extend the API. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	ffb1f10cd1	blockjob: Add block_job_get() Currently the way to look for a specific block job is to iterate the list manually using block_job_next(). Since we want to be able to identify a job primarily by its ID it makes sense to have a function that does just that. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	9df229c3ca	blockjob: Update description of the 'id' field The 'id' field of the BlockJob structure will be able to hold any ID, not only a device name. This patch updates the description of that field and the error messages where it is being used. Soon we'll add the ability to set an arbitrary ID when creating a block job. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	29338003c9	stream: Fix prototype of stream_start() 'stream-start' has a parameter called 'backing-file', which is the string to be written to bs->backing when the job finishes. In the stream_start() implementation it is called 'backing_file_str', but it the prototype in the header file it is called 'base_id'. This patch fixes it so the name is the same in both cases and is consistent with other cases (like commit_start()). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Jarkko Lavinen	297b044a7f	scsi-bus: Add SCSI scanner support Add support for missing scanner specific SCSI commands and their xfer lenghts as per ANSI spec section 15. Signed-off-by: Jarkko Lavinen <jarkko.lavinen@iki.fi> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-12 18:31:26 +02:00
Markus Armbruster	175de52487	Clean up decorations and whitespace around header guards Cleaned up with scripts/clean-header-guards.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:20:46 +02:00
Markus Armbruster	121d07125b	Clean up header guards that don't match their file name Header guard symbols should match their file name to make guard collisions less likely. Offenders found with scripts/clean-header-guards.pl -vn. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:19:16 +02:00
Kevin Wolf	a03ef88f77	block: Convert bdrv_co_preadv/pwritev to BdrvChild This is the final patch for converting the common I/O path to take a BdrvChild parameter instead of BlockDriverState. The completion of this conversion means that all users that perform I/O on an image need to actually hold a reference (in the form of BdrvChild, possible as part of a BlockBackend) to that image. This also protects against inconsistent use of BlockBackend vs. BlockDriverState functions because direct use of a BlockDriverState isn't possible any more and blk->root is private for block-backends.c. In addition, we can now distinguish different users in the I/O path, and the future op blockers work is going to add assertions based on permissions stored in BdrvChild. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	720ff280e7	block: Convert bdrv_pwrite_zeroes() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	d9ca2ea2e2	block: Convert bdrv_pwrite(v/_sync) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	cf2ab8fc34	block: Convert bdrv_pread(v) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	18d51c4bac	block: Convert bdrv_write() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	fbcbbf4e80	block: Convert bdrv_read() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	0d1049c7d1	block: Convert bdrv_aio_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	ebb7af2173	block: Convert bdrv_aio_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	25ec177d90	block: Convert bdrv_co_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	28b04a8f65	block: Convert bdrv_co_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	5411541270	block: Use bool as appropriate for BDS members Using int for values that are only used as booleans is confusing. While at it, rearrange a couple of members so that all the bools are contiguous. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	a5b8dd2ce8	block: Move request_alignment into BlockLimit It makes more sense to have ALL block size limit constraints in the same struct. Improve the documentation while at it. Simplify a couple of conditionals, now that we have audited and documented that request_alignment is always non-zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	b9f7855a50	block: Switch discard length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_discard and discard_alignment. Rename them, using 'pdiscard' as an aid to track which remaining discard interfaces need conversion, and so that the compiler will help us catch the change in semantics across any rebased code. The BlockLimits type is now completely byte-based; and in iscsi.c, sector_limits_lun2qemu() is no longer needed. pdiscard_alignment is made unsigned (we use power-of-2 alignments as bitmasks, where unsigned is easier to think about) while leaving max_pdiscard signed (since we still have an 'int' interface); this is comparable to what commit `cf081fc` did for write zeroes limits. We may later want to make everything an unsigned 64-bit limit - but that requires a bigger code audit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	29cc6a6834	block: Wording tweaks to write zeroes limits Improve the documentation of the write zeroes limits, to mention additional constraints that drivers should observe. Worth squashing into commit `cf081fca`, if that hadn't been pushed already :) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	5def6b80e1	block: Switch transfer length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_transfer_length and opt_transfer_length. Rename them (dropping the _length suffix) so that the compiler will help us catch the change in semantics across any rebased code, and improve the documentation. Use unsigned values, so that we don't have to worry about negative values and so that bit-twiddling is easier; however, we are still constrained by 2^31 of signed int in most APIs. When a value comes from an external source (iscsi and raw-posix), sanitize the results to ensure that opt_transfer is a power of 2. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	476b923c32	nbd: Allow larger requests The NBD layer was breaking up request at a limit of 2040 sectors (just under 1M) to cater to old qemu-nbd. But the server limit was raised to 32M in commit `2d8214885` to match the kernel, more than three years ago; and the upstream NBD Protocol is proposing documentation that without any explicit communication to state otherwise, a client should be able to safely assume that a 32M transaction will work. It is time to rely on the larger sizing, and any downstream distro that cares about maximum interoperability to older qemu-nbd servers can just tweak the value of #define NBD_MAX_SECTORS. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00

1 2 3 4 5 ...

608 Commits