The Move to Vector Status and Control Register (mtvscr) instruction
uses VRB as the source register. Fix the code generator to correctly
decode the VRB field. That is, use "rB(ctx->opcode)" instead of
"rD(ctx->opcode)".
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Memory slots have to be page aligned to get entered into KVM. There
is existing logic that tries to ensure that we pad memory slots that
are not page aligned to the biggest region that would still fit in the
alignment requirements.
Unfortunately, that logic is broken. It tries to calculate the start
offset based on the region size.
Fix up the logic to do the thing it was intended to do and document it
properly in the comment above it.
With this patch applied, I can successfully run an e500 guest with more
than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
In the previous patch, the registers were added to init_proc_G2LE
instead of init_proc_e300.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
it may actually fix a case where autoconverge would break on a repeat
migration (and not just fix stats).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUbeQZAAoJEB6aO1+FQIO2LVcP/0aTLt2uZFdbQC6wkQ4ubd8g
/J0+O7TiMh4DAEmfPrd3T+WaEhvduptMa/q4rSJATo8pUIDVJ1Ac3jP3n7ETBfx0
EWReCrPN1WAEOzSOud/KRss60+ryo3AmL0I5AQNGSi8Gctia3XFIJ3VU3i2owgMp
jhO4zILKEW9Mpd39YtiOOzQktkID9aJKHyLvqLgxwGT1NS8iIsHDjPedCKEAtcvg
5mqPfR4K15jarjPS0M+t2Rx3mjpy8FzMo14H6856dvGqoZHznFjw0uq2bKo2FqKn
/eUqjFISBzXG9pDXlyouqQhu4NKcdzCB++2Wf/7pbgltEV3P4pROWWBaLHm9fHYy
O+NBq+df6Dxykcaj46jw1FHg/YrUIuJ0u7NNrqRVoHZBR6l+OktiWnCraY2psDdn
nNCLrVKRW64qx64HEXKrLxyLPhadfQrmVOksGc5kCecWotANpqKGWZp1qnBYVE6K
spEEPDRAgYD6Eb3OHVH/hUIuBYpegD/hSHHAOpdBZN8kjq+Hj8Plwn/mqxkkxxf+
+FtznF4OkSpJCRr1n6ijK5A4e6U7pK18L7NNwDZW6H+UyXDjc2qRkklX/FGfZuKD
qXbqeAC9MlUl8VGVCUtmkmPHeVxCHDa9Lw/5nZIggBCrH+H45aWgkVxJhmQnQq3m
XYXJdIrS3/MGJnCLfhnJ
=wxby
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2-2' into staging
Fix from a while back that unfortunately got ignored. Dave Gilbert says
it may actually fix a case where autoconverge would break on a repeat
migration (and not just fix stats).
# gpg: Signature made Thu 20 Nov 2014 12:52:41 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-migration/tags/for-2.2-2:
migration: static variables will not be reset at second migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The static variables in migration_bitmap_sync will not be reset in
the case of a second attempted migration.
Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The other callers to blk_set_enable_write_cache() in this file
already check for s->blk == NULL.
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416259239-13281-1-git-send-email-dslutz@verizon.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v1
iQEcBAABAgAGBQJUa2AVAAoJEJykq7OBq3PIYYMH/38g6vk60lp/OmIo2AHz9kjW
SOCdW/9aZ2R16THwiF6tuSaAD95UPr+D45wED47eKrlCKo3aZ5QY9VtM20j8aQvV
WGwLJ809to5EfsOaJY+vbutCM1uHdTAXFOv8FUjdLKkQF6mLKp1iEo/BLxxMRdPW
WKlsXC29s6yof6c8pKlarUwuGGv6VKWrkj73vQsWgaDgOJGqXST3zZynyqEf1ARI
XRWa13qQ76Pwy0UTzeTSu+W3dXuTbuw/PSuiavcrjvoi9Ewm5+6YEHob/0j8fEad
cAxcWaMVowbKdJSNV+UDgMaXje1Zt7EArYSTvRx0kf4rWBbeieA1K9jtlxShrYE=
=lVQH
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Tue 18 Nov 2014 15:04:53 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
net: The third parameter of getsockname should be initialized
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
simpletrace.py does not recognize the tcg option while reading trace-events file. In result simpletrace does not work on binary traces and tcg enabled events. Moved transformation of tcg enabled events to _read_events() which is used by simpletrace.
Signed-off-by: Christoph Seifert <christoph.seifert@posteo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix the example trace configure option.
Update the text to say that multiple backends are allowed and what
happens when multiple backends are enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1412691161-31785-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
migration by Michael S. Tsirkin.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUaywUAAoJEB6aO1+FQIO2pOMP/0i+4/6QKJdzkd6BbW5ms6pt
0s8q3Bb3W78FQ53fuztlVuCTbwgS5eNpXj9zKhdqxtYf1ag+8PYT6n2pefQPm8un
cYA4UagABsq14Ae17MAswLwoEncUK/FVoUyh4qd54FFZGiqtwVoxfNu648RsZurf
0sRHtfUUG3vj3YJiFCT6QBuhl17J0Ks345nVv+2V8oQlyktrI88dtGAQn2axschv
IfIR4D1Z2KwV27V6f4/f1W1FmxgSeuaMx6KTU9SbdOmVxfaRhl3y8mCZXMVl8R0R
0IubeRgJvUit/+x8sDMzH+p2fBsQAzwdfz7sX6OZykSPOzd9jGdCPFGp7Tl/PZ1Q
QA0wTHoOzZyP/KYw1OSr3JuiZpGV5LqH+RZFjemyIQxRUeSWLHwDsecvWe4k1238
P9jnsH7aU9ZiRiKfhRUFyrODWVh7cE97qJlMJKruFgKWw2Jyj+oQ8flImRJrYQA7
670g+y8K66afA19Ci61yZqLhdyXU0FpFAomgWLru+HFqGimKXboMWehPzjAA6J1L
3038lQsTqOPTlJDx5MbHbc/gtrDyL8Rei/ywQw3p0c2Ct3PaQ1xcGkMlAhnf+wNw
TmxcunyOqNhSsedJyKmWn6GxRiHMM4jXpi21q3X3yVAgnINTSQMQ5ERn4TzYqmsC
MPDuWYOtsLVf5YrfLiTU
=xwMv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2' into staging
Fix for CVE-2014-7840, avoiding arbitrary qemu memory overwrite for
migration by Michael S. Tsirkin.
# gpg: Signature made Tue 18 Nov 2014 11:23:00 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-migration/tags/for-2.2:
migration: fix parameter validation on ram load
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This updates the Linux header to version 3.18-rc5, adding support for
(among other things) read-only memslots on ARM and arm64.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1416248898-6302-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.
To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.
Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.
Fixes: CVE-2014-7840
Note: follow-up patches add extra checks on each block->host access.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
fsync() may fail, and that case should be handled.
Reported-by: László Érsek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The loop which filled the file with zeroes may have been left early due
to an error. In that case, the fsync() should be skipped.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
write() may write less bytes than requested; in this case, the number of
bytes written is returned. This is the byte count we should be
subtracting from the number of bytes still to be written, and not the
byte count we requested to write.
Reported-by: László Érsek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.
The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
* mreitz/block:
raw-posix: The SEEK_HOLE code is flawed, rewrite it
raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
raw-posix: Fix comment for raw_co_get_block_status()
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris,
but not Linux), try_seek_hole() reports trailing data instead.
Additionally, unlikely lseek() failures are treated badly:
* When SEEK_HOLE fails, try_seek_hole() reports trailing data. For
-ENXIO, there's in fact a trailing hole. Can happen only when
something truncated the file since we opened it.
* When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds,
then try_seek_hole() reports a trailing hole. This is okay only
when SEEK_DATA failed with -ENXIO (which means the non-trailing hole
found by SEEK_HOLE has since become trailing somehow). For other
failures (unlikely), it's wrong.
* When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely),
then try_seek_hole() reports bogus data [-1,start), which its caller
raw_co_get_block_status() turns into zero sectors of data. Could
theoretically lead to infinite loops in code that attempts to scan
data vs. hole forward.
Rewrite from scratch, with very careful comments.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Commit 5500316 (May 2012) implemented raw_co_is_allocated() as
follows:
1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl
2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek()
3. Else pretend there are no holes
Later on, raw_co_is_allocated() was generalized to
raw_co_get_block_status().
Commit 4f11aa8 (May 2014) changed it to try the three methods in order
until success, because "there may be implementations which support
[SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice
versa."
Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC.
Commit 38c4d0a (Sep 2014) added it. Because that's a significant
speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first.
As you see, the obvious use of FIEMAP is wrong, and the correct use is
slow. I guess this puts it somewhere between -7 "The obvious use is
wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard
to Misuse scale[*].
"Fortunately", the FIEMAP code is used only when
* SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is
Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it
was introduced for ext4 and btrfs) and 2012.
* SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails
Unlikely.
Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for
bugs. Worse, bugs hiding there can theoretically bite even on a host
that has SEEK_HOLE/SEEK_DATA.
I don't want to worry about this crap, not even theoretically. Get
rid of it.
[*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Missed in commit 705be72.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The ARMv8 address translation system defines that a page table walk
starts at a level which depends on the translation granule size
and the number of bits of virtual address that need to be resolved.
Where the translation granule is 64KB and the guest sets the
TCR.TxSZ field to between 35 and 39, it's actually possible to
start at level 3 (the final level). QEMU's implementation failed
to handle this case, and so we would set level to 2 and behave
incorrectly (including invoking the C undefined behaviour of
shifting left by a negative number). Correct the code that
determines the starting level to deal with the start-at-3 case,
by replacing the if-else ladder with an expression derived from
the ARM ARM pseudocode version.
This error was detected by the Coverity scan, which spotted
the potential shift by a negative number.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1415890569-7454-1-git-send-email-peter.maydell@linaro.org
usb_ep_get and usb_handle_packet can deal with a NULL device, but we have
to avoid dereferencing NULL pointers when building the id.
Thanks to Gonglei for an initial stab at fixing this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In function t_gen_mov_TN_preg and t_gen_mov_preg_TN, The begin check about the
validity of in-parameter 'r' is useless. We still access cpu_PR[r] in the
follow code if it is invalid. Which will be an out-of-bounds read error.
Fix it by using assert() to ensure it is valid before using it.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If 'i != index' for all acl->entries, variable
entry leaks the storage it points to.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
((n->bar.aqa >> AQA_ASQS_SHIFT) & AQA_ASQS_MASK) > 4095
is always false regardless of the values of its operands.
This occurs as the logical second operand of '||'.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
lseek will return -1 on error, g_malloc0(size) and read(,,size)
paramenters cannot be negative. We should add a check for return
value of lseek().
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Function send_response(s, &qdict->base) returns a negative number
when any failures occured. But strerror()'s parameter cannot be
negative. Let's change the testing condition and pass '-ret' to
strerr().
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
May pass freed pointer filename as an argument to error_report.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In this false branch, fd will leak when it is zero.
Change the testing condition.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
[Fix net_l2tpv3_cleanup as well. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
freeaddrinfo(result) does not assign result = NULL, after frees it.
There will be a double free when it goes error case.
It is reported by covertiy.
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In function connect_to_qemu(), getaddrinfo() will allocate memory
that is stored into server, it should be freed by using freeaddrinfo()
before connect_to_qemu() return.
Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes two issues with persistent grants and the disk PV backend
(Qdisk):
- Keep track of memory regions where persistent grants have been mapped
since we need to unmap them as a whole. It is not possible to unmap a
single grant if it has been batch-mapped. A new check has also been added
to make sure persistent grants are only used if the whole mapped region
can be persistently mapped in the batch_maps case.
- Unmap persistent grants before switching to the closed state, so the
frontend can also free them.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If user starts QEMU with "-machine pc,accel=xen", then
compat property in xenfv won't work and it would cause error:
"Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
when PCI device is added with -device on QEMU CLI.
From: Igor Mammedov <imammedo@redhat.com>
In case of Xen instead of using compat property, just use the fact
that xen doesn't use QEMU's fw_cfg/acpi tables to switch piix4_pm
into legacy PCI hotplug mode when Xen is enabled.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Li Liang <liang.z.li@intel.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
When extent types don't match, we return -ENOTSUP. In this case, be
polite to the caller and don't modify bdi.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1415938161-16217-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In bdrv_rw_co we report -EINVAL for nb_sectors > INT_MAX /
BDRV_SECTOR_SIZE, so a caller shouldn't exceed it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1415603264-21497-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In order to make handle_cmd more readable at the macro level,
the details of how to decompose particular types of FIS packets
are left to helper functions.
In our case, the only type of FIS packet we currently expect to
see is a Register H2D FIS packet, but the gory details of its
decomposition are of no particular interest in handle_cmd.
This patch keeps the receipt of FIS packets and the decomposition
thereof separated to two different functions.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of checking for a known byte, inspect the
fields of this byte explicitly to produce more meaningful
error messages and improve the readability of this section.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Error checking in ahci's handle_cmd is re-ordered so that we
initialize as few things as possible before we've done our
sanity checking. This simplifies returning from this call
in case of an error.
A check to make sure the DMA memory map succeeds with the
correct size is also added, and the debug print of the
command fis is cleaned up with its size corrected.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch introduces a few changes to how FIS packets are
deciphered in the AHCI virtual device. The summary of
changes can be grouped into two pieces:
[A] Changes to how we apply a preliminary sieve to FISes,
[B] Changes in how we internalize a decomposed FIS.
== Changes to how we apply a preliminary sieve to FISes ==
(1) Packets may now either update the Control register or
the Command register, but not both. This is according
to the SATA 3.2 specification which states:
"...the device either initiates processing of the command
indicated in the Command register or initiates processing
of the control request indicated [...] depending on the
state of the C bit in the FIS."
See SATA 3.2 section 10.5.5.4, "Reception" in the 10.5.5
"Register Host to Device FIS" section.
This change accounts for the first two regions of change
within the diff. All other changes belong to the following
changes.
== Changes in how we internalize a decomposed FIS ==
(2) Instead of trying to extract the sector number out of the
FIS from bytes 4-10 and setting it with ide_set_sector,
we set the appropriate IDEState registers and trust that
ide_get_sector can retrieve the correct sector later.
By "constructing" the sector for use with ide_set_sector,
we are duplicating the mechanisms of ide_get_sector.
This change makes the FIS decomposition more obvious.
SATA 3.2 as a specification does not make the legacy
register mapping with respect to the D2H FIS obvious.
However, SATA 3.2 section 10.5.5.1 "Register Host to
Device FIS layout" describes all of the "cmd_fis"
bytes:
0 - FIS Type (0x27)
1 - Port Multiplier Port and Command Update flag
2 - ATA Command
3 - Features_Low
4 - LBA 7:0
5 - LBA 15:8
6 - LBA 23:16
7 - Device, AKA "Drive Select."
8 - LBA 31:24
9 - LBA 39:32
10 - LBA 47:40
11 - Features_High
12 - Count Low
13 - Count High
14 - ICC
15 - Control
16-19 - Auxiliary (for NCQ, defined per-command)
Most of these registers map to existing IDEState registers
in obvious ways, especially features, select, hob_features,
and nsector (count). ICC is reserved in older specifications
but is not supported in our implementation, and remains
unused here. The Control register is not valid for a command
that is trying to update the command register and is to be
considered reserved at this point.
What is not obvious is the LBA register mappings, but SATA 1.0
can help inform of us legacy device support, see SATA 1.0 section
8.5.2 "Register - Host to Device."
LBA 7:0 - Sector Number (sector)
LBA 15:8 - Cyl Low (lcyl)
LBA 23:16 - Cyl High (hcyl)
LBA 31:24 - Sector Num Exp. (hob_sector)
LBA 39:32 - Cyl Low Exp. (hob_lcyl)
LBA 47:40 - Cyl High Exp. (hob_hcyl)
These mappings help guide which registers the FIS should be decomposed
into/towards for CHS, LBA28 and LBA48 commands.
As a note: The prior confusion that can be seen in the documentation
arises from the fact that CHS and LBA28 commands use the low nybble
of the drive select register to store LBA 27:24, whereas LNA48 commands
use the hob_sector, hob_lcyl and hob_hcyl registers as explained above.
The decomposition as it stands now will correctly decompose CHS, LBA28
and LBA48 commands into their appropriate registers where the core
IDE/ATAPI layers can deal with them correctly.
See the below point for more information.
(3) We save cmd_fis[7] as ide_state->select, which informs
decisions about if we are using LBA or CHS.
This corrects a bug in AHCI wherein we attempt to set and/or
retrieve the sector number by using ide_set_sector and
ide_get_sector, which depend on the select register to
determine if we are using LBA or CHS.
Without this adjustment, LBA48 read/writes are currently
broken. Thanks to Eniac Zheng @ HP for pointing this out.
(4) Save cmd_fis[11] as ide_state->hob_feature, as defined in SATA 3.2.
(5) For several ATA commands, the sector count register set to 0
is a magic number that means 256 sectors. For LBA48 commands,
this means 65,536 sectors. We drop the magic sector correction
here, and trust the ide core layer to handle the conversion
appropriately, in ide_cmd_lba48_transform(). As it stands,
the current AHCI code is only compliant with LBA28 commands.
By simply removing the magic, it will work with LBA28 and LBA48.
(6) We expand FIS decomposition to include both ATAPI and IDE devices.
We leave the logic of determining if the fields are valid or not
to the respective layers.
This change intends to make it clearer that AHCI is only a
composition mechanism for the FIS packets: the meanings of
the registers is best left to the implementation layers for
those devices.
(7) Forcefully setting the feature, hcyl and lcyl registers for ATAPI
commands is removed.
- The hcyl and lcyl magic present here is valid at boot only,
and should not be overridden for every PACKET command.
- The feature register is defined as valid for the PACKET command,
so we should not suppress it. The ATAPI layer does not even
currently depend on or require 0x01 as mandatory.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>