Commit Graph

9341 Commits

Author SHA1 Message Date
Akash 6c521c36dd GSdx-TC: Remove some old hacks
Previously, we only calculated the width of a single output circuit which lead to missing a single pixel from the other output circuit which in turn causes offset issues in Persona games, I have customized GetDisplayRect() to now also calculate the dimensions of the merged rectangle when both the output circuits are enabled through the PMODE register, so this hack is no longer needed. :)

TL;DR - The above commit of mine accurately handles the offset issues by calculating union of the rects, removing this stupid hack. (not insulting any other developers, this stupid hack was mine :)
2017-01-02 14:43:17 +05:30
Akash b56ff3fce7 GSDX-TC: Pass merged output size for scaling
Passes the merged output circuit as the base size for texture cache scaling code. Helps fixing scaling issues where games use both of the output circuits for rendering.

Future Note: Alter the behavior of IsEnabled() check always preferring the second output circuit for some weird reason. I plan on changing it to a better auto-output circuit selection mechanism but that could probably be done some time in the future.
2017-01-02 14:42:32 +05:30
Gregory Hainaut 9d1b27cde8 miss a ;
I don't know what I compiled for my previous push !
2016-12-31 17:42:38 +01:00
Gregory Hainaut 1be3f48017 gsdx sw: minor fix on the thread management
* Upgrade the counter to signed 32 bits. 16 bits is too small to contains the 64K value.
* Read ThreadProc/m_count when the mutex is locked
* Use old value of the fetch instead to read back the new value
2016-12-31 16:59:38 +01:00
Gregory Hainaut 14a76a8499 cmake: don't use SSE2 suffix on libgsdx.so file
In debug build, SIMD is disabled, so it is dangerous (use wrong binary) to debug
2016-12-31 13:37:43 +01:00
Gregory Hainaut 761ce60a8e i10n: refresh translated based on latest string change 2016-12-31 11:40:46 +01:00
refractionpcsx2 7a61dc2c88 GSDX: CLUT temp old regression fix for the Romance of the Three Kingdoms games, until somebody who knows what they are doing fixes it properly :P 2016-12-30 22:00:54 +00:00
refractionpcsx2 8fecd3512c refractionpcsx2
GSdx Merge Circuit: Fix regression and issue
2016-12-27 12:08:18 +00:00
refractionpcsx2 c88cd1b065 Merge pull request #1720 from ssakash/rtc
PCSX2-Counters: Fix RTC counting in Progressive modes
2016-12-27 00:00:00 +00:00
refractionpcsx2 af3c1fc510 Gif MFIFO: Slight Optimisation for GIF MFIFO heavily used area.
Hopefully this translates well to slower systems :)

Tekken Tag:

Before: 79-81fps
After: 82-84fps

Front Mission 4 intro (as it pans over the roofs)
Before: 158-159fps
After: 165-166fps
2016-12-24 20:09:47 +00:00
Akash c92830b103 PCSX2-Counters: Fix RTC counting at certain cases
Previously, the seconds variable of the RTC was updated on progressive modes after every 50 Vsyncs, which was obviously wrong. The code has been adjusted to update the RTC with respect to the vertical frequencies of various other video modes.
2016-12-24 11:54:25 +05:30
refractionpcsx2 7aa554b8eb GameDB: Adding Hugo: Magic in the Trollwoods 2016-12-22 21:12:16 +00:00
Akash 8038ce1aa9 GSDX: Cleanup warnings on MSVC (#1694)
Explicitly cast some bitfields/local loop variables to uint8 as these functions have uint8 as the parameter datatype.
2016-12-21 23:21:07 +00:00
Jonathan Li 10eb88f6fe Merge pull request #1706 from PCSX2/greg/vif-hash
Greg/vif hash
2016-12-21 22:30:27 +00:00
FlatOutPS2 9b6c3bd106 GSdx Merge Circuit: Fix regression and issue
Avoids graphical issues in EA NASCAR games and a regression in Time Crisis 2/3 split screen mode.
2016-12-21 01:28:43 +01:00
Jonathan Li 5a63a62454 cdvdgigaherz: Fix read past the end of the buffer 2016-12-19 23:56:48 +00:00
Jonathan Li f2edc50675 cdvdgigaherz: Improve prefetch logic
Avoid reading past the end of the disk.
Avoid waiting when there are prefetches remaining.
Fix the maths so that the first prefetch after a request attempts to
read the next block of sectors and not the block of sectors that was
just read (which will just be skipped anyway because the data has just
been cached).
Avoid potential prefetch after disk is swapped (though disc swap doesn't
work properly if you just eject and insert a different disk).
Stop prefetching on disk read failure (Suikoden hits this case - 2048
byte reads are requested, but only 2352 byte reads will succeed).

Also reduce the read retry count to 2.
2016-12-19 23:56:48 +00:00
Jonathan Li c1160f40d0 cdvdgigaherz: Rename variables/parameters in cdvdDirectReadSector
s/sector/sector_block
s/first/sector
2016-12-19 23:56:48 +00:00
Jonathan Li 3f89f4bd32 cdvdgigaherz: Use constant for sectors per read 2016-12-19 23:56:48 +00:00
Gregory Hainaut 58e4076620 vif: update alignment constraint
16B alignment is now useless for nVifBlock (no more SSE)
However update the alignment of bucket to 64B. It will reduce cache miss
probability in the find loop
2016-12-18 22:51:23 +01:00
Gregory Hainaut d812222061 vif: use u32 code instead of u8/u16
It avoids memory stalls and greatly reduces the overhead of the dVifUnpack function

Here a vtune summary of this branch (done on SotC init)

dVifUnpack<1> was 14.5% of effective VU thread time
dVifUnpack<1> is now 3.8% of effective VU thread time

I hope it will translate to better fps
2016-12-18 22:44:24 +01:00
Gregory Hainaut ef75b36013 vif: move back the cache seach in the unpack function
Avoid the various move to return the value (actually due to the pointer)
2016-12-18 22:44:22 +01:00
Gregory Hainaut e4c2c53b19 vif: inline dVifsetVUptr function
It avoid a double cmp/jmp on the dynarec/interpreter mode.
2016-12-18 22:44:01 +01:00
Gregory Hainaut 6ae082dab2 vif: compute the length during the compilation stage 2016-12-18 22:44:00 +01:00
Gregory Hainaut 7a33cda122 vif: replace sse cmp code with standard cmp
Standard instruction are faster to execute besides the CPU can optimize the cmp/jne

SSE

  e0:	add    ecx,0x10
  e3:	cmp    eax,0x7
  e6:	jg     1b0 <void dVifUnpack<0>(unsigned char const*, bool)+0x1b0>
enter_loop:
  ec:	vpcmpeqd xmm0,xmm1,XMMWORD PTR [ecx]
  f0:	vmovmskps eax,xmm0
  f4:	cmp    eax,0x7
  f7:	jne    e0 <void dVifUnpack<0>(unsigned char const*, bool)+0xe0>

Standard cmp

  d8:	add    eax,0x10
  db:	mov    esi,DWORD PTR [eax+0xc]
  de:	test   esi,esi
  e0:	je     190 <void dVifUnpack<0>(unsigned char const*, bool)+0x190>
enter_loop:
  e6:	cmp    ecx,DWORD PTR [eax+0x4]
  e9:	jne    d8 <void dVifUnpack<0>(unsigned char const*, bool)+0xd8>
  eb:	cmp    DWORD PTR [eax+0x8],ebx
  ee:	jne    d8 <void dVifUnpack<0>(unsigned char const*, bool)+0xd8>

v2: use reference instead of a pointer for find parameter
2016-12-18 22:43:07 +01:00
Jonathan Li f441efd776 cdvd: Set the data ready flag after a finished transfer
Fixes a black screen loading issue in Street Fighter EX3 (NTSC-J).
2016-12-18 16:27:05 +00:00
Jonathan Li 5c53708f43 cdvd: cdvdRead08 is interrupt reason, not status
It seems there was a bad copy paste that caused PwOff to be changed to
Status in bc9e0b08ad.
2016-12-18 16:25:52 +00:00
Jonathan Li 0708d7c539 onepad: Fix variable type
Fixes a type limits warning on a 64-bit build.
2016-12-18 14:32:13 +00:00
Jonathan Li c974a0d888 pcsx2: Fix "ISO Selector" menu item removal memleak
Delete() deletes the menu item but keeps the sub menu. Remove() doesn't
delete the menu item.

Also use AppendSubMenu - using Append on a submenu is deprecated.
2016-12-18 14:31:27 +00:00
Gregory Hainaut 2320efeb55 vif: increase buckets number to 64K
It allow to compare only 8B in the lookup so SSE could be replaced with general instruction

As a bonus, it allow to compute the hash key with a mov rather than modulo (which was an 'and')
2016-12-18 14:05:55 +01:00
Gregory Hainaut 1a32062439 vif: repack nVifBlock struct
cl/wl can fit in a single byte. Add a 2B length field instead.
It will contains the pre computed length to reduce dVifsetVUptr overhead
2016-12-18 14:05:55 +01:00
Gregory Hainaut d34e99b38b vif: handle the special case 0 in the compilation stage (rather than lookup) 2016-12-18 14:05:55 +01:00
Gregory Hainaut 555c96a941 vif: reorganize dVifUnpack
Inline the execution part
Add a num parameter to dVifsetVUptr
Use a local variable for the nVifBlock instead of a global struct state

The goal is to ease future update of the nVifBlock struct
2016-12-18 14:05:55 +01:00
Gregory Hainaut 10b3d429fe vif: new implementation of the hash bucket
Previous implementation saved the both the chain pointer and the chain size
Rational: size is useful to add new element and to detect the end of the chain
Vif cache is rarely miss. So 'add' is barely called and the end of a chain is
barely reached.

New implementation will add a null cell at the end of the chain. As a
cell contains a x86 pointer, if is null you could conclude that you
reach the end of the chain.

The 'add' function will traverse the chain to get the current size. It is
a cold path besides the chain is often short (< 4).

The 'find' function only need to check the startPtr bytes to detect the end
of the loop.

Note: SizeChain was replaced with a std::array
2016-12-18 14:05:53 +01:00
Gregory Hainaut c58b04979f vif: remove the type template of HashBucket
The class is designed and optimized for the layout of nVifBlock.
Besides it will ease future improvement.
2016-12-18 13:41:14 +01:00
Gregory Hainaut c368618d09 vif: use intrinsic cast instead of ugly define 2016-12-18 13:41:14 +01:00
Gregory Hainaut 1acc81c25d vif: don't allocate vifblock hash on the heap
Avoid an extra indirection to access the hash bucket (Find function)
2016-12-18 13:41:14 +01:00
Gregory Hainaut 3dc7dc0cdc vif: improve block compilation management
Safety:
* check remaining space before compilation
* clear hash if recompiler is reset

Perf:
* don't research the hash after a miss
* reduce branching in Unpack/ExecuteUnpack

Note: a potential speed optimization for dVifsetVUptr
Precompute the length and store in the cache. However it need 2B on the
nVifBlock struct. Maybe we can compact cl/wl. Or merge aligned with upkType
(if some bits are useless)
2016-12-18 13:41:13 +01:00
Gregory Hainaut b0b5c27fec vif: remove useless state from nVifStruct 2016-12-18 13:23:07 +01:00
Gregory Hainaut c2587abcea mVU: always call perf before leaving the compilation function
I misses some early return in my first tentative. Now VTune shows me
properly the time in VU recompiler.

Note: It seem some block overlap (likely due to the branching mess). But it is still way better than no data
2016-12-16 22:01:06 +01:00
Gregory Hainaut 632b4971de common: remove memset duplicates
Use standard memset instead of memset_8

Move memzero/memset8 in a common OS file.
2016-12-16 20:45:22 +01:00
Gregory Hainaut b3474b5a71 MTVU/gif: prebuilt the fake packet
GS_Packet constructor calls memset which is quite slow and useless as data is overwritten

Vtune overhead of Gif_Unit::Execute goes from 5.8% to 3.0% (EE thread)
2016-12-16 10:31:23 +01:00
ramapcsx2 29d229264d Merge pull request #1696 from FlatOutPS2/master
psxmode: Correct exe name for several PSX titles
2016-12-13 23:54:58 +01:00
FlatOutPS2 ff98dac104 psxmode: Correct exe name for several PSX titles
Several PSX titles lack a backslash in the elf path, which made the disc
serial contain 'cdrom:', this caused savestate issues in those ganes.

Solves: https://github.com/PCSX2/pcsx2/issues/1692
2016-12-13 17:32:26 +01:00
Jonathan Li 61669d1f3f gsdx:png: Fix accidental resource leak
Oops.

Unfortunately it'll reintroduce the clobbering warning on gcc 4.9.
2016-12-12 23:08:30 +00:00
Jonathan Li b178423166 gsdx-replayer:cmake: Reduce build time/filesize
Avoid building GSdx twice if the replayer is being built.
2016-12-12 18:54:54 +00:00
Jonathan Li 2c3fd160c3 gsdx-replayer:linux: Fix strict-aliasing warnings
Use a reinterpret_cast instead of casting the function pointer address
to a void** and dereferencing it.

Also remove an unnecessary (void) and avoid including stdafx.h.
2016-12-12 18:14:38 +00:00
Jonathan Li d4a6e18c01 gsdx:png: Fix gcc clobber warnings
Don't adjust 'image' and just use an additional offset.
'success' was kinda unnecessary when true or false could just be
directly returned.
Move 'compression' clamping out to GSPng::Save instead.

And throw in a whole bunch of const for good measure.
2016-12-12 17:39:05 +00:00
Jonathan Li 415090d249 common: Avoid wchar_t in pxTextWrapper
wchar_t is 16-bits on Windows, which can't actually properly fit all
Unicode characters.

Use the wx3.0.x wxTextWrapper approach of using iterators that increment
by actual characters to fix the issue, and also switch to using the
std::string style functions in wxString.
2016-12-10 22:30:27 +00:00
Jonathan Li afe86a5f66 cmake: Only use -fprofile-dir when PGO is used
It stops clang from warning that '-fprofile-dir' is not supported.
2016-12-10 21:51:21 +00:00