Commit Graph

9299 Commits

Author SHA1 Message Date
refractionpcsx2 98e8d93fa3 Merge pull request #1728 from ssakash/custom_regression
GSDX-TextureCache: Fix corner cases on custom resolution scaling
2017-01-04 16:21:18 +00:00
Gregory Hainaut ecd00d377a Merge pull request #1729 from np511/master
Extend LTO support and remove warnings
2017-01-04 17:03:38 +01:00
np511 f55f3b94a1 Removes LTO warnings and sets -flto=number of cores. 2017-01-03 15:45:24 -05:00
Jason Brown fce2814735 Added callbacks for OSD Log and Monitor. Added wrappers in PCSX2 main for callbacks. Added some basic info calls (e.g. Saving loading FPS) 2017-01-03 10:43:56 +01:00
Jason Brown 44e671bb0a Add an RGBA getter for ConsoleColors 2017-01-03 10:43:56 +01:00
Jason Brown 248ad0ddde Added config page to linux setting dialog 2017-01-03 10:43:56 +01:00
Jason Brown b8a84d170a Added OSD Manager which depends on FreeType2. Added functions into GSDeviceOGL to render OSD and a point shader. 2017-01-03 10:43:56 +01:00
Jason Brown 4c084391fc Changed the GSBufferOGL interface from map and upload to map and unmap. This allows rendering directly into the OGL buffer instead of having to do copy at some point. 2017-01-03 10:43:56 +01:00
Gregory Hainaut b87881c91b Merge pull request #1735 from FlatOutPS2/W
GSdx: Prevent FMV crash
2017-01-03 10:36:29 +01:00
FlatOutPS2 048b657c8f GSdx: Prevent FMV crash
Fixes FMV crashing PCSX2 in The Simpsons: Road Rage.
2017-01-03 00:46:38 +01:00
Akash 6c521c36dd GSdx-TC: Remove some old hacks
Previously, we only calculated the width of a single output circuit which lead to missing a single pixel from the other output circuit which in turn causes offset issues in Persona games, I have customized GetDisplayRect() to now also calculate the dimensions of the merged rectangle when both the output circuits are enabled through the PMODE register, so this hack is no longer needed. :)

TL;DR - The above commit of mine accurately handles the offset issues by calculating union of the rects, removing this stupid hack. (not insulting any other developers, this stupid hack was mine :)
2017-01-02 14:43:17 +05:30
Akash b56ff3fce7 GSDX-TC: Pass merged output size for scaling
Passes the merged output circuit as the base size for texture cache scaling code. Helps fixing scaling issues where games use both of the output circuits for rendering.

Future Note: Alter the behavior of IsEnabled() check always preferring the second output circuit for some weird reason. I plan on changing it to a better auto-output circuit selection mechanism but that could probably be done some time in the future.
2017-01-02 14:42:32 +05:30
Gregory Hainaut 9d1b27cde8 miss a ;
I don't know what I compiled for my previous push !
2016-12-31 17:42:38 +01:00
Gregory Hainaut 1be3f48017 gsdx sw: minor fix on the thread management
* Upgrade the counter to signed 32 bits. 16 bits is too small to contains the 64K value.
* Read ThreadProc/m_count when the mutex is locked
* Use old value of the fetch instead to read back the new value
2016-12-31 16:59:38 +01:00
Gregory Hainaut 14a76a8499 cmake: don't use SSE2 suffix on libgsdx.so file
In debug build, SIMD is disabled, so it is dangerous (use wrong binary) to debug
2016-12-31 13:37:43 +01:00
Gregory Hainaut 761ce60a8e i10n: refresh translated based on latest string change 2016-12-31 11:40:46 +01:00
refractionpcsx2 7a61dc2c88 GSDX: CLUT temp old regression fix for the Romance of the Three Kingdoms games, until somebody who knows what they are doing fixes it properly :P 2016-12-30 22:00:54 +00:00
refractionpcsx2 8fecd3512c refractionpcsx2
GSdx Merge Circuit: Fix regression and issue
2016-12-27 12:08:18 +00:00
refractionpcsx2 c88cd1b065 Merge pull request #1720 from ssakash/rtc
PCSX2-Counters: Fix RTC counting in Progressive modes
2016-12-27 00:00:00 +00:00
refractionpcsx2 af3c1fc510 Gif MFIFO: Slight Optimisation for GIF MFIFO heavily used area.
Hopefully this translates well to slower systems :)

Tekken Tag:

Before: 79-81fps
After: 82-84fps

Front Mission 4 intro (as it pans over the roofs)
Before: 158-159fps
After: 165-166fps
2016-12-24 20:09:47 +00:00
Akash c92830b103 PCSX2-Counters: Fix RTC counting at certain cases
Previously, the seconds variable of the RTC was updated on progressive modes after every 50 Vsyncs, which was obviously wrong. The code has been adjusted to update the RTC with respect to the vertical frequencies of various other video modes.
2016-12-24 11:54:25 +05:30
refractionpcsx2 7aa554b8eb GameDB: Adding Hugo: Magic in the Trollwoods 2016-12-22 21:12:16 +00:00
Akash 8038ce1aa9 GSDX: Cleanup warnings on MSVC (#1694)
Explicitly cast some bitfields/local loop variables to uint8 as these functions have uint8 as the parameter datatype.
2016-12-21 23:21:07 +00:00
Jonathan Li 10eb88f6fe Merge pull request #1706 from PCSX2/greg/vif-hash
Greg/vif hash
2016-12-21 22:30:27 +00:00
FlatOutPS2 9b6c3bd106 GSdx Merge Circuit: Fix regression and issue
Avoids graphical issues in EA NASCAR games and a regression in Time Crisis 2/3 split screen mode.
2016-12-21 01:28:43 +01:00
Jonathan Li 5a63a62454 cdvdgigaherz: Fix read past the end of the buffer 2016-12-19 23:56:48 +00:00
Jonathan Li f2edc50675 cdvdgigaherz: Improve prefetch logic
Avoid reading past the end of the disk.
Avoid waiting when there are prefetches remaining.
Fix the maths so that the first prefetch after a request attempts to
read the next block of sectors and not the block of sectors that was
just read (which will just be skipped anyway because the data has just
been cached).
Avoid potential prefetch after disk is swapped (though disc swap doesn't
work properly if you just eject and insert a different disk).
Stop prefetching on disk read failure (Suikoden hits this case - 2048
byte reads are requested, but only 2352 byte reads will succeed).

Also reduce the read retry count to 2.
2016-12-19 23:56:48 +00:00
Jonathan Li c1160f40d0 cdvdgigaherz: Rename variables/parameters in cdvdDirectReadSector
s/sector/sector_block
s/first/sector
2016-12-19 23:56:48 +00:00
Jonathan Li 3f89f4bd32 cdvdgigaherz: Use constant for sectors per read 2016-12-19 23:56:48 +00:00
Gregory Hainaut 58e4076620 vif: update alignment constraint
16B alignment is now useless for nVifBlock (no more SSE)
However update the alignment of bucket to 64B. It will reduce cache miss
probability in the find loop
2016-12-18 22:51:23 +01:00
Gregory Hainaut d812222061 vif: use u32 code instead of u8/u16
It avoids memory stalls and greatly reduces the overhead of the dVifUnpack function

Here a vtune summary of this branch (done on SotC init)

dVifUnpack<1> was 14.5% of effective VU thread time
dVifUnpack<1> is now 3.8% of effective VU thread time

I hope it will translate to better fps
2016-12-18 22:44:24 +01:00
Gregory Hainaut ef75b36013 vif: move back the cache seach in the unpack function
Avoid the various move to return the value (actually due to the pointer)
2016-12-18 22:44:22 +01:00
Gregory Hainaut e4c2c53b19 vif: inline dVifsetVUptr function
It avoid a double cmp/jmp on the dynarec/interpreter mode.
2016-12-18 22:44:01 +01:00
Gregory Hainaut 6ae082dab2 vif: compute the length during the compilation stage 2016-12-18 22:44:00 +01:00
Gregory Hainaut 7a33cda122 vif: replace sse cmp code with standard cmp
Standard instruction are faster to execute besides the CPU can optimize the cmp/jne

SSE

  e0:	add    ecx,0x10
  e3:	cmp    eax,0x7
  e6:	jg     1b0 <void dVifUnpack<0>(unsigned char const*, bool)+0x1b0>
enter_loop:
  ec:	vpcmpeqd xmm0,xmm1,XMMWORD PTR [ecx]
  f0:	vmovmskps eax,xmm0
  f4:	cmp    eax,0x7
  f7:	jne    e0 <void dVifUnpack<0>(unsigned char const*, bool)+0xe0>

Standard cmp

  d8:	add    eax,0x10
  db:	mov    esi,DWORD PTR [eax+0xc]
  de:	test   esi,esi
  e0:	je     190 <void dVifUnpack<0>(unsigned char const*, bool)+0x190>
enter_loop:
  e6:	cmp    ecx,DWORD PTR [eax+0x4]
  e9:	jne    d8 <void dVifUnpack<0>(unsigned char const*, bool)+0xd8>
  eb:	cmp    DWORD PTR [eax+0x8],ebx
  ee:	jne    d8 <void dVifUnpack<0>(unsigned char const*, bool)+0xd8>

v2: use reference instead of a pointer for find parameter
2016-12-18 22:43:07 +01:00
Jonathan Li 0708d7c539 onepad: Fix variable type
Fixes a type limits warning on a 64-bit build.
2016-12-18 14:32:13 +00:00
Jonathan Li c974a0d888 pcsx2: Fix "ISO Selector" menu item removal memleak
Delete() deletes the menu item but keeps the sub menu. Remove() doesn't
delete the menu item.

Also use AppendSubMenu - using Append on a submenu is deprecated.
2016-12-18 14:31:27 +00:00
Gregory Hainaut 2320efeb55 vif: increase buckets number to 64K
It allow to compare only 8B in the lookup so SSE could be replaced with general instruction

As a bonus, it allow to compute the hash key with a mov rather than modulo (which was an 'and')
2016-12-18 14:05:55 +01:00
Gregory Hainaut 1a32062439 vif: repack nVifBlock struct
cl/wl can fit in a single byte. Add a 2B length field instead.
It will contains the pre computed length to reduce dVifsetVUptr overhead
2016-12-18 14:05:55 +01:00
Gregory Hainaut d34e99b38b vif: handle the special case 0 in the compilation stage (rather than lookup) 2016-12-18 14:05:55 +01:00
Gregory Hainaut 555c96a941 vif: reorganize dVifUnpack
Inline the execution part
Add a num parameter to dVifsetVUptr
Use a local variable for the nVifBlock instead of a global struct state

The goal is to ease future update of the nVifBlock struct
2016-12-18 14:05:55 +01:00
Gregory Hainaut 10b3d429fe vif: new implementation of the hash bucket
Previous implementation saved the both the chain pointer and the chain size
Rational: size is useful to add new element and to detect the end of the chain
Vif cache is rarely miss. So 'add' is barely called and the end of a chain is
barely reached.

New implementation will add a null cell at the end of the chain. As a
cell contains a x86 pointer, if is null you could conclude that you
reach the end of the chain.

The 'add' function will traverse the chain to get the current size. It is
a cold path besides the chain is often short (< 4).

The 'find' function only need to check the startPtr bytes to detect the end
of the loop.

Note: SizeChain was replaced with a std::array
2016-12-18 14:05:53 +01:00
Gregory Hainaut c58b04979f vif: remove the type template of HashBucket
The class is designed and optimized for the layout of nVifBlock.
Besides it will ease future improvement.
2016-12-18 13:41:14 +01:00
Gregory Hainaut c368618d09 vif: use intrinsic cast instead of ugly define 2016-12-18 13:41:14 +01:00
Gregory Hainaut 1acc81c25d vif: don't allocate vifblock hash on the heap
Avoid an extra indirection to access the hash bucket (Find function)
2016-12-18 13:41:14 +01:00
Gregory Hainaut 3dc7dc0cdc vif: improve block compilation management
Safety:
* check remaining space before compilation
* clear hash if recompiler is reset

Perf:
* don't research the hash after a miss
* reduce branching in Unpack/ExecuteUnpack

Note: a potential speed optimization for dVifsetVUptr
Precompute the length and store in the cache. However it need 2B on the
nVifBlock struct. Maybe we can compact cl/wl. Or merge aligned with upkType
(if some bits are useless)
2016-12-18 13:41:13 +01:00
Gregory Hainaut b0b5c27fec vif: remove useless state from nVifStruct 2016-12-18 13:23:07 +01:00
Gregory Hainaut c2587abcea mVU: always call perf before leaving the compilation function
I misses some early return in my first tentative. Now VTune shows me
properly the time in VU recompiler.

Note: It seem some block overlap (likely due to the branching mess). But it is still way better than no data
2016-12-16 22:01:06 +01:00
Gregory Hainaut 632b4971de common: remove memset duplicates
Use standard memset instead of memset_8

Move memzero/memset8 in a common OS file.
2016-12-16 20:45:22 +01:00
Gregory Hainaut b3474b5a71 MTVU/gif: prebuilt the fake packet
GS_Packet constructor calls memset which is quite slow and useless as data is overwritten

Vtune overhead of Gif_Unit::Execute goes from 5.8% to 3.0% (EE thread)
2016-12-16 10:31:23 +01:00