Re add point sampler to OpenGL. Fixes graphical issues when
Allow 8-bit textures is enabled on AMD gpus.
Issue: https://forums.pcsx2.net/Thread-GSDX-Hardware-mode-Bug-Report-Allow-8-bit-Texture
Adjust the code to be easier to read, and execute the gl code only on amd - suggested by Gregory.
Remove useless ATI_SUCKS define in tfx shader.It wasn't used anywhere outside of the shader.
Enhance the skipdraw hack by allowing skipdraw to skip a range of draw
calls.
For example: When the broken effects are at frames 90-100, the default
skipdraw always skips 0-100, possibly skipping several functioning
effects as well. By enhancing the skipdraw feature, it is now possible
to skip just frames 90-100.
For the example given above set the first box to 90 and the second box
to 100 to skip frames 90-100.
coauthor:turtleli (Linux GUI + tidy/simplify Windows GUI code)
Decrease spacing between Hardware Settings and Texture Filtering
slightly. It was a bit too much before,
Adjusted spacing to be equal between options, some had incorrect
spacing.
Maybe Accurate Date and Blending Unit Accuracy can be swapped.
Makes all the tabs scrollable, which makes the dialog more usable (the
OK button should always be present unless the screen is absurdly small).
Only checked on GTK3.
Use hexpand instead of expand, and use margin-start instead of
margin-left (margin-left is deprecated in GTK+ 3.12). IMO it looks
better this way.
Also, set the properties using gtk_widget_set_* instead of g_object_set.
Add God of War Demo ntsc,
Add Burnout Takedown E3 Demo ntsc,
Adjust regions for Harry Potter ATCOS/ATPOA,
Add comment for crc 0x7ACF7E03 - multiloader.
Automatic mipmapping: Harry Potter (Chamber Of Secrets,
Prisoner of Azkaban, Order Of The Phoenix), The Incredible Hulk:
Ultimate Destruction.
Add some missing crc ids for GT4 demo discs.
Adjust Spartan crc hack: Combine/ease the hacks in to one.
Only remove the glow/yellow bloom effect and don't skip any other effects that shouldn't be skipped. UI and some other post processing effects work properly now.
Add crc id CrashBandicootWoC RU.
Add some missing RU crc ids: Onimusha3, ICO, TombRaiderUnderworld, SoulReaver2, LegacyOfKainDefiance.
Removed crc id 7ACF7E03 and mentions of that id and it's duplicates. The id is from a multiloader when packing images, and not an actual crc id from a game(s).
Add FinalFightStreetwise RU, SoulCalibur3 RU. Correcting the wrong id TenchuFS RU (the elf was modified widescreen cheat, sorry). Delite Kunoichi RU (the elf was modified widescreen cheat, the original coincides with the EU).
Move a hack that removed burning/hot air effect. The effect is rendered properly but causes slowdowns so it's best to move it to aggressive for now.
Add comments explaining what the crc hacks do.
Ease the crc hack and skip less effects, makes shadows and some other effects work properly.
The game experiences a bit more upscaling issues which can be resolved with Merge Sprite and Wild Arms offset HW hacks.
Purge Hw hack fixes for Spyro New Beginning and Eternal Night that fixed
HUD and menu display.
They will be replaced with EE patches in GameDB that work for both
software and hardware mode. A much better alternative and less GSdx
hacks.
Port from commit b0af54d3
Fixes shadows in Star Ocean 3.
Note: It works properly on native res only just like on GL.
Upscaling will cause some issues.
Only Direct3D10/11 supports it, D3D9 doesn't support integer operations
so we can't reuse the code.
Revert merge of Spyro Eternal Night / New Beginning hw hack.
Update Spyro New Beginning hack - fixes menu/hud flicker in HW mode.
SW mode still has issues with the menu/hud elements.
Improve #1490
The following games in the crc list are not used anywhere so
we can clean this list up. If some are needed in the future then
they can be re added.
List of removed games:
CaptainTsubasa,
Dororo,
HarvestMoon,
Jak1,
JamesBondEverythingOrNothing,
NamcoXCapcom,
SDGundamGGeneration,
SeintoSeiya,
SengokuBasara,
SilentHill2,
SilentHill3,
Siren,
TalesofDestiny,
VF4,
VF4EVO.
The crc hack broke graphics ingame, causing flickering/transparent
textures and other similar issues.
On a side note the game experiences upscaling issues that can be fixed
with Half Pixel offset hack.
Update it to the version found at
https://github.com/Microsoft/Windows-classic-samples , which is in an
MIT licensed repo, and add the LICENSE file (edited to remove the SIL
Open Font LICENSE part since that doesn't apply).
Some modifications have been made to reduce the diff/stop git
complaining (not including any file that wasn't in the previous version
and removing the related header includes in streams.h, and fixing some
but not all of the whitespace issues).
Move CRC hacks do DX level.
Hack that fixes shadows:
Shadows/glitchy black ground textures can be fixed with Preload Frame Data.
Hack that removed vertical stripes:
D3D10/11 correctly emulates texture shuffle but also needs depth support.
Don't skip draw calls on Jackie Chan Adv and SVC Chaos,
fixes regressions on Jackie Chan Adv and SVC Chaos.
Gregory: The correct fix would be to trace all textures writes to
be sure of the source. But it is a much bigger work.
Purge Sengoku Basara crc hacks. Texture shuffle is emulated correctly on
d3d11/ogl. d3d9 skips the bad draw call.
Move Eternal Poison crc hack to d3d level. Not needed on ogl since
the game is emulated correctly.
Move hack that removes texture shuffle for Demon Stoneback back to ogl level,
half screen bottom issue remains.
Extent CRC hack for The Getaway and The Getaway Black Monday to work on
EU regions.
Move CRC hack that removed shadows to Aggressive, shadows are misaligned
when upscaling - can be fixed with HPO. We can use the hack as a
speedhack for Aggressive state.
Add correct FBP code for Sly 2 E3 Demo. CRC hack should work properly now.
Purge God of War 1/2 CRC hacks that fixed/removed the vertical red lines.
No longer needed since the issue is fixed accross all renders.
Also useless to be kept for speedhacks.
Texture Shuffle changes:
Always Enable Texture shuffle on D3D10/11.
Previously Texture shuffle was enabled if CRC hack
level was below Full, this was kinda not good since
D3D also relies on CRC hacks on Full so you could either
stick with texture shuffle or crc hacks.
Texture shuffle is not supported on D3D9, however we can do a partial
port where instead of vertical lines with the effect we get the effect
on the entire screen. Better than nothing I suppose.
Ported some of the code from OpenGL to D3D
( just a copy - paste job :) ),
part of the code misses a dedicated shader but we can still
use it to fix various issues on many games.
List of affected games tested so far:
The Godfather, Final Fight Streetwise, The Suffering Ties that Bind,
Urban Chaos have their vertical lines issues fixed
(highly possible for other games as well), MGS and Stolen see an improvement
but they are still broken without crc hacks. Other games that suffered
similar issues are probably affected as well.
Channel Shuffle changes:
Update Channel Shuffle detection. A lot of games should see an improvement,
MGS, Urban Chaos, Stolen have their top left corner issues resolved.
Other games should be affected as well that use similar logic.
They still miss a shader so some effects are still broken/show glitches
but it's a nice improvement for D3D users.
Shared changes:
Texture Shuffle and Channel shuffle have been moved to their
own dedicated functions. Should make things a bit cleaner.
Move part of the code for Texture Shuffle to GSRendererHW to be shared
across all HW renderers, should aboid copy paste/duplicate code.
Replace/remove an old crc hack that was used to fix the red vertical
lines issue, the hack is no longer needed. The new hack removes depth
effects on D3D only.
Note: The game has another vertical lines issue that can be fixed with
texture shuffle on D3D.
Add comments to Simple2000Vol114 explaining what the hacks do.
Merge identical code for Spyro Games in to one to avoid duplicate code.
Rename hacks name for Jak series from OO_Jak to OO_JakGames since there
are multiple games added.
This follows PR #2330.
Add some missing regions: Tomb Raider Legend JP,
StarWars Force Unleashed EU, SuperMan Returns EU, Valkyrie Profile 2 FR.
Rename GT3/Concept titles, they were incorrect.
Adjust Harley Davidson region id from NoRegion to US.
Add some missing regions to automatic mipmapping:
FIFA 03 US, FIFA 04 EU, FIFA 05 EU.
Reformat a few comments.
Add "Default" nametag for default list options for the hacks: Sprite,
Round Sprite, GL Advanced settings, HPO.
Should help users know which options are the default ones with less
confusion.
Add CRC id for FFXII US, SFEX3 EU,
GT4 CH and GT4 Online Beta US.
Adjust BullyCC region from US to EU.
Add Missing regions for Ratchet & Clank,
Ace Combat, Destroy All Humans and Soul Reaver
series to Automatic Mipmapping.
Removes GT4/Tourist Trophy CRC hacks. The hack had already been moved to
aggressive due to VRAM spike issues, but is no longer necessary at all
due to the in-game brightness/contrast setting issue being moved to
behind the frame buffer conversion hack for Direct3D and being resolved for OpenGL.
Move hacks that disabled shadows to DX level since OpenGL renders
shadows properly with Depth Emulation.
Some other upscaling issues appear with the disabled hack like a small
black border on the bottom of the screen or some ui elements but those
can be fixed with TC X,Y Offset hack.
Previously, the calculation for the size of data to be loaded was done
based on the rendering target buffer size and scaling multiplier, which
was totally wrong. This led to different resolutions having different
load sizes while the size of the real GS memory is common regardless of
the scaling variancies.
Hence use the default rendering target buffer size for the load size
independent of the scaling values. I've also removed a buffer height saturation
code which seemed unreliable.
Note: The accurate version of the code can be enabled using the macro
provided in config.h (which is more intensive on resources), the current
code goes along with the approach of maintaining a decent performance
level along with a formidable accuracy.
Adjust region id for BT2.
Move the sky texture(depth) hack back to Partial level only for the EU
regions. Effect is still not rendered correctly and causes a half
screen bottom issue.
Add R&C3 EU to Automatic Mipmapping.
Previously the limit was 1000, now 10000 in the GUI. It should help in
some rare cases where a higher number is needed without the need of ini
editing and value reset issues caused by the GUI.
Previously if HW hacks were enabled Merge Sprite was active(if checked) on
native resolution even if the GUI option was disabled, which in result
caused glitches in games on native resolution.
This should address that issue.
Remove Aggressive CRC hacks for SSX 3. Was used to remove the red lines
on older versions but no longer needed since the issue has been fixed.
Offered 1fps or less speed bump but it's not worth keeping for such a
minimal increase.
Merge all FFX CRC hacks in to one. They share the same code so it's
better to have one to avoid duplicate code.
Move CRC hack for Bleach Blade Battlers to Aggressive. It removes the
character shading. It can be used as a speed hack since the gains are
quite good from it. Around 15-30%.
Add missing CRC ids for Soul Calibur 2 and 3.
Move CRC hacks to DX level. They are not needed anymore on OpenGL since
Depth Emulation fixes depth issues (shadows).
Move CRC hack to Partial that fix the half screen
bottom issue since the effect is not rendered correctly.
Move CRC hack to Aggressive. These hacks are only
needed when running upscaled resolution. They skip the
blur effect which cause ghosting and some other screen issues.
Side effect is they also remove the channel effect on OpenGL
which is emulated correctly so let's put them on Aggressive.
Comment out a hack, it's unknown what the hack does atm.
If there are new issues then it will be added back.
Added comments what the hacks do.
Partial port for channel shuffle effect to Direct3D for Tekken5.
The effect is skipped and not rendered but now the top left
screen glitch has been resolved.
Note: At least Minimum CRC level is required for this to work.
Add hidden option "UserHacks_DisableNVhack" to disable
the Nvidia hack on Direct3D which added black lines on the right
and bottom of the screen. Could be useful for Intel and AMD GPUs.
A better solution would be to add Vendor Id detection instead,
but this will do for now.
To disable the Nvidia hack add UserHacks_DisableNVhack=1 in GSdx.ini
Fixes glitchy water in Rogue Galaxy in Direct3D when the hack is
enabled.
Fixes Test Drive car reflection in Direct3D when the hack is enabled.
OTher games are affected as well.
Adds automatic HW mipmapping support.
It relies on CRC ids so if a game does
not have their CRC id but needs mipmapping
it will not work until the id is added.
Add GUI menu and tooltip for Automatic mipmap
"Automatic (Default)"
This option will be default option from now on.
Rename "Very Slow" text option to "Slow" for full mipmap
as it caused the text not to fit properly in the menu.
Credits also go to @RedPanda4552 and @ssakash for helping
with the code.
Rearrange the two columns of HW hacks, new arrangement
is done in alphabetical order on Windows and Linux.
Rename some hacks on Linux to match the windows version.
Some other minor tweaks as well.
The following patch grays out the configure button when there's no
configuration dialog available for the selected codec. What's the use in
clicking it when no dialog pops up? :P (I've been tricked by it lots of
times)
Add HW Hack that enables Framebuffer Conversion on the CPU instead of the GPU.
Can fix broken textures on games but at the cost of slower performance.
List of games: Harry Potter games, FIFA Street games.
Games like Call of Duty, Kung fu Panda might also be affected as well as others
especially on Direct3D.
Add HW Hack GUI option on Windows/Linux for 4-bit and 8-bit Framebuffer conversion hack
named "Frame Buffer Conversion".
Regression was introduced in #1954
GSdx caused the emulator to crash when the renderer was restarted.
It may have affected older gpus from nvidia/amd
with older OpenGL support as well.
The swap interval function must be called on the same thread that
rendering takes place on. This fixes an issue where the turbo speed and
frame limiter hotkeys fail to disable vsync when the OpenGL renderer is
used.
Using range loops where possible (correctly).
Using auto where possible (minimize code changes whenever it's decided to change back to a std container).
Use more efficient erase pattern (where possible).
Minor code tweaks.
PCSX2 sends a negative value (-1) to GSdx when adaptive mode is
specified for Vsync, this mode is exclusive to OpenGL at the moment
and is unimplemented on the D3D11 renderer. Also the present function
of swapchain only accepts values from 0 to 4 as parameter, hence
passing negative values to the function is undefined behavior.
So let's fallback to standard synchronization method on D3D11 when
PCSX2 requests for adaptive mode.
Fix CRC hacks on PAL version.
PAL version will no longer experience very high brightness/contrast
issues on stages in hw mode caused by an incorrect CRC hack.
Moved a CRC hack back to OpenGL mode only for the PAL version
because texture shuffling does not work properly on PAL games.
Forgot to replace `IDC_TEXT` with `IDC_VALUE` macros, due to this the
text containing the name of the options was being updated with the
current value of the option instead of updating the text designated for
holding the values.
DBY isn't an offset to the frame memory but rather an offset to read
output circuit inside the frame memory, hence the top offset should also
be calculated for the total height of the frame memory. Fixes software
mode regression in Beyond Good and Evil.
Also handle cases when GetFrameRect() is called without any paramerer to
avoid an illegal value access violation on the DISP register.
output 1 strip of 2 triangles instead of 2 strips of 1 triangle.
Potentially it would reduce the geometry shader overhead. And it
might avoid a middle line in sprite in some AMD GPU/driver/OS
bad combination
GSdx-ogl: Console messages v2
Follow up to
commit/ec63b04719fd9c05a6aeeacb55dc1c54f5ef145b
Add intel broken driver wiki link message in console (OpenGL).
Print intel / amd buggy driver message once in console (OpenGL).
Pring texture barrier and viewpoint array info once in console (OpenGL).
Enables character outlines to partially work on Full CRC.
DX9 has a small issue where a small black line at the bottom shakes when outlines are enabled.
You can either use Aggressive crc or x,y offset to fix the issue.
Removed unnecessary crc hack that caused shadows on stationary objects (trees) to move on Direct3D in a weird
motion blur type way when the player moved slightly.
* Cast return value of IsEof() to bool. (Avoids int -> bool performance
warning error)
* Cast field and index to the required parameter type of AppendRawData.
The sprite geometry shader was still being used even if the sprites were
converted on the CPUs.
Convert all sprites using the GPU - the fix isn't ideal, but it'll
likely have to do unless someone feels like porting over more of the
OpenGL changes to the D3D11 renderer.
Closes#1921.
Class member variables are initialised in order of declaration in the
class definition. Move native_buffer to the top of the class definition
to avoid initialising m_width and m_height to random values.
Move the custom resolution scaling code to a separate subroutine and
allow future RT buffer resize calls when the buffer size isn't enough.
(Example: when a game's CRTC/Framebuffer size changes. The older code
didn't consider such cases)
Added a more robust buffer size calculation mechanism for custom
resolutions. Improves performance in higher resolutions for games
which don't need a big buffer. There's a great boost in performance
at GS limited scenarios.
I don't even feel there's a need for the large framebuffer option right
now, For future - I plan on making the large framebuffer enabled version
as the default as the overhead is there only at situations when it's
necessary. Until then keeping the original code just to be on the safe
side in case any issue pops up.
An EOF only occurs after attempting to read past the end of the file.
Account for this correctly, which fixes a potential infinite loop when
reading back an xz compressed GS dump.
Add the bitfield structure of the undocumented SYNCV register,
potentially might be useful in proper height determination of the output
circuit for some weird games which still get it wrong but still haven't
figured out how it might be useful. Maybe some sort of black magic
formula with the vertical synchronization values?
The differential phase value seems to closely resemble the display
height value of the video modes (480 for NTSC, 576 for PAL) but after
some investigating into the differential phase, I have no clue on how
they might be even related. Hopefully the mystery will be unveiled in
the near future.
Reduce disk space. Easy to share.
It would be nice to port the code to Windows.
libzma code was taken from https://git.tukaani.org/xz.git
Note: only short dumps are supported so far. Big dump will freeze the interface during the compression.
Or will suck all the RAM.
Note2: a multithreaded encoder would badly impact the compression ratio
Thanks to Turtleli for all review comments
* replace gtk_table by gtk_grid
=> it still misses some paddings
* Use 3.22 monitor API to query screen size
=> need to be tested
* directly add scrolled windows into a container without bothering with
the viewport.
Code compile fine but wasn't tested.
v2: disable the code until I (or someone) get a chance to test and fix it.
Thanks to @colepcsx2 (https://github.com/PCSX2/pcsx2/pull/1896#commitcomment-21858717) for pointing it out!
I also updated the prefix in the inferior video mode detection of GSdx, I'm not even sure why we need the videomode info on the plugin side, might be useful someday.
The shadeboost options text (Contrast, Brightness, Saturation) were not
grayed out when shadeboost was disabled, it was sort of inconsistent
compared to the behavior of external shader, so added grayouts to them
when shadeboost is disabled.
Also changed "OpenGL Very Advanced Custom Settings" to "OpenGL Advanced
Settings", the verbosity didn't help much in my opinion.
Split code in 2 parts
* Base class (GSWndEGL) that implement the core EGL and GL context
* Derived class (GSWndEGL_X11/GSWndEGL_WL) that implement the backend to handle native resources
Note: Most backend code is only useful for GSopen1/PS1 mode. GSopen2 only requires
the AttachNativeWindow implementation
Code is based around EGL_EXT_platform extension that allow to select the platform at runtime.
Note: I think the extension was integrated in EGL 1.5
The X11 backend was mostly converted to XCB
The wayland backend is only a placeholder for future code
I don't know if MS windows is/could be supported with EGL_EXT_platform API
Code validated on Mesa. Proprietary drivers aren't yet tested.
Moved a CRC hack to Aggressive that can cause or fix VRAM and RAM spikes.
This way people can switch between each config if they experience problems with either.
Varies on userconfig , game version and maybe hardware.
Added missing CRC game version for GT4 pal.
Note: The issue might be the same on GT3 and GTConcept , the code might
need to be removed for those games as well.
Hack was used to remove garbage data rectangles from popping up on screen when objects and characters were added to or removed from the world.
This issue is now being handled by OI_DoubleHalfClear in GSRendererHW.cpp, so the hack is no longe necessary and has been removed.
Moves Resident Evil 4 hack to Aggressive level as it is no longer
required to fix any issues, but does offer a decent speed boost.
Moves The Getaway & The Getaway Black Monday CRC hack to Full level, as the issue can be resolved
when using the OpenGL Hardware renderer.
* Windows behavior must be checked
* remove glsl_source.h
v2: fix missing include
Big thanks to Turtleli
v3:
fix indentation in gsdx-res.xml
add dependency in cmake
remove old res/glsl/fxaa.fx symlink
add tfx.cl for OpenCL support on Linux
v4, v5
fix cmake indentation
It is done automatically on Linux. Strings are much
better with this NULL char ;)
All credits go to turtleli
v2: increase resize instead of push_back NULL char
add checkboxes for the 2 "new" hacks
Wrap gs memory & merge postprocessing sprite
add tooltip for OpenGL options
v2: based on turtleli feedback
use gtk_scrolled_window_set_propagate_natural_height on GTK 3.22+
use the nicer GTK_CHECK_VERSION macro
Move GL_ARB_copy_image to optional for OpenGL SW render.
It will allow Ivy Bridge to work with OpenGL SW as it's not required.
Sandy Bridge is not yet tested , would be nice if someone could test.
Clip Control is only used for the HW renderer.
It will help Nvidia DX10 GPU on Windows. Potentially old AMD GPU too.
Unfortunately Ivy bridge still misses texture copy
Note on Linux, you can use the free Mesa driver.
Otherwise, it is time to save money for a future upgrade :)
Adds merge sprite hack to GSDx hacks dialog
And ports merge sprite hack to Direct3D renderers.
Special thanks to my keyboards Ctrl, c and v buttons for all their hard
work in porting this hack.
Ports the "Unscale Point and Line" hack to the Direct3D11 Hardware renderer.
And enables the "Unscale Point and Line" hack for Custom Resolutions with Direct3D11 and OpenGL.
Adds Windows GUI elements of the split texture filtering options.
Bilinear Texture Filtering is moved to the top section of the main GSdx window,
and Trilinear Filtering is moved to Hacks.
Adds Texture Filtering Of Display option to the Shader dialog window Windows UI.
Updates the layouts of the Shader and OSD dialog windows to more closely resemble the Linux GUI.
Reorganizes Hacks dialog window.
Adds UI elements for the Memory Wrapping and HPO v2/Special commits
Adds advanced OpenGL functions "Geometry Shader" and "Image Load Store" to the Windows UI.
Renames "Configure Hacks" to "Advanced Settings and Hacks", to more closely resemble the Linux GUI.
Adds GS Memory Wrapping hack to Windows. Enabling the hack will fix cut-off cutscenes in Wallace & Gromit: The Curse of the Were-Rabbit and Thrillville.
Ace Combat 4 CRC hack removes clouds for a good speed boost, which removes both 3D clouds(invisible with Hardware renderers, but cause slowdown) and 2D background clouds.
Removes blur from player airplane.
This hack also removes rockets, shows explosions(invisible without CRC hack) as garbage data, causes flickering issues with the HUD, and in some (night) missions removes the HUD altogether.
The CRC hack has been moved to the aggressive level.
Aggressive is misspeled several times in the file, this has been adressed.
Changes to the dependencies of the generated logo files did not trigger
a rebuild of the files. Use add_custom_command instead of
execute_process so build dependencies can be specified.
Also prevent the generated files from polluting the source directory.
The pixdata format loader has been removed from recent versions of
gdk2-pixbuf, so the logo doesn't load. Avoid preprocessing the data and
leave the logo as an embedded bitmap file.
Allow the output circuit saturation to take place at cases where one of the output circuit is enabled with frame mode rendering, I'm not sure it would be safe to allow saturations when both of the output circuits are enabled with frame mode rendering. Unlike field mode rendering, frame mode doesn't use identical rectangles at same co-ordinates for output in two alternating fields and potentially they could use a much bigger output size when both of the output circuits are enabled and are separated without any intersection. So let's limit the saturation to only the cases where we detect a single output circuit for frame mode rendering.
Fixes a regression in Devil May Cry 3 and Sky Gunner.
If a user switches renderer they also have to remember to change the CRC
hack level for the best user experience with the selected renderer.
This commit adds a new automatic CRC level that autoselects the
recommended CRC level for the selected renderer, so that a user doesn't
have to make the change manually.
coauthor: turtleli
If OpenGL software is the saved ini renderer and F9 is pressed to toggle
to the hardware renderer, depth emulation will be disabled. This fixes
that issue.
GSdx ogl: SSO Workaround for AMD buggy drivers
All 2017 drivers are now blacklisted.
The BSOD/crash issue is still there so don't set Blending Accuracy to None!
Shortened the message in the console making it more appealing.
-1 is only returned when there is an encoding error, and the va_list
argument is indeterminate after being passed to vsnprintf.
Use the return value to determine the buffer length, and call va_end and
then va_start before vsnprintf is called again.
ICO uses a depth of field effect for the fog. Depth is extracted
into the alpha channel of a texture. And then used as blending factor.
You need a 1:1 texture/pixel mapping otherwise you will line at boundaries.
In order to extract the DoF, ICO moves the depth buffer around the GS
memory. Memory moves are implemented in the not-scaled world. It means
that we can't have the above 1:1 ratio. And we don't know anymore that
data are coming from the current depth buffer.
The solution: I reused an HLE channel shader to read the depth buffer directly.
This way I have the guarantee that pixel/depth are aligned.
Close#1816
Otherwise you have a write before read typical race condition. It works
most of the time because textures are stored in temporary buffers (aka
texture cache). So the race condition requires texture invalidation in the mix.
I hope the perf impact will be small enough.
Fix#1691
Blood Will Tell: gray scale effect description
Frame is renderer in 0x700
Sync 0x700 (RT will be used as input)
Foreach page of frame
// The missing Sync was this one. You can't copy new data to 0x2800
// until you finish the rendering that use 0x2800 as input texture
// (AKA end of this foreach loop)
Sync 0x2800 (not the first iteration, texture will be used as a RT)
Copy page from 0x700+offset to 0x2800
Sync 0x2800 (RT will be used as input)
Render Effect line1 from 0x2800 to 0x700
Alpha test should only be disabled when writes to all of the alpha bits in the Framebuffer are masked. Fixes a regression in Dragon Ball Z: Budokai 3 scouter image rendering.
It allow to do the division before the size multiplication
It avoid a float overflow if T is too big.
Old behavior: (T * size) / Q
New behavior: (T / Q) * size
Performance Note:
* Rcp was replaced by a slow division (more accurate)
* At least we avoid a 2nd loop on the vertex buffer
It helps on Pro Soccer Club and Galerians Ash rendering
Tric Note:
SPRITE must be handled differently because the 'q' of first vertex could
be invalid
Bilinear applies to all renderer
* Common code done in GSVertexTrace
* Extend it with forced but sprite (trade-off between linear/upscale glitches)
* Linux GUI option was moved at the top with the renderer selection
Trilinear is moved to OGL hack
close#1837
Thanks to Flatout for the review and feedback.
It will take care to update the Window GUI :)
Typical bug, missing/wrong texture on the SW renderer but working fine on the HW renderer
Debugged on ATV Quad Power Racing 2 but I suspect couple of game are impacted
Bug description:
GSdx flatten the Q value of sprite. So m_vt.m_eq.q is true when Q(2N+1) are the same.
Q(2N) values could be random. The fix replaces Q0 by Q1 for the uniform Q value.
Previously, the NTSC saturation was also applied for double scan mode (Interlaced and Frame) where the developers send double the height to the DISP registers, saturation shouldn't be performed at such cases as the developers could send a value of 780 while the real size of the output would be 390 due to double scan mode. Doing the saturation later after identifying the real size also seems a bit counter-intuitive as we haven't discovered any cases where double scan games require the NTSC saturation hack. So let's just apply the saturation only for Interlaced (Field) Mode and omit the saturation step for other modes.
Isolate all the hacks into a separate subroutine and properly document about them, should make it easier for people to understand the display rectangle setup code, the hacks were totally messing up the readability of the function earlier.
The buffer contains extra room to avoid a segmentation fault due to an overflow.
Unfortunately the end of the buffer wasn't initialized which can lead to unexpected behavior.
Based on issue #1806 it could impact Guilty Gear X2
Mesa AMD was updated :)
all drivers[1] that support GL_ARB_shader_image_load_store got GL_ARB_clear_texture
[1] Intel driver misses others extensions to run GSdx
Rendering will be corrupted (for advance effects) if the driver doesn't support it.
However it allow to run with Mesa software emulation (or inside a virtual machine)
Note: mesa still requires an override of the buffer storage extension
MESA_EXTENSION_OVERRIDE=GL_ARB_buffer_storage
Previously, the combobox will reach an indeterminate state whenever it's passed with a value out of range via ComboBoxInit(). To avoid such cases, let's initialize the current selection of the combobox with the front element of the settings vector whenever we detect an out of range value which is not declared in the vector.
To reproduce the issue, set "Renderer" to some sort of crazy value like 50 in the GSdx.ini file and it'll mess up the whole GSdx plugin dialog really bad. This patch prevents such undesirable behavior by simply selecting the front element in the vector when we read an unsupported value.
Previously, the auto output circuit selection of the GSdx wasn't good, it simply defaulted to the second output circuit even when the first output circuit is also enabled. The new algorithm for auto selecting returns the merged rectangle dimensions when both of the output circuits are enabled and if the condition for merge is not satisfied then it returns the bigger output circuit.
I didn't fix all the warnings (purpose was to realign code with "recent" update)
Linux note: only miss 2 major items
* res/tfx.cl loading
* device descriptor
* And various bug fixes ;)
Previously, when F8 was triggered multiple times in a single second, the latest captured image would replace the previous captured one as it has the same name as the previous image.
The following patch detects such cases and adds a number along with the filename when new image capture is requested under the same time as the previous capture.
This reverts commit
d6383e6c21
It created a regression in Everybody's Golf 4/Hot Shots Golf 4, breaking the renderering when depth emulation is disabled/when using a Direct3D Hardware renderer.
The code is now a mirror of the ::add. So 1 insert == 1 erase
This way it won't crash on future update. And it will support future GS
memory wrapping improvement.
ZoE2:
RemoveAt overhead plummet to 0.5%. It was 17% !
However insertion is a bit slower. Due to the begin() after the push_front
v2: use std:: for lists and arrays
Removes Alpine Racer 3 hack. Issue has been resolved.
Moves NanoBreaker hack. Issue has been resolved for OpenGL and hack has
been moved to DX only.
Moves Tri-Ace games hacks. Hacks are also necessary for OpenGL with "Partial" CRC Hack Level to prevent massive slowdown.
Move Tales Of Legendia hack back as it's also necessary for OpenGL with "Partial" CRC Hack Level to prevent graphical issues.
Close: https://github.com/PCSX2/pcsx2/issues/1698
Added PAL and NTSC-U CRC's for Ar tonelico II.
Unfortunately it requires at least GCC6. If a nice guy can check the generated code on GCC6.
I don't know clang status.
Here the only example, I have found on the web
https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
Current generated code in GSTextureCache::SourceMap::Add
38b3: bsf eax,esi
38b6: add esp,0x10
38b9: test esi,esi
38bb: jne 387e <GSTextureCache::SourceMap::Add(GSTextureCache::Source*, GIFRegTEX0 const&, GSOffset*)+0x6e>
BSF already set the Z flag when input (esi) is 0. So it would be better
to not put a silly add before the jump and to skip the test operation.
OMG, Zone of Ender got a speed boost from 11 fps to 45 fps
Seriously, the goal is to allow benchmarking GSdx without too much overhead of the main renderer draw call
Note: unlike the null renderer, texture/vertex uploading, 2D draw, texture conversions are still done.
* move the post-processing frame into the OSD tab
* Rename Global Settings to Renderer Settings
* put monitor and indicator check box on the same line
At least we have a similar number of options by tab
This reverts commit f77c1900fa.
Conflicts:
plugins/GSdx/GSTextureCache.cpp
Another fix was done later for Jak cut scene (or FMV). One game got a regression (don't remember which)
It's only ever updated after the queue is updated, so its state will
always lag slightly behind it. It's sufficient to just use empty().
This seems to fix some caching issues that were noticeable on Skylake
CPUs (#998).
In the previous code, the worker thread would notify the MTGS thread
while the mutex is still locked, which could cause the MTGS thread to
wake up and immediately go back to sleep again since it can't lock the
mutex.
Use a separate mutex for waiting, which avoids the issue.
Some PSX games seem to store image data of the drawing results in an undeterminate area out of range from the current context buffer. At such cases, calculate the height of both the frame memory rectangles combined.
What happens on "Crash bash" -
* At first draw, scissoring is limited to SCAY0- 0 & SCAY1- 255
* At second draw, scissoring is limited to SCAY0- 255 & SCAY0-511
Previously, we limited the height to the value of one single output texture, so instead of that let's calculate the total height of both the two buffers combined to prevent such issues.
Previously, we only calculated the width of a single output circuit which lead to missing a single pixel from the other output circuit which in turn causes offset issues in Persona games, I have customized GetDisplayRect() to now also calculate the dimensions of the merged rectangle when both the output circuits are enabled through the PMODE register, so this hack is no longer needed. :)
TL;DR - The above commit of mine accurately handles the offset issues by calculating union of the rects, removing this stupid hack. (not insulting any other developers, this stupid hack was mine :)
Passes the merged output circuit as the base size for texture cache scaling code. Helps fixing scaling issues where games use both of the output circuits for rendering.
Future Note: Alter the behavior of IsEnabled() check always preferring the second output circuit for some weird reason. I plan on changing it to a better auto-output circuit selection mechanism but that could probably be done some time in the future.
* Upgrade the counter to signed 32 bits. 16 bits is too small to contains the 64K value.
* Read ThreadProc/m_count when the mutex is locked
* Use old value of the fetch instead to read back the new value
Use a reinterpret_cast instead of casting the function pointer address
to a void** and dereferencing it.
Also remove an unnecessary (void) and avoid including stdafx.h.
Don't adjust 'image' and just use an additional offset.
'success' was kinda unnecessary when true or false could just be
directly returned.
Move 'compression' clamping out to GSPng::Save instead.
And throw in a whole bunch of const for good measure.
JMMT uses a bigger display height on NTSC progressive scan mode, which is not really unusual hence adjust the saturation hack to only take effect on interlaced NTSC mode.
However, the whole double screen issue on FMV still exists. As a bit of information, this game has the second output disabled but seems to have some valid data inside of it, maybe the second output data is leaked into the first one? most likely a bug in the frambuffer data management rather than a CRTC issue (needs to be investigated)
Previously, the height of the frame offset was also considered for the total height of the texture which was obviously wrong as the portion before the offset value isn't part of the frame memory.
In the previous code, the threads were created and destroyed in the base
class constructor and destructor, so the threads could potentially be
active while the object is in a partially constructed or destroyed state.
The thread however, relies on a virtual function to process the queue
items, and the vtable might not be in the desired state when the object
is partially constructed or destroyed.
This probably only matters during object destruction - no items are in
the queue during object construction so the virtual function won't be
called, but items may still be queued up when the destructor is called,
so the virtual function can be called. It wasn't an issue because all
uses of the thread explicitly waited for the queues to be empty before
invoking the destructor.
Adjust the constructor to take a std::function parameter, which the
thread will use instead to process queue items, and avoid inheriting
from the GSJobQueue class. This will also eliminate the need to
explicitly wait for all jobs to finish (unless there are other external
factors, of course), which would probably make future code safer.
The previous implementation of HPO adds an offset on vertex position. It
doesn't always work beside it moves the rendering window.
The new implementation will add a texture offset so that instead to sample
the middle of the GS texel, we will sample the middle of the real texture texel.
It must be manually enabled with
* UserHacks_HalfPixelOffset_New = 1 (keep a small offset as intended by GS effect)
* UserHacks_HalfPixelOffset_New = 2 (no offset)
v2: always apply a 0.5 offset in case of float coordinates (Tales of Abyss)
Might break other games but few of them uses float coordinates to read
back the target
Doesn't fully work yet
* Unknown stack frame
* Outside any known module
Potential root cause:
* Nvidia driver
* VU code as ebp is required for emulation so likely no frame
* Use POD type to avoid SSE/AVX compilation dependency
* global object to reduce cache miss
* dynamically object so give a chance to allocate below 2GB (allow x64
optimization)
tex address is a3
vm address is a1
Could help to avoid REX prefix
Reduce prologue/epilogue register copy
Byte code size 41893 => 38912 (on my testcase)
* bin2hex.h is removed
* vptest/vpblendvb YMM support integrated upsteam
* better support of rip for 64 bits
* AVX512 support (only miss the CPU now)
Local change: add BSD3 clause
It will requires a generic (register naming) linear interpolation to use it properly
Gather instruction requires an extra mask register therefore all registers name will be shuffled
Perf wise, initial haswell implementation seems to be microcode emulated.
Based on Gabest's work.
* Miss mipmap
Note: dithering info
It is a bit tricky as a2 on linux was rdx register which overlap with fzm (dh/dl)
It might require dedicated windows code
The main FindMinMax methods is perf critical so instead I created a separate function
to ensure the constness of the depth
Fix letter regression on Xenosaga3
* Explicitly cast w_pages and h_pages into uint32.
* Prevent signed/unsigned comparison by converting lod into unsigned integer, honestly how coud a mipmapping level be negative?
Previously the dedicated custom resolution scaling equation was ignored for the second SetScale() call, generalizing the equations will also fix the DMC scaling issue on custom resolution. Also remove unnecessary checks for null on scale factors. The possibility for having a null scale factor value only exists on custom resolution and it will only happen on cases where the output circuit isn't ready yet. So the ideal way would be to handle all the required conditions of output circuit on "m_renderer->CanUpscale()" itself.
Cost ought to remain small. Worst case is 2 extra "and" operation by group of pixels in scanline renderer
I think PixelAddressN functions are mostly call in the init.
CloseThread is called in the GSJobQueue destructor, so don't call it
again in the GSThread destructor.
Fixes#392, which was caused by a use after free.
Also prevents pthread_join() from being called twice for each thread
on non-Windows operating systems, which is undefined behaviour.
Code can be enabled with "wrap_gs_mem = 1". Code only allow a single shared memory but
I don't think we need more anyway.
Linux only, Kernel panic expected with the HW renderer.
Fix FMV on Silent Hill 3 with the SW renderer
Hardcode location of interface to the location 0. If I understand the
spec correctly (unlikely), variable in interface will get successive
location.
Goal is to reduce driver work. Instead to compute some location based on
name matching approach (and silly validation), the driver can now use
static allocation.
Tests on future Mesa 13 are welcome
Just clear the buffer. The generic solution will be a copy from buffer A
to buffer B But it requires
1/ a big buffer A (otherwise it would overflow)
2/ a line width rescaling (+ the upscaling mess support)
* Add full PMODE register to replace slbg/mmod
* Add full EXTBUF register (will allow to emulate write feedback)
* Add a third source (which will actually be the destination of the
write feedback)
mipmap option 3. Actually maybe a separate tri-linear option will be better
m_mipmap == 2 => use manual PS2 trilinear/mipmap
Otherwise
m_filter == 3 => always use full automatic trilinear interpolation
m_filter == 4 => use automatic trilinear interpolation when PS2 uses mipmap
m_filter == 5 => like 4 but force bilinear interpolation inside layer
Fix Berserk #1526 Well done guys but we're more clever than you ;)
So instead to mask the color channels as any guy that RTFM, they decided to use the illegal 8H frame format
WMS/WMT 2 is the region clamping mode.
Hw unit can't emulate it right so it can give you bad filtering (Fix#1025)
Note: I only did the fix because I wanted to remove the TEXA hack. Otherwise
it is still recommended to use openGL
It must work fine without it now.
From the google code comments:
It would be nice to test those games
* Ar Tonelico 2 (line in sprite regression?)
* breath of fire dragon quarter (overlayed user interface in the game)
v2: update Dx code to use the good format
* As sw renderer, don't bother to bypass it when it is ATST_ALWAYS
* Don't update the ATE register value
=> It is a really bad idea. Next draw call will be wrong if TEST register isn't written.
The TryAlphaTest context could have been updated
GS really uses an invalid texture located at 0.
Improve the rounding for R&C. The idea is to avoid the corner case were only
the corner of the triangle touch the 7/16 edge.
* Always do +1 before the draw call
* Prefix texture name with i (as input) to keep them before the FB
Goal is to ensure that all renderers share the same draw call value.
Game: harley davidson
* write tex0 ctx0
* write tex0 ctx1
* draw ctx 0
Previous GSdx behavior will load the clut every write of TEX0. In the
above case the draw will take the wrong clut.
To be honest, it could be a wrong emulation on the EE core emulation.
The hardware likely got a single clut (1KB cache is quite expensive)
So clut loading must be skipped if the context is wrong.
Next draw will use the ctx1 clut so I apply TEX0 when the context is switched
Please test harley davidson :)
v2: detect context switch from UpdateContext function
V3: always set m_env.CTXT[i].offset.tex, avoid crash (Thanks to FlatOutPS2 that spot the issue)
V4: move bad psm correction code (rebase put it in the wrong place)
Free bt
3 0xe676d194 in ~Source ../plugins/GSdx/GSTextureCache.cpp:1526
4 0xe676d194 in GSTextureCache::SourceMap::RemoveAt(GSTextureCache::Source*) ../plugins/GSdx/GSTextureCache.cpp:1990
5 0xe676f0fe in GSTextureCache::IncAge() ../plugins/GSdx/GSTextureCache.cpp:1022
Use bt
0 0xe6772a83 in GSTextureCache::LookupSource(GIFRegTEX0 const&, GIFRegTEXA const&, GSVector4i const&) ../plugins/GSdx/GSTextureCache.cpp:204
1 0xe66b0c9f in GSRendererHW::Draw() ../plugins/GSdx/GSRendererHW.cpp:579
2 0xe66fb43e in GSState::FlushPrim() ../plugins/GSdx/GSState.cpp:1509
Hypothesis the m_map array of list contains an invalid pointer
It is populated GSTextureCache::SourceMap::Add based on the coverage. The coverage is based on the offset.
So offset is potentially wrong. As mipmap code hack the offset value. It would be a nice culprit.
This commit avoids a potential bad transition between MIPMAP (which
overwrite the "offset") and the base layer (which wrongly keep an old "offset")
Conclusion, pray for my soul as it is very hard to reproduce
Ratchet & Clank (the third) uses an address of 0 for invalid mipmap.
It would be very awkward to put the middle layer of texture in start of
memory. So let's use this information to correct the lod.
It make the game more robust on the lod rounding
* Use texraw for the unconverted texture (keep index fmt)
=> avoid bad filename order with the multiple texture layers
* add the real mipmap address
* Use a nice string format
gsdx ogl: only use geometry shader to convert big enough draw call
The purpose of geometry shader is to reduce bandwidth (72 bytes by sprite)
and CPU load.
Unfortunately it increases CPU load due to extra shader validations.
So geometry shader will only be enabled for draw call with more than
16 sprites (arbitrarily, smallest number before shadow hearts plummet)
v2: don't disable geometry shader in replayer.
It is easier to spot sprite rendering and to manually read vertex info.
Strangely the game uses large texture to handle texture buffer.
I think it plays with WMS/WMT. I'm not sure texture shuffling is 100%
correct here. But without it, it's completely broken.
* 0x9C712FF0, Jak1, EU
* 0x472E7699, Jak1, US
* 0x2479F4A9, Jak2, EU
* 0x12804727, Jak3, EU
* 0xDF659E77, JakX, EU
Please report me the CRC of the US version too so I can add them.
Please test the shadows rendering (openGL HW + accurate blending at least basic)
The game sets the framebuffer as an input texture. So I did the same for
openGL. Code is protected with a CRC. It is working because the game want to sample
pixels.
For the record, I tested it GTA too, it doesn't work as expected because
the game will resize the framebuffer to a smaller one. So you don't have
the guarantee that pixel will be read before a data write.
Note: it requires at least accurate blending set on basic
Note: I need CRC of all Jak games that suffers of this issue. Thanks you :)
Default copy-constructor is eight 32 bits move
GSRendererOGL::Lines2Sprites code shrinks from 510B to 398B
(loop of the function 296B => 181B). Hopefully it will reduce the cost
to convert line to sprite on the CPU (i.e. when geometry shader is disabled)
It impacts all renderers. It ought to fix issue in GTA radiosity,
Shadows in Jak series. (note shadows will suck in upscaling)
Implementation is really brutal. Expect a massive slow down, but at least we can test the effect
easily.
Normally perf impact will remain reasonable if the game doesn't use a Read-Write effect
Performances number are welcomes (my guess is really awful in HW mode, slow in SW mode).
You can enable it with "UserHacks_AutoFlush = 1"
GS memory is only 4MB but rendering is allowed to be 2048x2048
with 32 bits format (so 16MB). Technically the frame/depth buffer can start
at the end of the GS memory. Let's not waste too much memory.
Fix crash with BASARAX
(game draws a 2048x1664 32 bits area)
The hack only fix the HW renderer but not the SW renderer. However I'm not sure
the issue is from GSdx.
The hack will disable alpha test that used to generate empty draw call.
Manual gives all setup to upload a palette from the host. But nothing forbid to render
directly in the palette buffer. (GS rule nb 1, there is no rule ^^)
Fix Virtua Fighter 2 dark colors
However I'm not sure we can fix HW renderer. Rendering is done on the GPU but palette
handling is done on the CPU... So we need to read back data (ouch, and slow). A quick
test didn't get the expected results. Potentially there are others bugs (aka not gonna
happen on the HW renderer)
Fix motocross mania missing texture. Close#1319
As far as I understand, transfer is initialized in DIR. But the real
write only occured later so the blit buffer could have been overwritten
by a new value.
BLIT 0 13700
TREG 40 40
DIR 0 0
BLIT 0 13f00 <=== the bad guy
Write! ... => 0x3f00 W:1 F:C_32 (DIR 00), dPos(0 0) size(64 64)
v2: set a value in m_tr.m_blit for load state
It creates a regression on game that uses a small temporary target to
upload textures of various sizes. Inital code was done to handle direct
frame write (background, FMV) so big target
Purpose is to control the filtering when final image is displayed on the screen
Could improve the sharpness of the output in some games (ofc, it will be pixelated)
When depth primitive is constant and depth test is greater or equal, we can
execute the depth write after color (depth status will only depends on the initial
value)
New case for RGB_ONLY ate:
If the blending equation uses a fixed alpha or a source alpha. We can postpone the alpha write
in a 2nd pass.
If depth can also be postponed, we can guarantee the order of correctness of the value.
1st pass => do RGB
2nd pass => do Alpha & Depth
It fixed Stuntman letter rendering :) Remaining of the game is still broken :(
At higher resolutions it takes too much time to save a screenshot at the
maximum compression level. So let's allow the user to set the
compression level.
This re-uses the png_compression_level setting. The default compression
level is 1 for speed, but if the user wishes to increase the compression
level (without using an external tool) and doesn't mind if the
screenshot takes more time to save then they can increase the
compression level up to a maximum of 9 (which can take quite a while).
Fixes#1527.
PNG_LIBRARIES adds both libpng and zlib to the command line.
PNG_LIBRARY only adds libpng to the linker command line, and the cmake
documentation also suggests not to use it.
Value seems wrongly rounded and you can't distinguish 0xFFFF from 0xFFFE
Instead check that depth is constant for the draw call and the value from the vertex buffer
Fix recent regression on GTA (and likely various games)
In FB_ONLY mode the alpha test impacts (discard) only the depth value.
If there is no depth buffer, we don't care about depth write. So alpha
test is useless and we can do the draw with a single draw call and no program
switch
Extend GSVector to support float move
Initial code likely used integer move for performance reason. However due to
the nan correction, register is now in float domain.
CID 168626 (#1 of 1): Uninitialized scalar field
uninit_member: Non-static class member m_end_block is not initialized in this constructor nor in any functions that it calls.
Fix rendering issue on letters on Kengo/burnout 3/...
Default algo will execute the alpha test in 2 passes. However due to blending
you can't handle accurately the color.
Fortunately for us, the rendering uses an always pass depth test so you
can execute first all the color rendering (which doesn't depends on the alpha test)
And then the depth part which depends on the alpha test.
* Code was factorized a bit with the help of max_z
* Add an extra optimization if test is ZTST_GEQUAL and min z value is
the biggest value. Z test will always be pass.
Note: due to float rounding (23 bits mantissa vs 24 bits depth) the test
is done against 0xFF_FFFE and not 0xFF_FFFF. It is wrong but GPU will
also use float so impact will be null.
The solution files are unused and for ancient Visual Studio versions -
GSDumpGUI has its own solution file, and bin2cpp is included in the main
solution file.
The property sheets have either fallen out of use or were never used in
the first place.
Fixes a regression introduced by 46ba9aa117,
where the Linux GS replayer would always use the options in inis/GSdx.ini
(or use the default options if that doesn't exist) to replay the dump,
instead of using the GSdx.ini from the specified ini folder.
Update done on f712c5c6d0
Previous code use the size of the draw to compute latest block. I
don't know why I use .x/.y which are the origin offset so the start of the block.
Check the instruction set first in GPUinit, GPUconfigure and GPUtext
to prevent unsupported vector instructions from being executed.
Move the vector initialisation in GPUinit to a separate function - it
avoids a vzeroupper instruction.
vector push_back causes a SIGILL signal on a Nehalem (SSE4.2) QEMU VM
when compiled with GCC 6.1.1.
However, an empty constructor causes illegal instruction exceptions to be
generated on a Windows VM.
So here's an inbetween that looks stupid but works on what I've tested.