Commit Graph

3564 Commits

Author SHA1 Message Date
Stenzek 517a977444 ShaderCache: Fix several issues in background shader compiling
- In D3D, shaders could be compiled on the main thread, blocking
startup.
- Reduced the latency between a pipeline being requested and used in all
backends in hybrid ubershader mode, when no shader stages were present.
- Fixed a case where async compilation could cause the same UID to be
appended multiple times to the UID cache.
- Fix incorrect number of threads being used when immediately compile
shaders was enabled.
2018-03-15 01:50:47 +10:00
Stenzek 51e4014a35 DriverDetails: Disable primitive restart on Vulkan with Mali driver 2018-03-14 02:56:24 +10:00
Stenzek db810956ec Vulkan: Provide a more accurate method of detecting drivers/vendors
This is needed to differentiate between the open-source Mesa drivers and
their binary counterparts for Intel and AMD.
2018-03-14 02:48:53 +10:00
Henrik Rydgård 6cbb716f60 SSE-optimize UninitializeXFBMemory
Lowest hanging fruit I could find with a profiler.

Not sure this stuff actually needs to be done, but assuming it is, why
not do it quickly? 10x faster, goes from 1% CPU to 0.09%.
2018-03-13 11:55:28 +01:00
Stenzek 427aa188d4 ShaderCache: Use a version number for pipeline UID caches 2018-03-11 14:48:09 +10:00
Stenzek e31cc1f679 ShaderCache: Implement background shader compilation
This enables shaders to be compiled while the game is starting, instead
of blocking startup. If a shader is needed before it is compiled,
emulation will block.
2018-03-10 16:11:19 +10:00
Stenzek 9fa24700b6 VideoConfig: Collapse ubershader configuration fields to a single value 2018-03-10 15:56:45 +10:00
Stenzek 590307b94c ShaderCache: Use memcmp for comparing pipeline UIDs
As these are stored in a map, operator< will become a hot function when
doing lookups, which happen every frame. std::tie generated a rather
large function here with quite a few branches.
2018-03-10 15:56:42 +10:00
Stenzek 41296db083 Renderer: Remove now-redundant Set{Rasterization,Depth,Blending}State 2018-03-10 15:56:40 +10:00
Stenzek f9c829c7f7 OGL: Re-implement async shader compiling 2018-03-10 15:56:34 +10:00
Stenzek dec0c3bce8 Move shader caches to VideoCommon 2018-03-10 15:56:30 +10:00
Stenzek 4c24a69710 VideoCommon: Add support for Abstract Framebuffers 2018-03-02 20:20:48 +10:00
Stenzek 2a6d9e4713 AbstractTexture: Add support for depth textures/formats 2018-03-01 17:31:24 +10:00
Stenzek 6374a4c4a8 AbstractTexture: Support multisampled abstract texture 2018-03-01 17:31:24 +10:00
Stenzek 4316f5f56b AbstractTexture: Add property/attribute accessor helpers 2018-03-01 17:31:24 +10:00
Stenzek e125eaa237 VideoCommon: Drop references to AbstractRawTexture 2018-03-01 17:31:24 +10:00
Markus Wick 227db66e4f OGL: Use glBufferData on Mali.
tl;dr: This PR speedups dolphin on mobiles with the Mali GPU and ES 3.2
drivers by a factor of 10 by using the method with the biggest overhead.
Please keep care not to buy this shit!

The ARM driver team seems to care very well about their customers. But
bad luck, users and open source developers are *not* their customers. So
even device-independent feature requests are just ignored for *years*:

https://community.arm.com/graphics/f/discussions/4645/gl_ext_buffer_storage-support

The bad point, they neither implement any of the other common ways to
stream dynamic content in unextented GL:
- They just ignore the GL_MAP_UNSYNCHRONIZED_BIT flag
- They don't support on-device buffer updates and just stall with
glBufferSubData

It seems like no benchmark is using any dynamic content - and like no
customer cares about anything but benchmarks, or users...

We have a flag to disable the glBufferSubData way, this PR adds the flag
to also disable the unsychronized mapping way. The second one is
available since their ES 3.2 update, but slow as hell.

So how to continue? The last remaining technical way to stream dynamic
content at all is to alloc a new buffer per draw call with glBufferData.
This is very gross, but still a factor 10 speedup compared to stalling
the GPU. Small tests shows that you can expect another 3-5 times speedup
with EXT_buffer_data, so Mali would be on pair with Adreno here. So if
you have bought such a device unfortunately, please try to make noise on
your vendor forums/support and ask for this extension. If you are going
to buy a new mobile, I'd recormend to avoid *any* mobile with a Mali GPU
in it.
2018-02-25 17:12:36 +01:00
Stenzek fec6bb4d56 VideoBackends: Add AbstractShader and AbstractPipeline classes 2018-02-22 22:02:34 +10:00
Stenzek de632fc9c8 Renderer: Handle resize events on-demand instead of polling
We now differentiate between a resize event and surface change/destroyed
event, reducing the overhead for resizes in the Vulkan backend. It is
also now now safe to change the surface multiple times if the video thread
is lagging behind.
2018-02-20 01:15:55 +10:00
Stenzek c1b39ecc58 BPFunctions: Move upscaling of scissor rect to VideoCommon 2018-02-20 00:49:32 +10:00
Stenzek 5359396099 BPFunctions: Move GX viewport conversion to VideoCommon 2018-02-20 00:49:32 +10:00
Stenzek 340ee8fff8 PixelShaderGen: Implement table-based fog range as in software renderer 2018-02-15 22:19:21 +10:00
Stenzek 93fb0e1e1c TextureCache: Add an option to disable EFB copies to VRAM
The option is named DisableCopyToVRAM under the Hacks section in
GFX.ini. It is intentionally not exposed to the GUI, as users should not
need to use it under normal circumstances. The main use is debugging
issues in the EFB-to-RAM shaders.
2018-02-11 15:48:46 +10:00
Stenzek 84b990faa0 VideoConfig: Remove bForceCopyToRam field
It's the inverse of supports-copy-to-vram.
2018-02-11 15:29:37 +10:00
Anthony 096131c908
Merge pull request #6334 from stenzek/startup
Video Backend Initialization/Core Boot Improvements
2018-02-07 23:35:54 -08:00
Stenzek 260d5b7aa7 BPMemory: Handle fog configuration where both A and C are infinity/NaN
The console appears to behave against standard IEEE754 specification
here, in particular around how NaNs are handled. NaNs appear to have no
effect on the result, and are treated the same as positive or negative
infinity, based on the sign bit.

However, when the result would be NaN (inf - inf, or (-inf) - (-inf)),
this results in a completely fogged color, or unfogged color
respectively. We handle this by returning a constant zero for the A
varaible, and positive or negative infinity for C depending on the sign
bits of the A and C registers. This ensures that no NaN value is passed
to the GPU in the first place, and that the result of the fog
calculation cannot be NaN.
2018-02-01 17:40:39 +10:00
Stenzek fe5150cc31
Merge pull request #6303 from TraceBullet/auto-adjust-window-size
Fix Auto-Adjust Window Size option making the window too large
2018-01-29 17:28:44 +10:00
Stenzek c790077c13 VideoBackend: Remove PeekMessages method
The video thread and backend no longer create any windows, therefore
there will never be any messages dispatched to their thread.
2018-01-27 13:53:55 +10:00
Stenzek d96e8c9d76 VideoBackends: Combine Initialize/Prepare and Cleanup/Shutdown methods
Also allows the work previously done in Prepare to return a failure
status.
2018-01-27 13:53:55 +10:00
TraceBullet ab6f932347 Fix Auto-Adjust Window Size option making the window too large 2018-01-26 10:47:19 -05:00
Stenzek 3f197480ef Renderer: Fix crash on shutdown when frame dumping or taking screenshots 2018-01-26 12:12:00 +10:00
Stenzek 81ae88d2d5 AbstractTexture: Fix crash in Vulkan backend when freeing texture 2018-01-26 19:12:11 +10:00
Stenzek 38e0b6e2ab AbstractTexture: Move Bind() method to Renderer
This makes state tracking simpler, and enables easier porting to command
lists later on.
2018-01-22 13:22:09 +10:00
degasus a5a0599145 CustomTexture: Drop old texture format. 2018-01-20 17:08:47 +01:00
degasus 0b466249e0 CustomTextures: Drop format convertion. 2018-01-20 16:39:04 +01:00
JosJuice 2441fd28d5 AVIDump: Remove incorrect usage of s_ prefix 2018-01-17 22:19:14 +01:00
Markus Wick e02025b45e
Merge pull request #6307 from rukai/fix-frame-dump-path
Handle framedump path not existing
2018-01-17 22:02:51 +01:00
Lucas Kent 6c7e6016fb Handle framedump path not existing 2018-01-18 07:53:30 +11:00
Markus Wick cb7eede193 VideoCommon: Apply custom texture scale for arbitrary mipmaps.
We want to get the same mipmap level. And if the IR and the custom
texture upscaling fits, we don't need to modify the LOD bias.
2018-01-17 09:02:36 +01:00
Markus Wick 2a43f41ace
Merge pull request #6297 from JosJuice/custom-texture-arb-filename
Treat custom textures with "_arb" suffix as having arbitrary mipmaps
2018-01-15 09:58:30 +01:00
Markus Wick b93ae14272
Merge pull request #6300 from JonnyH/WIP/glsl-es-implicit-int-float-conversions-in-gpu-texture-decode
GLSL-ES doesn't allow implicit int/uint conversions
2018-01-11 22:22:05 +01:00
Jonathan Hamilton 46254a2cf2 Some more implicit uint/float conversions in the texture decode shaders 2018-01-11 11:15:40 -08:00
Jonathan Hamilton f23dd992dd GLSL-ES doesn't allow implicit int/uint conversions 2018-01-11 10:54:55 -08:00
JosJuice 226b65bd38 Clean up variable naming in HiresTextures::Update 2018-01-10 17:53:51 +01:00
JosJuice c25fffc9a0 Treat custom textures with "_arb" suffix as having arbitrary mipmaps
This is adapted from Bighead's code that was posted at
https://forums.dolphin-emu.org/Thread-dolphin-custom-texture-mipmaps?pid=460867#pid460867

In master, custom textures are never treated as having arbitrary mipmaps,
so we need either a change like this or a change that makes us apply the
arbitrary mipmap heuristic even when a custom texture is used.
2018-01-10 17:51:45 +01:00
Markus Wick 22f469697b
Merge pull request #6290 from JosJuice/invalid-aspect-ratio
Treat invalid aspect ratio setting values as Auto
2018-01-08 13:46:30 +01:00
JosJuice 1557e6ab05 Specify underlying types for enums that get casted from integers
Otherwise we might get UB if the value we cast is larger than the
max value of the underlying type that the compiled picked for the enum.

I haven't done any extensive check through Dolphin to find cases
of this, I'm just fixing the cases I already know of.
2018-01-08 12:14:18 +01:00
JosJuice a2404c42a1 Treat invalid aspect ratio setting values as Auto 2018-01-06 12:53:53 +01:00
Markus Wick 56d153f548 VideoCommon: Apply the yscale as upscaling of the XFB. 2018-01-06 10:36:33 +01:00
Jonathan Hamilton ceb1f8c8cb Enable shader_framebuffer_fetch blend path on ubershaders
Tested on a linux Intel Skylake integrated graphics with
blend_func_extended force-disabled, as it's the only platform I have
that doesn't crash with ubershaders and supports fb_fetch
2018-01-05 09:56:46 -08:00