Commit Graph

3592 Commits

Author SHA1 Message Date
Stenzek 2687c55cf6 Renderer: Only recreate frame dump texture if dimensions differ
This was a typo, been around for a while. == should be !=. May improve
frame dumping performnace slightly, but I doubt much if any.
2018-04-02 01:15:48 +10:00
JosJuice 01ea10824d Put a "last changed in PR" comment next to UID cache version
This will create a merge conflict if two PRs try to increment the
cache version at the same time, which makes it noticeable that the
PR that gets merged last needs to increment the cache version again.
We already use this for savestates and the game list cache.
2018-03-29 22:38:37 +02:00
Lioncash b818cc682c VideoCommon/Vulkan: Explicitly link in xxhash
Lessens the dependency on the LIBS variable (and also makes the required
libraries explicit).
2018-03-28 17:03:16 -04:00
Lioncash fd7ac0d4a3
VideoCommon/CMakeLists: Migrate off add_dolphin_library
Continues the migration work started in 3a4c3bbe01
2018-03-28 09:57:50 -04:00
Markus Wick fc59ec6f13
Merge pull request #6547 from lioncash/dead
PixelShaderGen: Remove dead code in WriteColor()
2018-03-28 10:43:15 +02:00
Lioncash f10198500e PixelShaderGen: Invert conditional in WriteColor()
Puts the true condition body first instead of the false one for better
reading.
2018-03-27 18:53:17 -04:00
Lioncash c9e0045881 PixelShaderGen: Remove dead code in WriteColor()
The outer conditional executes only whenever destination alpha is used.
2018-03-27 18:46:25 -04:00
Lioncash 2da8d98b2f
HiresTextures: Use std::minmax or std::minmax_element where applicable in GenBaseName()
Minimizes repetition.

std::minmax_element can be used for the 256 * 2 case, as it's only performing byte comparisons
and thus, there will always be an element smaller than 0xffff, so it doesn't need to be included
in the set of compared values.
2018-03-27 15:39:05 -04:00
Lioncash d5a1edba09
HiresTextures: Remove unnecessary pointer casts in GenBaseName()
swap16 has an overload that accepts a u8*, performing the same behavior
in a well-defined manner.
2018-03-27 15:19:02 -04:00
Lioncash 9feb18866b
BPStructs: Remove an unnecessary pointer cast in GetBPRegInfo
swap32 has an overload that accepts a u8*, performing the same behavior
in a well-defined manner.
2018-03-27 12:04:16 -04:00
Stenzek 2f1a7cbee1 Implement "Skip" ubershader mode
Skip ubershader mode works the same as hybrid ubershaders in that the
shaders are compiled asynchronously. However, instead of using the
ubershader to draw the object, it skips it entirely until the
specialized shader is made available.

This mode will likely result in broken effects where a game creates an
EFB copy, and does not redraw it every frame. Therefore, it is not a
recommended option, however, it may result in better performance on
low-end systems.
2018-03-26 01:57:41 +10:00
JosJuice 9c70036105
Merge pull request #6503 from lioncash/brace-warn
PixelShaderGen/UberShaderPixel: Silence -Wmissing-braces warnings
2018-03-23 15:29:39 +01:00
Lioncash a52cc8d52b
PixelShaderGen/UberShaderPixel: Silence -Wmissing-braces warnings 2018-03-23 10:06:27 -04:00
Lioncash 2ab29a40eb
AbstractFramebuffer: Silence a -Wlogical-op-parentheses warning in ValidateConfig() 2018-03-23 09:58:19 -04:00
Léo Lam f335790623
Merge pull request #6460 from lioncash/datareader
DataReader: Minor API changes
2018-03-19 15:02:50 +01:00
Markus Wick 98b4716902
Merge pull request #6442 from stenzek/async-compiler-priority
ShaderCache: Implement compile priority
2018-03-19 09:16:53 +01:00
Markus Wick 79b21e1381
Merge pull request #6459 from lioncash/enum
VertexShaderGen: Convert defines to an enum
2018-03-19 08:48:25 +01:00
Lioncash b21a00d290 DataReader: Provide a const qualified variant of GetPointer()
This wouldn't be much of a data reader if it can't access the
read-only data pointer in read-only contexts. Especially if it
can get a writable equivalent in contexts that aren't read-only.
2018-03-18 16:53:04 -04:00
Lioncash ce29c1c42f DataReader: Correct return type of operator=
It's questionable to not return a reference to the instance being
assigned to. It's also quite misleading in terms of expected behavior
relative to everything else. This fixes it to make it consistent with
other classes.
2018-03-18 16:50:59 -04:00
Lioncash ffade65c55 DataReader: Remove __forceinline from trivial functions 2018-03-18 16:49:50 -04:00
Lioncash 4c2ec39199 DataReader: In-class initialize member variables where applicable
Allows the default constructor to be defaulted and ensures the default
values are associated with the member variables directly.

Also corrects a prefixed underscore in the two parameter constructor.
2018-03-18 16:34:51 -04:00
Lioncash fd956f6c69 DataReader: Make size() and Peek() const member functions
These don't modify class state, so they can be const qualified.
2018-03-18 16:33:14 -04:00
Lioncash 2b3b1e8d09 VertexShaderGen: Convert defines to an enum
Unlike defines, these will actually obey namespacing (should one be
added), and also provide a symbol when debugging, as opposed to a magic
value.
2018-03-18 15:45:20 -04:00
Lioncash b68e8b872e VideoBackendBase: Migrate functions from MainBase.cpp to VideoBackendBase.cpp
Given that this only contains functions from the VideoBackendBase class,
it makes more sense to move these to the relevant cpp file to keep them
all together.
2018-03-18 15:33:59 -04:00
Lioncash 13ed716419 MainBase: Remove effectively unused s_FifoShuttingDown variable
This is only ever set and cleared, which is fine, however nothing else
ever actually reads its state to perform differing behavior.
2018-03-17 18:10:43 -04:00
Lioncash 826c11ba3b MainBase: Remove unused s_beginFieldArgs struct
This is only ever memset to zero and never used again.

This also gets rid of an instance of undefined behavior considering the
draft standard for C++17 (N4659) states at [dcl.type.cv] paragraph 5:

"
The semantics of an access through a volatile glvalue are implementation-defined.
If an attempt is made to access an object defined with a volatile-qualified type
through the use of a non-volatile glvalue, the behavior is undefined.
"
2018-03-17 18:07:08 -04:00
Stenzek 93865b327f ShaderCache: Implement compile priority
Currently, when immediately compile shaders is not enabled, the
ubershaders will be placed before any specialized shaders in the compile
queue in hybrid ubershaders mode. This means that Dolphin could
potentially use the ubershaders for a longer time than it would have if
we blocked startup until all shaders were compiled, leading to a drop in
performance.
2018-03-17 01:53:11 +10:00
Lioncash 50a476c371 Assert: Uppercase assertion macros
Macros should be all upper-cased. This is also kind of a wart that's
been sticking out for quite a while now (we avoid prefixing
underscores).
2018-03-14 22:03:12 -04:00
Stenzek 517a977444 ShaderCache: Fix several issues in background shader compiling
- In D3D, shaders could be compiled on the main thread, blocking
startup.
- Reduced the latency between a pipeline being requested and used in all
backends in hybrid ubershader mode, when no shader stages were present.
- Fixed a case where async compilation could cause the same UID to be
appended multiple times to the UID cache.
- Fix incorrect number of threads being used when immediately compile
shaders was enabled.
2018-03-15 01:50:47 +10:00
Stenzek 51e4014a35 DriverDetails: Disable primitive restart on Vulkan with Mali driver 2018-03-14 02:56:24 +10:00
Stenzek db810956ec Vulkan: Provide a more accurate method of detecting drivers/vendors
This is needed to differentiate between the open-source Mesa drivers and
their binary counterparts for Intel and AMD.
2018-03-14 02:48:53 +10:00
Henrik Rydgård 6cbb716f60 SSE-optimize UninitializeXFBMemory
Lowest hanging fruit I could find with a profiler.

Not sure this stuff actually needs to be done, but assuming it is, why
not do it quickly? 10x faster, goes from 1% CPU to 0.09%.
2018-03-13 11:55:28 +01:00
Stenzek 427aa188d4 ShaderCache: Use a version number for pipeline UID caches 2018-03-11 14:48:09 +10:00
Stenzek e31cc1f679 ShaderCache: Implement background shader compilation
This enables shaders to be compiled while the game is starting, instead
of blocking startup. If a shader is needed before it is compiled,
emulation will block.
2018-03-10 16:11:19 +10:00
Stenzek 9fa24700b6 VideoConfig: Collapse ubershader configuration fields to a single value 2018-03-10 15:56:45 +10:00
Stenzek 590307b94c ShaderCache: Use memcmp for comparing pipeline UIDs
As these are stored in a map, operator< will become a hot function when
doing lookups, which happen every frame. std::tie generated a rather
large function here with quite a few branches.
2018-03-10 15:56:42 +10:00
Stenzek 41296db083 Renderer: Remove now-redundant Set{Rasterization,Depth,Blending}State 2018-03-10 15:56:40 +10:00
Stenzek f9c829c7f7 OGL: Re-implement async shader compiling 2018-03-10 15:56:34 +10:00
Stenzek dec0c3bce8 Move shader caches to VideoCommon 2018-03-10 15:56:30 +10:00
Stenzek 4c24a69710 VideoCommon: Add support for Abstract Framebuffers 2018-03-02 20:20:48 +10:00
Stenzek 2a6d9e4713 AbstractTexture: Add support for depth textures/formats 2018-03-01 17:31:24 +10:00
Stenzek 6374a4c4a8 AbstractTexture: Support multisampled abstract texture 2018-03-01 17:31:24 +10:00
Stenzek 4316f5f56b AbstractTexture: Add property/attribute accessor helpers 2018-03-01 17:31:24 +10:00
Stenzek e125eaa237 VideoCommon: Drop references to AbstractRawTexture 2018-03-01 17:31:24 +10:00
Markus Wick 227db66e4f OGL: Use glBufferData on Mali.
tl;dr: This PR speedups dolphin on mobiles with the Mali GPU and ES 3.2
drivers by a factor of 10 by using the method with the biggest overhead.
Please keep care not to buy this shit!

The ARM driver team seems to care very well about their customers. But
bad luck, users and open source developers are *not* their customers. So
even device-independent feature requests are just ignored for *years*:

https://community.arm.com/graphics/f/discussions/4645/gl_ext_buffer_storage-support

The bad point, they neither implement any of the other common ways to
stream dynamic content in unextented GL:
- They just ignore the GL_MAP_UNSYNCHRONIZED_BIT flag
- They don't support on-device buffer updates and just stall with
glBufferSubData

It seems like no benchmark is using any dynamic content - and like no
customer cares about anything but benchmarks, or users...

We have a flag to disable the glBufferSubData way, this PR adds the flag
to also disable the unsychronized mapping way. The second one is
available since their ES 3.2 update, but slow as hell.

So how to continue? The last remaining technical way to stream dynamic
content at all is to alloc a new buffer per draw call with glBufferData.
This is very gross, but still a factor 10 speedup compared to stalling
the GPU. Small tests shows that you can expect another 3-5 times speedup
with EXT_buffer_data, so Mali would be on pair with Adreno here. So if
you have bought such a device unfortunately, please try to make noise on
your vendor forums/support and ask for this extension. If you are going
to buy a new mobile, I'd recormend to avoid *any* mobile with a Mali GPU
in it.
2018-02-25 17:12:36 +01:00
Stenzek fec6bb4d56 VideoBackends: Add AbstractShader and AbstractPipeline classes 2018-02-22 22:02:34 +10:00
Stenzek de632fc9c8 Renderer: Handle resize events on-demand instead of polling
We now differentiate between a resize event and surface change/destroyed
event, reducing the overhead for resizes in the Vulkan backend. It is
also now now safe to change the surface multiple times if the video thread
is lagging behind.
2018-02-20 01:15:55 +10:00
Stenzek c1b39ecc58 BPFunctions: Move upscaling of scissor rect to VideoCommon 2018-02-20 00:49:32 +10:00
Stenzek 5359396099 BPFunctions: Move GX viewport conversion to VideoCommon 2018-02-20 00:49:32 +10:00
Stenzek 340ee8fff8 PixelShaderGen: Implement table-based fog range as in software renderer 2018-02-15 22:19:21 +10:00