Commit Graph

39 Commits

Author SHA1 Message Date
Markus Wick 227db66e4f OGL: Use glBufferData on Mali.
tl;dr: This PR speedups dolphin on mobiles with the Mali GPU and ES 3.2
drivers by a factor of 10 by using the method with the biggest overhead.
Please keep care not to buy this shit!

The ARM driver team seems to care very well about their customers. But
bad luck, users and open source developers are *not* their customers. So
even device-independent feature requests are just ignored for *years*:

https://community.arm.com/graphics/f/discussions/4645/gl_ext_buffer_storage-support

The bad point, they neither implement any of the other common ways to
stream dynamic content in unextented GL:
- They just ignore the GL_MAP_UNSYNCHRONIZED_BIT flag
- They don't support on-device buffer updates and just stall with
glBufferSubData

It seems like no benchmark is using any dynamic content - and like no
customer cares about anything but benchmarks, or users...

We have a flag to disable the glBufferSubData way, this PR adds the flag
to also disable the unsychronized mapping way. The second one is
available since their ES 3.2 update, but slow as hell.

So how to continue? The last remaining technical way to stream dynamic
content at all is to alloc a new buffer per draw call with glBufferData.
This is very gross, but still a factor 10 speedup compared to stalling
the GPU. Small tests shows that you can expect another 3-5 times speedup
with EXT_buffer_data, so Mali would be on pair with Adreno here. So if
you have bought such a device unfortunately, please try to make noise on
your vendor forums/support and ask for this extension. If you are going
to buy a new mobile, I'd recormend to avoid *any* mobile with a Mali GPU
in it.
2018-02-25 17:12:36 +01:00
Stenzek 9e1c09e347 StreamBuffer: Don't wait on fences twice when reserve > commit
If we allocate a large amount of memory (A), commit a smaller amount,
then allocate memory smaller than allocation A, we will have already
waited for these fences in A, but not used the space. In this case,
don't set m_free_iterator to a position before that which we know is
safe to use, which would result in waiting on the same fence(s) next
time.
2017-09-09 13:26:30 +10:00
z0z0z 005e6796b8 Disable pinned memory for AMD mesa drivers 2017-02-26 10:49:28 -05:00
Lioncash 468f623d27 ShaderGenCommon: Remove unnecessary includes 2017-02-01 12:19:55 -05:00
Léo Lam 31ccfffd38 Common: Add alignment header
Gets rid of duplicated alignment code.
2016-12-06 20:33:53 +01:00
Jules Blok 086f839435 DriverDetails: Make the bug identifiers humanly readable. 2016-10-31 15:02:08 +01:00
Lioncash e01c143379 Common: namespace MemoryUtil 2016-08-07 13:03:07 -04:00
Pierre Bourdon 3570c7f03a Reformat all the things. Have fun with merge conflicts. 2016-06-24 10:43:46 +02:00
degasus bca0e06a95 OGL: Use coherent mapping on Qualcomm devices. 2016-05-11 23:55:28 +02:00
Lioncash d20ba76ab3 StreamBuffer: Make factory function return a std::unique_ptr 2015-12-21 10:21:38 -05:00
Lioncash ec71452706 StreamBuffer: Correct function casing 2015-12-21 10:09:03 -05:00
Lioncash 1eea95a5be StreamBuffer: Use std::array for fences 2015-12-21 10:07:56 -05:00
Scott Mansell 95f3c956a8 Move GL interface code out of the OpenGL video backend. 2015-09-22 00:36:45 +12:00
Tillmann Karras cefcb0ace9 Update license headers to GPLv2+ 2015-05-25 13:22:31 +02:00
Stevoisiak cb86db7b68 Minor consistency changes
Mostly small changes, like capitalization and spelling
2015-01-12 15:18:18 -05:00
Fiora 94c20db369 Rename Log2 and add IsPow2 to MathUtils for future use
Also remove unused pow2/pow2f functions.
2014-09-08 20:15:45 -07:00
Sonicadvance1 e32b2e1771 Work around Intel's failings with with buffer_storage 2014-09-04 19:03:49 -05:00
Lioncash 960b54670c OGL: Fix brace and body placements
Also got rid of void argument specifiers. These are a carryover from C.
2014-08-15 14:12:29 -04:00
Justin Chadwick 43dcbe0a73 Change the comments to be more detailed. 2014-07-04 08:00:49 -04:00
Justin Chadwick 30f93ab418 Place pinned memory as top priority 2014-07-03 20:35:13 -04:00
degasus d9eafd94a2 OGL-StreamBuffer: replace size_t with u32
Yes, this matters.
We align our buffer all the the time which needs a division. u64 divisions are just so slow.
2014-06-05 13:33:50 +02:00
degasus 606e46ba8d OGL-StreamBuffer: move alignment to caller
Only the caller know if alignment is needed at all, so it can be skipped now.
2014-06-05 13:32:13 +02:00
degasus 02a4e3d70f OGL-StreamBuffer: make the SLOT calculation much easier
The size of the buffer is now power of 2, so we can use a shift instead of a division.
This was at about 2% of the global CPU usage.
2014-06-05 13:32:13 +02:00
degasus d81d2e8915 OGL-StreamBuffer: allocate fences in StreamBuffer directly 2014-06-05 13:32:13 +02:00
degasus 0688cfdaef OGL-StreamBuffer: don't use coherent mapping
Coherent mapping seems to be much slower on fermi gpus.
2014-06-05 12:18:44 +02:00
magumagu 812ff4686b OpenGL backend: remove useless header Globals.h.
The header has no content, so it can can just be deleted.
2014-04-12 19:25:37 -07:00
Pierre Bourdon 664c8d30a0 Remove all trailing whitespaces from our codebase. 2014-03-29 11:05:44 +01:00
Matthew Parlane 31cfc73a09 Fixes spacing for "for", "while", "switch" and "if"
Also moved && and || to ends of lines instead of start.
Fixed misc vertical alignments and some { needed newlining.
2014-03-11 00:35:07 +13:00
Tillmann Karras d802d39281 clang-modernize -use-nullptr
and s/\bNULL\b/nullptr/g for *.cpp/h/mm files not compiled on my machine
2014-03-09 21:14:26 +01:00
Tillmann Karras f28116b7da clang-modernize -add-override 2014-03-09 21:12:01 +01:00
Tillmann Karras 315a8ba1c0 Various changes suggested by cppcheck
- remove unused variables
- reduce the scope where it makes sense
- correct limits (did you know that strcat()'s last parameter does not
  include the \0 that is always added?)
- set some free()'d pointers to NULL
2014-02-28 12:43:20 +01:00
Lioncash 2afe215271 Convert all includes to relative paths. 2014-02-18 02:19:10 -05:00
degasus d3fd0eddbb OSX: don't avoid unsync mapping on nvida gpus just because the windows driver doesn't like it
OSX has their own driver, so performance issues aren't shared with the nvidia driver (unlike the closed source linux and windows nvidia driver). So now they'll also use the MapAndSync backend like all other osx drivers.

fixes issue 6596

I've also cleaned up the if/else block selecting the best backend a bit.
2014-01-26 11:00:29 +01:00
degasus 128fcdac26 OpenGL: refactor all of our StreamBuffers
The old way was to use big switch/case statements based on a type of buffer.
The new one is to use inheritance.

This change prohibits us to change the buffer type while running, but I doubt we'll ever do so.
Performance should also be a bit better. Also a nice cleanup.

Added some comments about this different kind of buffers.
2014-01-23 15:12:31 +01:00
degasus be1fee6d74 OpenGL: change StreamBuffer in a streaming way
This is a bit slower on map_and_* because of flushing and _very_ much slower on buffer(sub)?data because of a new memcpy.
But this design allow us to decode directly into a gpu buffer, eg vertexloader will profit :)
2014-01-23 15:12:31 +01:00
Tony Wasserka f1adc56a56 Remove vertex streaming hack.
NV has buffer_storage, AMD has pinned memory.
Both are better than that hack which shouldn't ever have been introduced in the first place.
2014-01-16 00:11:12 +01:00
degasus 95aa977d81 OGL: remove masking from streambuffer
We used this to disable pinned memory for index buffer, but as the detection
was reworked completely, it's just unused code.
2014-01-09 18:52:05 +01:00
Ryan Houdek 1118226f27 Merge branch 'master' into buffer_storage 2013-12-31 19:18:30 -06:00
Jasper St. Pierre 34692ab826 Remove unnecessary Src/ folders 2013-12-31 14:03:19 -05:00