Commit Graph

19 Commits

Author SHA1 Message Date
comex f0131c2e09 Mechanical changes to move most CP state to a struct rather than separate globals.
The next commit will add a separate copy of the struct and the ability
for LoadCPReg to work on it.
2014-09-28 21:23:29 -04:00
comex 63c62b277d Some changes to VertexLoaderManager:
- Lazily create the native vertex format (which involves GL calls) from
RunVertices rather than RefreshLoader itself, freeing the latter to be
run from the CPU thread (hopefully).

- In order to avoid useless allocations while doing so, store the native
format inside the VertexLoader rather than using a cache entry.

- Wrap the s_vertex_loader_map in a lock, for similar reasons.
2014-09-28 21:23:28 -04:00
Rohit Nirmal 46057db37d Fix build failing when disabling precompiled headers. 2014-09-19 18:17:51 -04:00
degasus 8b84ddce9a VideoCommon: rewrite frame skipping code 2014-09-04 18:07:39 +02:00
comex 608f9bcd67 Refactor opcode decoding a bit to kill FifoCommandRunnable.
Separated out from my gpu-determinism branch by request.  It's not a big
commit; I just like to write long commit messages.

The main reason to kill it is hopefully a slight performance improvement
from avoiding the double switch (especially in single core mode);
however, this also improves cycle calculation, as described below.

- FifoCommandRunnable is removed; in its stead, Decode returns the
number of cycles (which only matters for "sync" GPU mode), or 0 if there
was not enough data, and is also responsible for unknown opcode alerts.

Decode and DecodeSemiNop are almost identical, so the latter is replaced
with a skipped_frame parameter to Decode.  Doesn't mean we can't improve
skipped_frame mode to do less work; if, at such a point, branching on it
has too much overhead (it certainly won't now), it can always be changed
to a template parameter.

- FifoCommandRunnable used a fixed, large cycle count for display lists,
regardless of the contents.  Presumably the actual hardware's processing
time is mostly the processing time of whatever commands are in the list,
and with this change InterpretDisplayList can just return the list's
cycle count to be added to the total.  (Since the calculation for this
is part of Decode, it didn't seem easy to split this change up.)

To facilitate this, Decode also gains an explicit 'end' parameter in
lieu of FifoCommandRunnable's call to GetVideoBufferEndPtr, which can
point to there or to the end of a display list (or elsewhere in
gpu-determinism, but that's another story).  Also, as a small
optimization, InterpretDisplayList now calls OpcodeDecoder_Run rather
than having its own Decode loop, to allow Decode to be inlined (haven't
checked whether this actually happens though).

skipped_frame mode still does not traverse display lists and uses the
old fake value of 45 cycles.  degasus has suggested that this hack is
not essential for performance and can be removed, but I want to separate
any potential performance impact of that from this commit.
2014-09-01 14:35:23 -04:00
degasus a64b0bf499 VertexLoader: cache NativeVertexFormat
This fix a performance regression of PR #672.
2014-08-16 12:58:52 +02:00
Pierre Bourdon 16f180524c VertexLoader: do not prepare for vertices if we need to skip them 2014-08-04 20:47:02 -07:00
Pierre Bourdon 6f715a1fbe VertexLoader: Remove more global state dependencies (this time IndexGenerator and VertexManager) 2014-08-02 09:34:39 -07:00
Pierre Bourdon 73f9a22e2e VertexLoader: Remove global state dependency on g_nativeVertexFmt 2014-07-26 01:35:09 +02:00
Pierre Bourdon 78c3a22060 VertexLoader: take the VAT object directly for RunVertices 2014-07-24 01:51:37 +02:00
Pierre Bourdon 20369743a4 VertexLoaderUID: remove global state dependency 2014-07-24 01:12:12 +02:00
degasus bb2fc8ecbb VideoCommon: Cache native vertex formats
We are used to have a 1:1 mapping of GX vertex formats and the native (OGL + D3D) ones, but there are by far more GX ones.
This new cache maps them directly so that we don't flush on GX vertex format changes as long as the native one doesn't change.

The idea is stolen from galop1n.
2014-07-04 14:39:27 +02:00
degasus 7bb44199fd remove unused and unexported function 2014-05-16 14:33:00 +02:00
Tillmann Karras d802d39281 clang-modernize -use-nullptr
and s/\bNULL\b/nullptr/g for *.cpp/h/mm files not compiled on my machine
2014-03-09 21:14:26 +01:00
Lioncash 2afe215271 Convert all includes to relative paths. 2014-02-18 02:19:10 -05:00
Tillmann Karras 404624bf0b Turn loops into range-based form
and some things suggested by cppcheck and compiler warnings.
2014-02-13 09:05:50 +01:00
degasus 3437c7f060 VideoCommon: small VertexLoader(Manager)? refactoring 2014-01-31 07:31:03 +01:00
degasus 010a0d481a VideoCommon: remove Cache Displaylist
This option was known to break every second game and only boost a bit.
It also seems to be broken because of streaming into pinned memory and buffer storage buffers.

v2: also remove dlc_desc
2014-01-31 07:30:55 +01:00
Jasper St. Pierre 34692ab826 Remove unnecessary Src/ folders 2013-12-31 14:03:19 -05:00