Commit graph

77 commits

Author SHA1 Message Date
Daniel R. c3703acb19 shader_recompiler/frontend: fix IMAGE_GATHER4_C_LZ format 2024-08-25 14:06:41 +02:00
Daniel R 95a2d61014 shader_recompiler/frontend: add information on instruction format assert 2024-08-25 13:17:59 +02:00
Daniel R 785cd7ed64 shader_recompiler/frontend: fix V_NOP instruction format 2024-08-25 13:17:24 +02:00
Daniel R. 510ad570d4 shader_recompiler/frontend: implement V_NOP 2024-08-24 23:18:04 +02:00
TheTurtle b52741b714 video_core: Bloodborne stabilization pt1 (#543)
* shader_recompiler: Writelane elimination pass + null image fix

* spirv: Implement image derivatives

* texture_cache: Reduce page bit size

* clang format

* slot_vector: Back to debug assert

* vk_graphics_pipeline: Handle null tsharp

* spirv: Revert some change

* vk_instance: Support primitive restart on list topology

* page_manager: Adjust windows exception handler

* clang format

* Remove subres tracking

* Will be done separately
2024-08-24 22:51:47 +03:00
Vinicius Rangel 1ca13870eb shader_recompiler: handle fetch shader address offsets (#538)
* shader_recompiler: handle fetch shader address offsets

parse index & offset sgpr from fetch shader and propagate them to vkBindVertexBuffers

* shader_recompiler: fix fetch_shader when offset is not present

* video_core: propagate index/offset SGPRs to vkCmdDraw instead of offsetting the buffer address

* video_core: add vertex_offset to non-indexed draw calls

renamed fetch offset fields
2024-08-24 17:36:40 +02:00
psucien b552373680 Merge pull request #497 from xezrunner/xezrunner/cfg-msb-fix
shader_recompiler: fix BranchTarget sign flip for sopp.simm
2024-08-24 11:39:10 +02:00
Xphalnos bb2a417598 Lot of small fixes 2024-08-22 18:01:30 +02:00
TheTurtle e088f0141b vk_pipeline_cache: Avoid recompiling new shaders on each new PL (#480)
* cfg: Add one more divergence case

* Seen in RDR shaders

* renderer_vulkan: Reduce number of compiled shaders

* vk_pipeline_cache: Remove some unnecessary checks
2024-08-21 02:00:24 +03:00
xezrunner 66b59c20f3 Fix control.sopp.simm flipping sign in CFG label generation
This used to cause a fatal crash that would prevent Amplitude [CUSA02480] from booting beyond initialization.

A conditional true label would get an address starting with 0xffff...., which wasn't realistic with the given shader.

The multiplication by 4 causes the value to have its MSB set due to the smaller type.
2024-08-20 22:48:28 +02:00
Lizardy 74d43d059f shader_recompiler: BUFFER_ATOMIC & DS_* Opcodes (#428)
* BUFFER_ATOMIC | DS_MINMAX_U32

- Emission of BufferAtomicU32
- Addition of Buffer opcodes to IR
- Translator for BUFFER_ATOMIC Opcode
- Translators for DS_MAXMIN_U32 Opcodes

* Clang Format & UNREACHABLE_MSG

* clang

* no crash on compile

* clang

* Shared Atomics

* reuse

* rm vscode

* resolve

* opcodes

* side effects

* attempt fix shader comp

* failed attempt to fix

* clang

* do correct vdata set (still fails)

* clang

* fixed BUFFER_ATOMIC_ADD, DS_ADD_U32 fails

* data share should work

* clang

* resource tracking for buffer atomic

* clang

* distinguish RTN opcodes

* clean IsBufferInstruction

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-17 22:06:06 +03:00
TheTurtle 3b1e3b0a72 control_flow_graph: Initial divergence handling (#434)
* control_flow_graph: Initial divergence handling

* cfg: Handle additional case

* spirv: Handle tgid enable bits

* clang format

* spirv: Use proper format

* translator: Add more instructions
2024-08-16 20:05:37 +03:00
psucien 0a173b0392 shader_recompiler: basic implementation of BUFFER_STORE_FORMAT_ (#431)
* shader_recompiler: basic implementation of buffer store w\ fmt conversion

* added `Format16` dfmt
2024-08-15 00:15:07 +02:00
TheTurtle d5e7180c54 spirv: Simplify shared memory handling (#427)
* spirv: Simplify shared memory handling

* spirv: Ignore clip plane

* spirv: Fix image offsets

* ir_pass: Implement shared memory lowering pass

* NVIDIA doesn't like using shared mem in fragment shader and softlocks driver

* spirv: Add log for ignoring pos1
2024-08-14 19:01:17 +03:00
TheTurtle 705d1e29cf video_core: Various fixes (#423)
* video_core: Various fixes

* clang format
2024-08-13 20:05:10 +03:00
TheTurtle c243ef0c58 video_core: Crucial buffer cache fixes + proper GPU clears (#414)
* translator: Use templates for stronger type guarantees

* spirv: Define buffer offsets upfront

* Saves a lot of shader instructions

* buffer_cache: Use dynamic vertex input when available

* Fixes issues when games like dark souls rebind vertex buffers with different stride

* externals: Update boost

* spirv: Use runtime array for ssbos

* ssbos can be large and typically their size will vary, especially in generic copy/clear cs shaders

* fs: Lock when doing case insensitive search

* Dark Souls does fs lookups from different threads

* texture_cache: More precise invalidation from compute

* Fixes unrelated render targets being cleared

* texture_cache: Use hashes for protect gpu modified images from reupload

* translator: Treat V_CNDMASK as float

* Sometimes it can have input modifiers. Worst this will cause is some extra calls to uintBitsToFloat and opposite. But most often this is used as float anyway

* translator: Small optimization for V_SAD_U32

* Fix review

* clang format
2024-08-13 09:21:48 +03:00
Vinicius Rangel 5b589b8cc8 spirv: fix image sample lod/clamp/offset translation (#402)
* spirv: fix image sample lod/clamp translation

* spirv: fix image sample offsets

* fix ImageSample opcodes & offset emission
2024-08-13 09:12:38 +03:00
TheTurtle 8809e1c226 video_core: Implement guest buffer manager (#373)
* video_core: Introduce buffer cache

* video_core: Use multi level page table for caches

* renderer_vulkan: Remove unused stream buffer

* fix build

* oops forgot optimize off
2024-08-08 15:02:10 +03:00
TheTurtle e91da052c3 video_core: Minor fixes (#366)
* data_share: Fix DS instruction

* vk_graphics_pipeline: Fix unnecessary invalidate

* spirv: Remove subgroup id

* vector_alu: Simplify mbcnt pattern

* shader_recompiler: More instructions

* clang format

* kernel: Fix cond memory leak and reduce spam

* liverpool: Print error on exception

* build fix
2024-08-05 13:45:28 +03:00
TheTurtle bfc845324c shader_recompiler: Small instruction parsing refactor/bugfixes (#340)
* translator: Implemtn f32 to f16 convert

* shader_recompiler: Add bit instructions

* shader_recompiler: More data share instructions

* shader_recompiler: Remove exec contexts, fix S_MOV_B64

* shader_recompiler: Split instruction parsing into categories

* shader_recompiler: Better BFS search

* shader_recompiler: Constant propagation pass for cmp_class_f32

* shader_recompiler: Partial readfirstlane implementation

* shader_recompiler: Stub readlane/writelane only for non-compute

* hack: Fix swizzle on RDR

* Will properly fix this when merging this

* clang format

* address_space: Bump user area size to full

* shader_recompiler: V_INTERP_MOV_F32

* Should work the same as spirv will emit flat decoration on demand

* kernel: Add MAP_OP_MAP_FLEXIBLE

* image_view: Attempt to apply storage swizzle on format

* vk_scheduler: Barrier attachments on renderpass end

* clang format

* liverpool: cs state backup

* shader_recompiler: More instructions and formats

* vector_alu: Proper V_MBCNT_U32_B32

* shader_recompiler: Port some dark souls things

* file_system: Implement sceKernelRename

* more formats

* clang format

* resource_tracking_pass: Back to assert

* translate: Tracedata

* kernel: Remove tracy lock

* Solves random crashes in Dark Souls

* code: Review comments
2024-07-30 23:32:40 +02:00
Vinicius Rangel 9d8cbdc507 64 bits OP, impl V_ADDC_U32 & V_MAD_U64_U32 (#310)
* impl V_ADDC_U32 & V_MAD_U64_U32

* shader recompiler: add 64 bits version to get register / GetSrc

* fix V_ADDC_U32 carry

* shader recompiler: removed automatic conversion to force_flt in GetSRc

* shader recompiler: auto cast between u32 and u64 during ssa pass

* shader recompiler: fix SetVectorReg64 & standardize switches-case

* shader translate: fix overflow detection in V_ADD_I32

use vcc lo instead of vcc thread bit

* shader recompiler: more 64-bit work

- removed bit_size parameter from Get[Scalar/Vector]Register
- add BitwiseOr64
- add SetDst64 as a replacement for SetScalarReg64 & SetVectorReg64
- add GetSrc64 for 64-bit value

* shader recompiler: add V_MAD_U64_U32 vcc output

- add V_MAD_U64_U32 vcc output
- ILessThan for 64-bits

* shader recompiler: removed unnecessary changes & missing consts

* shader_recompiler: Add s64 type in constant propagation
2024-07-27 17:23:59 +03:00
DanielSvoboda 88cd3172ff BUFFER_STORE_DWORDX2 2024-07-26 00:25:29 -03:00
squidbus 38398a2175 Fix one-off bug with user data registers. 2024-07-21 22:36:12 +03:00
squidbus d42a32bbd8 Add initial macOS support. 2024-07-21 22:36:12 +03:00
IndecisiveTurtle 70e74160d7 shader_recompiler: Normal gathers 2024-07-17 16:49:45 +03:00
Vladislav Mikhalin f1d1af2dba Implemented load_buffer_format_* conversions (#295)
* Implemented load_buffer_format_* conversions

* clang-format insists on ugly things
2024-07-16 15:03:07 +03:00
IndecisiveTurtle 01f26998f8 ssa_rewrite_pass: Correct phi node type for thread bitmask 2024-07-15 13:34:34 +03:00
georgemoralis 8435322d6a Merge pull request #292 from shadps4-emu/games/00144
Missing graphics features for flOw & Flower
2024-07-14 23:07:46 +03:00
psucien b684893aa8 recompiler: added support for discard on export with masked EXEC 2024-07-13 14:57:01 +02:00
Daniel R ba8ae239f8 shader_recompiler/frontend: Implement opcodes (#289)
`S_ASHR_I32` and `BUFFER_LOAD_DWORD`.
2024-07-13 12:37:25 +03:00
psucien c068adda48 recompiler: proper VS inputs initialization 2024-07-13 01:00:24 +02:00
Vinicius Rangel 7dad151f98 impl V_CMP_CLASS_F32 common filter masks (#276) 2024-07-10 02:24:01 +03:00
DanielSvoboda 33c78854e2 add V_MAD_U32_U24 (#262)
* V_MAD_U32_U24

* adjust V_MAD_I32_I24 for bit extraction

* optional bit extraction parameter

* Update vector_alu.cpp

* clang-format

* Update src/shader_recompiler/frontend/translate/vector_alu.cpp

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>

* Update vector_alu.cpp

* Update translate.h

---------

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-07-09 01:35:01 +03:00
Stolas f9fd1e1dae Added Legacy Min/Max ops (#266)
* Forwarding V_MAX_LEGACY_F32 to V_MAX3_F32. Fixes Translation error in Geometry Wars 3.

* Forwarded to correct op

* Implemented Legacy Max/Min using NMax/NMin

* Added extra argument to Min/Max op codes

* Removed extra translator functions, replaced with bool

* Formatting
2024-07-08 12:24:12 +03:00
psucien 5836a31d76 recompiler: switch instance data to storage buffers 2024-07-07 13:08:39 +02:00
psucien 9d9ebe7a30 recompiler: fix for gather4 components return 2024-07-07 13:00:52 +02:00
psucien 317838122d renderer: added support for instance step rates 2024-07-06 18:03:43 +02:00
TheTurtle 60c63da3fd shader_recompiler: Check usage before enabling capabilities (#245)
* vk_instance: Better feature check

* shader_recompiler: Make most features optional

* vk_instance: Bump extension vector size

* resource_tracking_pass: Perform BFS for sharp tracking

* The Witness triggered this
2024-07-06 02:42:16 +03:00
TheTurtle d9873e30bc shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store (#231)
* shader_recompiler: Add LDEXP

* shader_recompiler: Add most image integer atomic ops

* shader_recompiler: Implement shared memory load/store

* shader_recompiler: More image atomics

* externals: Update sirit

* clang format

* cmake: Add missing files

* shader_recompiler: Fix some atomic bugs

* shader_recompiler: Vs outputs

* shader_recompiler: Shared mem has side-effects, fix format component order

* shader_recompiler: Inline constant buffer impl

* video_core: Fix regressions

* Work

* Fixup a few things
2024-07-05 00:15:44 +03:00
georgemoralis ec0afbd63c clang format fix 2024-07-01 23:48:30 +03:00
IndecisiveTurtle b72115bed4 shader_recompiler: More instructions 2024-07-01 22:42:45 +03:00
IndecisiveTurtle 0a900115e8 video_core: Fix some regressions 2024-07-01 18:26:22 +03:00
IndecisiveTurtle 028e28e231 shader_recompiler: Apply buffer swizzle on vertex attribs 2024-07-01 13:56:14 +03:00
IndecisiveTurtle 69797f4d5d video_core: Track renderpass scopes properly 2024-07-01 13:56:14 +03:00
IndecisiveTurtle 73c2697ed1 video_core: Fix a few problems 2024-07-01 13:56:14 +03:00
IndecisiveTurtle 333e5ab85f spirv: Add fragdepth and implement image query 2024-07-01 13:56:14 +03:00
georgemoralis 0845d8f250 Stabilization8 (#218)
* disable configured flexible memory size (caused issues in some games)

* fixed case S_OR_B64 for blazing chrome

* submodules updates and fixes for latest SDL

* stubbed _sigprocmask (not handled and spams too much)

* added ReplaceOp case in Stencilop

* dummy ajm module added
2024-06-27 16:37:17 +03:00
IndecisiveTurtle c5f2368e52 kernel: Const correctness 2024-06-26 18:24:06 +03:00
IndecisiveTurtle 4595a4cfb2 translator: Merge ANDN2 with AND and impl ORN2 2024-06-26 18:16:01 +03:00
IndecisiveTurtle ee50cbdcb6 shader_recompiler: More instructions and fix for swords of ditto 2024-06-26 18:03:09 +03:00