Commit graph

28 commits

Author SHA1 Message Date
Daniel R. dcf245b814
shader_recompiler: Implement basic 64-bit floating point support ()
* shader_recompiler: Implement basic 64-bit floating point support

* Fix formatting
2024-09-15 22:53:08 +02:00
squidbus a49c7e9dcb
shader_recompiler: Add buffer offset calculation when swizzle is enabled. () 2024-09-12 22:59:52 +03:00
TheTurtle 13743b27fc
shader_recompiler: Implement data share append and consume operations ()
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
baggins183 bb29224daf
Implement V_MOVREL variants ()
* shader_recompiler: Implement V_MOVRELS_B32, V_MOVRELD_B32,
V_MOVRELSD_B32

Generates a ton of OpSelects to hardcode reading or writing from each
possible vgpr depending on the value of m0

Future work is to do range analysis to put an upper bound on m0 and
check fewer registers.

* fix runtime info after rebase
2024-09-06 23:47:47 +03:00
TheTurtle 66e96dd944
video_core: Account of runtime state changes when compiling shaders ()
* video_core: Compile shader permutations

* spirv: Only specific storage image format for atomics

* ir: Avoid cube coord patching for storage image

* spirv: Fix default attributes

* data_share: Add more instructions

* video_core: Query storage flag with runtime state

* kernel: Use std::list for semaphore

* video_core: Use texture buffers for untyped format load/store

* buffer_cache: Limit view usage

* vk_pipeline_cache: Fix invalid iterator

* image_view: Reduce log spam when alpha=1 in storage swizzle

* video_core: More features and proper spirv feature detection

* video_core: Attempt no2 for specialization

* spirv: Remove conflict

* vk_shader_cache: Small cleanup
2024-08-29 19:29:54 +03:00
0xsegf4ult 9f4e55a8e7
shader_recompiler: constant propagation bitwise operations + S_CMPK_EQ_U32 fix ()
* rebase on main branch impl of V_LSHL_B64

* remove V_LSHR_B64

* fix S_CMPK_EQ_u32

* fix conflicts

* fix broken merge

* remove duplicate cases

* remove duplicate declaration
2024-08-28 13:10:21 +03:00
Lizardy aae6e5be73
shader_recompiler: BUFFER_ATOMIC_SWAP Opcode ()
* shader_recompiler: BUFFER_ATOMIC_SWAP Opcode

* clang

* follow 32 convention

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-26 15:21:20 +03:00
TheTurtle c79b10edc1
video_core: Bloodborne stabilization pt1 ()
* shader_recompiler: Writelane elimination pass + null image fix

* spirv: Implement image derivatives

* texture_cache: Reduce page bit size

* clang format

* slot_vector: Back to debug assert

* vk_graphics_pipeline: Handle null tsharp

* spirv: Revert some change

* vk_instance: Support primitive restart on list topology

* page_manager: Adjust windows exception handler

* clang format

* Remove subres tracking

* Will be done separately
2024-08-24 22:51:47 +03:00
Lizardy 63938ba8dd
shader_recompiler: BUFFER_ATOMIC & DS_* Opcodes ()
* BUFFER_ATOMIC | DS_MINMAX_U32

- Emission of BufferAtomicU32
- Addition of Buffer opcodes to IR
- Translator for BUFFER_ATOMIC Opcode
- Translators for DS_MAXMIN_U32 Opcodes

* Clang Format & UNREACHABLE_MSG

* clang

* no crash on compile

* clang

* Shared Atomics

* reuse

* rm vscode

* resolve

* opcodes

* side effects

* attempt fix shader comp

* failed attempt to fix

* clang

* do correct vdata set (still fails)

* clang

* fixed BUFFER_ATOMIC_ADD, DS_ADD_U32 fails

* data share should work

* clang

* resource tracking for buffer atomic

* clang

* distinguish RTN opcodes

* clean IsBufferInstruction

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-17 22:06:06 +03:00
psucien 9adc638220
shader_recompiler: basic implementation of BUFFER_STORE_FORMAT_ ()
* shader_recompiler: basic implementation of buffer store w\ fmt conversion

* added `Format16` dfmt
2024-08-15 00:15:07 +02:00
TheTurtle d332a5e611
spirv: Simplify shared memory handling ()
* spirv: Simplify shared memory handling

* spirv: Ignore clip plane

* spirv: Fix image offsets

* ir_pass: Implement shared memory lowering pass

* NVIDIA doesn't like using shared mem in fragment shader and softlocks driver

* spirv: Add log for ignoring pos1
2024-08-14 19:01:17 +03:00
Vinicius Rangel dfcfd62d4f
spirv: fix image sample lod/clamp/offset translation ()
* spirv: fix image sample lod/clamp translation

* spirv: fix image sample offsets

* fix ImageSample opcodes & offset emission
2024-08-13 09:12:38 +03:00
TheTurtle a7c9bfa5c5
shader_recompiler: Small instruction parsing refactor/bugfixes ()
* translator: Implemtn f32 to f16 convert

* shader_recompiler: Add bit instructions

* shader_recompiler: More data share instructions

* shader_recompiler: Remove exec contexts, fix S_MOV_B64

* shader_recompiler: Split instruction parsing into categories

* shader_recompiler: Better BFS search

* shader_recompiler: Constant propagation pass for cmp_class_f32

* shader_recompiler: Partial readfirstlane implementation

* shader_recompiler: Stub readlane/writelane only for non-compute

* hack: Fix swizzle on RDR

* Will properly fix this when merging this

* clang format

* address_space: Bump user area size to full

* shader_recompiler: V_INTERP_MOV_F32

* Should work the same as spirv will emit flat decoration on demand

* kernel: Add MAP_OP_MAP_FLEXIBLE

* image_view: Attempt to apply storage swizzle on format

* vk_scheduler: Barrier attachments on renderpass end

* clang format

* liverpool: cs state backup

* shader_recompiler: More instructions and formats

* vector_alu: Proper V_MBCNT_U32_B32

* shader_recompiler: Port some dark souls things

* file_system: Implement sceKernelRename

* more formats

* clang format

* resource_tracking_pass: Back to assert

* translate: Tracedata

* kernel: Remove tracy lock

* Solves random crashes in Dark Souls

* code: Review comments
2024-07-30 23:32:40 +02:00
Vinicius Rangel 680192a0c4
64 bits OP, impl V_ADDC_U32 & V_MAD_U64_U32 ()
* impl V_ADDC_U32 & V_MAD_U64_U32

* shader recompiler: add 64 bits version to get register / GetSrc

* fix V_ADDC_U32 carry

* shader recompiler: removed automatic conversion to force_flt in GetSRc

* shader recompiler: auto cast between u32 and u64 during ssa pass

* shader recompiler: fix SetVectorReg64 & standardize switches-case

* shader translate: fix overflow detection in V_ADD_I32

use vcc lo instead of vcc thread bit

* shader recompiler: more 64-bit work

- removed bit_size parameter from Get[Scalar/Vector]Register
- add BitwiseOr64
- add SetDst64 as a replacement for SetScalarReg64 & SetVectorReg64
- add GetSrc64 for 64-bit value

* shader recompiler: add V_MAD_U64_U32 vcc output

- add V_MAD_U64_U32 vcc output
- ILessThan for 64-bits

* shader recompiler: removed unnecessary changes & missing consts

* shader_recompiler: Add s64 type in constant propagation
2024-07-27 17:23:59 +03:00
Vladislav Mikhalin f9e96793cc
Implemented load_buffer_format_* conversions ()
* Implemented load_buffer_format_* conversions

* clang-format insists on ugly things
2024-07-16 15:03:07 +03:00
psucien f041276b04 recompiler: added support for discard on export with masked EXEC 2024-07-13 14:57:01 +02:00
Stolas 2620919f0b
Added Legacy Min/Max ops ()
* Forwarding V_MAX_LEGACY_F32 to V_MAX3_F32. Fixes Translation error in Geometry Wars 3.

* Forwarded to correct op

* Implemented Legacy Max/Min using NMax/NMin

* Added extra argument to Min/Max op codes

* Removed extra translator functions, replaced with bool

* Formatting
2024-07-08 12:24:12 +03:00
TheTurtle 6ceab6dfac
shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store ()
* shader_recompiler: Add LDEXP

* shader_recompiler: Add most image integer atomic ops

* shader_recompiler: Implement shared memory load/store

* shader_recompiler: More image atomics

* externals: Update sirit

* clang format

* cmake: Add missing files

* shader_recompiler: Fix some atomic bugs

* shader_recompiler: Vs outputs

* shader_recompiler: Shared mem has side-effects, fix format component order

* shader_recompiler: Inline constant buffer impl

* video_core: Fix regressions

* Work

* Fixup a few things
2024-07-05 00:15:44 +03:00
IndecisiveTurtle a603bc7d88 shader_recompiler: More instructions 2024-07-01 22:42:45 +03:00
psucien 54f8616d6a shader_recompiler: added MUL_HI VOP2 (896) 2024-06-16 20:39:53 +02:00
TheTurtle 7b1a317b09
video_core: Preliminary storage image support and more ()
* vk_rasterizer: Clear depth buffer when DB_RENDER_CONTROL says so

* video_core: Preliminary storage image support, more opcodes

* renderer_vulkan: a fix for vertex buffers merging

* renderer_vulkan: a heuristic for blend override when alpha out is masked

---------

Co-authored-by: psucien <bad_cast@protonmail.com>
2024-06-10 22:35:14 +03:00
raphaelthegreat ae7e6dafd5 shader_recompiler: Add more instructions and fix a few thinhs 2024-06-05 22:22:34 +03:00
raphaelthegreat 02a50265f8 shader_recompiler: Better branch detection + more opcodes 2024-06-02 03:05:40 +03:00
raphaelthegreat 58de7ff55a video_core: Implement basic compute shaders and more instructions 2024-05-30 01:39:24 +03:00
raphaelthegreat d59b102b6f video_core: Add image support 2024-05-27 18:25:45 +03:00
TheTurtle 8dfa5782b2
video_core: Add constant buffer support () 2024-05-26 15:51:35 +03:00
TheTurtle 3c90b8ac00
video_core: Bringup some basic functionality ()
* video_core: Remove hack in rasterizer

* The hack was to skip the first draw as the display buffer had not been created yet and the texture cache couldn't create one itself. With this patch it now can, using the color buffer parameters from registers

* shader_recompiler: Implement attribute loads/stores

* video_core: Add basic vertex, index buffer handling and pipeline caching

* externals: Make xxhash lowercase
2024-05-25 15:33:15 +03:00
TheTurtle 8730968385
video: Import new shader recompiler + display a triangle () 2024-05-22 01:35:12 +03:00