Commit graph

1225 commits

Author SHA1 Message Date
James Rowe cf9bfe0690
Merge pull request #3787 from wwylele/shader-jit-state
shader/jit: preserve integer & condition register across invocation
2018-06-09 18:38:05 -06:00
James Rowe 2dac1a9590
Merge pull request #3788 from wwylele/shader-jit-breakc
shader/jit: implement breakc
2018-06-09 18:36:46 -06:00
N00byKing 523c52c708 renderer_opengl: Add Universal 3D Layout Adaption 2018-06-01 18:24:26 +02:00
jmorriz124 8c0ede544f 3dtv botenable improved (#1)
* Fixed crash when right eye isn't available

* Enabled swap screens in stereo views.  Fixed window alignment in stereo
views to handle all screen aspect ratios.

* Minor code cleanup and clang fomat updates.

* Minor cleanup of swapped and aspect ratio code
2018-06-01 17:05:29 +02:00
N00byKing 2814bbc3da renderer_opengl: Allow usage of Stereoscopic 3D 2018-06-01 17:01:06 +02:00
wwylele 781912e854 gl_rasterize: implement shadow mapping using image load/store 2018-06-01 14:26:44 +03:00
Weiyi Wang 08b119153d
Merge pull request #3799 from wwylele/sigh
gl_rasterizer: reset texture state context after every draw
2018-06-01 14:24:28 +03:00
wwylele 9060e08e49 shader/jit: implement breakc 2018-06-01 13:04:39 +03:00
wwylele f0ee4c0595 gl_rasterizer: reset texture state context after every draw 2018-06-01 12:05:30 +03:00
James Rowe 7715fd2c19
Merge pull request #3750 from wwylele/cube-watcher-fix
gl_rasterizer_cache: add missing watcher invalidation
2018-05-31 23:11:39 -06:00
James Rowe f7f5a54bc3
Merge pull request #3751 from wwylele/shader-warning-shutup
gl_shader_gen: rearrange function definition to avoid suprious warnings
2018-05-31 23:10:42 -06:00
James Rowe e63c374ff0
Merge pull request #3714 from wwylele/primitive-restart-guard
video_core/command_processor: correctly handles 0xFFFF index as a normal index
2018-05-29 23:22:00 -06:00
Markus Wick caba02d42a gl_rasterizer: Don't flip the texture bindings.
The state object isn't used anywhere else, so there
is no need to revert the state.
And the comment is just wrong: It doesn't matter
which textures are bound on framebuffer binding, it
only matters at draw time. And we reset all bindings
before the draw call. So let's use gl_state as it is
designed to avoid flipping states.
2018-05-28 21:04:59 +02:00
wwylele 874cb42e70 shader/jit: preserve integer & condition register across invocation 2018-05-28 14:41:47 +03:00
wwylele 92a1252835 gl_shader_gen: rearrange function definition to avoid suprious warnings 2018-05-19 00:36:33 +03:00
wwylele 8b4e832c5f gl_rasterizer_cache: add missing watcher invalidation 2018-05-18 23:58:43 +03:00
Markus Wick 8e1e52cad9 gl_rasterizer_cache: Use clean state for glBlitFramebuffer.
Framebuffer blits depends on pixel tests:
Ownership (is fine)
Scissor (is broken on the last commit)
Masking (is broken on master for a while)

So let's be honest and start with a clean state in
those helper functions.
2018-05-18 21:13:56 +02:00
Markus Wick 301073334a gl_rasterizer: Remove redundant scissor state change.
There is no need to disable this state after the draw call,
gl_state will handle this for us if needed. This kind of
redundant state changes are bad for the driver overhead,
as flipping bits will invalidate the driver state.
2018-05-18 21:13:56 +02:00
wwylele 129b893509 gl_stream_buffer: update the information about the AMD hack 2018-05-18 14:08:12 +03:00
wwylele dd6252a676 gl_rasterizer: fallback to software shader path if buffer overflow happens on hardware shader path 2018-05-18 13:55:19 +03:00
wwylele 6985b13439 [HACK] AMD workaround 2018-05-14 10:17:36 +03:00
wwylele ede0d15fec video_core/command_processor: attempt accelerate draw in draw trigger 2018-05-14 10:17:36 +03:00
wwylele 9b448a0739 gl_rasterizer: implement AccelerateDrawBatch to emulate PICA shader on hardware 2018-05-14 10:17:36 +03:00
MerryMage 15d14be3cc primitive_assembly: Add getters for internal state 2018-05-14 10:17:35 +03:00
wwylele 06815ec905 video_core: receive hardware shader settings 2018-05-14 10:17:35 +03:00
wwylele 68b0a3e19e regs_pipeline: use proper unsigned type where applicable 2018-05-06 15:57:48 +03:00
Weiyi Wang f85e71c37c
Merge pull request #3715 from wwylele/hardware-vertex-vector
gl_rasterizer: Use GLvec* instead of C arrays
2018-05-06 07:19:06 +03:00
Weiyi Wang 0da3b75c9e
Merge pull request #3700 from wwylele/texcache-watcher
gl_rasterizer_cache: cache texture cube
2018-05-05 16:30:39 +03:00
Markus Wick 5960282303 gl_rasterizer: Use buffer_storage for uniform data.
This replaces the glBufferData logic with the shared stream buffer code.
The new code doesn't need a temporary staging buffer any more, so the
performance should imrpove quite a bit.
2018-05-05 09:22:02 +02:00
MerryMage d6cd1a8712 gl_rasterizer: Use GLvec* instead of C arrays 2018-05-05 04:37:04 +03:00
wwylele 08a38370b0 video_core/command_processor: correctly handles 0xFFFF index as a normal index 2018-05-05 04:24:31 +03:00
Weiyi Wang be5777f3de
Merge pull request #3686 from wwylele/glvtx-shader-gen
gl_shader_gen: generate programmable vs/gs and fixed gs
2018-05-01 21:27:48 +03:00
wwylele 1762ad2dcc gl_rasterizer_cache: cache texture cube 2018-05-01 21:26:43 +03:00
bunnei ed42b4b0d2
Merge pull request #3678 from wwylele/b15-fallback
gl_shader_decompiler: fallback to CPU shader on GS b15 access
2018-04-25 00:03:11 -04:00
wwylele 191b29e402 gl_shader_gen: generate programmable vs/gs and fixed gs 2018-04-24 20:39:10 +03:00
MerryMage 8186820d16 pica_to_gl: Add GLuvec{2,3,4} aliases
To allow for transfer for integers into shaders.
2018-04-23 20:21:24 +03:00
wwylele e56128683c gl_shader_decompiler: fallback to CPU shader on GS b15 access 2018-04-23 12:45:56 +03:00
Markus Wick c4010e3f93 renderer_opengl: Drop GLSync, unused. 2018-04-21 16:12:30 +02:00
Markus Wick 5d1dd205c4 renderer_opengl: Rewrite stream buffer. 2018-04-21 16:12:30 +02:00
wwylele d52ddd0ec4 shader: avoid recomputing hash for the same program 2018-04-17 09:47:59 +03:00
wwylele 3cc460ab34 shader_jit: change passing ShaderSetup to passing uniforms struct into the program
We are going to add private memebers to ShaderSetup, which forbids the usage of offsetof. The JIT program only use the uniform part of the setup, so we can just isolate it.
2018-04-17 09:35:43 +03:00
Weiyi Wang cb36f9fad2
Merge pull request #3645 from wwylele/shader-manager
renderer_opengl: refactor shader & program objects and add shader manager for rasterizer
2018-04-16 16:38:38 +03:00
Weiyi Wang bfd1d963ba
Merge pull request #3638 from ds84182/we-need-more-rounds
Round TEV outputs and the final fragment output in GLSL
2018-04-12 23:32:27 +03:00
Weiyi Wang 9772513141
Merge pull request #3639 from wwylele/texture-cude-fix
gl_rasterizer_cache: exit FillTextureCube when address is invalid
2018-04-12 22:54:14 +03:00
wwylele 8dc75598a4 gl_rasterizer: isolate shader management into its own class 2018-04-11 14:52:37 +03:00
wwylele 36bc92273b gl_shader_gen: accept an option to generate separable shaders 2018-04-11 14:52:37 +03:00
wwylele bdab18d2d9 gl_resource_manager: add OGLPipeline 2018-04-11 14:52:37 +03:00
wwylele 4f9b9c4b80 gl_state: add pipeline state 2018-04-11 14:41:43 +03:00
wwylele 48869c768f gl_resource_manager: separate OGLShader and OGLProgram 2018-04-11 14:41:43 +03:00
wwylele d2ee40dc45 gl_shader_util: separate shader object creation and program object creation 2018-04-11 14:41:43 +03:00
wwylele 4256641da4 gl_rasterizer/lighting: implement shadow attenuation 2018-04-10 20:26:55 +03:00
wwylele b5763cb952 pica/lighting: split FresnelSelector into bitfields
The FresnelSelector was already working like a bitfield, so just make it actual bitfield to reduce redundant code. Also, it is already confirmed that this field also affects shadow on alpha. Given that the only two source that can affect alpha components are both controlled by this field, this field should be renamed to a general alpha switch
2018-04-10 20:25:56 +03:00
wwylele 7e7de7d3ab gl_rasterizer_cache: exit FillTextureCube when address is invalid 2018-04-08 12:34:50 +03:00
Dwayne Slater 234161ba62 Make byteround less expensive (thanks hrydgard!) 2018-04-07 18:26:14 -04:00
Dwayne Slater 734279ff22 Round TEV outputs and the final fragment output in GLSL
Fixes water effect in SM3DL
2018-04-07 16:43:56 -04:00
Weiyi Wang 972db17247
Merge pull request #3497 from wwylele/texture-cube-new
gl_rasterizer: implement TextureCube
2018-04-06 12:41:40 +03:00
Weiyi Wang a9544ca015
Merge pull request #3580 from daniellimws/common-fmt
common: Migrate logging macros
2018-04-06 12:38:08 +03:00
Weiyi Wang e3d25bc6d0
Merge pull request #3567 from wwylele/pica-glsl
renderer_opengl: add PICA->GLSL shader decompiler
2018-04-05 14:39:27 +03:00
Weiyi Wang acb02d300c
Merge pull request #3518 from wwylele/hashable-struct
Common/Hash: abstract HashableStruct from GLShader::PicaShaderConfig
2018-04-05 14:39:12 +03:00
James Rowe 1fecead2ff
Merge pull request #3624 from wwylele/sync-uniform
gl_rasterizer: move shader uniform sync from SetShader() to ctor
2018-04-05 00:30:38 -06:00
wwylele 0d84c5a0b6 gl_rasterizer: move state syncing from ctor to its own function 2018-04-04 17:23:55 +03:00
wwylele c2719feda2 gl_rasterizer: move shader uniform sync from SetShader() to ctor 2018-04-03 09:27:23 +03:00
Valentin Vanelslande c9ab184ec7 pica_to_gl: Migrate logging macros (#3608) 2018-04-02 09:31:28 -06:00
wwylele 9ffd400685 gl_shader_decompiler: add missing headers/rename GetXXX to MoveXXX to reflect that they move the data 2018-04-02 17:34:54 +03:00
wwylele 11c2f11872 gl_shader_decompiler: return error on decompilation failure
Internally these errors are handled by exceptions. Only fallbackable errors (that can be handled by CPU shader emulation) is reported. Completely ill-formed shader is still ASSERTed. Code logic related stuff is DEBUG_ASSERTed
2018-04-02 17:34:54 +03:00
wwylele 4991b15ee5 gl_shader_decompiler: some small fixes
- remove unnecessary ";"
- use std::tie for lexicographical ordering
- simplify loop condition
    The offset always has step +1 on each iteration, so it would just hit one of the two boundary anyway
2018-04-02 17:34:54 +03:00
wwylele f8a292f920 renderer_opengl: add PICA->GLSL shader decompiler 2018-04-02 17:34:54 +03:00
James Rowe 384849232b
Merge pull request #3516 from wwylele/shadow-sw
SwRasterizer: Implement shadow mapping
2018-03-31 23:29:22 -06:00
Lioncash 7d331a469f pica_to_gl: Use std::array where applicable
Removes the need to use the ARRAY_SIZE macro
2018-03-31 00:58:49 -04:00
Tobias bb6251f35f video_core: Remove Unreachable for invalid BlendEquation modes (#3595)
* video_core: Remove Unreachable statement

* Lower log level to ERROR
2018-03-29 17:53:55 -06:00
Lioncash 27a3d44b16 gl_rasterizer: Fix incorrect comparison against src_surface in AccelerateTextureCopy()
This should actually be comparing the validity of the destination
surface.
2018-03-28 21:13:57 -04:00
Daniel Lim Wee Soong 98760336be video_core/shader/shader: Remove include cinttypes 2018-03-28 22:40:16 +08:00
Daniel Lim Wee Soong 968569aa61 Replace format specifiers for all usages of ASSERT_MSG 2018-03-27 23:28:42 +08:00
Weiyi Wang 9e4f670ea9
Merge pull request #3484 from wwylele/highlight-fix
pica/lighting: compute highlight clamp after one-/two-sided diffuse pass
2018-03-18 23:41:27 +02:00
Mat M 79d1bcf5ba
Merge pull request #3506 from MerryMage/mov-gl_resource_manager
gl_resource_manager: Use std::exchange in move assignment operators and constructors
2018-03-17 16:30:58 -04:00
Markus Wick ac92664aa7 OGL: Use stream buffer for vertex data. 2018-03-17 02:02:39 +01:00
Phantom 50598fbbf4 stream buffer 2018-03-17 02:02:39 +01:00
MerryMage e3f9bfd850 gl_resource_manager: Use std::exchange instead of std::swap in move assignment operators and constructors
Move assignment operators and move constructors should ideally leave the object moved from in a state where resources aren't accessable.
2018-03-16 23:47:49 +00:00
wwylele 30cc8c10cd
Common/Hash: abstract HashableStruct from GLShader::PicaShaderConfig 2018-03-14 00:12:40 +02:00
wwylele 9f8ff7b04e swrasterizer: implement shadow map rendering 2018-03-13 13:07:07 +02:00
wwylele ae75d3032f swrasterizer: implement shadow map sampling 2018-03-13 12:56:19 +02:00
wwylele ce2ad7436e swrasterizer/lighting: implement shadow attenuation 2018-03-13 12:56:19 +02:00
wwylele 889d8aaab3 gl_rasterizer/cache: only reallocate cubemap when size/format mismatch 2018-03-11 13:31:29 +02:00
wwylele 15e8664ef7 gl_rasterizer: implement texture cube 2018-03-10 01:15:06 +02:00
wwylele 92c7bb9d20 pica/gl_shader: optimize ternary operator 2018-03-10 01:14:05 +02:00
wwylele 0d6db4a0b3 lighting: compute highlight clamp after one-/two-sided diffuse pass 2018-03-10 01:14:05 +02:00
James Rowe f61141e86a Update the entire application to use the new clang format style 2018-03-09 10:54:43 -07:00
bunnei 3cda637cb1
Merge pull request #3478 from j-selby/libpng-switch
Remove PICA image dumping, burn libpng
2018-03-07 18:03:38 -05:00
Vamsi Krishna 04cc8fb537 Discard Gas mode renders (#3486)
* Discard gas_mode renders

This discards the gas_mode / fog effect from games that use it and allows the games to display without it.  Note that gas mode is still unimplemented and will LOG<CRITICAL>.
This bypasses #3287. (Doesn't fix it)

* fix clang
2018-03-07 18:02:36 -05:00
James 077a519338 Remove unused DUMP_TEXTURES definition 2018-03-07 09:13:24 +11:00
James 9829a84fc6 Remove PICA image dumping/libpng 2018-03-07 09:10:54 +11:00
Weiyi Wang 4befbddc34
Merge pull request #3281 from jroweboy/texcache-pt2
Texture Cache Rework
2018-03-05 11:57:25 +02:00
wwylele c2515ff39d clang-format fix 2018-03-05 11:09:20 +02:00
James Rowe 1d419bac1b Disable accelerated texture copy for Texture surfaces 2018-03-04 22:06:09 -07:00
James Rowe 18456ff9e6 Address Lioncash's comments 2018-02-05 20:31:50 -07:00
Phantom 9e16a3c449 ConvertD24S8toABGR: fix fb attachment 2018-01-31 08:55:39 -07:00
Phantom d813bc5eb5 D24S8 to RGBA8 conversion 2018-01-31 08:55:19 -07:00
Phantom db21154142 GetFramebufferSurfaces: Remove an assert that is no longer correct 2018-01-31 08:54:19 -07:00
James Rowe b002511df0
citra-qt: Add customizable speed limit target (#3353)
citra-qt: Add customizable speed limit target

* Update SDL config for the new frame_limit option
* Made max lag time a function of target speed percent.
* Added a checkbox to enable/disable frame limiter
* UI: Prevent frame_limit from under/overflowing
* UI: Hide target speed percent when frame limiter is off
* Disable frame limit spin box when framelimit isn't enabled
2018-01-25 22:24:40 -07:00
Phantom 88f6521511 AccelerateTextureCopy: Better support for contiguous copy 2018-01-20 18:39:27 -07:00
Yuri Kunde Schlesner d93ee65164 Common: Add convenience function for hashing a struct 2018-01-15 13:43:37 -08:00
Dwayne Slater 41929371dc Optimize AttributeBuffer to OutputVertex conversion (#3283)
Optimize AttributeBuffer to OutputVertex conversion

First I unrolled the inner loop, then I pushed semantics validation
outside of the hotloop.

I also added overflow slots to avoid conditional branches.

Super Mario 3D Land's intro runs at almost full speed when compiled with
Clang, and theres a noticible speed increase in MSVC. GCC hasn't been
tested but I'm confident in its ability to optimize this code.
2018-01-02 15:32:33 -08:00
Phantom 7f1aec8fbb Support for textures smaller than 8*8 2017-12-30 07:42:32 +01:00
Phantom be1d0cee1e Fix viewport to surface rect clamping 2017-12-29 17:07:01 +01:00
Phantom 19672cfee8 CachedSurface: Add microprofile scopes for UploadGLTexture and DownloadGLTexture 2017-12-29 17:01:37 +01:00
Phantom 1591fa8d3d Remove read_framebuffer_handle and draw_framebuffer_handle from CachedSurface 2017-12-29 17:00:09 +01:00
James Rowe 1c4d1d1ace Move trasnfer_framebuffer to a member of RasterCache. Address review comments 2017-12-23 16:10:32 -07:00
James Rowe 10fb9242ae Fix clang format 2017-12-23 16:10:32 -07:00
James Rowe 4e053220a8 When downloading from a surface into gl_buffer, ingore any x/y offsets in rect and use 0,0 as the origin 2017-12-23 16:10:31 -07:00
James Rowe 7e673af527 Remove the correct intervals from the surface when validating 2017-12-23 16:10:31 -07:00
James Rowe ac4c589ab5 Workaround for ICE on gcc5 2017-12-23 16:10:31 -07:00
Phantom 9a6a452857 Fix broken surface validation logic since removal of the reinterpret hack 2017-12-23 16:10:30 -07:00
Phantom f893daa4a2 Perform the same checks on TexCopy params that SW does 2017-12-23 16:10:30 -07:00
James Rowe 91fad7010b Fix compilation on mac and linux 2017-12-23 16:10:30 -07:00
James Rowe 34ff77f5f7 Revert "OpenGL Cache: Ignore format reinterpretation hack"
Testing found a few games that did some crazy things which breaks the
assumptions made in that commit.
2017-12-23 16:10:29 -07:00
James Rowe 72034b772d Minor style changes 2017-12-23 16:10:29 -07:00
James Rowe 0498d34d18 OpenGL Cache: Ignore format reinterpretation hack
Several games such as Smash will cause some regions that are cached on
the gpu to be revalidated, but (seemingly) we can just ignore these
cases. If the data is already found on the gpu in dirty_regions, then we
validate those, and skip flushing that region from cpu.

Its unknown if this breaks any games, but it does speed up many games.
Additionally, it removes outlines in the pokemon games.
2017-12-23 16:10:29 -07:00
James Rowe 5b872c41d8 OpenGL Cache: Reorder methods
The previous commits added the methods where they were located
originally to try to get an easy to read diff between changes. This
commit fixes compliation since the static methods are now declared
before they are used.
2017-12-23 16:10:28 -07:00
James Rowe 24e187891f OpenGL Rasterizer: Update to use the new cache 2017-12-23 16:10:28 -07:00
James Rowe e5adb6a26b OpenGL Cache: Add the rest of the Cache methods
Fills in the rasterizer cache methods using the helper methods added in
the previous commits.
2017-12-23 16:10:27 -07:00
James Rowe 81ea32d1e0 OpenGL Cache: Refactor Surface Cache interface
Changes the public interface of the surface cache to make it easier to
use. Reintroduces the cached page count cached pages that was removed in
an earlier commit.
2017-12-23 16:10:27 -07:00
James Rowe 3e1cbb7d14 OpenGL Cache: Split CachedSurface
Breaks CachedSurface into two classes, the parameters used to create or
find a cached surface, and the actual cached surface. This also adds a
few helper methods for getting surfaces from cache
2017-12-23 16:10:27 -07:00
James Rowe 0b98b768f5 OpenGL Cache: Add surface utility functions
Separates creating and filling surfaces into static functions that
can be reused from the different RasterizerCache methods.
2017-12-23 16:10:26 -07:00
James Rowe e9e2d444ef OpenGL Cache: Optimize Morton Copy to copy in tiles
Compiles two lookup arrays of functions for the different
configurations of Morton Copy.
2017-12-23 16:10:26 -07:00
James Rowe 160ac25527 OpenGL State: Change setters so they don't directly write to curstate 2017-12-23 16:10:25 -07:00
James Rowe 13606a6d0b Memory: Remove count of cached pages and add InvalidateRegion
In a future commit, the count of cached pages will be reintroduced in
the actual surface cache. Also adds an Invalidate only to the cache
which marks a region as invalid in order to try to avoid a costly flush
from 3ds memory
2017-12-23 16:10:25 -07:00
James Rowe c821c14908 Settings: Change resolution scaling to an integer instead of a float 2017-12-23 16:10:25 -07:00
Subv 3652809408 HLE: Convert GSP_GPU to ServiceFramework.
The only functional change is the error handling of GSP_GPU::ReadHWRegs function. We previously didn't return error codes (not even for success). The new returns were found by reverse engineering the GSP module.
2017-12-21 10:30:22 -05:00
Tillmann Karras fd3ec6be30 video_core: fix infinity and NaN conversions 2017-12-14 19:51:58 +00:00
Yuri Kunde Schlesner aecd2b85fe
Merge pull request #3261 from MerryMage/DPH
shader_jit_x64_compiler: Use haddps for horizontal summation
2017-12-13 09:09:42 -05:00
bunnei 4695f12a08
Merge pull request #3264 from lioncash/cmake-target
CMakeLists: Derive the source directory grouping from targets themselves
2017-12-12 14:34:51 -05:00
MerryMage 6c199e4699 fixup! shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-12 15:37:00 +00:00
Lioncash ab021d163e CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables,
and then construct the target in most cases.
2017-12-11 21:11:52 -05:00
Yuri Kunde Schlesner ae7240a2cb
Merge pull request #3097 from ds84182/round-primary-color-swrast
Round primary color in swrast
2017-12-11 20:06:21 -05:00
MerryMage efec8fe513 shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-10 22:04:30 +00:00
Yuri Kunde Schlesner 230a7557f1 Shader: Store AttributeBuffers in GS output buffer
This also does the output masking early at EMIT time, instead of when a
triangle is sent to the vertex handler.
2017-12-09 20:33:59 -08:00
Yuri Kunde Schlesner 0184419814 Shader: Refactor output_mask copy loop to function 2017-12-09 20:31:24 -08:00
Tillmann Karras 1c2750d5bd video_core: optimize NaN check 2017-12-05 22:34:22 +00:00
MerryMage c1aef260af shader_jit_x64_compiler: Remove ABI overhead of LG2 and EX2
This involves reimplementing log2f and exp2f.
2017-11-30 18:17:35 +00:00
MerryMage 235a251d3c tests: Add tests for x64 shader jit
Tests LG2 and EX2 instructions
2017-11-30 18:17:35 +00:00
Dwayne Slater fcc141a327 Maintain the PICA's 8 bits of color precision when using the interpolated primary color
This matches the software renderer by using round.
The actual hardware rounds the results up instead of flooring.
2017-11-29 16:49:04 -05:00
Dwayne Slater 350082ab75 Fix logic ops not being enabled in the OpenGL renderer 2017-11-29 16:30:19 -05:00
Dwayne Slater dc48deaecc Round primary color inputs in software rasterizer
OpenGL version coming soon.
2017-11-29 16:30:18 -05:00
James Rowe 9d9693c13d Revert "Extracted the attribute setup and draw commands into their own functions"
This reverts commit b3b34a1e76. This
commit causes a performance regression for not enough benefits
2017-11-16 11:46:17 -07:00
wwylele 47c0c87c47 video_core: clean format warnings 2017-11-01 12:35:32 +02:00
Dragios 3e26b0dee5 swrasterizer folder minor edit 2017-10-27 09:44:45 +08:00
Dragios 9b3eb69973 Utilize vector function instead 2017-10-26 23:50:20 +08:00
Dragios 84054b7cd8 Get rid of narrowing conversion warning 2017-10-24 00:02:46 +08:00
Dragios 520929dd6d Fix typo for -Wunused-local-typedefs 2017-10-22 15:56:50 +08:00
Huw Pascoe b3b34a1e76 Extracted the attribute setup and draw commands into their own functions 2017-10-04 01:08:29 +01:00
Huw Pascoe a13ab958cb Fixed type conversion ambiguity 2017-09-30 09:34:35 +01:00
Subv a321bce378 Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-27 09:06:41 -05:00
B3n30 dc6a365337 Merge pull request #2951 from huwpascoe/perf-4
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe 903906da3b Optimized Float<M,E> multiplication
Before:

ucomiss xmm1, xmm1
jp      .L9
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm0, xmm2
setp    al
cmovne  eax, edx
test    al, al
jne     .L9
.L3:
movaps  xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp      .L10
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm1, xmm2
setp    al
cmovne  eax, edx
test    al, al
je      .L3

After:

movaps  xmm2, xmm1
mulss   xmm2, xmm0
ucomiss xmm2, xmm2
jnp     .L3
ucomiss xmm1, xmm0
jnp     .L11
.L3:
movaps  xmm0, xmm2
ret
.L11:
pxor    xmm2, xmm2
jmp     .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe 876aa82c29 Optimized Morton 2017-09-24 22:27:14 +01:00
James Rowe 93930a966f Merge pull request #2921 from jroweboy/batch-fix-2
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe 19d41dcc6e Remove pipeline.gpu_mode and fix minor issues 2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner a7758b0b36 Merge pull request #2928 from huwpascoe/master
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe a234e4c200 Improved performance of FromAttributeBuffer
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.

I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe 6a110ac5f5 Fixed framebuffer warning 2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner 699c920991 Merge pull request #2900 from wwylele/clip-2
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe ad0b57f407 GPU: Add draw for immediate and batch modes
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei 11baa40d75 Merge pull request #2865 from wwylele/gs++
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei ff4941fb3a Merge pull request #2914 from wwylele/fresnel-fix
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele 12fbc8c8df pica/lighting: only apply Fresnel factor for the last light 2017-09-03 08:22:03 +03:00
wwylele e2c41a5891 video_core: report telemetry for gas mode 2017-08-31 12:54:17 +03:00
bunnei f0e461bf6f Merge pull request #2891 from wwylele/sw-bump
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang 647f017c6d Merge pull request #2892 from Subv/warnings2
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv da88f3b8f0 Warnings: Fixed a few missing-return warnings in video_core. 2017-08-26 11:58:22 -05:00
wwylele 417cb45e3f SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL 2017-08-25 07:26:45 +03:00
wwylele addbcd5784 gl_rasterizer: implement custom clip plane 2017-08-25 07:26:45 +03:00
wwylele ea51a3af26 SwRasterizer: implement custom clip plane 2017-08-24 15:34:27 +03:00
wwylele 17c6104d2a gl_rasterizer/lighting: more accurate CP formula 2017-08-22 09:34:44 +03:00
wwylele b5aa570354 SwRasterizer/Lighting: implement LUT input CP 2017-08-22 09:34:44 +03:00
wwylele 3e478ca131 SwRasterizer/Lighting: implement bump mapping 2017-08-22 09:34:44 +03:00
wwylele 63b6e802cd swrasterizer: remove invalid TODO
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele 72b26ac32f swrasterizer/clipper: remove tested TODO
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele 5a4af616c6 gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader 2017-08-21 08:03:07 +03:00
wwylele 1eca380886 gl_rasterizer: add clipping plane z<=0 defined in PICA 2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner 46d1ca768d Merge pull request #2872 from wwylele/sw-geo-factor
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe 8afa81ac1b Merge pull request #2871 from wwylele/sw-spotlight
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele 0f35755572 pica/command_processor: build geometry pipeline and run geometry shader
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
 - no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
 - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
 - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
 - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele 8285ca4ad8 pica/shader/jit: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele 36981a5aa6 pica/primitive_assembly: Handle winding for GS primitive
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele bb63ae3052 correct constness 2017-08-19 10:13:20 +03:00
wwylele 28128348f2 pica/shader/interpreter: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele 46c6973d2b pica/shader: extend UnitState for GS
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele 686fb3e78c gl_shader_gen: don't call SampleTexture when bump map is not used 2017-08-11 18:35:00 +03:00
wwylele 945f9a1b04 SwRasterizer/Lighting: implement spot light 2017-08-11 01:19:10 +03:00
wwylele 14ee32c46a SwRasterizer/Lighting: implement geometric factor 2017-08-11 01:18:43 +03:00
wwylele 5d9d42f0d0 SwRasterizer/Lighting: use make_tuple instead of constructor
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele db309b2423 pica/regs: layout geometry shader configuration regs
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang 792dee47a7 Merge pull request #2822 from wwylele/sw_lighting-2
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele baa24f4ea9 pica: upload shared shader code to both unit 2017-08-07 10:30:05 +03:00
wwylele 2252a63f80 SwRasterizer/Lighting: shorten file name 2017-08-03 13:51:22 +03:00
wwylele eda28266fb SwRasterizer/Lighting: move to its own file 2017-08-02 22:20:40 +03:00
wwylele 48b4105871 SwRasterizer/Lighting: reduce confusion 2017-08-02 22:07:15 +03:00
wwylele c59ed47608 SwRasterizer/Lighting: move quaternion normalization to the caller 2017-08-02 22:05:53 +03:00
wwylele c89f804a01 pica/shader_interpreter: fix off-by-one in LOOP 2017-07-27 13:48:27 +03:00
Sebastian Valle c6a2e519ef Merge pull request #2816 from wwylele/proctex-lutlutlut
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00