vid_renderer

Truecolor ZDoom with extra features and some unofficial/beta GZDoom features.
[Home] [Download] [Git builds (Win)] [Git builds (Mac)] [Libs (Win)] [Repo] [Bugs&Suggestions]

Moderators: Rachael, dpJudas

dpJudas
Developer
Developer
Posts: 798
Joined: Sat Jul 23, 2016 7:53

Re: vid_renderer

Post by dpJudas »

Well, the performance of the 4col sprite drawers only really matter if a sprite is really close up to the camera - otherwise it doesn't cover enough of the screen to make a difference. The wall and span drawers are doing most of the rendering in the majority of cases.

For more complex maps, i.e. kdizd, the performance problems are often due to the BSP part of the renderer not feeding the drawers with stuff to draw fast enough. GZDoom has that problem too where it is often CPU bound.
User avatar
Rachael
Developer
Developer
Posts: 3640
Joined: Sat May 13, 2006 10:30

Re: vid_renderer

Post by Rachael »

I've a feeling you're gonna hate me soon. :)
Screenshot_Doom_20161013_090808.png
Screenshot_Doom_20161013_090808.png (260.45 KiB) Viewed 4390 times
Screenshot_Doom_20161013_090800.png
Screenshot_Doom_20161013_090800.png (259.86 KiB) Viewed 4390 times
r_columnmethod == false == 1col method. >_>
dpJudas
Developer
Developer
Posts: 798
Joined: Sat Jul 23, 2016 7:53

Re: vid_renderer

Post by dpJudas »

Lol at your lineup of former humans. :) I must admit I'm pretty surprised to see the 1col drawer not only match, but even clearly beat the optimized 4col version.

Bit hard to argue against evidence like that, although I'll still insist that, theoretically, 8 bytes or 16 bytes reads/writes should be faster. Cache-lines are 16 bytes on modern Intel CPUs, which 4col matches perfectly. Maybe the setup costs, and rendering to a temp buffer first, is what makes the 4col version lose out.
User avatar
Rachael
Developer
Developer
Posts: 3640
Joined: Sat May 13, 2006 10:30

Re: vid_renderer

Post by Rachael »

I won't argue against that. And if I had to guess, you're probably right about the setup part. Still - this is also keeping in mind that regardless of the drawer used you're still threading them out, the results themselves may well be different on an AMD CPU due to lack of hyperthreading which Intel has, and different ways the cache and memory calls are used.

EDIT: Just tried on an AMD CPU, the results are 61 FPS with 4col, 65 FPS with 1col, doing a similar thing just spawning zombiemen until they somewhat cover the screen.
dpJudas
Developer
Developer
Posts: 798
Joined: Sat Jul 23, 2016 7:53

Re: vid_renderer

Post by dpJudas »

I don't know enough about CPU architecture myself to say for sure. It might also be because I told LLVM to partially unroll the inner drawer loop, giving the 1col version the ability to grab several rows in one go and do the math with SSE registers. I suck at this, so I'm betting LLVM is better at it than I am. :D
User avatar
Rachael
Developer
Developer
Posts: 3640
Joined: Sat May 13, 2006 10:30

Re: vid_renderer

Post by Rachael »

The multiple 1col draws are probably it - because I just tried it in regular ZDoom (palette mode) and the frame rate stayed exactly the same, even with zombiemen covering the screen.
User avatar
Rachael
Developer
Developer
Posts: 3640
Joined: Sat May 13, 2006 10:30

Re: vid_renderer

Post by Rachael »

I think I'm going to default vid_used3d to "true" (for Windows, only, if this is implemented in Mac/Linux) until the live OpenGL switcher actually gets implemented. OpenGL is causing problems on enough graphics cards that I think having the Direct3D backend as default seems saner.
dpJudas
Developer
Developer
Posts: 798
Joined: Sat Jul 23, 2016 7:53

Re: vid_renderer

Post by dpJudas »

I agree. The Direct3D target is a lot more reliable on old low end hardware.
Locked

Return to “QZDoom”