The sky...
Moderators: Rachael, dpJudas
- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
The sky...
Proves that the sky is truly the limit when there is problems.
I *finally* figured out what was causing problems with the dual layer skies. It was the multi-threaded rendering, which when I initially solved the problem was turned off. Now that it's back on - it's causing problems again.
I can't believe it was something so simple.
The compositing buffer is too small to handle a thousand or so threads accessing it at once. So while static returns are always working, compositing is not. For each thread (launched all at the same time, or nearly the same time) that is executed, the buffer is re-overwritten again and again, until the final compositing is done and the threads even get executed. And so, the threads execute on invalid compositing data because the compositing data was rewritten before they had a chance to even execute.
I don't know quite yet how I want to solve this problem, but it's a sure bet that the composite sky buffer for True-Color mode needs to be much bigger since the original algorithm was only designed to handle single threads and 4 columns at once...
			
			
									
						
										
						I *finally* figured out what was causing problems with the dual layer skies. It was the multi-threaded rendering, which when I initially solved the problem was turned off. Now that it's back on - it's causing problems again.
I can't believe it was something so simple.
The compositing buffer is too small to handle a thousand or so threads accessing it at once. So while static returns are always working, compositing is not. For each thread (launched all at the same time, or nearly the same time) that is executed, the buffer is re-overwritten again and again, until the final compositing is done and the threads even get executed. And so, the threads execute on invalid compositing data because the compositing data was rewritten before they had a chance to even execute.
I don't know quite yet how I want to solve this problem, but it's a sure bet that the composite sky buffer for True-Color mode needs to be much bigger since the original algorithm was only designed to handle single threads and 4 columns at once...
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
The way to solve this is to use thread specific buffers by adding the composite sky buffer to DrawerThread (like dc_temp_rgbabuff_rgba). That way each thread can do their own composition as they go. What makes this really hard is that by the time this reaches the drawers it is already too late. What the sky code needs is to stop using wallscan's drawers to draw something that has nothing to do with a conventional wall.
			
			
									
						
										
						- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
I don't even know how the drawer thread system works, and looking at the code just confused me, so an implementation like that is, at the very moment, way beyond my knowledge.
What I was thinking of doing was dynamically allocating columns "as needed" through a linked list, corresponding to one column on the screen. Potentially this could mean as many 2k-sized buffers as there are columns on the screen. Each time it is updated the game's time is stored in the thread, and on garbage collecting runs (and obviously map/mode changes) the columns are recursively purged depending on their age.
			
			
									
						
										
						What I was thinking of doing was dynamically allocating columns "as needed" through a linked list, corresponding to one column on the screen. Potentially this could mean as many 2k-sized buffers as there are columns on the screen. Each time it is updated the game's time is stored in the thread, and on garbage collecting runs (and obviously map/mode changes) the columns are recursively purged depending on their age.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
You could probably do it that way, but then I would just allocate a single buffer large enough to hold all the columns. If the current buffer is for 4 columns, then I'd allocate SCREENWIDTH/4 * currentbuffersize big buffer and then pick the position in the buffer based on the X coordinate of where it is outputting to. That way you don't have to maintain a linked list or garbage collect.
About the drawer thread system, its outer interface is pretty simple. Calling DrawerCommandQueue::QueueCommand<Foobar>(x,y,z); causes it to 'command = new Foobar(x,y,z)' and then call 'command->Execute(thread)' on each thread, where 'thread' is the local thread's DrawerThread instance. That is all there is to it.
Here's a simple example of a command that draws a 300x300 box at (100,100) on the screen:
Hope that helps a bit. 
			
			
									
						
										
						About the drawer thread system, its outer interface is pretty simple. Calling DrawerCommandQueue::QueueCommand<Foobar>(x,y,z); causes it to 'command = new Foobar(x,y,z)' and then call 'command->Execute(thread)' on each thread, where 'thread' is the local thread's DrawerThread instance. That is all there is to it.
Here's a simple example of a command that draws a 300x300 box at (100,100) on the screen:
Code: Select all
class MyBoxCommand : public DrawerCommand
{
public:
    void Execute(DrawerThread *thread) override
    {
        int start_y = 100 + thread->skipped_by_thread(100);
        int count = thread->count_for_thread(100, 300);
        for (int y = start_y; y < start_y + count; y += thread->num_cores)
        {
            uint32_t *line = (uint32_t*)dc_destorg + y * dc_pitch;
            for (int x = 100; x < 400; x++)
                line[x] = 0xffffff00; // bgra(255,255,255,0)
        }
    }
    void FString DebugInfo() override { return "Oh no! The box command crashed!"; }
};
void DrawMyBox()
{
    DrawerCommandQueue::QueueCommand<MyBoxCommand>();
}

- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
If we're going to do that then we might as well just simplify it even more and simply do SCREENWIDTH * 512. Absolute X and scaled/panned Y. Simple, easy to manage, and even makes it easier to write in skybox code later on because I can use the same buffer.If the current buffer is for 4 columns, then I'd allocate SCREENWIDTH/4 * currentbuffersize big buffer and then pick the position in the buffer based on the X coordinate of where it is outputting to. That way you don't have to maintain a linked list or garbage collect.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
Sounds good to me. That's 2 MB of memory for a 4K resolution, or 8 MB if its a 32 bit buffer. Small in a gigabyte memory age. 
			
			
									
						
										
						
- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
I am pushing a different change. I simply changed the buffer to 2048 3072 columns. That's probably more than we'll ever need, hopefully...
If it becomes a problem at higher resolutions, we can change things as needed. By the time that even becomes a problem, the sky code will probably have been replaced with something else, anyway.
At 3072 columns, that is 6mb of space for the buffer.
I tested it with a squashed RDP resolution of 3840x2400 and it worked fine.
			
			
									
						
										
						If it becomes a problem at higher resolutions, we can change things as needed. By the time that even becomes a problem, the sky code will probably have been replaced with something else, anyway.
At 3072 columns, that is 6mb of space for the buffer.
I tested it with a squashed RDP resolution of 3840x2400 and it worked fine.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
There's a MAXHEIGHT define in the software renderer set to 3600. If you're going to use a fixed height I suggest using that define, as that's the general height limitation of the renderer anyway.
			
			
									
						
										
						- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
I am a bit confused by this. The maximum height that a sky can even be, at least before actual Skyboxes are used, is 512. That is according to the code (which doesn't make much sense to me, but it is what it is).
If we are wanting to support scaled skies, I think a lot more code is going to be necessary... and this is only supporting the dual-layer skies, anyway, which are rarely if ever used.
			
			
									
						
										
						If we are wanting to support scaled skies, I think a lot more code is going to be necessary... and this is only supporting the dual-layer skies, anyway, which are rarely if ever used.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
To be honest, I kind of gave up trying to understand the current sky code. In my head this particular code is marked for rewrite. 
			
			
									
						
										
						
- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
Can we branch the rewrite until we get an official release?
I'd like to at least get something out, kind of make the port "official" and recognized, even with as few features as it has (because it already has some neat ones anyway, mostly thanks to you ), so that it starts appearing in wikis and sites other than forum.zdoom/drdteam.org.
), so that it starts appearing in wikis and sites other than forum.zdoom/drdteam.org. 
Speaking of which, I am thinking of temporarily disabling the 4col drawers in the first release (and then enabling them right back). Then the only other real issue is seeing if I can get ZDoom's Unreal-style skybox points working.
			
			
									
						
										
						I'd like to at least get something out, kind of make the port "official" and recognized, even with as few features as it has (because it already has some neat ones anyway, mostly thanks to you
 ), so that it starts appearing in wikis and sites other than forum.zdoom/drdteam.org.
), so that it starts appearing in wikis and sites other than forum.zdoom/drdteam.org. 
Speaking of which, I am thinking of temporarily disabling the 4col drawers in the first release (and then enabling them right back). Then the only other real issue is seeing if I can get ZDoom's Unreal-style skybox points working.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
I'm not going to attempt any rewrite on the sky stuff until after the 8 bit drawers are also codegen'ed by LLVM. This is because I really want to get rid of certain globals used by wallscan that is blocked by 8 bit's assembly drawers. I'll do even that part on its own branch, no worries.
About 4col drawers, feel free to turn them off for sprites as you've proved in the other thread they are slower. I managed to fix the general crash I had with some of the drawers a bit earlier today, so its only if performance shows them to be slower they should be off IMO.
Getting the skybox rendering working would be really nice. The checkerboard texture is how a canvas texture looks like if nothing was rendered into it. That makes me think it somehow decides to never render the skybox canvas at all.
			
			
									
						
										
						About 4col drawers, feel free to turn them off for sprites as you've proved in the other thread they are slower. I managed to fix the general crash I had with some of the drawers a bit earlier today, so its only if performance shows them to be slower they should be off IMO.
Getting the skybox rendering working would be really nice. The checkerboard texture is how a canvas texture looks like if nothing was rendered into it. That makes me think it somehow decides to never render the skybox canvas at all.
- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
Well - if the crash is fixed I see no reason to disable them, then, except maybe by default. I was going to simply put in some code that would temporarily override r_columnmethod while "swtruecolor" is active.
I am a bit weary about porting the 8-bit drawers, because that means if Randi does any work on the original ZDoom code it will not only create harder to resolve merge conflicts but the code has to be then transferred to the rewrite, as well, and beyond the skybox points there are still a multitude of issues that have to be fixed. Still - if the code can be transferred back to ZDoom in some way, it would not be a problem. I'd hate to poke and prod Randi about this, and she's very comfortable with the 8-bit renderer being just the way it is, but I am fairly dissatisfied with the software renderer being what Graf calls "an unmaintainable mess" due to its use of globals, as you said.
			
			
									
						
										
						I am a bit weary about porting the 8-bit drawers, because that means if Randi does any work on the original ZDoom code it will not only create harder to resolve merge conflicts but the code has to be then transferred to the rewrite, as well, and beyond the skybox points there are still a multitude of issues that have to be fixed. Still - if the code can be transferred back to ZDoom in some way, it would not be a problem. I'd hate to poke and prod Randi about this, and she's very comfortable with the 8-bit renderer being just the way it is, but I am fairly dissatisfied with the software renderer being what Graf calls "an unmaintainable mess" due to its use of globals, as you said.
- 
				dpJudas
- Developer 
- Posts: 798
- Joined: Sat Jul 23, 2016 7:53
Re: The sky...
My primary rationale for wanting to replace the 8-bit drawers is that I cannot change the drawer interface when the assembly code relies on certain globals.
If we take the sky as an example, I want the drawer to just grab the right pixels out of the texture without this entire temp buffer. The reason it uses the temp buffer is because the current drawers cannot read from more than one texture at a time. If the 8-bit drawers must be kept, it means that the old temp buffer can never be removed.
Whenever you see a 'dc_source' value in the source code, you really should see a FTexture*, and whenever you see a dc_dest you should see a (x,y) variable, and when you see an 'iscale' or 'texturefrac' you should see a texture coordinate (u,v). But okay, I'll let the 8-bit drawers be for now and see how much I can get rid of dc_dest and dc_source without touching them.
			
			
									
						
										
						If we take the sky as an example, I want the drawer to just grab the right pixels out of the texture without this entire temp buffer. The reason it uses the temp buffer is because the current drawers cannot read from more than one texture at a time. If the 8-bit drawers must be kept, it means that the old temp buffer can never be removed.
Whenever you see a 'dc_source' value in the source code, you really should see a FTexture*, and whenever you see a dc_dest you should see a (x,y) variable, and when you see an 'iscale' or 'texturefrac' you should see a texture coordinate (u,v). But okay, I'll let the 8-bit drawers be for now and see how much I can get rid of dc_dest and dc_source without touching them.
- 
				Rachael  
- Developer 
- Posts: 3651
- Joined: Sat May 13, 2006 10:30
Re: The sky...
Well - let's see if we can get Randi on board with this - of course it may mean that ZDoom, itself, will start requiring LLVM, which she seemed clearly leery about. But if the benefits massively outweigh the costs, she'll probably go along with it. (I sent her a PM linking this thread)