Forum

What are you working on?

Discuss anything not covered by any of the other categories.

Moderator: InsideQC Admins

Re: What are you working on?

Postby mankrip » Sun Jan 08, 2012 1:21 am

The particles looks great, and I bet the rest will too.

As for me, nothing. December was hell, and I won't have enough time for the next two months.
Ph'nglui mglw'nafh mankrip Hell's end wgah'nagl fhtagn.
==-=-=-=-=-=-=-=-=-=-==
Dev blog / Twitter / YouTube
User avatar
mankrip
 
Posts: 915
Joined: Fri Jul 04, 2008 3:02 am

Re: What are you working on?

Postby mh » Sun Jan 08, 2012 4:39 am

taniwha wrote:MH: heh, you went world + particles first. I went sprits + alias models first (glsl rewrite).

Anyway, looks good.


I actually did the 2D gui stuff first, then particles (cos I was really wanting to try a geometry shader).
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: What are you working on?

Postby taniwha » Mon Jan 09, 2012 6:51 am

I meant for 3d. I too did 2d first. Console text, at that.

A geometry shader for particles? Pass in one point and it spits out 2 tris as a billboarded quad?

I want to try gl points with a nice fragment shader.
Leave others their otherness.
http://quakeforge.net/
taniwha
 
Posts: 399
Joined: Thu Jan 14, 2010 7:11 am

Re: What are you working on?

Postby mh » Mon Jan 09, 2012 11:44 am

taniwha wrote:I meant for 3d. I too did 2d first. Console text, at that.

Makes sense. :)

taniwha wrote:A geometry shader for particles? Pass in one point and it spits out 2 tris as a billboarded quad?

Yup. All the shader code you need is actually almost the very same as what's currently in r_part.c (although you can do nice things like operations on all components of a vector simultaneously in a shader).

Adding a GS stage adds some overhead of it's own so for particles it's actually slower than writing all 4 verts of a real quad, but I just so badly wanted to do one that I don't really care about that. Another option is to use instancing - store most of your particle data in a static VBO and add position and colour as per-instance data. That works well but it's also slower. For DirectQ I put position and colour into the constants registers and read everything else from a static VBO, which is the fastest of them all (beware of limited constants register space - you're really only guaranteed 256 registers on common hardware, and you need 2 per particle, plus some more for your own stuff - vup, vright, vpn, etc - so you're limited to drawing ~120 particles per batch; still fastest of them all though (but don't even attempt to index into constants registers in a fragment shader on older hardware)).
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: What are you working on?

Postby taniwha » Tue Jan 10, 2012 10:56 pm

mh wrote:Yup. All the shader code you need is actually almost the very same as what's currently in r_part.c (although you can do nice things like operations on all components of a vector simultaneously in a shader).

Yeah, I found that the software renderer is an excellent reference for my shaders, especially since I'm effectively rewriting it in glsl (with tweaks like console alpha and frame lerping). I'm even using sw's colormaps (colormap and palette lookup in my fragment shaders).

Adding a GS stage adds some overhead of it's own so for particles it's actually slower than writing all 4 verts of a real quad, but I just so badly wanted to do one that I don't really care about that. Another option is to use instancing - store most of your particle data in a static VBO and add position and colour as per-instance data. That works well but it's also slower. For DirectQ I put position and colour into the constants registers and read everything else from a static VBO, which is the fastest of them all (beware of limited constants register space - you're really only guaranteed 256 registers on common hardware, and you need 2 per particle, plus some more for your own stuff - vup, vright, vpn, etc - so you're limited to drawing ~120 particles per batch; still fastest of them all though (but don't even attempt to index into constants registers in a fragment shader on older hardware)).


Older hardware... I don't know how much of it is mesa, and how much of it is my eeepc's intel graphics chip, but I've found that doing texture+lightmap+colormap+palette lookups really hurts (0.5fps), though texture+colormap+palette seems to be ok.

As for vup, vright, vpn: you could use a quaternion and build the vectors in the shader (sure, it's slower, but frees up registers), or encode into the transform matrix (removes bulk drawing, though). That said, I found that expanding the alias normal index to 3x16bits in my vbo gave best results (I tried normal index->vertex shader texture lookup, but that didn't work on deek's hardware: to many vertex shader texture samplers :P).
Leave others their otherness.
http://quakeforge.net/
taniwha
 
Posts: 399
Joined: Thu Jan 14, 2010 7:11 am

Re: What are you working on?

Postby leileilol » Wed Jan 11, 2012 12:05 am

taniwha wrote:(colormap and palette lookup in my fragment shaders).

We need more of this
i should not be here
leileilol
 
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Re: What are you working on?

Postby mh » Wed Jan 11, 2012 2:27 am

taniwha wrote:Older hardware... I don't know how much of it is mesa, and how much of it is my eeepc's intel graphics chip, but I've found that doing texture+lightmap+colormap+palette lookups really hurts (0.5fps), though texture+colormap+palette seems to be ok.

It's probably the extra indirection that's pushing you over a hardware limit and dropping back to software emulation (D3D - for all it's non portability - has the nice feature of crashing horribly if this happens, meaning that you at least know you've done something unsupported in hardware).

taniwha wrote:As for vup, vright, vpn: you could use a quaternion and build the vectors in the shader (sure, it's slower, but frees up registers), or encode into the transform matrix (removes bulk drawing, though). That said, I found that expanding the alias normal index to 3x16bits in my vbo gave best results (I tried normal index->vertex shader texture lookup, but that didn't work on deek's hardware: to many vertex shader texture samplers :P).

I just use the normals directly and do the same lighting calculation in my shader as GLQuake originally used to build it's anorm_dots table:
Code: Select all
float4 MeshPS_NoLuma (VS_OUTPUT ps_in) : SV_TARGET
{
   float shadedot = dot (ps_in.Normal.xyz, shadevector);
   float4 light = shadelight;
   
   if (shadedot < 0)
      light *= (1.0f + shadedot * (13.0f / 44.0f));
   else light *= (1.0f + shadedot);

   return tex0.Sample (tex0Sampler, ps_in.Tex) * light * 2.0f;
}


Shadelight and shadevector are constants per-entity and the end result gives no quantization as you rotate; it's liquid-smooth. Targetting D3D11 and SM4+ really frees you up to do whatever you want here (you should see my surface lighting shaders!) but at the obvious tradeoff of having higher hardware requirements. Although the kind of branching I used in this one is completely painless even on SM2 hardware.

Hell, here's surface lighting:
Code: Select all
// force use of specific registers so that ~SetShaderResources will behave itself
Texture2D tex0 : register(t0);
Texture2D tex1 : register(t1);
Texture2D tex2 : register(t2);
Texture2D tex3 : register(t3);
Texture2D tex4 : register(t4);

Texture1D vtex0 : register(t5);
Texture1D vtex1 : register(t6);
Texture1D vtex2 : register(t7);


struct VS_INPUT
{
   float4 Pos : POSITION;
   float2 Tex0 : TEXCOORD;
   float2 Tex1 : LMCOORD;
   float4 Styles : STYLES;
   float4 Plane : PLANE;
};


struct PS_SOLIDVERT
{
   float4 Pos : SV_POSITION;
   float2 Tex0 : TEXCOORD0;
   float2 Tex1 : TEXCOORD1;
   float4 Styles : TEXCOORD2;
   float3 LightPos : TEXCOORD3;
   nointerpolation int dlightbits : TEXCOORD4;
};


float4 SurfGetDynamicLighting (float3 LightPos, int dlightbits)
{
   float4 light = float4 (0, 0, 0, 0);

   if (numdlights.x > 0 && dlightbits > 0)
   {
      for (int i = 0; i < numdlights.x; i++)
      {
         if (!(dlightbits & (1 << i))) continue;

         float4 pos = vtex1.Load (int2 (i, 0));
         float3 lvec = LightPos - pos.xyz;
         float dist = dot (lvec, lvec);

         if (dist < pos.w)
         {
            dist = 1.0f - (dist / pos.w);
            light.xyz += (vtex2.Load (int2 (i, 0))).xyz * dist * dist;
         }
      }
   }

   return light;
}


// similar for luma and 1/2/3 styles versions (although remove some texture lookups from those)
float4 SurfPS4_NoLuma (PS_SOLIDVERT ps_in) : SV_TARGET
{
   float4 light = SurfGetDynamicLighting (ps_in.LightPos, ps_in.dlightbits);

   light += tex1.Sample (sampler1, ps_in.Tex1) * ps_in.Styles.x;
   light += tex2.Sample (sampler1, ps_in.Tex1) * ps_in.Styles.y;
   light += tex3.Sample (sampler1, ps_in.Tex1) * ps_in.Styles.z;
   light += tex4.Sample (sampler1, ps_in.Tex1) * ps_in.Styles.w;

   return tex0.Sample (sampler0, ps_in.Tex0) * light * 2.0f;
}


PS_SOLIDVERT SurfVS (VS_INPUT vs_in)
{
   PS_SOLIDVERT vs_out;

   vs_out.Pos = mul (vs_in.Pos, localMatrix);
   vs_out.Tex0 = vs_in.Tex0;
   vs_out.Tex1 = vs_in.Tex1;

   vs_out.Styles.x = vtex0.Load (int2 (vs_in.Styles.x, 0));
   vs_out.Styles.y = vtex0.Load (int2 (vs_in.Styles.y, 0));
   vs_out.Styles.z = vtex0.Load (int2 (vs_in.Styles.z, 0));
   vs_out.Styles.w = vtex0.Load (int2 (vs_in.Styles.w, 0));

   int dlightbits = 0;
   float4 LightPos = mul (vs_in.Pos, modelMatrix);
   float3 Normal = mul (vs_in.Plane.xyz, modelMatrix);

   for (int i = 0; i < numdlights.x; i++)
   {
      // using the same equation as the PS gives better performance but it misses some verts where a light position
      // is in the center of a large poly and it's radius doesn't reach to the verts - which is quite a bummer
      float4 pos = vtex1.Load (int2 (i, 0));
      float dist = abs (dot (pos.xyz, Normal) - vs_in.Plane.w);

      if ((dist * dist) < pos.w)
      {
         dlightbits |= (1 << i);
      }
   }

   vs_out.dlightbits = dlightbits;
   vs_out.LightPos = LightPos;

   return vs_out;
}

You could use 3 textures per lightmap insetad of 4 and reduce the texture memory overhead, as well as speed up the worst-case shader somewhat, but it's messier to encode a one or two style surface in them (style 1 goes into the R channel of each texture, style 2 in the G channel, etc, then it's a dotproduct of each texture lookup with the styles per-vertex input and add them all together to get the final result). The common case is faster with 4 and on the target hardware the extra texture memory overhead isn't even worth considering.

I'm going to move dynamic light positions and colours back to constants (constant buffers rock) rather than doing texture lookups as they're marginally faster, but even so - this (with all it's craziness) runs much much faster than traditional Quake lighting.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: What are you working on?

Postby taniwha » Wed Jan 11, 2012 10:15 am

mh wrote:It's probably the extra indirection that's pushing you over a hardware limit and dropping back to software emulation (D3D - for all it's non portability - has the nice feature of crashing horribly if this happens, meaning that you at least know you've done something unsupported in hardware).

I very much suspect that to be the case. I intend on delving into it further once I get the basics done (just sky and particles to go).

I'm sorry, but most of your explanations have left my head spinning :(, but I'm very new to shaders. I'll go over them again later when I'm not so tired.

leileilol wrote:
taniwha wrote:(colormap and palette lookup in my fragment shaders).

We need more of this

I thought it might appeal to you :). In fact, reading some of your comments was part of what gave me the idea of trying to "port" the software renderer to glsl (there will be some tweaks: interpolation, water alpha, fog, skybox).
Leave others their otherness.
http://quakeforge.net/
taniwha
 
Posts: 399
Joined: Thu Jan 14, 2010 7:11 am

Re: What are you working on?

Postby mh » Sat Jan 14, 2012 12:15 pm

Another idea I had once was to use dynamic textures - take 2 or 3 large-ish ones - and rebuild software Quake's surface caching with them. You're 50% of the way towards megatexture with that.

The technology behind it is sufficiently intriguing that I might do it as a research project when time becomes free, but I'm not certain if it's something I'd put in a release engine.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: What are you working on?

Postby taniwha » Sat Jan 14, 2012 12:59 pm

mh wrote:
taniwha wrote:Older hardware... I don't know how much of it is mesa, and how much of it is my eeepc's intel graphics chip, but I've found that doing texture+lightmap+colormap+palette lookups really hurts (0.5fps), though texture+colormap+palette seems to be ok.

It's probably the extra indirection that's pushing you over a hardware limit and dropping back to software emulation (D3D - for all it's non portability - has the nice feature of crashing horribly if this happens, meaning that you at least know you've done something unsupported in hardware).


That is exactly what the problem was (going by empirical results). I rewrote the color map code to load the colormap as a 2D palette rather than an index into the 1D palette. It gave me a jump from less than 5fps to around 20fps. I still have a 1D palette texture, but I now use it only for stuff that doesn't receive lighting (2D, sprites, water, skies).
Leave others their otherness.
http://quakeforge.net/
taniwha
 
Posts: 399
Joined: Thu Jan 14, 2010 7:11 am

Re: What are you working on?

Postby taniwha » Sat Jan 14, 2012 1:14 pm

mh wrote:Another idea I had once was to use dynamic textures - take 2 or 3 large-ish ones - and rebuild software Quake's surface caching with them. You're 50% of the way towards megatexture with that.


I think I do something like that with lightmaps: I load all lightmaps into the one texture (currently 2048x2048: 7% usage on start.bsp).
Leave others their otherness.
http://quakeforge.net/
taniwha
 
Posts: 399
Joined: Thu Jan 14, 2010 7:11 am

Re: What are you working on?

Postby leileilol » Sat Jan 14, 2012 1:29 pm

i'm trying to reduce SDL's input delay in Windows

it sucks because i'm doing this to play one of my favorite closed source indie games better.
i should not be here
leileilol
 
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Re: What are you working on?

Postby mh » Sat Jan 14, 2012 11:01 pm

Adding DDS support to RMQ. Future demo downloads should be a LOT smaller.

This is also going to be more generally portable to other engines, and will have the ability to uncompress a DDS to RGBA data if the hardware doesn't support compressed textures.
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

Re: What are you working on?

Postby leileilol » Sat Jan 14, 2012 11:42 pm

mh wrote: and will have the ability to uncompress a DDS to RGBA data if the hardware doesn't support compressed textures.

Careful about approaching this - decoding DXT as it should be decoded is patented.

Darkplaces does have an alternate patent unencumbered method to decode a DXT texture by the way



Also, useless feature creep time, as if Quake wasn't not colorful enough to meet nextgen(tm) standards:

Image
i should not be here
leileilol
 
Posts: 2783
Joined: Fri Oct 15, 2004 3:23 am

Re: What are you working on?

Postby mh » Sun Jan 15, 2012 1:37 am

leileilol wrote:
mh wrote: and will have the ability to uncompress a DDS to RGBA data if the hardware doesn't support compressed textures.

Careful about approaching this - decoding DXT as it should be decoded is patented.


Well that's a bummer; I've just got DXT1 working and all. Gonna look at the DP code...
We had the power, we had the space, we had a sense of time and place
We knew the words, we knew the score, we knew what we were fighting for
User avatar
mh
 
Posts: 2292
Joined: Sat Jan 12, 2008 1:38 am

PreviousNext

Return to General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest