I have a compute shader and the C# script which goes with it used to modify an array of vertices on the y axis simple enough to be clear.
But despite the fact that it runs fine the shader seems to forget the first vertex of my shape (except when that shape is a closed volume?)
Here is the C# class :
Mesh m;
//public bool stopProcess = false; //Useless in this version of exemple
MeshCollider coll;
public ComputeShader csFile; //the compute shader file added the Unity way
Vector3[] arrayToProcess; //An array of vectors i'll use to store data
ComputeBuffer cbf; //the buffer CPU->GPU (An early version with exactly
//the same result had only this one)
ComputeBuffer cbfOut; //the Buffer GPU->CPU
int vertexLength;
void Awake() { //Assigning my stuff
coll = gameObject.GetComponent<MeshCollider>();
m = GetComponent<MeshFilter>().sharedMesh;
vertexLength = m.vertices.Length;
arrayToProcess = m.vertices; //setting the first version of the vertex array (copy of mesh)
}
void Start () {
cbf = new ComputeBuffer(vertexLength,32); //Buffer in
cbfOut = new ComputeBuffer(vertexLength,32); //Buffer out
csFile.SetBuffer(0,"Board",cbf);
csFile.SetBuffer(0,"BoardOut",cbfOut);
}
void Update () {
csFile.SetFloat("time",Time.time);
cbf.SetData(m.vertices);
csFile.Dispatch(0,vertexLength,vertexLength,1); //Dispatching (i think there is my mistake)
cbfOut.GetData(arrayToProcess); //getting back my processed vertices
m.vertices = arrayToProcess; //assigning them to the mesh
//coll.sharedMesh = m; //collider stuff useless in this demo
}
And my compute shader script :
#pragma kernel CSMain
RWStructuredBuffer<float3> Board : register(s[0]);
RWStructuredBuffer<float3> BoardOut : register(s[1]);
float time;
[numthreads(1,1,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
float valx = (sin((time*4)+Board[id.x].x));
float valz = (cos((time*2)+Board[id.x].z));
Board[id.x].y = (valx + valz)/5;
BoardOut[id.x] = Board[id.x];
}
At the beginning I was reading and writing from the same buffer, but as I had my issue I tried having separate buffers, but with no success. I still have the same problem.
Maybe I misunderstood the way compute shaders are supposed to be used (and I know I could use a vertex shader but I just want to try compute shaders for further improvements.)
To complete what I said, I suppose it is related with the way vertices are indexed in the Mesh.vertices Array.
I tried a LOT of different Blocks/Threads configuration but nothing seems to solve the issue combinations tried :
Block Thread
60,60,1 1,1,1
1,1,1 60,60,3
10,10,3 3,1,1
and some others I do not remember. I think the best configuration should be something with a good balance like :
Block : VertexCount,1,1 Thread : 3,1,1
About the closed volume: I'm not sure about that because with a Cube {8 Vertices} everything seems to move accordingly, but with a shape with an odd number of vertices, the first (or last did not checked that yet) seems to not be processed
I tried it with many different shapes but subdivided planes are the most obvious, one corner is always not moving.
EDIT :
After further study i found out that it is simply the compute shader which does not compute the last (not the first i checked) vertices of the mesh, it seems related to the buffer type, i still dont get why RWStructuredBuffer should be an issue or how badly i use it, is it reserved to streams? i cant understand the MSDN doc on this one.
EDIT : After resolution
The C# script :
using UnityEngine;
using System.Collections;
public class TreeObject : MonoBehaviour {
Mesh m;
public bool stopProcess = false;
MeshCollider coll;
public ComputeShader csFile;
Vector3[] arrayToProcess;
ComputeBuffer cbf;
ComputeBuffer cbfOut;
int vertexLength;
// Use this for initialization
void Awake() {
coll = gameObject.GetComponent<MeshCollider>();
m = GetComponent<MeshFilter>().mesh;
vertexLength = m.vertices.Length+3; //I add 3 because apparently
//vertexnumber is odd
//arrayToProcess = new Vector3[vertexLength];
arrayToProcess = m.vertices;
}
void Start () {
cbf = new ComputeBuffer(vertexLength,12);
cbfOut = new ComputeBuffer(vertexLength,12);
csFile.SetBuffer(0,"Board",cbf);
csFile.SetBuffer(0,"BoardOut",cbfOut);
}
// Update is called once per frame
void Update () {
csFile.SetFloat("time",Time.time);
cbf.SetData(m.vertices);
csFile.Dispatch(0,vertexLength,1,1);
cbfOut.GetData(arrayToProcess);
m.vertices = arrayToProcess;
coll.sharedMesh = m;
}
}
I had already rolled back to a
Blocks VCount,1,1
Before your answer because it was logic that i was using VCount*VCount so processing the vertices "square-more" times than needed.
To complete, you were absolutely right the Stride was obviously giving issues could you complete your answer with a link to doc about the stride parameter? (from anywhere because Unity docs are VOID and MSDN did not helped me to get why it should be 12 and not 32 (as i thought 32 was the size of a float3)
so Doc needed please
In the mean time i'll try to provide a flexible enough (generic?) version of this to make it stronger, and start adding some nice array processing functions in my shader...
I'm familiar with Compute Shaders but have never touched Unity, but having looked over the documentation for Compute Shaders in Unity a couple of things stand out.
The cbf and cbfOut ComputeBuffers are created with a stride of 32 (bytes?). Both your StructuredBuffers contain float3s which have a stride of 12 bytes, not 32. Where has 32 come from?
When you dispatch your compute shader you're requesting a two-dimensional dispatch (vertexLength,vertexLength, 1) but you're operating on a 1D array of float3s. You will end up with a race condition where many different threads think they're responsible for updating each element of the array. Although awful for performance, if you want a thread group size of [numthreads(1,1,1)] then you should dispatch (vertexLength, 1, 1) numbers of waves/wavefronts when calling Dispatch (ie, Dispatch (60,1,1) with numThreads(1,1,1)).
For best/better performance the number of threads in your thread group / wave should at least be a multiple of 64 for best efficiency on AMD hardware. You then need only dispatch ceil(numVertices/64) wavefronts and then simply insert some logic into the shader to ensure id.x is not out of bounds for any given thread.
EDIT:
The documentation for the ComputeBuffer constructor is here: Unity ComputeBuffer Documentation
While it doesn't explicitly say "stride" is in bytes, it's the only reasonable assumption.
Related
I've made boids in unity but when trying to render a 1000 of them the performance is really bad, in my update function i use Physics.OverlapCircleAll to check all surroundiing boids. Is there any way to do this more optimized? Here is my update function:
void Update()
{
Collider2D[] hitColliders = Physics2D.OverlapCircleAll(Position, radius,layerMask.value);
List<Boid> boids = hitColliders.Select(o => o.GetComponent<Boid>()).ToList();
boids.Remove(this);
Flock(boids.ToArray());
}
Absolutely! Physics.OverlapCircleAll creates a lot of garbage every time it is called. What you're looking for is Physics.OverlapCircleNonAlloc, which will not create any garbage as it uses a buffer:
Collider2D[] hitsBuffer = new Collider2D[30]; //limit the amout of possible boid interations
void Update()
{
int numHits = Physics2D.OverlapCircleNonAlloc(Position, radius, hitsBuffer, layerMask.value);
Flock(hitsBuffer,numHits);
}
void Flock(Collider2D[] hitsBuffer, int numHits){
for(int i = 0; i < numHits; i++){
var boid = hitsBuffer[i].GetComponent<Boid>();
if(boid == this)
continue;
//flocking algorith here
}
}
Note how in the above code no additional arrays are created each frame, which is quite expensive. To check how much time is being spent where check out the Profiler:
Orange is 'Physics', working out the overlaps
Cyan is 'Scripts', calcuations in code, ie the flocking algorithm
Dark green is 'GarbageCollector', handling arrays created and destroyed each frame
PS If not already, ensure that the boids are using a CircleCollider2D, this is the easiest for Unity to calculate.
PPS You may want to double check that if(boid == this) actually gets called. I thought that Physics.Overlap... ignores this collider.
I am trying to recreate the full range of a guitar only using 6 audio clips.
I was thinking there would be a way to set frequency of an audio clip but audio.frequency only returns the frequency of the audio based on compression format and not the actual tone.
I know I can read GetSpectrumData, but that solution is fairly complex and would require some Fourier Transform analysis or something of the kind.
Affecting the pitch, it is easy to alter the tone so I can go up and down but is there a way to figure out what are the steps to use.
void Update ()
{
CheckAudio(KeyCode.Q, 1.0f);
CheckAudio(KeyCode.W, 1.1f);
CheckAudio(KeyCode.E, 1.2f);
CheckAudio(KeyCode.R, 1.3f);
CheckAudio(KeyCode.T, 1.4f);
}
void CheckAudio(KeyCode key, float pitch)
{
if (Input.GetKeyDown (key))
{
audio.pitch = pitch;
audio.Play ();
}
}
I can hear it does not sound right.
Knowing the initial tone E4 329.63Hz with pitch at 1 is there any equation that affecting the pitch, I would get the next key F4 349.23Hz (or close enough)?
It has to be considered also that Unity AudioSource limits the pitch within -3/3 range (which I think is more than needed).
EDIT: Adding some personal research. It seems pitch 1 is initial note and setting to 2 give the same key one octave higher.
Since a chromatic scale (all black and white notes on the piano) is 12 keys, I assume that using 1/12 for each step should do it.
It sounds close but I fell it is not quite right. Here is the new code:
[SerializeField] private AudioSource audio;
float step = 1f/12f;
KeyCode[]keys = new KeyCode[]{
KeyCode.Q, KeyCode.W,KeyCode.E,KeyCode.R,KeyCode.T,
KeyCode.Y, KeyCode.U, KeyCode.I, KeyCode.O, KeyCode.P,
KeyCode.A, KeyCode.S, KeyCode.D
};
void Update ()
{
float f = 0.0f;
foreach (KeyCode key in keys)
{
CheckAudio(key, f);
f += 1f;
}
}
void CheckAudio(KeyCode key, float pitch)
{
if (Input.GetKeyDown (key))
{
audio.pitch = 1f + pitch * step;
audio.Play ();
}
}
What you are trying to do will not work well by simply changing the pitch of the audio. By changing the pitch, you will run into other problems such as sound finishing too fast or taking more time to finish and the sound will not be good either.
The first solution is to make a plugin(Synthesizer) in C++ that reads the audio file from Unity and change the frequency. It should also perform other actions to fix speed issues. This is very complicated unless you are an audio engineer with some great math skills. And trying this on a mobile device is whole different story. OnAudioFilterRead is a function you should use if you decide to go with this method.
The second and the recommended solution is to make an audio file for each guitar key then put them into array of audioClip. This solves every other problems.The down side is that you will have more files.
EDIT:
If you don't care about it being perfect, you can use something below from this nice guy on the internet.
void playSound(){
float transpose = -4;
float note = -1;
if (Input.GetKeyDown("a")) note = 0; // C
if (Input.GetKeyDown("s")) note = 2; // D
if (Input.GetKeyDown("d")) note = 4; // E
if (Input.GetKeyDown("f")) note = 5; // F
if (Input.GetKeyDown("g")) note = 7; // G
if (Input.GetKeyDown("h")) note = 9; // A
if (Input.GetKeyDown("j")) note = 11; // B
if (Input.GetKeyDown("k")) note = 12; // C
if (Input.GetKeyDown("l")) note = 14; // D
if (note>=0){ // if some key pressed...
audio.pitch = Mathf.Pow(2, (note+transpose)/12.0);
audio.Play();
}
EDIT: For those of you interested in why the Mathf.Pow equation is used and working, read the following: https://en.wikipedia.org/wiki/Twelfth_root_of_two
I have implemented basic Hardware model instancing method in XNA code by following this short tutorial:
http://www.float4x4.net/index.php/2011/07/hardware-instancing-for-pc-in-xna-4-with-textures/
I have created the needed shader (without texture atlas though, single texture only) and I am trying to use this method to draw a simple tree I generated using 3DS Max 2013 and exported via FBX format.
The results I'm seeing left me without clue as to what is going on.
Back when I was using no instancing methods, but simply calling Draw on a mesh (for every tree on a level), the whole tree was shown:
I have made absolutely sure that the Model contains only one Mesh and that Mesh contains only one MeshPart.
I am using Vertex Extraction method, by using Model's Vertex and Index Buffer "GetData<>()" method, and correct number of vertices and indices, hence, correct number of primitives is rendered. Correct texture coordinates and Normals for lighting are also extracted, as is visible by the part of the tree that is being rendered.
Also the parts of the tree are also in their correct places as well.
They are simply missing some 1000 or so polygons for absolutely no reason what so ever. I have break-pointed at every step of vertex extraction and shader's parameter generation, and I cannot for the life of me figure out what am I doing wrong.
My Shader's Vertex Transformation function:
VertexShaderOutput VertexShaderFunction2(VertexShaderInput IN, float4x4 instanceTransform : TEXCOORD1)
{
VertexShaderOutput output;
float4 worldPosition = mul(IN.Position, transpose(instanceTransform));
float4 viewPosition = mul(worldPosition, View);
output.Position = mul(viewPosition, Projection);
output.texCoord = IN.texCoord;
output.Normal = IN.Normal;
return output;
}
Vertex bindings and index buffer generation:
instanceBuffer = new VertexBuffer(Game1.graphics.GraphicsDevice, Core.VertexData.InstanceVertex.vertexDeclaration, counter, BufferUsage.WriteOnly);
instanceVertices = new Core.VertexData.InstanceVertex[counter];
for (int i = 0; i < counter; i++)
{
instanceVertices[i] = new Core.VertexData.InstanceVertex(locations[i]);
}
instanceBuffer.SetData(instanceVertices);
bufferBinding[0] = new VertexBufferBinding(vBuffer, 0, 0);
bufferBinding[1] = new VertexBufferBinding(instanceBuffer, 0, 1);
Vertex extraction method used to get all vertex info (this part I'm sure works correctly as I have used it before to load test geometric shapes into levels, like boxes, spheres, etc for testing various shaders, and constructing bounding boxes around them using extracted vertex data, and it is all correct):
public void getVertexData(ModelMeshPart part)
{
modelVertices = new VertexPositionNormalTexture[part.NumVertices];
rawData = new Vector3[modelVertices.Length];
modelIndices32 = new uint[rawData.Length];
modelIndices16 = new ushort[rawData.Length];
int stride = part.VertexBuffer.VertexDeclaration.VertexStride;
VertexPositionNormalTexture[] vertexData = new VertexPositionNormalTexture[part.NumVertices];
part.VertexBuffer.GetData(part.VertexOffset * stride, vertexData, 0, part.NumVertices, stride);
if (part.IndexBuffer.IndexElementSize == IndexElementSize.ThirtyTwoBits)
part.IndexBuffer.GetData<uint>(modelIndices32);
if (part.IndexBuffer.IndexElementSize == IndexElementSize.SixteenBits)
part.IndexBuffer.GetData<ushort>(modelIndices16);
for (int i = 0; i < modelVertices.Length; i++)
{
rawData[i] = vertexData[i].Position;
modelVertices[i].Position = rawData[i];
modelVertices[i].TextureCoordinate = vertexData[i].TextureCoordinate;
modelVertices[i].Normal = vertexData[i].Normal;
counter++;
}
}
This is the rendering code for the object batch (trees in this particular case):
public void RenderHW()
{
Game1.graphics.GraphicsDevice.RasterizerState = rState;
treeBatchShader.CurrentTechnique.Passes[0].Apply();
Game1.graphics.GraphicsDevice.SetVertexBuffers(bufferBinding);
Game1.graphics.GraphicsDevice.Indices = iBuffer;
Game1.graphics.GraphicsDevice.DrawInstancedPrimitives(PrimitiveType.TriangleList, 0, 0, treeMesh.Length, 0, primitive , counter);
Game1.graphics.GraphicsDevice.RasterizerState = rState2;
}
If anybody has any idea where to even start looking for errors, just post all ideas that come to mind, as I'm completely stumped as to what's going on.
This even counters all my previous experience where I'd mess something up in shader code or vertex generation, you'd get some absolute mess on your screen - numerous graphical artifacts such as elongated triangles originating where mesh should be, but one tip stretching back to (0,0,0), black texture, incorrect positioning (often outside skybox or below terrain), incorrect scaling...
This is something different, almost as if it works - the part of the tree that is visible is correct in every single aspect (location, rotation, scale, texture, shading), except that a part is missing. What makes it weirder for me is that the part missing is seemingly logically segmented: Only tree trunk's primitives, and some leaves off the lowest branches of the tree are missing, leaving all other primitives correctly rendered with no artifacts. Basically, they're... correctly missing.
Solved. Of course it was the one part I was 100% sure it was correct while it was not.
modelIndices32 = new uint[rawData.Length];
modelIndices16 = new ushort[rawData.Length];
Change that into:
modelIndices32 = new uint[part.IndexBuffer.IndexCount];
modelIndices16 = new ushort[part.IndexBuffer.IndexCount];
Now I have to just figure out why are 3 draw calls rendering 300 trees slower than 300 draw calls rendering 1 tree each (i.e. why did I waste entire afternoon creating a new problem).
I have a working implementation of NAudio's wasapi loopback recording and the FFT of the data.
Most of the data I get is just as it should be but every once in a while (10 sec to minutes intervals) it shows amplitude on almost all frequencies.
Basicly the picture is rolling from right to left with time and frequencies going on logarithmic scale from lowest frequencies on the bottom. The lines are the errors. As far as i can tell those are not supposed to be there.
I get the audio buffer and send the samples to an aggregator (applies Hamming window) which implements the NAudio FFT. I have checked the data (FFT result) before I modify it in any way (the picture is not from the raw FFT output, but desibel scaled) confirming the FFT result is giving those lines. I could also point out the picture is modified with LockBits so I thought I had something wrong with the logic there, but that's why I checked the FFT output data which shows the same problem.
Well I could be wrong and the problem might be somewhere I said it isn't but it really seems it originates from the FFT OR the buffer data (data itself or the aggregation of samples). Somehow I doubt the buffer itself is corrupted like this.
If anyone has any idea what could cause this I would greatly appreciate it!
UPDATE
So I decided to draw the whole FFT result range rather than half of it. It showed something strange. I'm not sure of FFT but I thought Fourier transformation should give a result that is mirrored around the middle. This certainly is not the case here.
The picture is in linear scale so the exact middle of the picture is the middle point of the FFT result. Bottom is the first and top is the last.
I was playing a 10kHz sine wave which gives the two horizontal lines there but the top part is beyond me. It also seems like the lines are mirrored around the bottom quarter of the picture so that seems strange to me as well.
UPDATE 2
So I increased the FFT size from 4096 to 8192 and tried again. This is the output with me messing with the sine frequency.
It would seem the result is mirrored twice. Once in the middle and then again on the top and bottom halves. And the huge lines are now gone.. And it would seem like the lines only appear on the bottom half now.
After some further testing with different FFT lengths it seems the lines are completely random in that account.
UPDATE 3
I have done some testing with many things. The latest thing I added was overlapping of samples so that I reuse the last half of the sample array in the beginning of the next FFT. On Hamming and Hann windows it gives me massive intensities (quite like in the second picture I posted) but not with BlackmannHarris. Disabling overlapping removes the biggest errors on every window function. The smaller errors like in the top picture still remain even with BH window. I still have no idea why those lines appear.
My current form allows control over which window function to use (of the three previously mentioned), overlapping (on/off) and multiple different drawing options. This allows me to compare all the affecting parties effects when changed.
I shall investigate further (I am quite sure I have made a mistake at some point) but good suggestions are more than welcome!
The problem was in the way I handled the data arrays. Working like a charm now.
Code (removed excess and might have added mistakes):
// Other inputs are also usable. Just look through the NAudio library.
private IWaveIn waveIn;
private static int fftLength = 8192; // NAudio fft wants powers of two!
// There might be a sample aggregator in NAudio somewhere but I made a variation for my needs
private SampleAggregator sampleAggregator = new SampleAggregator(fftLength);
public Main()
{
sampleAggregator.FftCalculated += new EventHandler<FftEventArgs>(FftCalculated);
sampleAggregator.PerformFFT = true;
// Here you decide what you want to use as the waveIn.
// There are many options in NAudio and you can use other streams/files.
// Note that the code varies for each different source.
waveIn = new WasapiLoopbackCapture();
waveIn.DataAvailable += OnDataAvailable;
waveIn.StartRecording();
}
void OnDataAvailable(object sender, WaveInEventArgs e)
{
if (this.InvokeRequired)
{
this.BeginInvoke(new EventHandler<WaveInEventArgs>(OnDataAvailable), sender, e);
}
else
{
byte[] buffer = e.Buffer;
int bytesRecorded = e.BytesRecorded;
int bufferIncrement = waveIn.WaveFormat.BlockAlign;
for (int index = 0; index < bytesRecorded; index += bufferIncrement)
{
float sample32 = BitConverter.ToSingle(buffer, index);
sampleAggregator.Add(sample32);
}
}
}
void FftCalculated(object sender, FftEventArgs e)
{
// Do something with e.result!
}
And the Sample Aggregator class:
using NAudio.Dsp; // The Complex and FFT are here!
class SampleAggregator
{
// FFT
public event EventHandler<FftEventArgs> FftCalculated;
public bool PerformFFT { get; set; }
// This Complex is NAudio's own!
private Complex[] fftBuffer;
private FftEventArgs fftArgs;
private int fftPos;
private int fftLength;
private int m;
public SampleAggregator(int fftLength)
{
if (!IsPowerOfTwo(fftLength))
{
throw new ArgumentException("FFT Length must be a power of two");
}
this.m = (int)Math.Log(fftLength, 2.0);
this.fftLength = fftLength;
this.fftBuffer = new Complex[fftLength];
this.fftArgs = new FftEventArgs(fftBuffer);
}
bool IsPowerOfTwo(int x)
{
return (x & (x - 1)) == 0;
}
public void Add(float value)
{
if (PerformFFT && FftCalculated != null)
{
// Remember the window function! There are many others as well.
fftBuffer[fftPos].X = (float)(value * FastFourierTransform.HammingWindow(fftPos, fftLength));
fftBuffer[fftPos].Y = 0; // This is always zero with audio.
fftPos++;
if (fftPos >= fftLength)
{
fftPos = 0;
FastFourierTransform.FFT(true, m, fftBuffer);
FftCalculated(this, fftArgs);
}
}
}
}
public class FftEventArgs : EventArgs
{
[DebuggerStepThrough]
public FftEventArgs(Complex[] result)
{
this.Result = result;
}
public Complex[] Result { get; private set; }
}
And that is it I think. I might have missed something though.
Hope this helps!
I'm having trouble using Short2 for the (x,y) positions in my vertex data. This is my vertex structure:
struct VertexPositionShort : IVertexType
{
private static VertexElement[]
vertexElements = new VertexElement[]
{
new VertexElement(0, VertexElementFormat.Short2, VertexElementUsage.Position, 0),
};
private static VertexDeclaration
vertexDeclaration = new VertexDeclaration(vertexElements);
public Short2
Position;
public static VertexDeclaration Declaration
{
get { return new VertexDeclaration(vertexElements); }
}
VertexDeclaration IVertexType.VertexDeclaration
{
get { return new VertexDeclaration(vertexElements); }
}
}
Using the WP7 emulator, nothing is drawn if I use this structure - no artifacts, nothing! However, if I use an identical structure where the Short2 structs are replaced by Vector2 then it all works perfectly.
I've found a reference to this being an emulator-specific issue: "In the Windows Phone Emulator, the SkinnedEffect bone index channel must be specified as one of the integer vertex element formats - either Byte4, Short2, or Short4. This same set of integer data formats cannot be used for other shader input channels such as colors, positions, and texture coordinates on the emulator." (http://www.softpedia.com/progChangelog/Windows-Phone-Developer-Tools-Changelog-154611.html) However this is from July 2010 and I'd have assumed this limitation has been fixed by now...? Unfortunately I don't have a device to test on.
Can anyone confirm that this is still an issue in the emulator or point me at another reason why this is not working?
Solved, by Mr Shawn Hargreaves: "You can use Short2 in vertex data, but this is an integer type, so your vertex shader must be written to accept integer rather than float inputs. BasicEffect takes floats, so Short2 will not work with it. NormalizedShort2 might be a better choice?"
http://blogs.msdn.com/b/shawnhar/archive/2010/11/19/compressed-vertex-data.aspx
I can confirm that NormalizedShort2 does in fact work for position data, in both the WP7 emulator and on real devices.
Thanks, Shawn!