I spent about 8 hours today working on the particle system. Here's a short list of the changes I made:
Converted the entire system to use floats instead of doubles, including editing the openGL calls to work with floats instead of doubles. Hopefully not against the rules.
Added methods to Vect4D for things like += and *=.
Attempted to add in Babylonian or quake square roots for normalization, but couldn't get anything working faster. I'll try again tomorrow now that it's all running on floats.
Reordered the variables of Particle, ParticleEmitter, Vect4D and Matrix for caching purposes.
Changed a few functions to pass by reference
Removed some unnecessary temp variables.
Total I sped it up by a factor of two. It's now running at ~11ms update and ~35ms, sitting around ~45-50ms total, after it begin at ~90ms.
Seriously improved the update function, condensing and removing useless crap.
Monday, March 15, 2010
Sunday, March 14, 2010
Breaking in the particle system.
I worked for a couple hours today trying to convert everything from using doubles to floats, adding a better square root method, fixing invert, and fixing some other trivial stuff. Something inside really doesn't like me using floats and dies when I try and change it. I'll return to that later.
Right now I'm doing miniscule stuff, but so far have saved ~20ms, mostly just by fixing the invert function. Next I'm going to try and implement the SIMD stuff into the matrix and vect4d files.
Right now I'm doing miniscule stuff, but so far have saved ~20ms, mostly just by fixing the invert function. Next I'm going to try and implement the SIMD stuff into the matrix and vect4d files.
Friday, March 12, 2010
Starting out the blog.
So here I am starting out the blog for the particle system. This format ought to be easy on the eyes. I've started reviewing the code and am currently working on speeding up my memory system so I can implement it into the particle system.
So far my output for the memory system is still slower than the mem lib.
For 10,000 iterations:
Release:
Mem Lib: stress test: 355.958700 ms
Custom Memory: stress test: 690.598190 ms
times faster: 0.515435
Debug:
Mem Lib: stress test: 1726.908207 ms
Custom Memory: stress test: 3269.583225 ms
times faster: 0.528174
50,000:
Release:
Mem Lib: stress test: 1742.919445 ms
Custom Memory: stress test: 3277.043343 ms
times faster: 0.531857
Debug:
Mem Lib: stress test: 4051.890373 ms
Custom Memory: stress test: 8225.121498 ms
times faster: 0.492624
If I reduce the stress test to just the mallocs, and remove the frees, my memory system does better.
10,000 Iterations
Release:
Mem Lib: stress test: 236.124858 ms
Custom Memory: stress test: 91.542751 ms
times faster: 2.579394
Debug:
Mem Lib: stress test: 778.461158 ms
Custom Memory: stress test: 357.236087 ms
times faster: 2.179122
For some reason, with 50,000 iterations with only the malloc's it crashes in release and in debug it just won't ever get past the original inner loop, I'm guessing it's a memory issue?
Hopefully I'll fix this issue tonight and will be able to begin implementing it into the particle system.
Subscribe to:
Posts (Atom)