User talk:Masterzorag: Difference between revisions

From PS3 Developer wiki
Jump to navigation Jump to search
m (update info)
m (Cosmetic)
Line 1: Line 1:
'''SPU Problems on Linux 3.x, OpenCL related
'''SPU Problems on Linux > 3.2, OpenCL related
--------------------------------'''
--------------------------------'''
<pre>
 
As far as I know, I'm the only coding OpenCL on the Cell here, if someone want to test something be warned that it's stable until 3.2 branch.
As far as I know, I'm the only coding OpenCL on the Cell here, if someone want to test something be warned that it's stable until 3.2 branch.
This due some spufs changes that ppc-kernel-devs are (maybe) trying to fix, so now latest 3.3/3.4/3.5 branches falls into 'possible circular locking dependency detected' and slowdown runtime.
This due some spufs changes that ppc-kernel-devs are (maybe) trying to fix, so now latest 3.3/3.4/3.5 branches falls into 'possible circular locking dependency detected' and slowdown runtime.
Even disabling lock debugging it slowdowns runtime without debug messages, it happens even with OpenCL samples from IBM.
Even disabling lock debugging it slowdowns without warnings, it happens even with OpenCL samples from IBM.
</pre>
<br />
http://permalink.gmane.org/gmane.linux.ports.ppc.embedded/50547
http://permalink.gmane.org/gmane.linux.ports.ppc.embedded/50547


Latest tested kernels:
Latest tested kernels:
* 3.2.28 works fine
* 3.2.28 works fine
<code>
<pre>
[root@fedora_clone ppc]# ./perlin<br />
# ./perlin
OpenCL took 22.496168 seconds to compute 1000 frames. Pixel Rate = 46.611316 Mpixels/sec, Frame Rate = 44.452015 frames/sec<br />
OpenCL took 22.496168 seconds to compute 1000 frames. Pixel Rate = 46.611316 Mpixels/sec, Frame Rate = 44.452015 frames/sec
Host code took 12.620616 seconds to compute 10 frames. Pixel Rate = 0.830844 Mpixels/sec, Frame Rate = 0.792354 frames/sec<br />
Host code took 12.620616 seconds to compute 10 frames. Pixel Rate = 0.830844 Mpixels/sec, Frame Rate = 0.792354 frames/sec
OpenCL provided a 56.101182 speedup<br />
OpenCL provided a 56.101182 speedup
</code>
</pre>
* 3.3.3 falls into 'possible circular locking dependency detected' and slowdown runtime
* 3.3.3 falls into 'possible circular locking dependency detected' and slowdown runtime
* 3.4.6 falls into 'possible circular locking dependency detected' and slowdown runtime
* 3.4.6 falls into 'possible circular locking dependency detected' and slowdown runtime
* 3.5.3 falls into 'possible circular locking dependency detected' and slowdown runtime
* 3.5.3 falls into 'possible circular locking dependency detected' and slowdown runtime
<code>
Here the slowdown effect:
[root@fedora_clone ppc]# ./perlin<br />
<pre>
OpenCL took 93.280273 seconds to compute 1000 frames. Pixel Rate = 11.241133 Mpixels/sec, Frame Rate = 10.720380 frames/sec<br />
# ./perlin
Host code took 12.948244 seconds to compute 10 frames. Pixel Rate = 0.809821 Mpixels/sec, Frame Rate = 0.772305 frames/sec<br />
OpenCL took 93.280273 seconds to compute 1000 frames. Pixel Rate = 11.241133 Mpixels/sec, Frame Rate = 10.720380 frames/sec
OpenCL provided a 13.881010 speedup<br />
Host code took 12.948244 seconds to compute 10 frames. Pixel Rate = 0.809821 Mpixels/sec, Frame Rate = 0.772305 frames/sec
</code>
OpenCL provided a 13.881010 speedup
</pre>
In this specific case time spent is 4x to do the same thing!<br />
When program runs something is going weird, e.g. in my program I'm used to query an OpenCL builtin function to tell me how many available SPEs there are, and its reply 8.<br />
When program runs something is going weird, e.g. in my program I'm used to query an OpenCL builtin function to tell me how many available SPEs there are, and its reply 8.<br />
Using spu_base.enum_shared=1 parameter it should reply 7, so seems that the issue is OpenCL related.
Using spu_base.enum_shared=1 parameter it should reply 7, so seems that the issue is OpenCL related.

Revision as of 16:06, 8 October 2012

SPU Problems on Linux > 3.2, OpenCL related


As far as I know, I'm the only coding OpenCL on the Cell here, if someone want to test something be warned that it's stable until 3.2 branch. This due some spufs changes that ppc-kernel-devs are (maybe) trying to fix, so now latest 3.3/3.4/3.5 branches falls into 'possible circular locking dependency detected' and slowdown runtime. Even disabling lock debugging it slowdowns without warnings, it happens even with OpenCL samples from IBM.
http://permalink.gmane.org/gmane.linux.ports.ppc.embedded/50547

Latest tested kernels:

  • 3.2.28 works fine
# ./perlin
OpenCL took 22.496168 seconds to compute 1000 frames. Pixel Rate = 46.611316 Mpixels/sec, Frame Rate = 44.452015 frames/sec
Host code took 12.620616 seconds to compute 10 frames. Pixel Rate = 0.830844 Mpixels/sec, Frame Rate = 0.792354 frames/sec
OpenCL provided a 56.101182 speedup
  • 3.3.3 falls into 'possible circular locking dependency detected' and slowdown runtime
  • 3.4.6 falls into 'possible circular locking dependency detected' and slowdown runtime
  • 3.5.3 falls into 'possible circular locking dependency detected' and slowdown runtime

Here the slowdown effect:

# ./perlin
OpenCL took 93.280273 seconds to compute 1000 frames. Pixel Rate = 11.241133 Mpixels/sec, Frame Rate = 10.720380 frames/sec
Host code took 12.948244 seconds to compute 10 frames. Pixel Rate = 0.809821 Mpixels/sec, Frame Rate = 0.772305 frames/sec
OpenCL provided a 13.881010 speedup

In this specific case time spent is 4x to do the same thing!
When program runs something is going weird, e.g. in my program I'm used to query an OpenCL builtin function to tell me how many available SPEs there are, and its reply 8.
Using spu_base.enum_shared=1 parameter it should reply 7, so seems that the issue is OpenCL related.