performance - Theoretical speedup not achieved - kernel separability -
i seeing how improve time takes convolution using kernel separability. below piece of code demonstrating this:
test = randn(3000); kx = [1 2 3 4 5 6 7 8 9]; ky = kx'; kernel = ky*kx; tic; b = conv2(test,kernel,'same'); toc; tic; bx=conv2(test,kx, 'same'); by=conv2(bx,ky, 'same'); toc;
running above code yields these results:
elapsed time 0.564579 seconds. elapsed time 0.333260 seconds.
as can seen, not theoretical speedup expecting, supposed 81/18 = 4.5.
can explain why?
your kernel not big enough see gains. improvement should become more apparent make kernel larger:
test = randn(3000); kx = 1:100; ky = kx'; kernel = ky*kx; tic; b = conv2(test,kernel,'same'); toc; tic; bx=conv2(test,kx, 'same'); by=conv2(bx,ky, 'same'); toc;
when run 100x100 kernel size, see:
elapsed time 6.961222 seconds. elapsed time 0.252186 seconds.
with 200x200 kernel get:
elapsed time 28.894932 seconds. elapsed time 0.639125 seconds.
when double kernel size, 2d kernel time increases factor of ~4.15, , 1d time increases factor of ~2.5. not far off theoretical increase of 4x , 2x respectively.
Comments
Post a Comment