performance - Theoretical speedup not achieved - kernel separability -


i seeing how improve time takes convolution using kernel separability. below piece of code demonstrating this:

test = randn(3000); kx = [1 2 3 4 5 6 7 8 9]; ky = kx'; kernel = ky*kx;  tic; b = conv2(test,kernel,'same'); toc;  tic; bx=conv2(test,kx, 'same'); by=conv2(bx,ky, 'same'); toc; 

running above code yields these results:

elapsed time 0.564579 seconds. elapsed time 0.333260 seconds.

as can seen, not theoretical speedup expecting, supposed 81/18 = 4.5.

can explain why?

your kernel not big enough see gains. improvement should become more apparent make kernel larger:

test = randn(3000); kx = 1:100; ky = kx'; kernel = ky*kx;  tic; b = conv2(test,kernel,'same'); toc; tic; bx=conv2(test,kx, 'same'); by=conv2(bx,ky, 'same'); toc; 

when run 100x100 kernel size, see:

elapsed time 6.961222 seconds. elapsed time 0.252186 seconds. 

with 200x200 kernel get:

elapsed time 28.894932 seconds. elapsed time 0.639125 seconds. 

when double kernel size, 2d kernel time increases factor of ~4.15, , 1d time increases factor of ~2.5. not far off theoretical increase of 4x , 2x respectively.


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

How do you convert a timestamp into a datetime in python with the correct timezone? -