python - Numpy multiplication of vectors of different size avoiding for loops -


i have matrix, say, p of size (x,y). also, have 2 matrices, say, kx , ky of size (m,n) both, matrix pk of size (m,n) , 2 vectors u , v of x , y respectively. example, can defined follows:

import numpy np p = np.zeros((x,y)); pk = np.random.rand(m,n); kx = np.random.rand(m,n); ky = np.random.rand(m,n); u = np.random.rand(x); v = np.random.rand(y); 

in actual code not random, of course, shall not matter example. question is, if there exists pure numpy equivalent following:

for m in range(0, m):     n in range(0, n):         in range(0,x):             j in range(0,y):                arg = kx[m,n]*u[i] + ky[m,n]*v[j];                p[i,j] += pk[m,n]*np.cos(arg); 

all m,n,x,y different, x , y can same if solution not exist otherwise.

a common strategy eliminating for-loops in numpy calculations work higher-dimensional arrays.

consider example, line

arg = kx[m,n]*u[i] + ky[m,n]*v[j] 

this line depends on indices m, n, i , j. so arg depends on m, n, i , j. means arg can thought of 4-dimensional array indexed m, n, i , j. can eliminate 4 for-loops -- far arg concerned -- computing

kxu = kx[:,:,np.newaxis]*u kyv = ky[:,:,np.newaxis]*v arg = kxu[:,:,:,np.newaxis] + kyv[:,:,np.newaxis,:] 

kx[:,:,np.newaxis] has shape (m, n, 1), , u has shape (x,). multiplying them uses numpy broadcasting create array of shape (m, n, x). thus, above, new axes used placeholders, arg ends 4 axes indexed m,n,i,j in order.

similarly, p can defined

p = (pk[:,:,np.newaxis,np.newaxis]*np.cos(arg)).sum(axis=0).sum(axis=0) 

the sum(axis=0) (called twice) sums along m , n axes, p ends being 2-dimensional array indexed i , j only.

by working these 4-dimensional arrays, apply numpy operations on whole numpy arrays. in contrast, when using 4 for-loops, had computations value-by-value on scalars. consider example np.cos(arg) doing when arg 4-dimensional array. off-loads computation of all cosines in 1 numpy function call underlying loop in compiled c code. much faster calling np.cos once each scalar. reason why working higher-dimensional arrays ends being faster for-loop-based code.


import numpy np  def orig(kx, ky, u, v, pk):     m, n = kx.shape     x = u.size     y = v.size     p = np.empty((x, y), dtype=pk.dtype)     m in range(0, m):         n in range(0, n):             in range(0,x):                 j in range(0,y):                    arg = kx[m,n]*u[i] + ky[m,n]*v[j]                    p[i,j] += pk[m,n]*np.cos(arg)     return p  def alt(kx, ky, u, v, pk):     kxu = kx[:,:,np.newaxis]*u     kyv = ky[:,:,np.newaxis]*v     arg = kxu[:,:,:,np.newaxis] + kyv[:,:,np.newaxis,:]     p = (pk[:,:,np.newaxis,np.newaxis]*np.cos(arg)).sum(axis=0).sum(axis=0)     return p  m, n = 10, 20 x, y = 5, 15 kx = np.random.random((m, n)) ky = np.random.random((m, n)) u = np.random.random(x) v = np.random.random(y) pk = np.random.random((m, n)) 

sanity check, (showing alt , orig return same result):

in [57]: p2 = alt(kx, ky, u, v, pk)  in [58]: p1 = orig(kx, ky, u, v, pk)  in [59]: np.allclose(p1, p2) out[59]: true 

a benchmark, showing alt faster orig:

in [60]: %timeit orig(kx, ky, u, v, pk) 10 loops, best of 3: 33.6 ms per loop  in [61]: %timeit alt(kx, ky, u, v, pk) 1000 loops, best of 3: 349 µs per loop 

Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -