python - Efficiently multiply a dense matrix by a sparse vector -


i looking efficient way multiply dense matrix sparse vector, av, of size (m x n) , v (n x 1). vector v scipy.sparse.csc_matrix.

i have 2 methods use @ moment:

in method 1, pick off non-zero values in v, vi, , element-wise multiply vi corresponding column of a, sum these columns. if y = av, y = a[:, 0]*v0 + ... + a[:, n]*vn, non-zero i.

def dense_dot_sparse(dense_matrix, sparse_column):     prod = np.zeros((dense_matrix.shape[0]))     r, c = sparse_column.nonzero()     indices = zip(r, c)     ind in indices:         prod = prod + dense_matrix[:, ind[1]] * sparse_column[ind]     return prod 

in method 2, perform multiplication making sparse vector .todense() , use np.dot().

def dense_dot_sparse2(dense_matrix, sparse_column):     return np.dot(dense_matrix, sparse_column.todense()) 

the typical size of (512 x 2048) , sparsity of v varies between 1 200 non-zero entries. choose method employ based on sparsity of v. if sparsity of v ~ 200 non-zeros, method 1 takes ~45ms , method 2 takes ~5ms. when v sparse, ~1 non-zero, method 1 takes ~1ms whereas method 2 still takes 5ms. checking sparsity of v (.nnz) adds 0.2ms.

i have perform 1500 of these multiplications (after splitting data , multiprocessing), time adds up.

[edit: adding simple representative example

rows = 512 cols = 2048 sparsity = 0.001  # sparse: 0.001 ~ 1 non-zero, moderately sparse: 0.1 ~ 200 non-zero big_matrix = np.random.rand(rows, cols)  # use dense matrix col = np.random.rand(cols, 1) col = np.array([i[0] if < sparsity else 0.0 in col]) sparse_col = csc_matrix(col)  # use sparse vector print sparse_col.nnz  

end edit]

i looking single implementation fast both sparse , moderately sparse v.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -