python - Efficiently multiply a dense matrix by a sparse vector -
i looking efficient way multiply dense matrix sparse vector, av
, of size (m x n) , v (n x 1). vector v scipy.sparse.csc_matrix.
i have 2 methods use @ moment:
in method 1, pick off non-zero values in v, vi, , element-wise multiply vi corresponding column of a, sum these columns. if y = av
, y = a[:, 0]*v0 + ... + a[:, n]*vn
, non-zero i.
def dense_dot_sparse(dense_matrix, sparse_column): prod = np.zeros((dense_matrix.shape[0])) r, c = sparse_column.nonzero() indices = zip(r, c) ind in indices: prod = prod + dense_matrix[:, ind[1]] * sparse_column[ind] return prod
in method 2, perform multiplication making sparse vector .todense()
, use np.dot()
.
def dense_dot_sparse2(dense_matrix, sparse_column): return np.dot(dense_matrix, sparse_column.todense())
the typical size of (512 x 2048) , sparsity of v varies between 1 200 non-zero entries. choose method employ based on sparsity of v. if sparsity of v ~ 200 non-zeros, method 1 takes ~45ms , method 2 takes ~5ms. when v sparse, ~1 non-zero, method 1 takes ~1ms whereas method 2 still takes 5ms. checking sparsity of v (.nnz
) adds 0.2ms.
i have perform 1500 of these multiplications (after splitting data , multiprocessing), time adds up.
[edit: adding simple representative example
rows = 512 cols = 2048 sparsity = 0.001 # sparse: 0.001 ~ 1 non-zero, moderately sparse: 0.1 ~ 200 non-zero big_matrix = np.random.rand(rows, cols) # use dense matrix col = np.random.rand(cols, 1) col = np.array([i[0] if < sparsity else 0.0 in col]) sparse_col = csc_matrix(col) # use sparse vector print sparse_col.nnz
end edit]
i looking single implementation fast both sparse , moderately sparse v.
Comments
Post a Comment