Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Conversation

@haojin2
Copy link
Contributor

@haojin2 haojin2 commented Mar 22, 2018

Description

Add a sparse operator on CPU that supports broadcast_mul/div(csr, dense) = csr operations.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-117]
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Add support for broadcast_mul/div(csr, 1Ddense) = csr
  • Add support for broadcast_mul/div(csr, 2Ddense) = csr
  • Add test for broadcast_mul/div(csr, 1Ddense) = csr
  • Add test for broadcast_mul/div(csr, 2Ddense) = csr

Comments

Duplicate PR with #10150, getting a new PR due to renaming of my branch
Example for broadcast_mul/div(csr, 1Ddense) = csr
import mxnet as mx
a = mx.nd.array([[0,0,3],[0,2,0],[1,0,0]]).tostype('csr')
b = mx.nd.array([1,2,3])
mx.nd.broadcast_mul(a,b).asnumpy()
array([[ 0., 0., 3.],
[ 0., 4., 0.],
[ 3., 0., 0.]], dtype=float32)

@haojin2 haojin2 requested a review from cjolivier01 as a code owner March 22, 2018 18:43
int& out_stype = out_attrs->at(0);
bool dispatched = false;
// For GPU, directly fallback
if (dev_mask == mshadow::gpu::kDevMask) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should dispatch to fcompute for dense inputs/outputs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@haojin2 haojin2 requested a review from szha as a code owner March 23, 2018 22:42
@eric-haibin-lin
Copy link
Member

@marcoabreu the pylint result failed on lines not added by this PR. Is there anything not tested on CI?

@haojin2 haojin2 force-pushed the broadcast_muldiv branch 3 times, most recently from 97d198d to f2ee977 Compare March 24, 2018 00:06
@marcoabreu
Copy link
Contributor

@eric-haibin-lin the pylint check passed in http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10208/7/pipeline. There was a short time between upgrading the instances and merging my PR to reflect the necessary changes (pylint fixes included) in which pylint failures could have been possible. It should be resolved now.

@haojin2
Copy link
Contributor Author

haojin2 commented Mar 26, 2018

@sergeykolychev Hi, my PR is failing some tests because I've changed some interface in Python but I'm not familiar with Perl so I'm not able to make the same changes in Perl. I wonder if you could please give some help on how to make corresponding changes in Perl so that my builds could pass? Thanks!

@sergeykolychev
Copy link
Contributor

@haojin2 Hi Hao, certainly, when you need this by the latest ? I'm on vacation right now but if it's pressing I can look at it sooner.

@haojin2
Copy link
Contributor Author

haojin2 commented Mar 27, 2018

@sergeykolychev Probably by the end of this month would be a hard deadline for me, since it would be good check this in for the next release. I guess all what we need are some minor changes to the interface similar to my changes to sparse.py in this PR on the Perl side, so I think any help at your earliest convenience is appreciated, thanks for your response and enjoy your vacation!

@sergeykolychev
Copy link
Contributor

@haojin2 ok, I'll fix this problem for you until end of Thursday this week.
Don't worry and continue your development, I'll take about the perl side.

@haojin2
Copy link
Contributor Author

haojin2 commented Mar 27, 2018

Sure, please take your time! Thanks a lot for your help!

@eric-haibin-lin eric-haibin-lin self-assigned this Mar 28, 2018

@with_seed()
def test_sparse_broadcast_mul_div():
from scipy.sparse import random, csr_matrix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think scipy.sparse.csr_matrix is used here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

assert_almost_equal(mx.nd.broadcast_div(mx_lhs, mx_rhs).asnumpy(), np.divide(np_lhs, np_rhs), atol=1e-4)
shape = (4,3)
np_lhs = random(shape[0], shape[1], density=0.25, dtype=np.float32).tocsr()
mx_lhs = mx.nd.sparse.csr_matrix((np_lhs.data, np_lhs.indices, np_lhs.indptr), shape=shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use test_utils.rand_ndarray(shape, stype, density)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, was not aware of that helper function

_set_ndarray_class(_ndarray_cls)


def add(lhs, rhs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the document and clarify that it's equivalent to nd.elemwise_add() when shape matches. Same for other mul, div, sub

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

>>> mx.nd.sparse.zeros('row_sparse', (1,2), ctx=mx.cpu(), dtype='float16').asnumpy()
array([[ 0., 0.]], dtype=float16)
"""
# pylint: disable= no-member, protected-access
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this added to every function? Maybe worth disable it at the beginning of the file instead of disabling it per function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not added to every function, so will keep this in this way.

[ 1., 1., 1.]]
Supported sparse operations:
broadcast_mul(csr, dense(1D)) = csr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify that this is only supported on CPU

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sergeykolychev
Copy link
Contributor

@haojin2 haojin2#1 this fixes the perl tests. Thanks.

@haojin2
Copy link
Contributor Author

haojin2 commented Mar 29, 2018

@sergeykolychev Thank you very much!

@haojin2 haojin2 requested a review from sergeykolychev as a code owner March 29, 2018 22:40
'/=' => \&not_implemented;

method add(AI::MXNet::NDArray|Num $other, $reverse=)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate your fast response @sergeykolychev thanks for unblocking us. While I'm not a perl user but some documentations for the new function will be great. Maybe worth adding next time

Copy link
Contributor

@sergeykolychev sergeykolychev Mar 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll add the docs next time with more substantial changes in April. Tried to unblock quickly. Though I don't see anybody using the functions directly and not via the overloaded operators.

@haojin2 haojin2 force-pushed the broadcast_muldiv branch 2 times, most recently from aebfa17 to 2ce2bbb Compare April 2, 2018 17:26
Examples
--------
>>> x = mx.nd.ones((2,3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should replace the example with the supported case specifically for sparse:

>>> x = mx.nd.ones((2,3)).tostype('csr')
>>> y = mx.nd.ones((2,3)).tostype('csr')
>>> (x + y).asnumpy()
    array([[ 2.,  2.,  2.],
            [ 2.,  2.,  2.]], dtype=float32)
>>> (x + 2).asnumpy()
    array([[ 3.,  3.,  3.],
            [ 3.,  3.,  3.]], dtype=float32)
>>> mx.nd.sparse.add(x,y).asnumpy()
    array([[ 2.,  2.,  2.],
            [ 2.,  2.,  2.]], dtype=float32)

# pylint: enable= no-member, protected-access


def multiply(lhs, rhs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these functions be added to __all__ in line 36?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's show a specific example for csr multiplied by 1D vector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, please check

Examples
--------
>>> x = mx.nd.ones((2,3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment: use sparse ndarray in the example

>>> (x/y).asnumpy()
array([[ 3., 3., 3.],
[ 3., 3., 3.]], dtype=float32)
>>> mx.nd.divide(x,y).asnumpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's show a specific example for csr divided by 1D vector

broadcast_mul(x, y) = [[ 0., 0., 0.],
[ 1., 1., 1.]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@haojin2 haojin2 force-pushed the broadcast_muldiv branch from 2ce2bbb to 0556ff3 Compare April 2, 2018 21:27
@haojin2 haojin2 requested a review from thirdwing as a code owner April 2, 2018 21:27
@haojin2 haojin2 force-pushed the broadcast_muldiv branch 2 times, most recently from 68d2cd4 to 6f9d8c2 Compare April 3, 2018 05:28
@haojin2 haojin2 force-pushed the broadcast_muldiv branch from 6f9d8c2 to 09a60a1 Compare April 3, 2018 05:28
def check_broadcast_div(mx_lhs, mx_rhs, np_lhs, np_rhs, dtype):
assert_almost_equal(mx.nd.sparse.divide(mx_lhs, mx_rhs).asnumpy(), np.divide(np_lhs, np_rhs), atol=1e-4)
stype = 'csr'
for num_rows in range(2, 6):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think this is too much.. This will be tested repeatedly on CI, rand_shape_2d() should be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done

for num_rows in range(2, 6):
for num_cols in range(2, 6):
shape = (num_rows, num_cols)
density = random.uniform(0.15, 0.25)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually what about densities = zero and a non-zero? We want to explicitly test the case where density = 0, since that's a special case in your code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

>>> (x+2).asnumpy()
array([[ 3., 3., 3.],
[ 3., 3., 3.]], dtype=float32)
>>> (x+y).asnumpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think csr + dense will fallback. It's not good to put an example that is not efficient..

>>> (x*y).asnumpy()
array([[ 0., 0., 0.],
[ 1., 1., 1.]], dtype=float32)
>>> mx.nd.multiply(x, y).asnumpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sparse.multiply, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

>>> (x/y).asnumpy()
array([[ 6., 6., 6.],
[ 3., 3., 3.]], dtype=float32)
>>> mx.nd.divide(x,y).asnumpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sparse.divide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Examples
--------
>>> x = mx.nd.ones((2,3)).tostype('csr')
>>> y = mx.nd.arange(2).reshape((2,1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

csr - dense will fallback..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is multiply, subtract has been fixed above

CHECK_EQ(inputs.size(), 2U);
CHECK_EQ(outputs.size(), 1U);
CHECK_EQ(req.size(), 1U);
CHECK_LE(inputs[1].shape().ndim(), 2U) << "input dense matrix should have less than 2 dimensions";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less than -> less than or equal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

} else {
if (req[0] != kNullOp) {
// broadcast(CSR, Dense(1D)) = CSR
if (lhs_stype == kCSRStorage && rhs_stype == kDefaultStorage && out_stype == kCSRStorage) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about rhs.shape = (1,1)? Will this work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it work

// If the input is not a vector
if ((rhs.shape().ndim() != 1U) && (rhs.shape()[0] != 1) && (rhs.shape()[1] != 1)) {
// Currently do not support elementwise_mul/div(csr, dense) = csr, log and exit
LogUnimplementedOp(attrs, ctx, inputs, req, outputs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's print a more informational error msg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@haojin2 haojin2 force-pushed the broadcast_muldiv branch 4 times, most recently from 3206737 to 1d3d43d Compare April 4, 2018 18:34
MSHADOW_IDX_TYPE_SWITCH(output.aux_type(kIdx), CType, {
MSHADOW_IDX_TYPE_SWITCH(output.aux_type(kIndPtr), RType, {
MXNET_ASSIGN_REQ_SWITCH(req, req_type, {
if (dns.shape().ndim() > 1 && dns.shape()[0] == 1 && dns.shape()[1] == 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would rhs = mx.nd.array([5]) work?

@haojin2 haojin2 force-pushed the broadcast_muldiv branch from 1d3d43d to b56f975 Compare April 4, 2018 23:07
}

template<typename xpu, typename OP>
void BinaryBroadcastComputeCsrEx(const nnvm::NodeAttrs& attrs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to BinaryBroadcastComputeEx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (lhs_stype == kCSRStorage && rhs_stype == kDefaultStorage && out_stype == kCSRStorage) {
BinaryBroadcastCsrDnsCsrImpl<xpu, OP>(ctx, lhs, rhs, req[0], out);
} else {
LogUnimplementedOp(attrs, ctx, inputs, req, outputs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to move LogUnimplementedOp to line 317. Otherwise it may not be called for some unimplemented case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@haojin2 haojin2 force-pushed the broadcast_muldiv branch from b56f975 to 99585fb Compare April 5, 2018 17:56
Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@eric-haibin-lin eric-haibin-lin merged commit 73a0eb0 into apache:master Apr 5, 2018
@haojin2 haojin2 deleted the broadcast_muldiv branch April 5, 2018 21:48
@haojin2
Copy link
Contributor Author

haojin2 commented Apr 8, 2018

Update on 4/8:
A small benchmark for this operator is added to compare the new implementation with the fallback implementation

import mxnet as mx
import scipy
import numpy as np
import time

def measure_cost(repeat, f, *args, **kwargs):
    # start bench
    start = time.time()
    results = []
    for i in range(repeat):
        results.append(f(*args, **kwargs))
    for result in results:
        result.wait_to_read()
    end = time.time()
    diff = end - start
    return diff / repeat

def main():
    shape_lhs = (256, 1000000)
    vec = np.random.uniform(size=(256, 1))
    mx_vec = mx.nd.array(vec)
    for density in [0.01, 0.005, 0.001]:
        csr = scipy.sparse.random(256, 1000000, density=density, format = 'csr', dtype=np.float32)
        mx_csr = mx.nd.sparse.csr_matrix((csr.data, csr.indices, csr.indptr), shape=shape_lhs, ctx=mx.cpu())
        mx_dns = mx_csr.tostype('default')
        sparse_cost = 0.0
        dns_cost = 0.0
        for i in range(10):
            sparse_cost += measure_cost(100, mx.nd.broadcast_mul, mx_csr, mx_vec)
            dns_cost += measure_cost(100, mx.nd.broadcast_mul, mx_dns, mx_vec)
        print("%.2f %%" % (density*100), dns_cost / sparse_cost)


if __name__ == "__main__":
    main()

Results on p2.8xlarge instance with commit(2ae0bd5):
(density speedup)
(1.00% 9.453656599452351)
(0.50% 18.406541290116778)
(0.10% 53.18159853487238)

rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…he#10208)

* support broadcast_mul/div(csr, 1Ddense) = csr

* address code reviews and support broadcast_mul/div(csr, 2Ddense) = csr

* add test for both 1D and 2D dense case

* address code review and fix test error

* address code review and fix test error

* added proper overrides for basic arithmetic functions for sparse tensors.

* fix broadcast dimension

* address code reviews
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
…he#10208)

* support broadcast_mul/div(csr, 1Ddense) = csr

* address code reviews and support broadcast_mul/div(csr, 2Ddense) = csr

* add test for both 1D and 2D dense case

* address code review and fix test error

* address code review and fix test error

* added proper overrides for basic arithmetic functions for sparse tensors.

* fix broadcast dimension

* address code reviews
@haojin2 haojin2 added the Sparse label Aug 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants