GroupMamba实战：使用GroupMamba实现图像分类任务

共2000个文件

png：1197个

identifier：771个

py：13个

版权申诉

计算机视觉

目标检测

5星 · 超过95%的资源 94 浏览量 2024-07-31 18:29:29 上传评论 1 收藏 761.5MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

GroupMamba实战：使用GroupMamba实现图像分类任务（2000个子文件）

selective_scan_nrow.cpp 15KB

selective_scan_oflex.cpp 15KB

selective_scan.cpp 15KB

selective_scan_ndstate.cpp 14KB

selective_scan_common.h 8KB

selective_scan.h 2KB

selective_scan_ndstate.h 2KB

static_switch.h 1KB

ffc02550b.pngZone.Identifier 75B

a6c251d63.pngZone.Identifier 75B

aa8778e2d.pngZone.Identifier 75B

19a44418c.pngZone.Identifier 75B

169afb6aa.pngZone.Identifier 75B

a9e03b3a1.pngZone.Identifier 75B

63ac8cb8b.pngZone.Identifier 75B

36a913120.pngZone.Identifier 75B

27d08b6f9.pngZone.Identifier 75B

61cb94bb2.pngZone.Identifier 75B

793d8f855.pngZone.Identifier 75B

a0c39c1dd.pngZone.Identifier 75B

20d3a67d3.pngZone.Identifier 75B

611fc426b.pngZone.Identifier 75B

46b9f0a87.pngZone.Identifier 75B

a38aa2204.pngZone.Identifier 75B

264e8b9b5.pngZone.Identifier 75B

983aed879.pngZone.Identifier 75B

47b316d8f.pngZone.Identifier 75B

ac85f848f.pngZone.Identifier 75B

289e929b2.pngZone.Identifier 75B

839fad8be.pngZone.Identifier 75B

fe801c9c0.pngZone.Identifier 75B

56f69db16.pngZone.Identifier 75B

326192149.pngZone.Identifier 75B

a96438dae.pngZone.Identifier 75B

21ace47d3.pngZone.Identifier 75B

fa468d955.pngZone.Identifier 75B

4426efc94.pngZone.Identifier 75B

50b0d5abf.pngZone.Identifier 75B

062f0fec6.pngZone.Identifier 75B

ab6338bd1.pngZone.Identifier 75B

27f0e13ae.pngZone.Identifier 75B

ff934fcc7.pngZone.Identifier 75B

a5b9d84a3.pngZone.Identifier 75B

024b144e3.pngZone.Identifier 75B

f41055895.pngZone.Identifier 75B

786df0a52.pngZone.Identifier 75B

137dad5ef.pngZone.Identifier 75B

ffdddcf4e.pngZone.Identifier 75B

763b0b8cd.pngZone.Identifier 75B

142c503e1.pngZone.Identifier 75B

16f17c7d1.pngZone.Identifier 75B

146feb316.pngZone.Identifier 75B

04526c399.pngZone.Identifier 75B

378a40743.pngZone.Identifier 75B

654f701ad.pngZone.Identifier 75B

a3e3b178c.pngZone.Identifier 75B

5324a9ab2.pngZone.Identifier 75B

410876711.pngZone.Identifier 75B

feb7699d0.pngZone.Identifier 75B

57c3c7b86.pngZone.Identifier 75B

92bd3b2b7.pngZone.Identifier 75B

98b18ed7a.pngZone.Identifier 75B

2071d617e.pngZone.Identifier 75B

a855cbc06.pngZone.Identifier 75B

774bf7020.pngZone.Identifier 75B

306e7dbd9.pngZone.Identifier 75B

2133c16c5.pngZone.Identifier 75B

54a3a899b.pngZone.Identifier 75B

fe7373785.pngZone.Identifier 75B

0118f1f70.pngZone.Identifier 75B

fe03224a0.pngZone.Identifier 75B

61a3a0f94.pngZone.Identifier 75B

97ab5baf0.pngZone.Identifier 75B

22e7c17b2.pngZone.Identifier 75B

35e31b2b5.pngZone.Identifier 75B

f47065e0a.pngZone.Identifier 75B

aa28f442c.pngZone.Identifier 75B

29cc438e4.pngZone.Identifier 75B

aba570b21.pngZone.Identifier 75B

a42ddba4f.pngZone.Identifier 75B

fefaeec6d.pngZone.Identifier 75B

440d51444.pngZone.Identifier 75B

29c8ca750.pngZone.Identifier 75B

fd08aae02.pngZone.Identifier 75B

125c2316a.pngZone.Identifier 75B

a8968f15a.pngZone.Identifier 75B

91d294b43.pngZone.Identifier 75B

974108721.pngZone.Identifier 75B

422cf9f7d.pngZone.Identifier 75B

709ff44b4.pngZone.Identifier 75B

67ea1b535.pngZone.Identifier 75B

5687df8c6.pngZone.Identifier 75B

646556430.pngZone.Identifier 75B

838c25c16.pngZone.Identifier 75B

072fc34f1.pngZone.Identifier 75B

37cea3ddd.pngZone.Identifier 75B

096eb593d.pngZone.Identifier 75B

29d790068.pngZone.Identifier 75B

38fb092f0.pngZone.Identifier 75B

16c5adff0.pngZone.Identifier 75B

共 2000 条

# mamba-mini An efficient implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivation. It is probably the code which is the most close to selective_scan_cuda in mamba. ### mathematical derivation ![image](../assets/derivation.png) ### code ```python import torch def selective_scan_easy(us, dts, As, Bs, Cs, Ds, delta_bias=None, delta_softplus=False, return_last_state=False, chunksize=64): """ # B: batch_size, G: groups, D: dim, N: state dim, L: seqlen us: B, G * D, L dts: B, G * D, L As: G * D, N Bs: B, G, N, L Cs: B, G, N, L Ds: G * D delta_bias: G * D # chunksize can be any as you like. But as the chunksize raises, hs may get None, as exp(sum(delta) A) is really small """ def selective_scan_chunk(us, dts, As, Bs, Cs, hprefix): """ partial(h) / partial(t) = Ah + Bu; y = Ch + Du; => partial(h*exp(-At)) / partial(t) = Bu*exp(-At); => h_t = h_0 + sum_{0}_{t}_{Bu*exp(A(t-v)) dv}; => h_b = exp(A(dt_a + ... + dt_{b-1})) * (h_a + sum_{a}_{b-1}_{Bu*exp(-A(dt_a + ... + dt_i)) dt_i}); y_i = C_i*h_i + D*u_i """ """ us, dts: (L, B, G, D) # L is chunk_size As: (G, D, N) Bs, Cs: (L, B, G, N) Ds: (G, D) hprefix: (B, G, D, N) """ ts = dts.cumsum(dim=0) Ats = torch.einsum("gdn,lbgd->lbgdn", As, ts).exp() scale = Ats[-1].detach() rAts = Ats / scale duts = dts * us dtBus = torch.einsum("lbgd,lbgn->lbgdn", duts, Bs) hs_tmp = rAts * (dtBus / rAts).cumsum(dim=0) hs = hs_tmp + Ats * hprefix.unsqueeze(0) ys = torch.einsum("lbgn,lbgdn->lbgd", Cs, hs) return ys, hs inp_dtype = us.dtype has_D = Ds is not None dts = dts.float() if delta_bias is not None: dts = dts + delta_bias.view(1, -1, 1).float() if delta_softplus: dts = torch.nn.functional.softplus(dts) if len(Bs.shape) == 3: Bs = Bs.unsqueeze(1) if len(Cs.shape) == 3: Cs = Cs.unsqueeze(1) B, G, N, L = Bs.shape us = us.view(B, G, -1, L).permute(3, 0, 1, 2).float() dts = dts.view(B, G, -1, L).permute(3, 0, 1, 2).float() As = As.view(G, -1, N).float() Bs = Bs.permute(3, 0, 1, 2).float() Cs = Cs.permute(3, 0, 1, 2).float() Ds = Ds.view(G, -1).float() if has_D else None D = As.shape[1] oys = [] # ohs = [] hprefix = us.new_zeros((B, G, D, N), dtype=torch.float) for i in range(0, L - 1, chunksize): ys, hs = selective_scan_chunk( us[i:i + chunksize], dts[i:i + chunksize], As, Bs[i:i + chunksize], Cs[i:i + chunksize], hprefix, ) oys.append(ys) # ohs.append(hs) hprefix = hs[-1] oys = torch.cat(oys, dim=0) # ohs = torch.cat(ohs, dim=0) if has_D: oys = oys + Ds * us oys = oys.permute(1, 2, 3, 0).view(B, -1, L) oys = oys.to(inp_dtype) # hprefix = hprefix.to(inp_dtype) return oys if not return_last_state else (oys, hprefix.view(B, G * D, N)) ``` ### to test ```bash pytest test_selective_scan.py ```

评论收藏

内容反馈

版权申诉