Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

Jeon, Byungsoo; Park, Sunghyun; Liao, Peiyuan; Xu, Sheng; Chen, Tianqi; Jia, Zhihao

doi:10.1145/3559009.3569651

Computer Science > Machine Learning

arXiv:2111.00655 (cs)

[Submitted on 1 Nov 2021 (v1), last revised 28 Oct 2022 (this version, v3)]

Title:Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

Authors:Byungsoo Jeon, Sunghyun Park, Peiyuan Liao, Sheng Xu, Tianqi Chen, Zhihao Jia

View PDF

Abstract:The strong demand for efficient and performant deployment of Deep Learning (DL) applications prompts the rapid development of a rich DL ecosystem. To keep up with this fast advancement, it is crucial for modern DL frameworks to efficiently integrate a variety of optimized tensor algebra libraries and runtimes as their backends and generate the fastest possible executable using these backends. However, current DL frameworks require significant manual effort and expertise to integrate every new backend while failing to unleash its full potential. Given the fast-evolving nature of the DL ecosystem, this manual approach often slows down continuous innovations across different layers; it prevents hardware vendors from the fast deployment of their cutting-edge libraries, DL framework developers must repeatedly adjust their hand-coded rules to accommodate new versions of libraries, and machine learning practitioners need to wait for the integration of new technologies and often encounter unsatisfactory performance.
In this paper, we propose Collage, a DL framework that offers seamless integration of DL backends. Collage provides an expressive backend registration interface that allows users to precisely specify the capability of various backends. By leveraging the specifications of available backends, Collage automatically searches for an optimized backend placement strategy for a given workload and execution environment. Our evaluation shows that Collage outperforms the best existing framework for each hardware by $1.26\times$, $1.43\times$, $1.40\times$ on average on NVIDIA's RTX 2070 GPU, V100 GPU, and Intel's Xeon 8259CL CPU, respectively. Collage has been open-sourced and deployed in Apache TVM.

Comments:	Published in PACT 22
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2111.00655 [cs.LG]
	(or arXiv:2111.00655v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2111.00655
Related DOI:	https://doi.org/10.1145/3559009.3569651

Submission history

From: Byungsoo Jeon [view email]
[v1] Mon, 1 Nov 2021 02:01:45 UTC (3,579 KB)
[v2] Wed, 12 Oct 2022 03:03:28 UTC (7,746 KB)
[v3] Fri, 28 Oct 2022 02:20:18 UTC (7,746 KB)

Computer Science > Machine Learning

Title:Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators