news 2026/6/18 17:51:22

CANN/PTO乘加运算指令

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
CANN/PTO乘加运算指令

# TMULADDDST

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

Tile Operation Diagram

Introduction

Elementwise operation:src0 * src1 + dst.

Math Interpretation

For each element(i, j)in the valid region:

$$ \mathrm{dst}{i,j} = \mathrm{src0}{i,j} * \mathrm{src1}{i,j} + \mathrm{dst}{i,j} $$

Assembly Syntax

Synchronous form:

%dst = tmuladddst %src0, %src1 : !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS)

pto.tmuladddst ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared ininclude/pto/common/pto_instr.hpp:

template <typename TileDataDst, typename TileDataSrc0, typename TileDataSrc1, typename... WaitEvents> PTO_INST RecordEvent TMULADDDST(TileDataDst &dst, TileDataSrc0 &src0, TileDataSrc1 &src1, WaitEvents &...events);

Constraints

  • Implementation checks:
    • TileData::DTypemust be one of:float,half.
    • Tile layout must be row-major (TileData::isRowMajor).
  • Valid region:
    • The op usesdst.GetValidRow()/dst.GetValidCol()as the iteration domain;src0/src1are assumed to be compatible (not validated by explicit runtime checks in this op).
  • The op iterates overdst.GetValidRow()/dst.GetValidCol().

Examples

#include <pto/pto-inst.hpp> using namespace pto; void example() { using TileT = Tile<TileType::Vec, float, 16, 16>; TileT a, b, out; TMULADDDST(out, a, b); }

ASM Form Examples

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling. %dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

Manual Mode

# Manual mode: resources must be bound explicitly before issuing the instruction. # Optional for tile operands: # pto.tassign %arg0, @tile(0x1000) # pto.tassign %arg1, @tile(0x2000) %dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

PTO Assembly Form

%dst = tmuladddst %src0, %src1 : !pto.tile<...> # AS Level 2 (DPS) pto.tmuladddst ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/6/18 17:50:53

Windows系统优化终极指南:一键清理Win11臃肿软件

Windows系统优化终极指南&#xff1a;一键清理Win11臃肿软件 【免费下载链接】Win11Debloat A simple, lightweight PowerShell script that allows you to remove pre-installed apps, disable telemetry, as well as perform various other changes to declutter and customi…

作者头像 李华
网站建设 2026/6/18 17:47:44

SOUI资源管理详解:图片、字体、样式的高效加载与使用

SOUI资源管理详解&#xff1a;图片、字体、样式的高效加载与使用 【免费下载链接】soui SOUI是目前为数不多的轻量级可快速开发window桌面程序开源DirectUI库.其前身为Duiengine,更早期则是源自于金山卫士开源版本UI库Bkwin.经过多年持续更新方得此库 项目地址: https://gitc…

作者头像 李华
网站建设 2026/6/18 17:47:03

kitti2bag高级用法:如何自定义转换参数和优化ROS bag输出

kitti2bag高级用法&#xff1a;如何自定义转换参数和优化ROS bag输出 【免费下载链接】kitti2bag Convert KITTI dataset to ROS bag file the easy way! 项目地址: https://gitcode.com/gh_mirrors/ki/kitti2bag 想要将KITTI数据集高效转换为ROS bag文件吗&#xff1f;…

作者头像 李华
网站建设 2026/6/18 17:46:04

Kimi K2.5 Agent集群:AI协作系统如何实现端到端任务闭环

1. 项目概述&#xff1a;当AI不再单打独斗&#xff0c;而是开始“组队作业” 你有没有过这种体验&#xff1a;对着一个复杂的任务发呆——比如要写一份覆盖七家公司的竞品分析报告&#xff0c;得先查官网、翻财报、扫行业新闻、比对技术路线、整理数据表格&#xff0c;最后还得…

作者头像 李华