# TMULADDDST
【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa
Tile Operation Diagram
Introduction
Elementwise operation:src0 * src1 + dst.
Math Interpretation
For each element(i, j)in the valid region:
$$ \mathrm{dst}{i,j} = \mathrm{src0}{i,j} * \mathrm{src1}{i,j} + \mathrm{dst}{i,j} $$
Assembly Syntax
Synchronous form:
%dst = tmuladddst %src0, %src1 : !pto.tile<...>AS Level 1 (SSA)
%dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>AS Level 2 (DPS)
pto.tmuladddst ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)C++ Intrinsic
Declared ininclude/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc0, typename TileDataSrc1, typename... WaitEvents> PTO_INST RecordEvent TMULADDDST(TileDataDst &dst, TileDataSrc0 &src0, TileDataSrc1 &src1, WaitEvents &...events);Constraints
- Implementation checks:
TileData::DTypemust be one of:float,half.- Tile layout must be row-major (
TileData::isRowMajor).
- Valid region:
- The op uses
dst.GetValidRow()/dst.GetValidCol()as the iteration domain;src0/src1are assumed to be compatible (not validated by explicit runtime checks in this op).
- The op uses
- The op iterates over
dst.GetValidRow()/dst.GetValidCol().
Examples
#include <pto/pto-inst.hpp> using namespace pto; void example() { using TileT = Tile<TileType::Vec, float, 16, 16>; TileT a, b, out; TMULADDDST(out, a, b); }ASM Form Examples
Auto Mode
# Auto mode: compiler/runtime-managed placement and scheduling. %dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>Manual Mode
# Manual mode: resources must be bound explicitly before issuing the instruction. # Optional for tile operands: # pto.tassign %arg0, @tile(0x1000) # pto.tassign %arg1, @tile(0x2000) %dst = pto.tmuladddst %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>PTO Assembly Form
%dst = tmuladddst %src0, %src1 : !pto.tile<...> # AS Level 2 (DPS) pto.tmuladddst ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考