TRTorch v0.2.0

Support for PyTorch 1.7.x, Multi Device APIs, Runtime Library, New Converters, Bug Fixes

This is the second beta release of TRTorch, targeting PyTorch 1.7.x, CUDA 11.0 (on x86_64), TensorRT 7.2 and cuDNN 8. TRTorch 0.2.0 for aarch64 targets JetPack 4.5.x. It updates the to_backend integration for PyTorch to reflect changes in the PyTorch API. A new API has been added to disable the newly introduced TF32 data format used on Ampere as TF32 is now the default FP32 format used in TRTorch. APIs have been solidified for runtime configuration of the active CUDA device to let users choose what device a program is deserialized on. This API will continue to change as we further define the serialization format and work with the PyTorch team to make runtime device configuration more ergonomic. You can follow this work here: #311. This PR also formalizes DLA support in TRTorch, adding APIs and capabilities to target DLA on Jetson and DRIVE platforms. v0.2.0 also includes a new shared library libtrtorchrt.so. This library only contains the runtime components of TRTorch and is suitable for use in situations where device footprint is extremely limited. libtrtorch.so can be linked to C++ applications and loaded into Python scripts and will load all necessary trtorch runtime components into the torch runtime allowing users to run TRTorch applications without the full compiler. v0.2.0 also adds support for Python 3.9.

Dependencies:

- Bazel 4.0.0
- Libtorch 1.7.1 (on x86_64), 1.7.0 (on aarch64)
- CUDA 11.0 (by default, newer CUDA 11 supported with compatible PyTorch build)
- cuDNN 8.0.5
- TensorRT 7.2.2

v0.2.0 (2021-02-25)

refactor!: Update bazel and trt versions (0618b6b)

Bug Fixes

//core/conversion/conversionctx: Fix memory leak in conversion (6f83b41)
//core/lowering: fix debug message for bn dim check removal pass (86bb5b7)
//py: Fix bounds for enum macros (6b942e5)
aten::expand: Fix compiler warning for unused out ITensor (5b0f584)
aten::expand: Fix compiler warnings in the expand converter (51b09d4)
aten::flatten: Fixing flatten converter to handle dynamic batch (00f2d78)
aten::max_pool2d: Supressing error due to not filling in stride in (ed3c185)
aten::zeros: verify zeros produces a tensor correctly (00d2d0c)
remove_to: bug in remove_to.cpp, replace outputs()[0] with inputs()[0] (6c5118a)
setup.py: Broaden the supported pytorch versions to handle jetson (e94a040)
test_op_aliasing: Fix the renamed op (91c3c80)
tests: Fix broken elementwise tests (22ed944)

Features

support true_divide, floor_divide, max, min, rsub (a35fbf1)
//.github: Moving to python directly (ece114c)
//core/conversion: Adding a check to detect programs that will (a3d4144)
//core/lowering: Adding a new pass to handle new dim checks for (3d14cda)
//cpp/api/lib: New runtime only library (6644a9e)
//notebooks: Update notebooks container for 0.1.0 (a5851ff)
//py: [to_backend] adding device specification support for (6eeba1c), closes #286
aten::leaky_relu_: Adding alias for inplace leaky relu (bc53411)
aten::softmax: Adding support for any neg index (abc29a2)
aten::squeeze|aten::unsqueeze: adding BUILD files for new squeeze (9e0a1d7)
aten::sum: Allow for negative indices less than -1 (769bbc9)
aten::topk: Add a debug message noting that sorted is always true (81f1e9d)
aten::topk: Adding BUILD files for topk op (22e6a6b)
disable_tf32: Add a new API to disable TF32 (536983b)
interpolate: Adding support for .vec variants and overhauling test (0cda1cc)
interpolate: Addressing the linear, scale factor, align corners edge case (92e3818)
supportedops: Application to dump a list of supported operators (872d9a3)

BREAKING CHANGES

Version of bazel has been bumped to 4.0.0
Version of TensorRT has been bumped to 7.2.2.3

Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected]

The device API has now changed. Device settings are configured via a device struct which encapsulates information on selected device ids and types.

Supported Operators in TRTorch v0.2.0

Operators Currently Supported Through Converters

aten::_convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups, bool benchmark, bool deterministic, bool cudnn_enabled, bool allow_tf32) -> (Tensor)
aten::_convolution.deprecated(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups, bool benchmark, bool deterministic, bool cudnn_enabled) -> (Tensor)
aten::abs(Tensor self) -> (Tensor)
aten::acos(Tensor self) -> (Tensor)
aten::acosh(Tensor self) -> (Tensor)
aten::adaptive_avg_pool2d(Tensor self, int[2] output_size) -> (Tensor)
aten::add.Scalar(Tensor self, Scalar other, Scalar alpha=1) -> (Tensor)
aten::add.Tensor(Tensor self, Tensor other, Scalar alpha=1) -> (Tensor)
aten::add_.Tensor(Tensor(a!) self, Tensor other, *, Scalar alpha=1) -> (Tensor(a!))
aten::asin(Tensor self) -> (Tensor)
aten::asinh(Tensor self) -> (Tensor)
aten::atan(Tensor self) -> (Tensor)
aten::atanh(Tensor self) -> (Tensor)
aten::avg_pool1d(Tensor self, int[1] kernel_size, int[1] stride=[], int[1] padding=[0], bool ceil_mode=False, bool count_include_pad=True) -> (Tensor)
aten::avg_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=[0, 0], bool ceil_mode=False, bool count_include_pad=True, int? divisor_override=None) -> (Tensor)
aten::avg_pool3d(Tensor self, int[3] kernel_size, int[3] stride=[], int[3] padding=[], bool ceil_mode=False, bool count_include_pad=True, int? divisor_override=None) -> (Tensor)
aten::batch_norm(Tensor input, Tensor? gamma, Tensor? beta, Tensor? mean, Tensor? var, bool training, float momentum, float eps, bool cudnn_enabled) -> (Tensor)
aten::cat(Tensor[] tensors, int dim=0) -> (Tensor)
aten::ceil(Tensor self) -> (Tensor)
aten::clamp(Tensor self, Scalar? min=None, Scalar? max=None) -> (Tensor)
aten::cos(Tensor self) -> (Tensor)
aten::cosh(Tensor self) -> (Tensor)
aten::div.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::div.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::div_.Scalar(Tensor(a!) self, Scalar other) -> (Tensor(a!))
aten::div_.Tensor(Tensor(a!) self, Tensor other) -> (Tensor(a!))
aten::elu(Tensor self, Scalar alpha=1, Scalar scale=1, Scalar input_scale=1) -> (Tensor)
aten::embedding(Tensor weight, Tensor indices, int padding_idx=-1, bool scale_grad_by_freq=False, bool sparse=False) -> (Tensor)
aten::eq.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::eq.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::erf(Tensor self) -> (Tensor)
aten::exp(Tensor self) -> (Tensor)
aten::expand(Tensor(a) self, int[] size, *, bool implicit=False) -> (Tensor(a))
aten::expand_as(Tensor(a) self, Tensor other) -> (Tensor(a))
aten::flatten.using_ints(Tensor self, int start_dim=0, int end_dim=-1) -> (Tensor)
aten::floor(Tensor self) -> (Tensor)
aten::floor_divide(Tensor self, Tensor other) -> (Tensor)
aten::floor_divide.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::ge.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::ge.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::gt.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::gt.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::hardtanh(Tensor self, Scalar min_val=-1, Scalar max_val=1) -> (Tensor)
aten::hardtanh_(Tensor(a!) self, Scalar min_val=-1, Scalar max_val=1) -> (Tensor(a!))
aten::le.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::le.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::leaky_relu(Tensor self, Scalar negative_slope=0.01) -> (Tensor)
aten::leaky_relu_(Tensor(a!) self, Scalar negative_slope=0.01) -> (Tensor(a!))
aten::linear(Tensor input, Tensor weight, Tensor? bias=None) -> (Tensor)
aten::log(Tensor self) -> (Tensor)
aten::lstm_cell(Tensor input, Tensor[] hx, Tensor w_ih, Tensor w_hh, Tensor? b_ih=None, Tensor? b_hh=None) -> (Tensor, Tensor)
aten::lt.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::lt.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::matmul(Tensor self, Tensor other) -> (Tensor)
aten::max(Tensor self) -> (Tensor)
aten::max.other(Tensor self, Tensor other) -> (Tensor)
aten::max_pool1d(Tensor self, int[1] kernel_size, int[1] stride=[], int[1] padding=[], int[1] dilation=[], bool ceil_mode=False) -> (Tensor)
aten::max_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=[0, 0], int[2] dilation=[1, 1], bool ceil_mode=False) -> (Tensor)
aten::max_pool3d(Tensor self, int[3] kernel_size, int[3] stride=[], int[3] padding=[], int[3] dilation=[], bool ceil_mode=False) -> (Tensor)
aten::mean(Tensor self, *, int? dtype=None) -> (Tensor)
aten::mean.dim(Tensor self, int[] dim, bool keepdim=False, *, int? dtype=None) -> (Tensor)
aten::min(Tensor self) -> (Tensor)
aten::min.other(Tensor self, Tensor other) -> (Tensor)
aten::mul.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::mul.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::mul_.Tensor(Tensor(a!) self, Tensor other) -> (Tensor(a!))
aten::narrow(Tensor(a) self, int dim, int start, int length) -> (Tensor(a))
aten::narrow.Tensor(Tensor(a) self, int dim, Tensor start, int length) -> (Tensor(a))
aten::ne.Scalar(Tensor self, Scalar other) -> (Tensor)
aten::ne.Tensor(Tensor self, Tensor other) -> (Tensor)
aten::neg(Tensor self) -> (Tensor)
aten::permute(Tensor(a) self, int[] dims) -> (Tensor(a))
aten::pow.Tensor_Scalar(Tensor self, Scalar exponent) -> (Tensor)
aten::pow.Tensor_Tensor(Tensor self, Tensor exponent) -> (Tensor)
aten::prelu(Tensor self, Tensor weight) -> (Tensor)
aten::prod(Tensor self, *, int? dtype=None) -> (Tensor)
aten::prod.dim_int(Tensor self, int dim, bool keepdim=False, *, int? dtype=None) -> (Tensor)
aten::reciprocal(Tensor self) -> (Tensor)
aten::relu(Tensor input) -> (Tensor)
aten::relu_(Tensor(a!) self) -> (Tensor(a!))
aten::repeat(Tensor self, int[] repeats) -> (Tensor)
aten::reshape(Tensor self, int[] shape) -> (Tensor)
aten::rsub.Scalar(Tensor self, Scalar other, Scalar alpha=1) -> (Tensor)
aten::rsub.Tensor(Tensor self, Tensor other, Scalar alpha=1) -> (Tensor)
aten::select.int(Tensor(a) self, int dim, int index) -> (Tensor(a))
aten::sigmoid(Tensor input) -> (Tensor)
aten::sigmoid_(Tensor(a!) self) -> (Tensor(a!))
aten::sin(Tensor self) -> (Tensor)
aten::sinh(Tensor self) -> (Tensor)
aten::slice.Tensor(Tensor(a) self, int dim=0, int start=0, int end=9223372036854775807, int step=1) -> (Tensor(a))
aten::softmax.int(Tensor self, int dim, int? dtype=None) -> (Tensor)
aten::split(Tensor self, int[] split_sizes, int dim=0) -> (Tensor[])
aten::split.Tensor(Tensor(a) self, int split_size, int dim=0) -> (Tensor[])
aten::split_with_sizes(Tensor(a) self, int[] split_sizes, int dim=0) -> (Tensor[])
aten::sqrt(Tensor self) -> (Tensor)
aten::squeeze.dim(Tensor(a) self, int dim) -> (Tensor(a))
aten::stack(Tensor[] tensors, int dim=0) -> (Tensor)
aten::sub.Tensor(Tensor self, Tensor other, Scalar alpha=1) -> (Tensor)
aten::sub_.Tensor(Tensor(a!) self, Tensor other, *, Scalar alpha=1) -> (Tensor(a!))
aten::sum(Tensor self, *, int? dtype=None) -> (Tensor)
aten::sum.dim_IntList(Tensor self, int[1] dim, bool keepdim=False, *, int? dtype=None) -> (Tensor)
aten::tan(Tensor self) -> (Tensor)
aten::tanh(Tensor input) -> (Tensor)
aten::tanh_(Tensor(a!) self) -> (Tensor(a!))
aten::topk(Tensor self, int k, int dim=-1, bool largest=True, bool sorted=True) -> (Tensor values, Tensor indices)
aten::transpose.int(Tensor(a) self, int dim0, int dim1) -> (Tensor(a))
aten::unsqueeze(Tensor(a) self, int dim) -> (Tensor(a))
aten::upsample_bilinear2d(Tensor self, int[2] output_size, bool align_corners, float? scales_h=None, float? scales_w=None) -> (Tensor)
aten::upsample_bilinear2d.vec(Tensor input, int[]? output_size, bool align_corners, float[]? scale_factors) -> (Tensor)
aten::upsample_linear1d(Tensor self, int[1] output_size, bool align_corners, float? scales=None) -> (Tensor)
aten::upsample_linear1d.vec(Tensor input, int[]? output_size, bool align_corners, float[]? scale_factors) -> (Tensor)
aten::upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> (Tensor)
aten::upsample_nearest1d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> (Tensor)
aten::upsample_nearest2d(Tensor self, int[2] output_size, float? scales_h=None, float? scales_w=None) -> (Tensor)
aten::upsample_nearest2d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> (Tensor)
aten::upsample_nearest3d(Tensor self, int[3] output_size, float? scales_d=None, float? scales_h=None, float? scales_w=None) -> (Tensor)
aten::upsample_nearest3d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> (Tensor)
aten::upsample_trilinear3d(Tensor self, int[3] output_size, bool align_corners, float? scales_d=None, float? scales_h=None, float? scales_w=None) -> (Tensor)
aten::upsample_trilinear3d.vec(Tensor input, int[]? output_size, bool align_corners, float[]? scale_factors) -> (Tensor)
aten::view(Tensor(a) self, int[] size) -> (Tensor(a))
trt::const(Tensor self) -> (Tensor)

Operators Currently Supported Through Evaluators

aten::Bool.float(float b) -> (bool)
aten::Bool.int(int a) -> (bool)
aten::Float.Scalar(Scalar a) -> float
aten::Float.bool(bool a) -> float
aten::Float.int(int a) -> float
aten::and(int a, int b) -> (bool)
aten::getitem.t(t list, int idx) -> (t(*))
aten::is(t1 self, t2 obj) -> bool
aten::isnot(t1 self, t2 obj) -> bool
aten::not(bool self) -> bool
aten::or(int a, int b) -> (bool)
aten::__round_to_zero_floordiv(int a, int b) -> (int)
aten::xor(int a, int b) -> (bool)
aten::add.float(float a, float b) -> (float)
aten::add.int(int a, int b) -> (int)
aten::add_.t(t self, t[] b) -> (t[])
aten::append.t(t self, t(c -> *) el) -> (t)
aten::dim(Tensor self) -> int
aten::div.float(float a, float b) -> (float)
aten::div.int(int a, int b) -> (float)
aten::eq.bool(bool a, bool b) -> (bool)
aten::eq.float(float a, float b) -> (bool)
aten::eq.float_int(float a, int b) -> (bool)
aten::eq.int(int a, int b) -> (bool)
aten::eq.int_float(int a, float b) -> (bool)
aten::floor.float(float a) -> (int)
aten::floordiv.float(float a, float b) -> (int)
aten::floordiv.int(int a, int b) -> (int)
aten::ge.bool(bool a, bool b) -> (bool)
aten::ge.float(float a, float b) -> (bool)
aten::ge.float_int(float a, int b) -> (bool)
aten::ge.int(int a, int b) -> (bool)
aten::ge.int_float(int a, float b) -> (bool)
aten::gt.bool(bool a, bool b) -> (bool)
aten::gt.float(float a, float b) -> (bool)
aten::gt.float_int(float a, int b) -> (bool)
aten::gt.int(int a, int b) -> (bool)
aten::gt.int_float(int a, float b) -> (bool)
aten::le.bool(bool a, bool b) -> (bool)
aten::le.float(float a, float b) -> (bool)
aten::le.float_int(float a, int b) -> (bool)
aten::le.int(int a, int b) -> (bool)
aten::le.int_float(int a, float b) -> (bool)
aten::len.t(t[] a) -> (int)
aten::lt.bool(bool a, bool b) -> (bool)
aten::lt.float(float a, float b) -> (bool)
aten::lt.float_int(float a, int b) -> (bool)
aten::lt.int(int a, int b) -> (bool)
aten::lt.int_float(int a, float b) -> (bool)
aten::mul.float(float a, float b) -> (float)
aten::mul.int(int a, int b) -> (int)
aten::ne.bool(bool a, bool b) -> (bool)
aten::ne.float(float a, float b) -> (bool)
aten::ne.float_int(float a, int b) -> (bool)
aten::ne.int(int a, int b) -> (bool)
aten::ne.int_float(int a, float b) -> (bool)
aten::neg.int(int a) -> (int)
aten::numel(Tensor self) -> int
aten::size(Tensor self) -> (int[])
aten::size.int(Tensor self, int dim) -> (int)
aten::slice.t(t[] l, int start, int end=9223372036854775807, int step=1) -> (t[])
aten::sub.float(float a, float b) -> (float)
aten::sub.int(int a, int b) -> (int)
prim::max.bool(bool a, bool b) -> (bool)
prim::max.float(float a, float b) -> (bool)
prim::max.float_int(float a, int b) -> (bool)
prim::max.int(int a, int b) -> (bool)
prim::max.int_float(int a, float b) -> (bool)
prim::max.self_int(int[] self) -> (int)
prim::min.bool(bool a, bool b) -> (bool)
prim::min.float(float a, float b) -> (bool)
prim::min.float_int(float a, int b) -> (bool)
prim::min.int(int a, int b) -> (bool)
prim::min.int_float(int a, float b) -> (bool)
prim::min.self_int(int[] self) -> (int)
prim::shape(Tensor a) -> (int[])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRTorch v0.2.0

TRTorch v0.2.0

Support for PyTorch 1.7.x, Multi Device APIs, Runtime Library, New Converters, Bug Fixes

Dependencies:

v0.2.0 (2021-02-25)

Bug Fixes

Features

BREAKING CHANGES

Supported Operators in TRTorch v0.2.0

Operators Currently Supported Through Converters

Operators Currently Supported Through Evaluators