Skip to content

3.0.0.rc1

Pre-release
Pre-release
Compare
Choose a tag to compare
@nudles nudles released this 08 Apr 17:24
· 968 commits to master since this release
f3f6fe0

This release includes following changes:

  • Code quality has been promoted by introducing linting check in CI and auto code formatter.
    For linting, the tools, cpplint and pylint, are used and configured to comply
    google coding styles details in tool/linting/.
    Similarly, formatting tools, clang-format and yapf configured with google coding styles,
    are the recommended one for developers to clean code before submitting changes,
    details in tool/code-format/. LGTM is enabled on Github for
    code quality check; License check is also enabled.

  • New Tensor APIs are added for naming consistency, and feature enhancement:

    • size(), mem_size(), get_value(), to_proto(), l1(), l2(): added for the sake of naming consistency
    • AsType(): convert data type between float and int
    • ceil(): perform element-wise ceiling of the input
    • concat(): concatenate two tensor
    • index selector: e.g. tensor1[:,:,1:,1:]
    • softmax(in, axis): allow to perform softmax on a axis on a multi-dimensional tensor
  • 14 new operators are added into the autograd module: Gemm, GlobalAveragePool, ConstantOfShape,
    Dropout, ReduceSum, ReduceMean, Slice, Ceil, Split, Gather, Tile, NonZero, Cast, OneHot.
    Their unit tests are added as well.

  • 14 new operators are added to sonnx module for both backend and frontend:
    Gemm,
    GlobalAveragePool,
    ConstantOfShape,
    Dropout,
    ReduceSum,
    ReduceMean,
    Slice,
    Ceil,
    Split,
    Gather,
    Tile,
    NonZero,
    Cast,
    OneHot.
    Their tests are added as well.

  • Some ONNX models are imported into SINGA, including
    Bert-squad,
    Arcface,
    FER+ Emotion,
    MobileNet,
    ResNet18,
    Tiny Yolov2,
    Vgg16, and Mnist.

  • Some operators now support multidirectional broadcasting,
    including Add, Sub, Mul, Div, Pow, PRelu, Gemm

  • [Distributed training with communication optimization]. DistOpt
    has implemented multiple optimization techniques, including gradient sparsification,
    chunk transmission, and gradient compression.

  • Computational graph construction at the CPP level. The operations submitted to the Device are buffered.
    After analyzing the dependency, the computational graph is created, which is further analyzed for
    speed and memory optimization. To enable this feature, use the Module API.

  • New website based on Docusaurus. The documentation files are moved to a separate repo [singa-doc]](https://github.com/apache/singa-doc).
    The static website files are stored at singa-site.

  • DNNL(Deep Neural Network Library), powered by Intel,
    is integrated into model/operations/[batchnorm|pooling|convolution],
    the changes is opaque to the end users. The current version is dnnl v1.1
    which replaced previous integration of mkl-dnn v0.18. The framework could
    boost the performance of dl operations when executing on CPU. The dnnl dependency
    is installed through conda.

  • Some Tensor APIs are marked as deprecated which could be replaced by broadcast,
    and it can support better on multi-dimensional operations. These APIs are
    add_column(), add_row(), div_column(), div_row(), mult_column(), mult_row()

  • Conv and Pooling are enhanced to support fine-grained padding like (2,3,2,3),
    and SAME_UPPER, SAME_LOWER
    pad mode and shape checking.

  • Reconstruct soonx,

    • Support two types of weight value (Initializer and Constant Node);
    • For some operators (BatchNorm, Reshape, Clip, Slice, Gather, Tile, OneHot),
      move some inputs to its attributes;
    • Define and implement the type conversion map.