Class CuDNNFunctionOptimizations

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  CuDNNFunctionOptimizations.CudnnConv2dNCHWtoNHWCConversion
      https://docs.nvidia.com/deeplearning/sdk/dl-performance-guide/index.html#tensor-layout For tensor cores: we want NHWC layout: Section 7.3.1 "Layout choice has an effect on performance, as convolutions implemented for Tensor Cores require NHWC layout and are fastest when input tensors are laid out in NHWC." "To maximize performance, we recommend using NHWC tensor layout." As for weights format: cuDNN docs are vague - but TF uses NCHW+OIHW or NHWC+OHWI
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static boolean isCudaBackend  
    • Field Detail

      • isCudaBackend

        protected static final boolean isCudaBackend
    • Constructor Detail

      • CuDNNFunctionOptimizations

        public CuDNNFunctionOptimizations()