Why does my GTX Titan Black GPU underperform in double precision calculations in MATLAB R2015a?
I experience unexpectedly slow performance of the GPU in double precision benchmarks.
I have a fast PC (Intel i7-4790 3.6GHz, 16GB of 1600MHz memory, Windows 7 64bit, and a nVidia GeForce GTX Titan Black GPU card, in PCIe 3.0×16 slot, with 850W power supply. I have downloaded the video drivers and CUDA toolkit and installed matlab Parallel Computing Toolbox:
>> gpuDeviceans =CUDADevice withproperties:Name: ‘GeForce GTX TITAN Black’Index: 1ComputeCapability: ‘3.5’SupportsDouble: 1DriverVersion: 7ToolkitVersion: 6.5000MaxThreadsPerBlock: 1024MaxShmemPerBlock: 49152MaxThreadBlockSize: [1024 1024 64]MaxGridSize: [2.1475e+09 65535 65535]SIMDWidth: 32TotalMemory: 6.4425e+09AvailableMemory: 6.2105e+09MultiprocessorCount: 15ClockRateKHz: 980000ComputeMode: ‘Default’GPUOverlapsTransfers: 1KernelExecutionTimeout: 1CanMapHostMemory: 1DeviceSupported: 1DeviceSelected: 1
I then downloaded the GPU benchmarking tool by by the MathWorks Parallel Computing Toolbox Team (version of Updated 05 Jan 2015), from http://www.mathworks.com/matlabcentral/fileexchange/34080-gpubenchand executed the “gpuBench”.
The results show that my GPU performs similarly to Quadro K6000 in single precision benchmarks (with deviations up to 40%, as expected: both the cards have the same no of CUDA cores but the memory bandwidth is higher for my Titan Black and the amount of memory is higher K6000)
However, the GeForce GTX Titan Black performs 4 times (!) slower than Quadro K6000 in the double precision benchmarks! This is unexpected for several reasons.A) both cards are fairly similar:Specification type K6000 / Titan BlackCUDA cores: 2880 / 2880Clock: 902MHz /889MHzMemory clock: 6 Gbps/ 7GbpsMemory bandwidth: 288GB/s / 336GB/s
B) There are benchmarking tests done by the MathWorksParallel Computing Toolbox Team shown in the file “Older benchmarks for GPUs” attached. From those results, a GPU very similar to mine, GeForce GTX Titan (anolder GPU with 2688 CUDA cores, 837MHz clock, 6Gbps memory clock and 288GB/s memory bandwidth) shows benchmarks very much similar to Quadro K6000:
Card DOUBLE SINGLE Benchmark MTimes,Backlash, FFT, MTimes,Backlash,FFTK6000 1092 421 160 3017 831 334GTX Titan 1106 352 150 2933 582 298My GPU 252 163 110 4221 994 409
These results indicate that my GPU card (GeForce GTX Titan Black) should be faster than or similar to the Quadro K6000. However, the performance in the double precision is terrible (4x slower).I experience unexpectedly slow performance of the GPU in double precision benchmarks.
I have a fast PC (Intel i7-4790 3.6GHz, 16GB of 1600MHz memory, Windows 7 64bit, and a nVidia GeForce GTX Titan Black GPU card, in PCIe 3.0×16 slot, with 850W power supply. I have downloaded the video drivers and CUDA toolkit and installed matlab Parallel Computing Toolbox:
>> gpuDeviceans =CUDADevice withproperties:Name: ‘GeForce GTX TITAN Black’Index: 1ComputeCapability: ‘3.5’SupportsDouble: 1DriverVersion: 7ToolkitVersion: 6.5000MaxThreadsPerBlock: 1024MaxShmemPerBlock: 49152MaxThreadBlockSize: [1024 1024 64]MaxGridSize: [2.1475e+09 65535 65535]SIMDWidth: 32TotalMemory: 6.4425e+09AvailableMemory: 6.2105e+09MultiprocessorCount: 15ClockRateKHz: 980000ComputeMode: ‘Default’GPUOverlapsTransfers: 1KernelExecutionTimeout: 1CanMapHostMemory: 1DeviceSupported: 1DeviceSelected: 1
I then downloaded the GPU benchmarking tool by by the MathWorks Parallel Computing Toolbox Team (version of Updated 05 Jan 2015), from http://www.mathworks.com/matlabcentral/fileexchange/34080-gpubenchand executed the “gpuBench”.
The results show that my GPU performs similarly to Quadro K6000 in single precision benchmarks (with deviations up to 40%, as expected: both the cards have the same no of CUDA cores but the memory bandwidth is higher for my Titan Black and the amount of memory is higher K6000)
However, the GeForce GTX Titan Black performs 4 times (!) slower than Quadro K6000 in the double precision benchmarks! This is unexpected for several reasons.A) both cards are fairly similar:Specification type K6000 / Titan BlackCUDA cores: 2880 / 2880Clock: 902MHz /889MHzMemory clock: 6 Gbps/ 7GbpsMemory bandwidth: 288GB/s / 336GB/s
B) There are benchmarking tests done by the MathWorksParallel Computing Toolbox Team shown in the file “Older benchmarks for GPUs” attached. From those results, a GPU very similar to mine, GeForce GTX Titan (anolder GPU with 2688 CUDA cores, 837MHz clock, 6Gbps memory clock and 288GB/s memory bandwidth) shows benchmarks very much similar to Quadro K6000:
Card DOUBLE SINGLE Benchmark MTimes,Backlash, FFT, MTimes,Backlash,FFTK6000 1092 421 160 3017 831 334GTX Titan 1106 352 150 2933 582 298My GPU 252 163 110 4221 994 409
These results indicate that my GPU card (GeForce GTX Titan Black) should be faster than or similar to the Quadro K6000. However, the performance in the double precision is terrible (4x slower). I experience unexpectedly slow performance of the GPU in double precision benchmarks.
I have a fast PC (Intel i7-4790 3.6GHz, 16GB of 1600MHz memory, Windows 7 64bit, and a nVidia GeForce GTX Titan Black GPU card, in PCIe 3.0×16 slot, with 850W power supply. I have downloaded the video drivers and CUDA toolkit and installed matlab Parallel Computing Toolbox:
>> gpuDeviceans =CUDADevice withproperties:Name: ‘GeForce GTX TITAN Black’Index: 1ComputeCapability: ‘3.5’SupportsDouble: 1DriverVersion: 7ToolkitVersion: 6.5000MaxThreadsPerBlock: 1024MaxShmemPerBlock: 49152MaxThreadBlockSize: [1024 1024 64]MaxGridSize: [2.1475e+09 65535 65535]SIMDWidth: 32TotalMemory: 6.4425e+09AvailableMemory: 6.2105e+09MultiprocessorCount: 15ClockRateKHz: 980000ComputeMode: ‘Default’GPUOverlapsTransfers: 1KernelExecutionTimeout: 1CanMapHostMemory: 1DeviceSupported: 1DeviceSelected: 1
I then downloaded the GPU benchmarking tool by by the MathWorks Parallel Computing Toolbox Team (version of Updated 05 Jan 2015), from http://www.mathworks.com/matlabcentral/fileexchange/34080-gpubenchand executed the “gpuBench”.
The results show that my GPU performs similarly to Quadro K6000 in single precision benchmarks (with deviations up to 40%, as expected: both the cards have the same no of CUDA cores but the memory bandwidth is higher for my Titan Black and the amount of memory is higher K6000)
However, the GeForce GTX Titan Black performs 4 times (!) slower than Quadro K6000 in the double precision benchmarks! This is unexpected for several reasons.A) both cards are fairly similar:Specification type K6000 / Titan BlackCUDA cores: 2880 / 2880Clock: 902MHz /889MHzMemory clock: 6 Gbps/ 7GbpsMemory bandwidth: 288GB/s / 336GB/s
B) There are benchmarking tests done by the MathWorksParallel Computing Toolbox Team shown in the file “Older benchmarks for GPUs” attached. From those results, a GPU very similar to mine, GeForce GTX Titan (anolder GPU with 2688 CUDA cores, 837MHz clock, 6Gbps memory clock and 288GB/s memory bandwidth) shows benchmarks very much similar to Quadro K6000:
Card DOUBLE SINGLE Benchmark MTimes,Backlash, FFT, MTimes,Backlash,FFTK6000 1092 421 160 3017 831 334GTX Titan 1106 352 150 2933 582 298My GPU 252 163 110 4221 994 409
These results indicate that my GPU card (GeForce GTX Titan Black) should be faster than or similar to the Quadro K6000. However, the performance in the double precision is terrible (4x slower). gpu, slow, slower MATLAB Answers — New Questions