summaryrefslogtreecommitdiff
path: root/docs/CompileCudaWithLLVM.rst
diff options
context:
space:
mode:
authorJustin Lebar <jlebar@google.com>2016-09-07 20:09:46 +0000
committerJustin Lebar <jlebar@google.com>2016-09-07 20:09:46 +0000
commit78e95faf4b44ef6178211b2405271412ae5bea8c (patch)
tree6d7678b846dee214788be7b0af4732bbabb2dbf7 /docs/CompileCudaWithLLVM.rst
parenta804c5a9aac31c25a5345a240919ee0d70671ce6 (diff)
[CUDA] Expand upon --cuda-gpu-arch flag in CompileCudaWithLLVM doc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280848 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/CompileCudaWithLLVM.rst')
-rw-r--r--docs/CompileCudaWithLLVM.rst7
1 files changed, 7 insertions, 0 deletions
diff --git a/docs/CompileCudaWithLLVM.rst b/docs/CompileCudaWithLLVM.rst
index f57839cec96..85aab5dda0f 100644
--- a/docs/CompileCudaWithLLVM.rst
+++ b/docs/CompileCudaWithLLVM.rst
@@ -119,6 +119,13 @@ your GPU <https://developer.nvidia.com/cuda-gpus>`_. For example, if you want
to run your program on a GPU with compute capability of 3.5, you should specify
``--cuda-gpu-arch=sm_35``.
+Note: You cannot pass ``compute_XX`` as an argument to ``--cuda-gpu-arch``;
+only ``sm_XX`` is currently supported. However, clang always includes PTX in
+its binaries, so e.g. a binary compiled with ``--cuda-gpu-arch=sm_30`` would be
+forwards-compatible with e.g. ``sm_35`` GPUs.
+
+You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs.
+
Detecting clang vs NVCC
=======================