summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorMatthias Vogelgesang <matthias.vogelgesang@kit.edu>2015-07-07 17:01:37 +0200
committerMatthias Vogelgesang <matthias.vogelgesang@kit.edu>2015-07-07 17:01:37 +0200
commit9a99cd39053ba9646e618663bdf3cc72c7890fa7 (patch)
treeb72df267be9562d7f309ea3c2b2309c9cd1024d5 /docs
parented71be36a5694ed2583e4bdf2061aad10b648e1a (diff)
Document NVIDIA multi-GPU performance issue
Diffstat (limited to 'docs')
-rw-r--r--docs/manual/using/cluster.rst11
1 files changed, 11 insertions, 0 deletions
diff --git a/docs/manual/using/cluster.rst b/docs/manual/using/cluster.rst
index 7496b25..e266f2c 100644
--- a/docs/manual/using/cluster.rst
+++ b/docs/manual/using/cluster.rst
@@ -39,3 +39,14 @@ mode, you have to prepare the scheduler::
sched = Ufo.Scheduler(remotes=remotes)
sched.set_remote_mode(Ufo.RemoteMode.REPLICATE)
sched.run(graph)
+
+
+Improving small kernel launches
+===============================
+
+UFO uses a single OpenCL context to manage multiple GPUs in a transparent way.
+For applications and plugins that require many small kernel launches, multi-GPU
+performance suffers on NVIDIA systems due to bad scaling of the kernel launch
+time. In order to improve performance on machines with multiple GPUs it is
+strongly advised to run multiple ``ufod`` services with differently chosen GPUs
+and ports.