summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorMatthias Vogelgesang <matthias.vogelgesang@kit.edu>2017-01-13 15:43:08 +0100
committerMatthias Vogelgesang <matthias.vogelgesang@kit.edu>2017-01-13 15:43:08 +0100
commitab6b0e404dbe78483acea19e3c5226ef934da434 (patch)
tree22ac17b839eaf22f3c1f3e11531d0a667d4af96c /docs
parentbc5d8d0f2b762cf56e34fefba2d530c1c546ee83 (diff)
Restructure docs and add section about broadcasts
Diffstat (limited to 'docs')
-rw-r--r--docs/manual/using/background.rst40
-rw-r--r--docs/manual/using/cluster.rst52
-rw-r--r--docs/manual/using/execution.rst127
-rw-r--r--docs/manual/using/index.rst3
4 files changed, 128 insertions, 94 deletions
diff --git a/docs/manual/using/background.rst b/docs/manual/using/background.rst
deleted file mode 100644
index 027244d..0000000
--- a/docs/manual/using/background.rst
+++ /dev/null
@@ -1,40 +0,0 @@
-.. _using-objects:
-
-====================
-Technical Background
-====================
-
-Relationship between graph and scheduler
-========================================
-
-A ``Ufo.Graph`` represents a network of interconnected filter nodes. New nodes
-can be added and existing node relationships be queried. Also, the graph can be
-serialized as a JSON structure with ``ufo_graph_save_to_json`` and read back
-again with ``ufo_graph_read_from_json``.
-
-The ``Ufo.Scheduler`` on the other hand is an implementation of a strategy *how*
-to execute the filters contained in a graph. Therefore, the scheduler is passed
-a graph object on execution.
-
-
-Profiling
-=========
-
-By default, the scheduler measures the run-time from initial setup until
-processing of the last data item finished. You can get the time in seconds via the
-``time`` property ::
-
- g = Ufo.TaskGraph()
- scheduler = Ufo.Scheduler()
- scheduler.run(g)
- print("Time spent: {}s".format(scheduler.time))
-
-To get more fine-grained insight into the execution, you can enable tracing ::
-
- scheduler.props.enable_tracing = True
- scheduler.run(g)
-
-and analyse the generated traces for OpenCL (saved in ``opencl.PID.json``) and
-general events (saved in ``trace.PID.json``). To visualize the trace events, you
-can either use the distributed ``ufo-prof`` tool or Google Chrome or Chromium by
-going to chrome://tracing and loading the JSON files.
diff --git a/docs/manual/using/cluster.rst b/docs/manual/using/cluster.rst
deleted file mode 100644
index e266f2c..0000000
--- a/docs/manual/using/cluster.rst
+++ /dev/null
@@ -1,52 +0,0 @@
-.. _using-cluster:
-
-==========================
-Running tasks in a cluster
-==========================
-
-The UFO framework comes with built-in cluster capabilities based on ZeroMQ 3.2.
-Contrary to bulk cluster approaches (e.g. solving large linear systems), UFO
-tries to distribute `streamed` data on a set of multiple machines. On each
-remote slave, ``ufod`` must be started. By default, the server binds to port
-5555 on any available network adapter. To change this, use the ``-l/--listen``
-option::
-
- $ ufod --listen tcp://ib0:5555
-
-will let ``ufod`` use the first Infiniband-over-IP connection.
-
-On the master host, you pass the remote slave addresses to the scheduler object.
-In Python this would look like this::
-
- sched = Ufo.Scheduler(remotes=['tcp://foo.bar.org:5555'])
-
-Address are notated according to `ZeroMQ <http://api.zeromq.org/3-2:zmq-tcp>`_.
-
-
-Streaming vs. replication
-=========================
-
-Work can be executed in two ways: `streaming`, which means data is transferred
-from a master machine to all slaves and returned to the master after computation
-is finished and `replicated` in which each slaves works on its own subset of the
-initial input data. The former must be used if the length of the stream is
-unknown before execution, otherwise the stream could not be split up into equal
-partitions.
-
-Initially, the scheduler is set to streaming mode. To switch to replication
-mode, you have to prepare the scheduler::
-
- sched = Ufo.Scheduler(remotes=remotes)
- sched.set_remote_mode(Ufo.RemoteMode.REPLICATE)
- sched.run(graph)
-
-
-Improving small kernel launches
-===============================
-
-UFO uses a single OpenCL context to manage multiple GPUs in a transparent way.
-For applications and plugins that require many small kernel launches, multi-GPU
-performance suffers on NVIDIA systems due to bad scaling of the kernel launch
-time. In order to improve performance on machines with multiple GPUs it is
-strongly advised to run multiple ``ufod`` services with differently chosen GPUs
-and ports.
diff --git a/docs/manual/using/execution.rst b/docs/manual/using/execution.rst
new file mode 100644
index 0000000..47d200f
--- /dev/null
+++ b/docs/manual/using/execution.rst
@@ -0,0 +1,127 @@
+==============
+Task execution
+==============
+
+This section provides a deeper look into the technical background concerning
+scheduling and task execution. The execution model of the UFO framework is based
+on the ``Ufo.TaskGraph`` that represents a network of interconnected task
+nodes and the ``Ufo.BaseScheduler`` that runs these tasks according to a
+pre-defined strategy. The ``Ufo.Scheduler`` is a concrete implementation and is
+the default choice because it is able to instantiate tasks in a multi-GPU
+environment. For greater flexibility, the ``Ufo.FixedScheduler`` can be used to
+define arbitrary GPU mappings.
+
+
+Profiling execution
+===================
+
+By default, the scheduler measures the run-time from initial setup until
+processing of the last data item finished. You can get the time in seconds via the
+``time`` property ::
+
+ g = Ufo.TaskGraph()
+ scheduler = Ufo.Scheduler()
+ scheduler.run(g)
+ print("Time spent: {}s".format(scheduler.time))
+
+To get more fine-grained insight into the execution, you can enable tracing ::
+
+ scheduler.props.enable_tracing = True
+ scheduler.run(g)
+
+and analyse the generated traces for OpenCL (saved in ``opencl.PID.json``) and
+general events (saved in ``trace.PID.json``). To visualize the trace events, you
+can either use the distributed ``ufo-prof`` tool or Google Chrome or Chromium by
+going to chrome://tracing and loading the JSON files.
+
+
+Broadcasting results
+====================
+
+.. highlight:: c
+
+Connecting a task output to multiple consumers will in most cases cause
+undefined results because some data is processed differently than others. A
+certain class of problems can be solved by inserting explicit ``Ufo.CopyTask``
+nodes and executing the graph with a ``Ufo.FixedScheduler``. In the following
+example, we want write the same data twice with a different prefix::
+
+ from gi.repository import Ufo
+
+ pm = Ufo.PluginManager()
+ sched = Ufo.FixedScheduler()
+ graph = Ufo.TaskGraph()
+ copy = Ufo.CopyTask()
+
+ data = pm.get_task('read')
+
+ write1 = pm.get_task('write')
+ write1.set_properties(filename='w1-%05i.tif')
+
+ write2 = pm.get_task('write')
+ write2.set_properties(filename='w2-%05i.tif')
+
+ graph.connect_nodes(data, copy)
+ graph.connect_nodes(copy, write1)
+ graph.connect_nodes(copy, write2)
+
+ sched.run(graph)
+
+.. note::
+
+ The copy task node is not a regular plugin but part of the core API and
+ thus cannot be used with tools like ``ufo-runjson`` or ``ufo-launch``.
+
+
+
+Running tasks in a cluster
+==========================
+
+.. highlight:: bash
+
+The UFO framework comes with built-in cluster capabilities based on ZeroMQ 3.2.
+Contrary to bulk cluster approaches (e.g. solving large linear systems), UFO
+tries to distribute `streamed` data on a set of multiple machines. On each
+remote slave, ``ufod`` must be started. By default, the server binds to port
+5555 on any available network adapter. To change this, use the ``-l/--listen``
+option::
+
+ $ ufod --listen tcp://ib0:5555
+
+will let ``ufod`` use the first Infiniband-over-IP connection.
+
+On the master host, you pass the remote slave addresses to the scheduler object.
+In Python this would look like this::
+
+ sched = Ufo.Scheduler(remotes=['tcp://foo.bar.org:5555'])
+
+Address are notated according to `ZeroMQ <http://api.zeromq.org/3-2:zmq-tcp>`_.
+
+
+Streaming vs. replication
+-------------------------
+
+Work can be executed in two ways: `streaming`, which means data is transferred
+from a master machine to all slaves and returned to the master after computation
+is finished and `replicated` in which each slaves works on its own subset of the
+initial input data. The former must be used if the length of the stream is
+unknown before execution, otherwise the stream could not be split up into equal
+partitions.
+
+Initially, the scheduler is set to streaming mode. To switch to replication
+mode, you have to prepare the scheduler::
+
+ sched = Ufo.Scheduler(remotes=remotes)
+ sched.set_remote_mode(Ufo.RemoteMode.REPLICATE)
+ sched.run(graph)
+
+
+Improving small kernel launches
+-------------------------------
+
+UFO uses a single OpenCL context to manage multiple GPUs in a transparent way.
+For applications and plugins that require many small kernel launches, multi-GPU
+performance suffers on NVIDIA systems due to bad scaling of the kernel launch
+time. In order to improve performance on machines with multiple GPUs it is
+strongly advised to run multiple ``ufod`` services with differently chosen GPUs
+and ports.
diff --git a/docs/manual/using/index.rst b/docs/manual/using/index.rst
index 3da6812..dd291ad 100644
--- a/docs/manual/using/index.rst
+++ b/docs/manual/using/index.rst
@@ -25,6 +25,5 @@ own image processing pipeline or implement a new filter.
quickstart.rst
env.rst
- background.rst
- cluster.rst
+ execution.rst
json.rst