summaryrefslogtreecommitdiff
path: root/docs/developers/internal_architecture.rst
blob: d74bd0d8d9755fbed2e61457b1ce89b915ac754e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
Internal Architecture Overview
==============================

External API
************


Configs
~~~~~~~

At the highest level, we have OCIO::Configs. This represents the entirety of the
current color "universe".  Configs are serialized as .ocio files, read at runtime,
and are often used in a 'read-only' context.

Config are loaded at runtime to allow for customized color handling in a show-
dependent manner.

Example Configs:

* ACES (Acacdemy's standard color workflow)
* spi-vfx (Used on some Imageworks VFX shows such as spiderman, etc).
* and others


ColorSpaces
~~~~~~~~~~~

The meat of an OCIO::Config is a list of named ColorSpaces. ColorSpace often
correspond to input image states, output image states, or image states used for
internal processing.

Example ColorSpaces (from ACES configuration):

* aces (HDR, scene-linear)
* adx10 (log-like density encoding space)
* slogf35 (sony F35 slog camera encoding)
* rrt_srgb (baked in display transform, suitable for srgb display)
* rrt_p3dci (baked in display transform, suitable for dcip3 display)



Transforms
~~~~~~~~~~

ColorSpaces contain an ordered list of transforms, which define the conversion
to and from the Config's "reference" space.

Transforms are the atomic units available to the designer in order to specify a
color conversion.

Examples of OCIO::Transforms are:

* File-based transforms (1d lut, 3d lut, mtx... anything, really.)
* Math functions (gamma, log, mtx)
* The 'meta' GroupTransform, which contains itself an ordered lists of transforms
* The 'meta' LookTransform, which contains an ordered lists of transforms

For example, the adx10 ColorSpace (in one particular ACES configuration)
-Transform FROM adx, to our reference ColorSpace:

#. Apply FileTransform adx_adx10_to_cdd.spimtx
#. Apply FileTransform adx_cdd_to_cid.spimtx
#. Apply FileTransform adx_cid_to_rle.spi1d
#. Apply LogTransform base 10 (inverse)
#. Apply FileTransform adx_exp_to_aces.spimtx

If we have an image in the reference ColorSpace (unnamed), we can convert TO
adx by applying each in the inverse direction:

#. Apply FileTransform adx_exp_to_aces.spimtx (inverse)
#. Apply LogTransform base 10 (forward)
#. Apply FileTransform adx_cid_to_rle.spi1d (inverse)
#. Apply FileTransform adx_cdd_to_cid.spimtx (inverse)
#. Apply FileTransform adx_adx10_to_cdd.spimtx (inverse)

Note that this isn't possible in all cases (what if a lut or matrix is not 
invertible?), but conceptually it's a simple way to think about the design.



Summary
~~~~~~~

Configs and ColorSpaces are just a bookkeeping device used to get and ordered
lists of Transforms corresponding to image color transformation.

Transforms are visible to the person AUTHORING the OCIO config, but are
NOT visible to the client applications. Client apps need only concern themselves
with Configs and Processors.




OCIO::Processors
~~~~~~~~~~~~~~~~

A processor corresponds to a 'baked' color transformation. You specify two arguments
when querying a processor: the :ref:`colorspace_section` you are coming from,
and the :ref:`colorspace_section` you are going to.

Once you have the processor, you can apply the color transformation using the
"apply" function.  For the CPU veseion, first wrap your image in an
ImageDesc class, and then call apply to process in place.

Example:

.. code-block:: cpp

   #include <OpenColorIO/OpenColorIO.h>
   namespace OCIO = OCIO_NAMESPACE;
   
   try
   {
      // Get the global OpenColorIO config
      // This will auto-initialize (using $OCIO) on first use
      OCIO::ConstConfigRcPtr config = OCIO::GetCurrentConfig();
      
      // Get the processor corresponding to this transform.
      // These strings, in this example, are specific to the above
      // example. ColorSpace names should NEVER be hard-coded into client
      // software, but should be dynamically queried at runtime from the library
      OCIO::ConstProcessorRcPtr processor = config->getProcessor("adx10", "aces");
      
      // Wrap the image in a light-weight ImageDescription
      OCIO::PackedImageDesc img(imageData, w, h, 4);
      
      // Apply the color transformation (in place)
      processor->apply(img);
   }
   catch(OCIO::Exception & exception)
   {
      std::cerr << "OpenColorIO Error: " << exception.what() << std::endl;
   }


The GPU code path is similar.  You get the processor from the config, and then
query the shaderText and the lut3d.  The client loads these to the GPU themselves,
and then makes the appropriate calls to the newly defined function.

See `src/apps/ociodisplay` for an example.



Internal API
************


The Op Abstraction
~~~~~~~~~~~~~~~~~~

It is a useful abstraction, both for code-reuse and optimization, to not relying
on the transforms to do pixel processing themselves.

Consider that the FileTransform represents a wide-range of image processing
operations (basically all of em), many of which are really complex.  For example,
the houdini lut format in a single file may contain a log convert, a 1d lut, and
then a 3d lut; all of which need to be applied in a row!  If we dont want the
FileTransform to know how to process all possible pixel operations, it's much
simpler to make light-weight processing operations, which the transforms can
create to do the dirty work as needed.

All image processing operations (ops) are a class that present the same
interface, and it's rather simple:
(src/core/Op.h)

.. code-block:: cpp

   virtual void apply(float* rgbaBuffer, long numPixels)

Basically, given a packed float array with the specified number of pixels, process em.

Examples of ops include Lut1DOp, Lut3DOp, MtxOffsetOp, LogOp, etc.

Thus, the job of a transform becomes much simpler and they're only responsible
for converting themselves to a list of ops.  A simple FileTransform that only has
a single 1D lut internally may just generate a single Lut1DOp, but a
FileTransform that references a more complex format (such as the houdini lut case
referenced above) may generate a few ops:

.. code-block:: cpp

   void FileFormatHDL::BuildFileOps(OpRcPtrVec & ops,
                            const Config& /*config*/,
                            const ConstContextRcPtr & /*context*/,
                            CachedFileRcPtr untypedCachedFile,
                            const FileTransform& fileTransform,
                            TransformDirection dir) const {
   
   // Code omitted which loads the lut file into the file cache...
   
   CreateLut1DOp(ops, cachedFile->lut1D,
                      fileTransform.getInterpolation(), dir);
   CreateLut3DOp(ops, cachedFile->lut3D,
                      fileTransform.getInterpolation(), dir);

See (src/core/*Ops.h) for the available ops.

Note that while compositors often have complex, branching trees of image processing
operations, we just have a linear list of ops, lending itself very well to
optimization.

Before the ops are run, they are optimized. (Collapsed with appropriate neighbors, etc).


An Example
~~~~~~~~~~

Let us consider the internal steps when getProcessor() is called to convert
from ColorSpace 'adx10' to ColorSpace 'aces'.

* The first step is to turn this ColorSpace conversion into an ordered list of
transforms.  We do this by creating a single of the conversions from 'adx10'
to reference, and then adding the transforms required to go from reference to
'aces'.

* The Transform list is then converted into a list of ops.  It is during this
stage luts, are loaded, etc.



CPU CODE PATH
~~~~~~~~~~~~~

The master list of ops is then optimized, and stored internally in the processor.

.. code-block:: cpp

   FinalizeOpVec(m_cpuOps);

During Processor::apply(...), a subunit of pixels in the image are formatted into a sequential rgba block.  (Block size is optimized for computational (SSE) simplicity and performance, and is typically similar in size to an image scanline)

.. code-block:: cpp

   float * rgbaBuffer = 0;
   long numPixels = 0;
   while(true) {
      scanlineHelper.prepRGBAScanline(&rgbaBuffer, &numPixels);
      ...

Then for each op, op->apply is called in-place.

.. code-block:: cpp

   for(OpRcPtrVec::size_type i=0, size = m_cpuOps.size(); i<size; ++i)
   {
      m_cpuOps[i]->apply(rgbaBuffer, numPixels);
   }         

After all ops have been applied, the results are copied back to the source

.. code-block:: cpp

   scanlineHelper.finishRGBAScanline();



GPU CODE PATH
~~~~~~~~~~~~~

#. The master list of ops is partitioned into 3 ordered lists:

- As many ops as possible from the BEGINNING of the op-list that can be done
  analytically in shader text. (called gpu-preops)
- As many ops as possible from the END of the op-list that can be done
  analytically in shader text. (called gpu-postops)
- The left-over ops in the middle that cannot support shader text, and thus
  will be baked into a 3dlut. (called gpu-lattice)

#. Between the first an the second lists (gpu-preops, and gpu-latticeops), we
anaylze the op-stream metadata and determine the appropriate allocation to use.
(to minimize clamping, quantization, etc). This is accounted for here by
interserting a forward allocation to the end of the pre-ops, and the inverse
allocation to the start of the lattice ops.

See https://github.com/imageworks/OpenColorIO/blob/master/src/core/NoOps.cpp#L183

#. The 3 lists of ops are then optimized individually, and stored on the processor.
The Lut3d is computed by applying the gpu-lattice ops, on the CPU, to a lut3d
image.

The shader text is computed by calculating the shader for the gpu-preops, adding
a sampling function of the 3d lut, and then calculating the shader for the gpu
post ops.