Zero Copy Arrays
12 Feb 2015With this blog, I start a new series covering some of the new features in VTK's data model. Please note that a lot of these improvements are very new and are likely to change as we address various kinks we encounter as we apply them to various problems.
In these blogs, I will depend heavily on the excellent work done by David Lonie, which is summarized here and here. I recommend taking a close look at the Wiki page before digging too deep into these blogs as I will not provide the same level of detail. In summary, we made a number of changes to VTK to allow easy creation of datasets and data arrays that use memory layouts different than VTK's defaults. The main objective of these changes was to enable tight coupling of VTK with other codes without the need to deep copy data structures back and forth. Our main application for this is in situ analysis and visualization where the other code is a simulation, usually running on a supercomputer. However, as I will demonstrate, these changes open the door for a lot more. Let's get down to it.
Constant Array
The simplest example that I could think of is a VTK data array that has one unique value. Normally, such an array would be created and populated as follows.
vtkNew<vtkFloatArray> constArray;
constArray->SetNumberOfTuples(nvalues);
for(vtkIdType i=0; i<nvales; i++)
{
constArray->SetValue(i, constValue);
}
Obviously, this array uses nvalues * sizeof(float)
bytes of memory in addition
to whatever memory an array requires internally. Pretty wasteful. Wouldn't it be
great if we could create an array that uses only sizeof(float)
bytes (+ whatever
the array uses internally)? With the zero copy infrastructure, we can. Take a quick
look at the header and the implementation.
I shamelessly copied and pasted much of this from vtkCPExodusIIResultsArrayTemplate
as one can probably guess from the comments I left. This class looks fairly long but
I actually had to change only a few lines of code, mainly:
template <class Scalar> Scalar& vtkConstantArray<Scalar>
::GetValueReference(vtkIdType idx)
{
return this->Value;
}
Note that this code ignores the value index, idx
, which is expected for a
constant array. We can exercise this code with the following.
vtkNew<vtkConstantArray<double> > testScalars;
testScalars->InitializeArray(10, 1000, 1);
testScalars->SetName("scalars");
vtkNew<vtkImageData> image;
image->SetDimensions(10, 10, 10);
image->GetPointData()->SetScalars(testScalars.GetPointer());
vtkNew<vtkLineSource> line;
line->SetPoint1(0, 0, 0);
line->SetPoint2(9, 9, 9);
line->SetResolution(50);
vtkNew<vtkProbeFilter> probe;
probe->SetSourceData(image.GetPointer());
probe->SetInputConnection(line->GetOutputPort());
probe->Update();
vtkDataArray* outScalars =
probe->GetOutput()->GetPointData()->GetArray("scalars");
for (int i=0; i<50; i++)
{
cout << i << " : " << outScalars->GetTuple1(i) << endl;
}
This prints 10 for each point as expected.
Struct of Arrays
Let's do something a bit more involved. As you probably know, VTK stores
vectors in an interleaved way. When linearly traversing a vector array, the
component index increases faster than the tuple index. This is commonly referred
to as array of structs (AOS, see this link for example).
Many simulation codes store their vectors as struct of arrays (SOA).
Dave developed an array that
can handle AOS vectors. See the full code here and
here.
This class is very similar to vtkConstantArray
, which is no surprise given
that I copied from it for my implementation. The meat of this array is the following
piece of code.
template <class Scalar> Scalar& vtkCPExodusIIResultsArrayTemplate<Scalar>
::GetValueReference(vtkIdType idx)
{
const vtkIdType tuple = idx / this->NumberOfComponents;
const vtkIdType comp = idx % this->NumberOfComponents;
return this->Arrays[comp][tuple];
}
Note how this method maps a value reference to the right array and index in an SOA data structure. The equivalent of this for VTK's AOS data structure is the following.
template <class Scalar> Scalar& vtkCPExodusIIResultsArrayTemplate<Scalar>
::GetValueReference(vtkIdType idx)
{
return this->Array[idx];
}
Implicit Points Array
This requires explanation. In VTK, there are two types of datasets. In one type,
subclasses of vtkPointSet
, the point coordinates are stored explicitly as a
vtkDataArray
(inside a vtkPoints
). In the other type, the points coordinates
are stored implicitly (vtkImageData
and vtkRectilinearGrid
). There are occasions
where we may want an implementation that is in between these two types. For example,
when we threshold an image, the filter produces a vtkUnstructuredGrid
, which
stores its points explicitly. This causes a significant increase in memory usage.
Wouldn't it be nice if the threshold filter could create an unstructured grid but
still refer to the points implicitly? In a future blog, I will describe how
we can implement a threshold filter that does this and more. Here, let's look
at a data array that can used to develop such a filter. See here
and here for the implementation.
The meat of this class is as follows.
template <class Scalar> Scalar& vtkImagePointsArray<Scalar>
::GetValueReference(vtkIdType idx)
{
assert(this->Points);
const vtkIdType tuple = idx / 3;
const vtkIdType comp = idx % 3;
return this->Points->GetPoint(tuple)[comp];
}
Note that Points
is actually a vtkImageData
, which calculates point coordinates
implicitly with the following code.
void vtkImageData::GetPoint(vtkIdType ptId, double x[3])
{
int i, loc[3];
const double *origin = this->Origin;
const double *spacing = this->Spacing;
const int* extent = this->Extent;
vtkIdType dims[3];
dims[0] = extent[1] - extent[0] + 1;
dims[1] = extent[3] - extent[2] + 1;
dims[2] = extent[5] - extent[4] + 1;
loc[0] = ptId % dims[0];
loc[1] = (ptId / dims[0]) % dims[1];
loc[2] = ptId / (dims[0]*dims[1]);
for (i=0; i<3; i++)
{
x[i] = origin[i] + (loc[i]+extent[i*2]) * spacing[i];
}
}
I left out some of the code for simplicity. We can exercise this class with the following.
vtkNew<vtkRTAnalyticSource> source;
source->SetWholeExtent(-50, 50, -50, 50, -50, 50);
source->Update();
vtkImageData* wavelet = source->GetOutput();
vtkNew<vtkImageData> img;
img->CopyStructure(wavelet);
vtkNew<vtkImagePointsArray<double> > testPts;
testPts->InitializeArray(img.GetPointer());
testPts->SetName("pts");
vtkNew<vtkPoints> points;
points->SetData(testPts.GetPointer());
// This creates an unstructured grid from the input
// image data.
vtkNew<vtkCleanUnstructuredGrid> toUGrid;
toUGrid->SetInputData(wavelet);
toUGrid->Update();
vtkUnstructuredGrid* ugrid = toUGrid->GetOutput();
// Here we substitute the implicit points.
ugrid->SetPoints(points.GetPointer());
vtkNew<vtkContourFilter> contour2;
contour2->SetInputData(ugrid);
contour2->SetValue(0, 200);
contour2->Update();
The full code which can be found here verifies that the output of the contour filter matches the output that is generated from an unstructured grid with explicit points.
We covered a few examples of data array subclasses that use memory layouts different than VTK's default. Most filters should work out-of-box when they encounter these arrays. However, some filters, especially those that directly access raw pointers, need to be changed. In my next blog, I will demonstrate how such filters can be easily updated. Until then, happy zero copying.