Semantic-segmentation loss calculation: fix buffer usage on multi-GPU training (#1717)

* Semantic-segmentation loss calculation: fix buffer usage on multi-GPU training * Review fix: make the work buffer live longer

Semantic-segmentation loss calculation: fix buffer usage on multi-GPU training (#1717)
* Semantic-segmentation loss calculation: fix buffer usage on multi-GPU training * Review fix: make the work buffer live longer
ccd8b64f · Juha Reunanen · Davis E. King · 9433bfd6 · ccd8b64f
Commit ccd8b64f authored Apr 07, 2019 by Juha Reunanen Committed by Davis E. King Apr 07, 2019
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 2 deletions

cuda_dlib.h dlib/cuda/cuda_dlib.h +5 -2

No files found.
--- a/dlib/cuda/cuda_dlib.h
+++ b/dlib/cuda/cuda_dlib.h
@@ -423,7 +423,6 @@ namespace dlib
            compute_loss_multiclass_log_per_pixel(
            )
            {
-                work = device_global_buffer();
            }
            template <
@@ -439,6 +438,10 @@ namespace dlib
                const size_t bytes_per_plane = subnetwork_output.nr()*subnetwork_output.nc()*sizeof(uint16_t);
                // Allocate a cuda buffer to store all the truth images and also one float
                // for the scalar loss output.
+                if (!work)
+                {
+                    work = device_global_buffer();
+                }
                cuda_data_void_ptr buf = work->get(subnetwork_output.num_samples()*bytes_per_plane + sizeof(float));
                cuda_data_void_ptr loss_buf = buf;
@@ -467,7 +470,7 @@ namespace dlib
                double& loss
            );
-            std::shared_ptr<resizable_cuda_buffer> work;
+            mutable std::shared_ptr<resizable_cuda_buffer> work;
        };
    // ------------------------------------------------------------------------------------