[C#] Enable copying of GPU OrtValue to CPU #21244

guigzzz · 2024-07-03T20:45:36Z

Describe the issue

Hey guys,

I can't seem to figure out an easy way to copy an OrtValue that's been allocated on the GPU, back to the CPU.

OrtValue has a really convenient GetTensorDataAsSpan API, which just seems to wrap the raw pointer into a span which obviously won't work when the pointer is for memory on the GPU.

The python API has a nice copy_outputs_to_cpu API, which is exactly what I need.

Can we have the same thing added to the dotnet API ?
Either the GetTensorDataAsSpan API could be updated to do the copying automatically, or a new CopyOutputsToCpu API could be added to the IOBinding class, similar to python.

To reproduce

using Microsoft.ML.OnnxRuntime;

var session = new InferenceSession("model.onnx", SessionOptions.MakeSessionOptionWithCudaProvider());

var binding = session.CreateIoBinding();

var alloc = new OrtMemoryInfo(OrtMemoryInfo.allocatorCUDA,
    OrtAllocatorType.DeviceAllocator, 0, OrtMemType.Default);
// var alloc = OrtMemoryInfo.DefaultInstance;

binding.BindOutputToDevice("output", alloc);

var input = new float[4];
var inputValue = OrtValue.CreateTensorValueFromMemory(
    OrtMemoryInfo.DefaultInstance, input.AsMemory(), new long[] { 4 });
binding.BindInput("input", inputValue);

session.RunWithBinding(new RunOptions(), binding);

var output = binding.GetOutputValues().ToArray().First();

var outputSpan = output.GetTensorDataAsSpan<float>();
Console.Out.WriteLine($"Got {outputSpan[0]}");

Fails with:

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Program.<Main>$(System.String[])

Urgency

Not urgent, feature request.

Platform

Linux

OS Version

Ubuntu 20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.0

ONNX Runtime API

C#

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.2

The text was updated successfully, but these errors were encountered:

yuslepukhin · 2024-07-08T22:27:49Z

IOBinding is deprecated.

Unless otherwise instructed, the output OrtValues are created and copied to CPU memory at the end of inferencing.

https://onnxruntime.ai/docs/tutorials/csharp/basic_csharp.html

guigzzz · 2024-07-10T07:03:22Z

Can you elaborate on 'IOBinding is deprecated', this is news to me.
How else are we supposed to efficiently reuse output OrtValues?
If it truly is deprecated, then the documentation should be updated to reflect that.

The 'unless otherwise instructed' part is the crucial bit here. I have my output tensors being allocated on the GPU and I only sometimes want to copy them back to the host (time series model, so outputs feed back into the input, this is more efficient if everything stays on the GPU), but currently can't.

github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jul 3, 2024

guigzzz changed the title ~~Enable copying of GPU OrtValue to CPU~~ Jul 3, 2024

xadupre added the api:CSharp issues related to the C# API label Jul 5, 2024

sophies927 removed the ep:CUDA issues related to the CUDA execution provider label Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C#] Enable copying of GPU OrtValue to CPU #21244

[C#] Enable copying of GPU OrtValue to CPU #21244

guigzzz commented Jul 3, 2024 •

edited

Loading

yuslepukhin commented Jul 8, 2024

guigzzz commented Jul 10, 2024 •

edited

Loading

[C#] Enable copying of GPU OrtValue to CPU #21244

[C#] Enable copying of GPU OrtValue to CPU #21244

Comments

guigzzz commented Jul 3, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

yuslepukhin commented Jul 8, 2024

guigzzz commented Jul 10, 2024 • edited Loading

guigzzz commented Jul 3, 2024 •

edited

Loading

guigzzz commented Jul 10, 2024 •

edited

Loading