[Web] `executionProviders` chain for `webnn` fallback does not work on init error #20729

sansmoraxz · 2024-05-19T16:00:43Z

Describe the issue

In case of any error during init with webnn the execution provider blocks and does not fallback to the next provider in chain. For example, DirectML API init in windows 10 with GPU initialization will not fallback to others.

This only occurs if the WebNN API is enabled in the browser flags.

To reproduce

Try in Windows 10 with a browser with the API enabled with following code fragment:

const mySession = await ort.InferenceSession.create("./model.onnx", {
  executionProviders: [
    {
      name: "webnn",
      deviceType: "gpu",
    },
    {
      name: "webnn",
      deviceType: "cpu",
    },
    "wasm",
    "cpu",
  ],
});

Results in error:

Failed to execute 'createContext' on 'ML': DirectML: The DirectML feature level on this platform is lower than the minimum required one.

The fallback works if the webnn api is not available.

Urgency

NA. Just doing some PoCs.

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.0

Execution Provider

Other / Unknown

The text was updated successfully, but these errors were encountered:

fdwr · 2024-05-20T07:05:47Z

The fallback works if the webnn api is not available.

@Honry Given that the fallback to the next EP works when the WebNN API is not available at all in the browser, would it be consistent to treat this error case (of not being able to initialize the WebNN backend) as a fallback too?

huningxin · 2024-05-20T07:22:29Z

@sansmoraxz

Try in Windows 10 with a browser with the API enabled with following code fragment:

At the current stage, you can try Edge browser dev channel that supports WebNN on Windows 10.

Honry · 2024-05-20T15:28:45Z

The fallback works if the webnn api is not available.

@Honry Given that the fallback to the next EP works when the WebNN API is not available at all in the browser, would it be consistent to treat this error case (of not being able to initialize the WebNN backend) as a fallback too?

@fdwr, following's the code snippet where it checks the availability of WebNN API. I drafted a PR to attempt to check the creation of WebNN MLContext at the same place, while it requires an additional WebNNExecutionProviderOption option parameter, which makes this parameter special in its parent methods (all related to EPs initialization).

    if (epName === 'webnn') {
      // perform WebNN availability check
      if (typeof navigator === 'undefined' || !(navigator as unknown as {ml: unknown}).ml) {
        throw new Error('WebNN is not supported in current environment');
      }

      await initJsep('webnn', getInstance(), env);
    }

@fs-eire, could you pls. take a look at the PR? Do you have any other good idea? Thanks!

fs-eire · 2024-05-20T21:19:38Z

Considering recent issues/PRs about WebNN, here is a summary of requirements:

User may want to create an instance of MLContext using different combinations of options when trying to initialize WebNN.
User may need to access the MLContext in order to use IO binding

It is not straightforward to implement those features. The major reason is because the existing design was based on some old assumptions that are no longer true for the new scenarios:

[js/webnn] Enable user-supplied MLContext #20600: A session object to MLContext object mapping is not easy to implement because (1) they are managed in different places (JS/C++) and (2) they are accessed in different context (EP creation/session init, which are decoupled in ORT)
[WebNN EP] Validate MLContext creation early to allow fallback to other EP #20735: Backend initialization (abstraction) is depending on WebNN options and does not support the example in this issue (yes , there are 2 "webnn" entries in the config).

How about we let ORT just accept an instance of MLContext and leave the creation of MLContext to users? This is also what we did for WebGPU when integrated with Transformerjs. We allow users to set adaptor which is created by themselves (https://onnxruntime.ai/docs/api/js/interfaces/Env.WebGpuFlags.html#adapter). This should solve recent issues and the only concern is the backward compatibility. @Honry

sansmoraxz · 2024-05-21T04:30:46Z

How about we let ORT just accept an instance of MLContext and leave the creation of MLContext to users?

I for one would very much like that. Otherwise the checks would be redundant (one throwaway init for checking available devices). TBH I don't think this approach is also all that good, but at least we get control of the init process somewhat.

Maybe the supported device type listing can be a part of core WebNN API.

Honry · 2024-05-21T06:35:16Z

How about we let ORT just accept an instance of MLContext and leave the creation of MLContext to users? This is also what we did for WebGPU when integrated with Transformerjs. We allow users to set adaptor which is created by themselves (https://onnxruntime.ai/docs/api/js/interfaces/Env.WebGpuFlags.html#adapter). This should solve recent issues and the only concern is the backward compatibility. @Honry

Thanks @fs-eire, good proposal, thus we can get global MLContext from ort.env.webnn.context for webnn session, as well as make it sharable context for I/O binding purpose.

While some more concerns, users have to learn how to create WebNN MLContext, and if they create different sessions with different WebNN MLContext, they have to reset the env.WebNNFlags.context. Which requires more additional configuration for creating webnn ep.

@huningxin, @egalli, what's your opinion?

fs-eire · 2024-05-21T07:15:21Z

There may be a difference between the proposed ort.env.webnn.context and ort.env.webgpu.adaptor: in WebGPU, the adaptor is used as a global unique object (singleton), while in WebNN, it is possible that multiple sessions uses different MLContext. I don't know if this is a valid scenario so please correct me if I was wrong. But if there are multiple instances of MLContext may be used, it may not be a good idea to use ort.env.webnn.context.

users have to learn how to create WebNN MLContext

This is true - no matter whether we use ort.env.webnn.context or allow users to pass the context in session options. However, it's just one line of code, right? Should not be a problem to any one:

navigator.ml.createContext(options)

huningxin · 2024-05-21T07:17:34Z

while in WebNN, it is possible that multiple sessions uses different MLContext

This is a valid scenario. For example, developers may want to run encoder on GPU MLContext and run decoder on NPU MLContext.

fs-eire · 2024-05-21T07:21:10Z

while in WebNN, it is possible that multiple sessions uses different MLContext

This is a valid scenario. For example, developers may want to run encoder on GPU MLContext and run decoder on NPU MLContext.

Then I would prefer to just let user pass their MLContext instance in session options. By doing this we don't need to offer a way to get the context, because users created them and should have the reference themselves.

Honry · 2024-05-21T07:33:15Z

while in WebNN, it is possible that multiple sessions uses different MLContext

This is a valid scenario. For example, developers may want to run encoder on GPU MLContext and run decoder on NPU MLContext.

Then I would prefer to just let user pass their MLContext instance in session options. By doing this we don't need to offer a way to get the context, because users created them and should have the reference themselves.

Is there a way to pass MLContext instance as sessionOptions to the WebNN EP (C++)? Looks like we need additional Wasm Module for exposing MLContext JS instance to WebNN EP.

egalli · 2024-05-21T07:42:15Z

If we pass the MLContext thru a session options, my understand is that we'll need to a way store the the MLContexts in JS, pass an id to to the ORT, and have the WebNN EP ask for MLContext using the id (since OrtAddSessionConfigEntry only supports strings). Does this sound correct?

huningxin · 2024-05-21T08:14:26Z

However, it's just one line of code, right? Should not be a problem to any one

That's true. However I suppose the concern is this is a breaking change.

Could we make MLContext as an option for advanced usages (I/O binding, WebGPU interop etc.,) while keep the the existing options? If MLContext option is present, other options are ignored.

fs-eire · 2024-05-21T08:17:20Z

However, it's just one line of code, right? Should not be a problem to any one

That's true. However I suppose the concern is this is a breaking change.

Could we make MLContext as an option for advanced usages (I/O binding, WebGPU interop etc.,) while keep the the existing options? If MLContext option is present, other options are ignored.

Yes, I think this is a good idea. This allows to preserve backward compatibility.

fs-eire · 2024-05-21T08:24:23Z

If we pass the MLContext thru a session options, my understand is that we'll need to a way store the the MLContexts in JS, pass an id to to the ORT, and have the WebNN EP ask for MLContext using the id (since OrtAddSessionConfigEntry only supports strings). Does this sound correct?

That sounds correct to me. I can only think of one easier way to do that: if we restrict that at the same time only one session is being initialized, we can just use the "currentContext" instead of a map. Then all other parts should be similar (using the Module object and embind)

Honry · 2024-05-21T08:34:09Z

Thanks @fs-eire, @huningxin, good idea, thus @egalli could continue his PR.

Now only one remaining issue, my concern is passing specific webnn options to backend initialization does not make much sense.

#20735: Backend initialization (abstraction) is depending on WebNN options and does not support the example in this issue (yes , there are 2 "webnn" entries in the config).

fs-eire · 2024-05-21T20:18:26Z

Thanks @fs-eire, @huningxin, good idea, thus @egalli could continue his PR.

Now only one remaining issue, my concern is passing specific webnn options to backend initialization does not make much sense.

#20735: Backend initialization (abstraction) is depending on WebNN options and does not support the example in this issue (yes , there are 2 "webnn" entries in the config).

If we need to pass the MLContext from JS to C++ anyway, it is no need to pass the options to C++.

egalli · 2024-05-21T20:33:50Z

Even if we pass the MLContext, we would still need to pass the deviceType to the C++ code as it is used to select between NCHW and NHWC.

onnxruntime/onnxruntime/core/providers/webnn/webnn_execution_provider.cc

Lines 31 to 32 in 8a98874

    
           if (webnn_device_flags.compare("cpu") == 0) { 
        
             preferred_layout_ = DataLayout::NHWC;

It would be nice if WebNN provided a way to query the preferred layout from the MLContext.

fs-eire · 2024-05-21T21:21:29Z

Even if we pass the MLContext, we would still need to pass the deviceType to the C++ code as it is used to select between NCHW and NHWC.

onnxruntime/onnxruntime/core/providers/webnn/webnn_execution_provider.cc

Lines 31 to 32 in 8a98874

if (webnn_device_flags.compare("cpu") == 0) {

preferred_layout_ = DataLayout::NHWC;

It would be nice if WebNN provided a way to query the preferred layout from the MLContext.

So currently there is no way to do that? does it mean if we want to let users to pass the MLContext, we still also need to pass the device?

huningxin · 2024-05-22T00:31:28Z

Even if we pass the MLContext, we would still need to pass the deviceType to the C++ code as it is used to select between NCHW and NHWC.

WebNN spec supports both layouts. This workaround is for a limitation of previous Chromium WebNN XNNPACK implementation for CPU. Now Chromium is using TFLite implementation for CPU and supporting NCHW. This workaround will not be necessary anymore once that is fixed.

It would be nice if WebNN provided a way to query the preferred layout from the MLContext.

There is an ongoing discussion in Working Group about Allow checking whether operators/types are supported for a backend before creating a graph. I think querying the preferred layout is a good use case. Feel free to chime in and share your inputs.

Honry · 2024-05-22T01:56:35Z

Thanks @fs-eire, @huningxin, good idea, thus @egalli could continue his PR.
Now only one remaining issue, my concern is passing specific webnn options to backend initialization does not make much sense.

#20735: Backend initialization (abstraction) is depending on WebNN options and does not support the example in this issue (yes , there are 2 "webnn" entries in the config).

If we need to pass the MLContext from JS to C++ anyway, it is no need to pass the options to C++.

As we talked, we wanted to keep current options and add additional option for MLContext, if user uses current options, this is still an issue.

e.g. With current options as below, if creating WebNN GPU failed, it will throw from WebNN EP (C++) and will not fallback to the webgpu ep. If we need to check this early, we have to pass WebNN options to backend initialization.

const mySession = await ort.InferenceSession.create("./model.onnx", {
  executionProviders: [
    {
      name: "webnn",
      deviceType: "gpu",
    },
    "webgpu",
  ],
});

fs-eire · 2024-05-22T04:43:54Z

Thanks @fs-eire, @huningxin, good idea, thus @egalli could continue his PR.
Now only one remaining issue, my concern is passing specific webnn options to backend initialization does not make much sense.

#20735: Backend initialization (abstraction) is depending on WebNN options and does not support the example in this issue (yes , there are 2 "webnn" entries in the config).

If we need to pass the MLContext from JS to C++ anyway, it is no need to pass the options to C++.

As we talked, we wanted to keep current options and add additional option for MLContext, if user uses current options, this is still an issue.

e.g. With current options as below, if creating WebNN GPU failed, it will throw from WebNN EP (C++) and will not fallback to the webgpu ep. If we need to check this early, we have to pass WebNN options to backend initialization.
const mySession = await ort.InferenceSession.create("./model.onnx", {
  executionProviders: [
    {
      name: "webnn",
      deviceType: "gpu",
    },
    "webgpu",
  ],
});

we wanted to keep current options and add additional option for MLContext

Yes, but the implementation can be different. Actually we can do this in JS:

if there is not MLContext in session options, try to create a MLContext in JS
always pass the MLContext from JS to C++.

This should not cause the C++ exception.

EDIT: I had a comment in #20600 to suggest to create MLContext in C++, but that is before this issue is created. Now I think it is better to create it in JS (or let user create it, if using new interface) so that we can avoid the problem

Honry · 2024-05-22T05:06:57Z

Yes, but the implementation can be different. Actually we can do this in JS:

if there is not MLContext in session options, try to create a MLContext in JS
always pass the MLContext from JS to C++.
This should not cause the C++ exception.

EDIT: I had a comment in #20600 to suggest to create MLContext in >C++, but that is before this issue is created. Now I think it is better to create it in JS (or let user create it, if using new >interface) so that we can avoid the problem

Thus we still need to pass WebNN options to the backend initialization.

And this depends on WebNN spec to expose device type to corresponding MLContext, as inside WebNN EP, we need device type to filter the supported ops, data type, layout etc, which are different supported status among device types at current stage.

fs-eire · 2024-05-22T05:36:34Z

Yes, but the implementation can be different. Actually we can do this in JS:
if there is not MLContext in session options, try to create a MLContext in JS
always pass the MLContext from JS to C++.
This should not cause the C++ exception.

EDIT: I had a comment in #20600 to suggest to create MLContext in >C++, but that is before this issue is created. Now I think it is better to create it in JS (or let user create it, if using new >interface) so that we can avoid the problem

Thus we still need to pass WebNN options to the backend initialization.

And this depends on WebNN spec to expose device type to corresponding MLContext, as inside WebNN EP, we need device type to filter the supported ops, data type, layout etc, which are different supported status among device types at current stage.

we need device type to filter the supported ops, data type, layout etc

I understand the part that C++ code need the WebNN options.

The one thing that I don't understand is: if a MLContext instance is already created, is it still possible that the WebNN EP may fail the EP initialization?

Honry · 2024-05-22T05:58:24Z

The one thing that I don't understand is: if a MLContext instance is already created, is it still possible that the WebNN EP may fail the EP initialization?

No, it isn't. But it will throw in where it creates the MLContext failed with current WebNN options, then the session creation will be rejected. Just as @sansmoraxz encountered in his use case.

If we check context creation early in initEp(), will resolve his issue.

    if (epName === 'webnn') {
      // perform WebNN availability check
      if (typeof navigator === 'undefined' || !(navigator as unknown as {ml: unknown}).ml) {
        throw new Error('WebNN is not supported in current environment');
      }

      await initJsep('webnn', getInstance(), env);
    }

fs-eire · 2024-05-22T06:30:28Z

I think before WebNN allows getting the options from a MLContext object, it seems that we need users to pass both deviceType and powerPreference in session options.

If MLContext is not specified, they are used to create the MLContext (or fail if not available).
If MLContext is specified, they are still need in C++ for setting preferred layout, etc.

fs-eire · 2024-05-22T06:39:21Z

I don't think this is a clean solution. A better solution would be exposing properties for getting those metadata from the MLContext object, which may require a spec review process.

The reason is not only because the options are redundant info (because the process of creating MLContext implicitly include those info), but also may cause inconsistency. User may use 'cpu' to create the MLContext but passing to session options with that MLContext and deviceType: 'gpu'. We can only assume that users may not make this kind of errors.

For now, we can ask users to pass those info in session options and eventually deprecate those options and just let passing MLContext only, which may take several versions after the spec allows it (if this ever happens).

fs-eire · 2024-05-24T20:07:07Z

With all the discussions above, I created #20816, which tries to make an API update for executionProviders for WebNN. Please help to review the change.

huningxin · 2024-05-27T01:05:42Z

I don't think this is a clean solution. A better solution would be exposing properties for getting those metadata from the MLContext object, which may require a spec review process.

Agreed, I think MLContext should expose preferredLayout, I add this input to corresponding spec issue for WG discussion: webmachinelearning/webnn#463 (comment)

### Description This PR is an API-only change to address the requirements being discussed in #20729. There are multiple ways that users may create an ORT session by specifying the session options differently. All the code snippet below will use the variable `webnnOptions` as this: ```js const myWebnnSession = await ort.InferenceSession.create('./model.onnx', { executionProviders: [ webnnOptions ] }); ``` ### The old way (backward-compatibility) ```js // all-default, name only const webnnOptions_0 = 'webnn'; // all-default, properties omitted const webnnOptions_1 = { name: 'webnn' }; // partial const webnnOptions_2 = { name: 'webnn', deviceType: 'cpu' }; // full const webnnOptions_3 = { name: 'webnn', deviceType: 'gpu', numThreads: 1, powerPreference: 'high-performance' }; ``` ### The new way (specify with MLContext) ```js // options to create MLcontext const options = { deviceType: 'gpu', powerPreference: 'high-performance' }; const myMlContext = await navigator.ml.createContext(options); // options for session options const webnnOptions = { name: 'webnn', context: myMlContext, ...options }; ``` This should throw (because no deviceType is specified): ```js const myMlContext = await navigator.ml.createContext({ ... }); const webnnOptions = { name: 'webnn', context: myMlContext }; ``` ### Interop with WebGPU ```js // get WebGPU device const adaptor = await navigator.gpu.requestAdapter({ ... }); const device = await adaptor.requestDevice({ ... }); // set WebGPU adaptor and device ort.env.webgpu.adaptor = adaptor; ort.env.webgpu.device = device; const myMlContext = await navigator.ml.createContext(device); const webnnOptions = { name: 'webnn', context: myMlContext, gpuDevice: device }; ``` This should throw (because cannot specify both gpu device and MLContext option at the same time): ```js const webnnOptions = { name: 'webnn', context: myMlContext, gpuDevice: device, deviceType: 'gpu' }; ```

sansmoraxz added the platform:web issues related to ONNX Runtime web; typically submitted using template label May 19, 2024

github-actions bot added ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform labels May 19, 2024

fdwr added ep:WebNN WebNN execution provider and removed ep:DML issues related to the DirectML execution provider labels May 20, 2024

Honry mentioned this issue May 20, 2024

[WebNN EP] Validate MLContext creation early to allow fallback to other EP #20735

Open

fs-eire mentioned this issue May 20, 2024

[js/webnn] Enable user-supplied MLContext #20600

Open

fs-eire mentioned this issue May 24, 2024

[js/webnn] update API of session options for WebNN #20816

Merged

huningxin mentioned this issue May 27, 2024

Allow checking whether operators/types are supported for a backend before creating a graph webmachinelearning/webnn#463

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Web] `executionProviders` chain for `webnn` fallback does not work on init error #20729

[Web] `executionProviders` chain for `webnn` fallback does not work on init error #20729

sansmoraxz commented May 19, 2024 •

edited

fdwr commented May 20, 2024 •

edited

huningxin commented May 20, 2024

Honry commented May 20, 2024 •

edited

fs-eire commented May 20, 2024 •

edited

sansmoraxz commented May 21, 2024

Honry commented May 21, 2024

fs-eire commented May 21, 2024 •

edited

huningxin commented May 21, 2024

fs-eire commented May 21, 2024

Honry commented May 21, 2024

egalli commented May 21, 2024

huningxin commented May 21, 2024 •

edited

fs-eire commented May 21, 2024

fs-eire commented May 21, 2024

Honry commented May 21, 2024

fs-eire commented May 21, 2024

egalli commented May 21, 2024

fs-eire commented May 21, 2024

huningxin commented May 22, 2024

Honry commented May 22, 2024

fs-eire commented May 22, 2024 •

edited

Honry commented May 22, 2024 •

edited

fs-eire commented May 22, 2024

Honry commented May 22, 2024

fs-eire commented May 22, 2024

fs-eire commented May 22, 2024 •

edited

fs-eire commented May 24, 2024

huningxin commented May 27, 2024

[Web] executionProviders chain for webnn fallback does not work on init error #20729

[Web] executionProviders chain for webnn fallback does not work on init error #20729

Comments

sansmoraxz commented May 19, 2024 • edited

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

Execution Provider

fdwr commented May 20, 2024 • edited

huningxin commented May 20, 2024

Honry commented May 20, 2024 • edited

fs-eire commented May 20, 2024 • edited

sansmoraxz commented May 21, 2024

Honry commented May 21, 2024

fs-eire commented May 21, 2024 • edited

huningxin commented May 21, 2024

fs-eire commented May 21, 2024

Honry commented May 21, 2024

egalli commented May 21, 2024

huningxin commented May 21, 2024 • edited

fs-eire commented May 21, 2024

fs-eire commented May 21, 2024

Honry commented May 21, 2024

fs-eire commented May 21, 2024

egalli commented May 21, 2024

fs-eire commented May 21, 2024

huningxin commented May 22, 2024

Honry commented May 22, 2024

fs-eire commented May 22, 2024 • edited

Honry commented May 22, 2024 • edited

fs-eire commented May 22, 2024

Honry commented May 22, 2024

fs-eire commented May 22, 2024

fs-eire commented May 22, 2024 • edited

fs-eire commented May 24, 2024

huningxin commented May 27, 2024

[Web] `executionProviders` chain for `webnn` fallback does not work on init error #20729

[Web] `executionProviders` chain for `webnn` fallback does not work on init error #20729

sansmoraxz commented May 19, 2024 •

edited

fdwr commented May 20, 2024 •

edited

Honry commented May 20, 2024 •

edited

fs-eire commented May 20, 2024 •

edited

fs-eire commented May 21, 2024 •

edited

huningxin commented May 21, 2024 •

edited

fs-eire commented May 22, 2024 •

edited

Honry commented May 22, 2024 •

edited

fs-eire commented May 22, 2024 •

edited