Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM IR generation bug when adding metadata for excinfo-related store instruction #9550

Open
dlee992 opened this issue Apr 29, 2024 · 8 comments · May be fixed by #9551
Open

LLVM IR generation bug when adding metadata for excinfo-related store instruction #9550

dlee992 opened this issue Apr 29, 2024 · 8 comments · May be fixed by #9551
Labels
bug - incorrect behavior Bugs: incorrect behavior

Comments

@dlee992
Copy link
Contributor

dlee992 commented Apr 29, 2024

I noticed in a certain case, numba will emit LLVM IR like this:

B10.if:                                           ; preds = %B10.if.loopexit, %B10.if.loopexit10
  %excinfo.2.0 = phi { i8*, i32, i8*, i8*, i32 }* [ @.const.picklebuf.23377073394240, %B10.if.loopexit10 ], [ @.const.picklebuf.23377075691904, %B10.if.loopexit ]
  %2 = bitcast i32* %.9.i to i8*
  %3 = bitcast i32* %.7.i to i8*
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %3)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %2)
  store { i8*, i32, i8*, i8*, i32 }* %excinfo.2.0, { i8*, i32, i8*, i8*, i32 }** %excinfo, align 8
  br label %common.ret

I think this branch is used for storing excinfo and return. But the store instr doesn't have the metadata of numba_exception_output, which makes the ref_prune pass in llvmlite can't remove some ref pairs.
The source code in llvmlite:

    bool isRaising(const BasicBlock *bb) {
        for (auto &instruction : *bb) {
            if (instruction.getOpcode() == Instruction::Store) {
                auto name = dyn_cast<StoreInst>(&instruction)
                                ->getPointerOperand()
                                ->getName();
                if (name.equals("excinfo") &&
                    instruction.hasMetadata("numba_exception_output")) {
                    return true;
                }
            }
        }
        return false;
    }

Willing to provide a python test case later. Trying to find a fix meanwhile.

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 29, 2024

Yeah, in numba/core/callconv.py, more than one location can generate excinfo-related store instr, but only one location adds the metadata for it.

Can push a PR soon.

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 29, 2024

could be related to #9186, where I observed this excinfo change hurts the performance.

@guilhermeleobas
Copy link
Collaborator

@dlee992, I wasn't aware of the numba_exception_output metadata. On callconv.py::set_dynamic_user_exc, one should set the metadata for the last store instruction as well.

zero = int32_t(0)
exc_fields = (builder.extract_value(struct_gv, PICKLE_BUF_IDX),
builder.extract_value(struct_gv, PICKLE_BUFSZ_IDX),
builder.bitcast(st_ptr, GENERIC_POINTER),
builder.bitcast(unwrap_fn, GENERIC_POINTER),
int32_t(len(struct_type)))
for idx, arg in enumerate(exc_fields):
builder.store(arg, builder.gep(excinfo_p, [zero, int32_t(idx)]))
builder.store(excinfo_p, excinfo_pp)

Can you try that?

@sklam
Copy link
Member

sklam commented Apr 30, 2024

The LLVM IR is from post-optimization and LLVM is allowed to move/remove metadata as it wish. Perhaps, we should stop using metadata but instead detect the store going to %exeinfo which must be one of the argument on the LLVM function.

So, in isRaising, it should detect that the store target is the LLVM argument for the exception pointer.

@guilhermeleobas
Copy link
Collaborator

@sklam, should we go with adding the metadata for the missing case? Or just try to fix the isRaising in llvmlite?

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 30, 2024

@guilhermeleobas, yes, I tried to add more metadata in #9551 . And my internal test case's performance is better with that. ex: w/o this PR, calls of NRT_decref are 3000 times; w/ this PR, the calls go down to 1800 times.

@sklam, yes, changing the detection way in llvmlite side is also a solution.

Copy link

This issue is marked as stale as it has had no activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with any updates and confirm that this issue still needs to be addressed.

@github-actions github-actions bot added the stale Marker label for stale issues. label May 31, 2024
@gmarkall gmarkall added bug - incorrect behavior Bugs: incorrect behavior and removed needtriage stale Marker label for stale issues. labels May 31, 2024
@gmarkall
Copy link
Member

The discussion seems like this is an actual issue, though I'm not clear about the exact nature of it, so I've labelled as "incorrect behaviour" and removed "needtriage" so it won't go stale again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug - incorrect behavior Bugs: incorrect behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants