[WIP] Optimal intercept initialization for simple objectives #10298

david-cortes · 2024-05-18T20:43:16Z

This PR modifies the intercept initialization for simple objectives (logistic, poisson, gamma, tweedie) to use their closed-form optimal solutions (as in: the number that minimizes the objective function) instead of a non-optimal one-step Newton.

For these objectives, the optimal intercept corresponds simply to the link function applied to the mean of the response variable. Since base_score already undergoes this transformation, the PR here just changes calculation to the mean of the response variable in those cases.

For multi-target versions of these objectives, it sets them to zero instead as otherwise applying a common intercept might not make much sense for the given problem.

Note that there's still room for improvements:

Custom user-defined functions would most likely be better served by a default score of zero or by a 1D newton estimation. I wasn't sure where in the code to identify when a user-defined objective is passed though.
Other objectives would likely benefit from using more than one newton step for the intercept estimation.

Note1: I wasn't sure about how to calculate a weighted sample mean here (not familiar with GPU computing and the 'devices' logic). Would be helpful to have a WeightedMean function under stats if possible, to use in case there's sample weights.

Note2: The compiler checks here don't like turning a linalg::Tensor<T, 2> into linalg::Tensor<T, 1> by reinterpret_cast. I'm also not sure what would be the right way to do it without a data copy.

Note3: I wasn't sure where to add tests for the changes here. For example, would be ideal to test that binary:logistic and binary:logitraw produce the same raw scores, but I'm not sure where's the right place to add such test.

trivialfis · 2024-05-20T04:17:09Z

Thank you for working on this! I will look into the changes.

david-cortes added 2 commits May 18, 2024 22:40

optimal intercept initialization ref dmlc#9899

91bdb07

Merge branch 'master' into intercepts

e1eb802

david-cortes changed the title ~~Optimal intercept initialization for simple objectives~~ [WIP] Optimal intercept initialization for simple objectives May 18, 2024

david-cortes added 5 commits May 18, 2024 22:49

linter

0ad3448

use default value for multi-targets

7b88455

add cpu-only weighted mean

174e5eb

initialize to zero for multitarget

41d5608

linter

08cbe85

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Optimal intercept initialization for simple objectives #10298

[WIP] Optimal intercept initialization for simple objectives #10298

david-cortes commented May 18, 2024 •

edited

trivialfis commented May 20, 2024 •

edited

[WIP] Optimal intercept initialization for simple objectives #10298

Are you sure you want to change the base?

[WIP] Optimal intercept initialization for simple objectives #10298

Conversation

david-cortes commented May 18, 2024 • edited

trivialfis commented May 20, 2024 • edited

david-cortes commented May 18, 2024 •

edited

trivialfis commented May 20, 2024 •

edited