data: implement resultspecs dataclass compatibility #7611

tdesveaux · 2024-05-14T14:47:44Z

As mentioned in #7610

resultspecs were only really working with dict data. This fix them when used with dataclass.

Change to includeFields implements field filtering, which convert a dataclass to a dict.
I think this should just not work, but implementation was trivial so I'm still looking for feedback.

Contributor Checklist:

I have updated the unit tests
[n/a?] I have created a file in the newsfragments directory (and read the README.txt in that directory)
I have updated the appropriate documentation

p12tic · 2024-05-15T19:31:17Z

master/buildbot/test/unit/data/test_resultspec.py

+
+class ResultSpecMKListMixin:
+    @staticmethod
+    def mkdata(fld: Sequence[str] | str, *values):


Maybe just have free function? It doesn't use self, so mixin is possibly an overkill.

Ah ok, it's used below for inheritance.

p12tic · 2024-05-15T19:34:56Z

I think this should just not work, but implementation was trivial so I'm still looking for feedback

I don't understand what downsides are with the current approach and implementation. What future problems could we have if this PR is merged?

From a practical standpoint I think the current approach is perfectly fine.

p12tic · 2024-05-15T19:35:07Z

The PR needs rebase due to conflicts.

tdesveaux · 2024-05-15T19:47:32Z

I think this should just not work, but implementation was trivial so I'm still looking for feedback

I don't understand what downsides are with the current approach and implementation. What future problems could we have if this PR is merged?

From a practical standpoint I think the current approach is perfectly fine.

When using ResultSpec.apply() where datais a dataclass. If the ResultSpec has fields, it will convert the dataclass to a dict silently. This seem pretty risky.

I see two options:

always convert the dataclass to a dict in apply
raise an exception if apply is called with a dataclass

I was mostly worried about when a ResultSpec was used in the DB layer, but this codepath will ignore the ResultSpec.fields, so it won't change DB API return types.

p12tic · 2024-05-15T22:36:05Z

More conflicts again.

Regarding the risks, resultspecs are used both in DB and data layers, so we need to support both.

p12tic · 2024-05-15T22:50:16Z

I think that we should have 2 apply functions in ResultSpec: one (e.g. apply_db) for DB layer, another (e.g. apply_data) for data layer. The current code is a mess that sort of works everywhere, but is very brittle because things in DB layer are actually quite different from things in data layer. We then have code like isinstance(data, base.ListResult) checks which are only relevant for data layer. The introduction of data classes made this defect more visible.

I think for the database layer we can simply ignore fields. Most queries return full rows anyway. In the cases where incomplete row is returned we can either change the code to request full row or adjust dataclass to allow null fields.

This will not have any user impact because resultspec filtering is only relevant to REST queries and they do resultspec filtering again. See

# post-process any remaining parts of the resultspec
data = rspec.apply(data)

in master/buildbot/www/rest.py.

tdesveaux · 2024-05-16T20:20:29Z

Marking as Draft as I need a chunk of time to take a good look at this and I won't be able until next week.

tdesveaux · 2024-05-21T10:15:52Z

I think that we should have 2 apply functions in ResultSpec: one (e.g. apply_db) for DB layer, another (e.g. apply_data) for data layer. The current code is a mess that sort of works everywhere, but is very brittle because things in DB layer are actually quite different from things in data layer. We then have code like isinstance(data, base.ListResult) checks which are only relevant for data layer. The introduction of data classes made this defect more visible.

I think for the database layer we can simply ignore fields. Most queries return full rows anyway. In the cases where incomplete row is returned we can either change the code to request full row or adjust dataclass to allow null fields.

This will not have any user impact because resultspec filtering is only relevant to REST queries and they do resultspec filtering again. See
# post-process any remaining parts of the resultspec
data = rspec.apply(data)
in master/buildbot/www/rest.py.

@p12tic from what I saw, ResultSpec.apply is never used in the DB layer, only ResultSpec.thd_execute.
apply is used in FakeDB in test contexts leading to the change I made to make it work, but it's unnecessary.

A better solution would be to change FakeDB to use ResultSpec without apply for now.

Longer term, I think its a bit dangerous that some DB functions can take a ResultSpec as arg and trust they will only call thd_execute.
From the caller perspective, it can lead to weird case were the fields and properties are ignored.

So a second change would be to remove the result_spec arg from the DB functions and replace it with a new arg from a new class that only allow to add order, limit, offset and where directives to the query.
My opinion is that trying to factor the Data and DB usage of ResultSpec in one class would result in a complex implementation with too many edge cases.

So to summarize, I'll now:

update FakeDB implementations to not use ResultSpec.apply

If you confirm my understanding and agree with my proposal, above:

Create an issue to split ResultSpec DB implementation into another class.

Depending on my availability, I could take care of the split once I'm done with the move to dataclasses for DB models.

p12tic reviewed May 15, 2024

View reviewed changes

tdesveaux force-pushed the data/resultspec/dataclass-compat branch from d65e0f7 to 9c4d61d Compare May 15, 2024 19:50

tdesveaux marked this pull request as draft May 16, 2024 20:20

data: implement resultspecs dataclass compatibility

e32974a

tdesveaux force-pushed the data/resultspec/dataclass-compat branch from 9c4d61d to e32974a Compare May 21, 2024 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data: implement resultspecs dataclass compatibility #7611

data: implement resultspecs dataclass compatibility #7611

tdesveaux commented May 14, 2024

p12tic May 15, 2024

p12tic May 15, 2024

p12tic commented May 15, 2024

p12tic commented May 15, 2024

tdesveaux commented May 15, 2024 •

edited

p12tic commented May 15, 2024

p12tic commented May 15, 2024 •

edited

tdesveaux commented May 16, 2024

tdesveaux commented May 21, 2024

data: implement resultspecs dataclass compatibility #7611

Are you sure you want to change the base?

data: implement resultspecs dataclass compatibility #7611

Conversation

tdesveaux commented May 14, 2024

Contributor Checklist:

p12tic May 15, 2024

Choose a reason for hiding this comment

p12tic May 15, 2024

Choose a reason for hiding this comment

p12tic commented May 15, 2024

p12tic commented May 15, 2024

tdesveaux commented May 15, 2024 • edited

p12tic commented May 15, 2024

p12tic commented May 15, 2024 • edited

tdesveaux commented May 16, 2024

tdesveaux commented May 21, 2024

tdesveaux commented May 15, 2024 •

edited

p12tic commented May 15, 2024 •

edited