Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support pk index major compaction in cloud native table #41737

Merged
merged 11 commits into from
May 21, 2024

Conversation

TszKitLo40
Copy link
Contributor

@TszKitLo40 TszKitLo40 commented Feb 27, 2024

Why I'm doing:

PK index major compaction is not supported in cloud native table which will cause write amplification.

What I'm doing:

Support pk index major compaction in cloud native table.
WIthout pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 121442 54
1G + upsert + 1/100 291326 73
1G + upsert + 1/10000 338894 46.7
2.5G + insert + 1/100 710491 121
2.5G + upsert + 1/100 1419680 121
2.5G + upsert + 1/10000 307463 121

With pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 52866 64.1
1G + upsert + 1/100 125956 80.8
1G + upsert + 1/10000 405559 46.7
2.5G + insert + 1/100 191436 117
2.5G + upsert + 1/100 389931 108
2.5G + upsert + 1/10000 223999 114

In trace log, we can see that the time of primary_index_commit_latency_us is much less that without major compaction. Because merge compaction is not needed in commit.

I0301 13:51:28.410473  3076 lake_service.cpp:237] Published txns=5050. tablets=10163 cost=1869193us, trace: {"child_traces":[["PublishTablet",{"base_version":93,"deletes":0,"do_update_latency_us":924748,"new_del":0,"primary_index_commit_latency_us":909008,"primary_index_load_latency_us":0,"queuing_latency_us":20,"rewrite_segment_latency_us":19,"rowsetid":93,"state_bytes":9205820,"tablet_id":10163,"total_del":0,"update_index_latency_us":924788,"upsert_rows":151201,"upserts":1}]]}
I0301 13:52:56.940850  4938 lake_service.cpp:237] Published txns=5031. tablets=10139 cost=11237925us, trace: {"child_traces":[["PublishTablet",{"base_version":102,"deletes":0,"do_update_latency_us":870584,"new_del":0,"primary_index_commit_latency_us":10342835,"primary_index_load_latency_us":0,"queuing_latency_us":16,"rewrite_segment_latency_us":14,"rowsetid":102,"state_bytes":10569270,"tablet_id":10139,"total_del":0,"update_index_latency_us":870614,"upsert_rows":151124,"upserts":1}]]}

Fix #45740

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@TszKitLo40 TszKitLo40 force-pushed the pk-index-major-compaction branch 6 times, most recently from 58c73e7 to e864f96 Compare February 28, 2024 13:24
@TszKitLo40 TszKitLo40 marked this pull request as ready for review February 29, 2024 08:41
@TszKitLo40 TszKitLo40 requested review from a team as code owners February 29, 2024 08:41
@TszKitLo40 TszKitLo40 requested a review from a team as a code owner March 7, 2024 11:47
@wanpengfei-git wanpengfei-git requested a review from a team March 29, 2024 07:28
@TszKitLo40 TszKitLo40 force-pushed the pk-index-major-compaction branch 4 times, most recently from 52bd94e to 0891f92 Compare March 29, 2024 12:15
be/src/util/dynamic_cache.h Outdated Show resolved Hide resolved
luohaha
luohaha previously approved these changes Apr 7, 2024
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Signed-off-by: Zijie Lu <wslzj40@gmail.com>
Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 112 / 132 (84.85%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/tablet_manager.cpp 0 1 00.00% [769]
🔵 be/src/storage/lake/local_pk_index_manager.cpp 44 61 72.13% [210, 211, 216, 217, 221, 222, 232, 233, 234, 247, 248, 261, 262, 263, 267, 275, 283]
🔵 be/src/storage/persistent_index_compaction_manager.cpp 16 18 88.89% [52, 81]
🔵 be/src/storage/lake/local_pk_index_manager.h 1 1 100.00% []
🔵 be/src/storage/olap_server.cpp 4 4 100.00% []
🔵 be/src/storage/lake/lake_primary_index.h 4 4 100.00% []
🔵 be/src/storage/persistent_index.cpp 1 1 100.00% []
🔵 be/src/storage/lake/lake_primary_index.cpp 13 13 100.00% []
🔵 be/src/storage/lake/update_manager.cpp 13 13 100.00% []
🔵 be/src/storage/lake/txn_log_applier.cpp 1 1 100.00% []
🔵 be/src/storage/tablet_updates.h 1 1 100.00% []
🔵 be/src/storage/storage_engine.cpp 2 2 100.00% []
🔵 be/src/storage/persistent_index_tablet_loader.cpp 3 3 100.00% []
🔵 be/src/storage/lake/lake_local_persistent_index.cpp 1 1 100.00% []
🔵 be/src/storage/lake/lake_local_persistent_index.h 2 2 100.00% []
🔵 be/src/storage/lake/lake_local_persistent_index_tablet_loader.h 2 2 100.00% []
🔵 be/src/storage/lake/lake_local_persistent_index_tablet_loader.cpp 4 4 100.00% []

@wyb wyb merged commit a154a9a into StarRocks:main May 21, 2024
53 checks passed
Copy link

@Mergifyio backport branch-3.2

@github-actions github-actions bot added version:3.4 and removed 3.2 labels May 21, 2024
Copy link
Contributor

mergify bot commented May 21, 2024

backport branch-3.2

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 21, 2024
…1737)

Signed-off-by: Zijie Lu <wslzj40@gmail.com>
(cherry picked from commit a154a9a)

# Conflicts:
#	be/src/storage/lake/lake_primary_index.cpp
#	be/src/storage/lake/lake_primary_index.h
#	be/src/storage/lake/update_manager.cpp
#	be/src/storage/lake/update_manager.h
#	be/src/storage/storage_engine.cpp
@TszKitLo40
Copy link
Contributor Author

https://github.com/Mergifyio backport branch-3.3

Copy link
Contributor

mergify bot commented May 21, 2024

backport branch-3.3

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 21, 2024
…1737)

Signed-off-by: Zijie Lu <wslzj40@gmail.com>
(cherry picked from commit a154a9a)
wanpengfei-git pushed a commit that referenced this pull request May 21, 2024
…ckport #41737) (#46002)

Co-authored-by: Zijie Lu <wslzj40@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support pk index major compaction in cloud native table
6 participants