[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

gengjun-git · 2024-05-17T14:09:21Z

Why I'm doing:

We need a unique ID to identify the Column. This ID is used in all places that reference the Column. In this way, to change the attributes of the Column, such as name, we only need to change the attributes in the Column object. There is a uniqueId in the current Column, but it is only available in newly created tables. For compatibility reasons, we need to introduce another Id: ColumnID. The ColumnID of the historical table is the name of the column, because the name was previously immutable.

What I'm doing:

There are currently three ways to reference Column: 1: direct copy of Column object, 2: reference to Column name, 3: sql expression reference.
This PR changes the sql expression reference to use ColumnId reference.

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

starrocks-cr · 2024-05-17T14:09:35Z

fe/fe-core/src/main/java/com/starrocks/catalog/Column.java

@@ -800,7 +816,7 @@ public void gsonPostProcess() throws IOException {
    @Override
    public void gsonPreProcess() throws IOException {
        if (generatedColumnExpr != null) {
-            generatedColumnExprSerialized = new GsonUtils.ExpressionSerializedObject(generatedColumnExpr.toSql());
+            generatedColumnExprSerialized = ExpressionSerializedObject.create(generatedColumnExpr);
        }
    }



The most risky bug in this code is:
Inconsistent handling of null return value leading to potential NullPointerException

You can modify the code like this:

public List<SlotRef> getGeneratedColumnRef(Map<ColumnId, Column> idToColumn) { List<SlotRef> slots = new ArrayList<>(); if (generatedColumnExpr == null) { return slots; // Return an empty list instead of null to avoid potential NullPointerException. } else { generatedColumnExpr.convertToColumnNameExpr(idToColumn).collect(SlotRef.class, slots); return slots; } }

This change ensures that getGeneratedColumnRef method returns an empty list instead of null, reducing the risk of a NullPointerException in contexts where the returned list is immediately used or iterated over.

starrocks-cr · 2024-05-17T14:09:35Z

fe/fe-core/src/main/java/com/starrocks/catalog/MaterializedView.java

@@ -1746,7 +1747,7 @@ public Status doAfterRestore(MvRestoreContext mvRestoreContext) throws DdlExcept
        // change ExpressionRangePartitionInfo because mv's db may be changed.
        if (partitionInfo instanceof ExpressionRangePartitionInfo) {
            ExpressionRangePartitionInfo expressionRangePartitionInfo = (ExpressionRangePartitionInfo) partitionInfo;
-            Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprs().size() == 1);
+            Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprsSize() == 1);
            expressionRangePartitionInfo.renameTableName(db.getFullName(), this.name);
        }



The most risky bug in this code is:
Incorrect handling of ExpressionRangePartitionInfo.getPartitionExprs() method calls without considering the change to require idToColumn as a parameter.

You can modify the code like this:

- partitionRefTableExprs.get(0).setType(expressionRangePartitionInfo.getPartitionExprs().get(0).getType()); + partitionRefTableExprs.get(0).setType(expressionRangePartitionInfo.getPartitionExprs(idToColumn).get(0).getType()); - Expr partitionExpr = expressionRangePartitionInfo.getPartitionExprs().get(0); + Expr partitionExpr = expressionRangePartitionInfo.getPartitionExprs(idToColumn).get(0); - Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprs().size() == 1); + Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprsSize() == 1);

This adjustment ensures the updated method signatures that now require an idToColumn map for fetching partition expressions are used correctly, adhering to the changes introduced. Additionally, it corrects the call from .getPartitionExprs().size() to .getPartitionExprsSize() reflecting the intended code behavior while avoiding potential NullPointerExceptions or incorrect behavior due to missing arguments.

starrocks-cr · 2024-05-17T14:09:44Z

fe/fe-core/src/main/java/com/starrocks/catalog/ExpressionRangePartitionInfo.java

-        for (Expr expr : partitionExprs) {
-            expr.accept(renameVisitor, null);
+        for (ColumnIdExpr expr : partitionExprs) {
+            expr.getExpr().accept(renameVisitor, null);
        }
    }



The most risky bug in this code is:
Incorrect handling and potential loss of partition expressions during serialization/deserialization

You can modify the code like this:

public static PartitionInfo read(DataInput in) throws IOException { // Instead of returning null, properly implement the deserialization logic if applicable. // The provided code snippet does not include the complete implementation for read and write methods, // which are critical for correct serialization and deserialization of the object. // Assuming ExpressionSerializedObject and ColumnIdExpr have proper serialize and deserialize methods implemented. ExpressionRangePartitionInfo info = new ExpressionRangePartitionInfo(); info.readFields(in); // Assuming this reads the basic fields required except partition expressions // Properly handle reading the serialized form of partition expressions. String json = Text.readString(in); List<ExpressionSerializedObject> expressionSerializedObjects = GsonUtils.GSON.fromJson(json, new TypeToken<List<ExpressionSerializedObject>>() {}.getType()); List<ColumnIdExpr> partitionExprs = new ArrayList<>(); for (ExpressionSerializedObject expressionSerializedObject : expressionSerializedObjects) { if (expressionSerializedObject != null) { ColumnIdExpr columnIdExpr = expressionSerializedObject.deserialize(); // Assuming an appropriate deserialize method exists partitionExprs.add(columnIdExpr); } } info.setPartitionExprs(partitionExprs); // Ensure this method properly updates the object's state return info; } @Override public void write(DataOutput out) throws IOException { super.write(out); // Serialize the partition expressions into JSON string List<ExpressionSerializedObject> serializedPartitionExprs = new ArrayList<>(); for (ColumnIdExpr columnIdExpr : partitionExprs) { // Assuming an appropriate method to serialize a ColumnIdExpr to ExpressionSerializedObject serializedPartitionExprs.add(ExpressionSerializedObject.create(columnIdExpr)); } // Convert the list of serialized objects to JSON string and write it String json = GsonUtils.GSON.toJson(serializedPartitionExprs); Text.writeString(out, json); }

This rectifies the potential issue by ensuring that the read and write methods for serialization/deserialization are appropriately implemented, thus avoiding data loss or inconsistencies.

nshangyiming · 2024-05-27T08:50:04Z

fe/fe-core/src/main/java/com/starrocks/analysis/SlotRef.java

@@ -62,7 +63,8 @@

 public class SlotRef extends Expr {
    private TableName tblName;
-    private String col;
+    private String colName;


this is the logical name , right?

nshangyiming · 2024-05-27T08:52:55Z

fe/fe-core/src/main/java/com/starrocks/analysis/ColumnIdExpr.java

+        }
+    }
+
+    private static void setColumnIdToColumnName(Expr expr) {


Suggested change

private static void setColumnIdToColumnName(Expr expr) {

private static void setColumnIdByColumnName(Expr expr) {

better?

nshangyiming · 2024-05-27T08:57:41Z

fe/fe-core/src/main/java/com/starrocks/analysis/ColumnIdExpr.java

+        @Override
+        public String visitSlot(SlotRef node, Void context) {
+            if (node.getTblNameWithoutAnalyzed() != null) {
+                return node.getTblNameWithoutAnalyzed().toString() + "." + node.getColumnId().getId();


why use column id here, should use column name?

nshangyiming · 2024-05-27T08:58:11Z

fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java.rej

@@ -0,0 +1,120 @@
+diff a/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java b/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java	(rejected hunks)


remove this file?

nshangyiming · 2024-05-27T09:04:11Z

fe/fe-core/src/main/java/com/starrocks/analysis/ColumnIdExpr.java

+        return expr;
+    }
+
+    public String serialize() {


why call it serialize/deserialize? how about toSQL/fromSQL?

nshangyiming · 2024-05-27T09:08:00Z

fe/fe-core/src/main/java/com/starrocks/catalog/Column.java

@@ -252,7 +254,7 @@ public ColumnDef toColumnDef() {
            }
        }
        ColumnDef col = new ColumnDef(name, new TypeDef(type), null, isKey, aggregationType, isAllowNull,
-                defaultValueDef, isAutoIncrement, generatedColumnExpr, comment);
+                defaultValueDef, isAutoIncrement, generatedColumnExpr.convertToColumnNameExpr(table.getIdToColumn()), comment);


why using column name here?

nshangyiming · 2024-05-27T09:26:08Z

fe/fe-core/src/main/java/com/starrocks/catalog/ExpressionRangePartitionInfo.java

-        sb.append(Joiner.on(", ").join(partitionExprs.stream().map(Expr::toSql).collect(toList())));
+        sb.append(Joiner.on(", ").join(partitionExprs
+                .stream()
+                .map(physicalExpr -> physicalExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())


Suggested change

.map(physicalExpr -> physicalExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())

.map(columnIdExpr -> columnIdExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())

Signed-off-by: gengjun-git <gengjun@starrocks.com>

sonarcloud · 2024-05-31T03:19:45Z

Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

Signed-off-by: gengjun-git <gengjun@starrocks.com>

github-actions · 2024-05-31T09:33:30Z

[BE Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2024-05-31T09:34:20Z

[FE Incremental Coverage Report]

❌ fail : 183 / 301 (60.80%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	com/starrocks/sql/optimizer/dump/DesensitizedSQLBuilder.java	0	1	00.00%	[768]
🔵	com/starrocks/load/Load.java	0	2	00.00%	[252, 709]
🔵	com/starrocks/sql/analyzer/UpdateAnalyzer.java	0	1	00.00%	[150]
🔵	com/starrocks/sql/InsertPlanner.java	0	3	00.00%	[164, 221, 651]
🔵	com/starrocks/alter/SchemaChangeJobV2.java	0	4	00.00%	[547, 598, 599, 602]
🔵	com/starrocks/sql/optimizer/rule/transformation/materialization/MvPartitionCompensator.java	0	1	00.00%	[668]
🔵	com/starrocks/sql/common/MetaUtils.java	10	68	14.71%	[290, 291, 292, 293, 294, 295, 296, 297, 299, 300, 301, 303, 307, 308, 309, 310, 311, 313, 314, 315, 319, 320, 321, 322, 323, 325, 326, 327, 331, 332, 333, 334, 335, 339, 340, 341, 342, 343, 345, 346, 347, 354, 355, 361, 362, 366, 367, 368, 369, 371, 375, 376, 377, 378, 379, 381, 382, 383]
🔵	com/starrocks/catalog/Column.java	9	21	42.86%	[528, 552, 583, 584, 586, 588, 707, 708, 710, 712, 765, 809]
🔵	com/starrocks/catalog/Table.java	1	2	50.00%	[454]
🔵	com/starrocks/planner/OlapTableSink.java	3	6	50.00%	[490, 507, 508]
🔵	com/starrocks/load/DeleteMgr.java	1	2	50.00%	[382]
🔵	com/starrocks/sql/optimizer/operator/ColumnFilterConverter.java	3	6	50.00%	[189, 190, 191]
🔵	com/starrocks/alter/SchemaChangeHandler.java	1	2	50.00%	[461]
🔵	com/starrocks/sql/analyzer/AlterTableClauseVisitor.java	5	8	62.50%	[536, 689, 775]
🔵	com/starrocks/catalog/MaterializedView.java	9	14	64.29%	[682, 683, 1677, 1694, 1845]
🔵	com/starrocks/sql/ast/ColumnDef.java	3	4	75.00%	[523]
🔵	com/starrocks/catalog/ExpressionRangePartitionInfoV2.java	12	16	75.00%	[133, 134, 159, 160]
🔵	com/starrocks/analysis/ColumnIdExpr.java	44	53	83.02%	[22, 23, 66, 81, 104, 105, 108, 109, 112]
🔵	com/starrocks/catalog/ExpressionRangePartitionInfo.java	34	38	89.47%	[100, 101, 107, 233]
🔵	com/starrocks/analysis/SlotRef.java	21	22	95.45%	[274]
🔵	com/starrocks/sql/analyzer/CreateTableAnalyzer.java	2	2	100.00%	[]
🔵	com/starrocks/connector/iceberg/IcebergAlterTableExecutor.java	3	3	100.00%	[]
🔵	com/starrocks/catalog/OlapTable.java	2	2	100.00%	[]
🔵	com/starrocks/sql/analyzer/AstToStringBuilder.java	3	3	100.00%	[]
🔵	com/starrocks/sql/analyzer/QueryAnalyzer.java	2	2	100.00%	[]
🔵	com/starrocks/server/LocalMetastore.java	1	1	100.00%	[]
🔵	com/starrocks/sql/analyzer/AnalyzerUtils.java	2	2	100.00%	[]
🔵	com/starrocks/sql/ast/ExpressionPartitionDesc.java	4	4	100.00%	[]
🔵	com/starrocks/sql/optimizer/rewrite/OptOlapPartitionPruner.java	1	1	100.00%	[]
🔵	com/starrocks/sql/analyzer/ExpressionAnalyzer.java	1	1	100.00%	[]
🔵	com/starrocks/persist/ExpressionSerializedObject.java	6	6	100.00%	[]

gengjun-git requested review from a team as code owners May 17, 2024 14:09

starrocks-cr bot reviewed May 17, 2024

View reviewed changes

wanpengfei-git added the META-REVIEW label May 17, 2024

starrocks-cr bot reviewed May 17, 2024

View reviewed changes

wanpengfei-git requested a review from a team May 17, 2024 14:09

mergify bot assigned gengjun-git May 17, 2024

nshangyiming self-assigned this May 21, 2024

nshangyiming reviewed May 27, 2024

View reviewed changes

gengjun-git force-pushed the rename_col_sql_expr branch from b70efd3 to ddd16e3 Compare May 30, 2024 11:53

gengjun-git added 3 commits May 31, 2024 11:11

add

c3a51b5

Signed-off-by: gengjun-git <gengjun@starrocks.com>

fix

0f8b129

Signed-off-by: gengjun-git <gengjun@starrocks.com>

fix ut

f648759

Signed-off-by: gengjun-git <gengjun@starrocks.com>

gengjun-git force-pushed the rename_col_sql_expr branch from 4f4c67c to f648759 Compare May 31, 2024 03:11

fix ut

22875e6

Signed-off-by: gengjun-git <gengjun@starrocks.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

gengjun-git commented May 17, 2024

starrocks-cr bot May 17, 2024

starrocks-cr bot May 17, 2024

starrocks-cr bot May 17, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

nshangyiming May 27, 2024

sonarcloud bot commented May 31, 2024

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

	private static void setColumnIdToColumnName(Expr expr) {
	private static void setColumnIdByColumnName(Expr expr) {

		@@ -0,0 +1,120 @@
		diff a/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java b/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java (rejected hunks)

	.map(physicalExpr -> physicalExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())
	.map(columnIdExpr -> columnIdExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())

[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

Are you sure you want to change the base?

[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

Conversation

gengjun-git commented May 17, 2024

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

starrocks-cr bot May 17, 2024

Choose a reason for hiding this comment

starrocks-cr bot May 17, 2024

Choose a reason for hiding this comment

starrocks-cr bot May 17, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

nshangyiming May 27, 2024

Choose a reason for hiding this comment

sonarcloud bot commented May 31, 2024

Quality Gate failed

github-actions bot commented May 31, 2024

[BE Incremental Coverage Report]

github-actions bot commented May 31, 2024

[FE Incremental Coverage Report]

file detail