Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Introduce ColumnId to support Column renaming (part4) #45842

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

gengjun-git
Copy link
Contributor

Why I'm doing:

We need a unique ID to identify the Column. This ID is used in all places that reference the Column. In this way, to change the attributes of the Column, such as name, we only need to change the attributes in the Column object. There is a uniqueId in the current Column, but it is only available in newly created tables. For compatibility reasons, we need to introduce another Id: ColumnID. The ColumnID of the historical table is the name of the column, because the name was previously immutable.

What I'm doing:

There are currently three ways to reference Column: 1: direct copy of Column object, 2: reference to Column name, 3: sql expression reference.
This PR changes the sql expression reference to use ColumnId reference.

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@gengjun-git gengjun-git requested review from a team as code owners May 17, 2024 14:09
@@ -800,7 +816,7 @@ public void gsonPostProcess() throws IOException {
@Override
public void gsonPreProcess() throws IOException {
if (generatedColumnExpr != null) {
generatedColumnExprSerialized = new GsonUtils.ExpressionSerializedObject(generatedColumnExpr.toSql());
generatedColumnExprSerialized = ExpressionSerializedObject.create(generatedColumnExpr);
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Inconsistent handling of null return value leading to potential NullPointerException

You can modify the code like this:

public List<SlotRef> getGeneratedColumnRef(Map<ColumnId, Column> idToColumn) {
    List<SlotRef> slots = new ArrayList<>();
    if (generatedColumnExpr == null) {
        return slots; // Return an empty list instead of null to avoid potential NullPointerException.
    } else {
        generatedColumnExpr.convertToColumnNameExpr(idToColumn).collect(SlotRef.class, slots);
        return slots;
    }
}

This change ensures that getGeneratedColumnRef method returns an empty list instead of null, reducing the risk of a NullPointerException in contexts where the returned list is immediately used or iterated over.

@@ -1746,7 +1747,7 @@ public Status doAfterRestore(MvRestoreContext mvRestoreContext) throws DdlExcept
// change ExpressionRangePartitionInfo because mv's db may be changed.
if (partitionInfo instanceof ExpressionRangePartitionInfo) {
ExpressionRangePartitionInfo expressionRangePartitionInfo = (ExpressionRangePartitionInfo) partitionInfo;
Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprs().size() == 1);
Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprsSize() == 1);
expressionRangePartitionInfo.renameTableName(db.getFullName(), this.name);
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Incorrect handling of ExpressionRangePartitionInfo.getPartitionExprs() method calls without considering the change to require idToColumn as a parameter.

You can modify the code like this:

-            partitionRefTableExprs.get(0).setType(expressionRangePartitionInfo.getPartitionExprs().get(0).getType());
+            partitionRefTableExprs.get(0).setType(expressionRangePartitionInfo.getPartitionExprs(idToColumn).get(0).getType());

-            Expr partitionExpr = expressionRangePartitionInfo.getPartitionExprs().get(0);
+            Expr partitionExpr = expressionRangePartitionInfo.getPartitionExprs(idToColumn).get(0);

-        Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprs().size() == 1);
+        Preconditions.checkState(expressionRangePartitionInfo.getPartitionExprsSize() == 1);

This adjustment ensures the updated method signatures that now require an idToColumn map for fetching partition expressions are used correctly, adhering to the changes introduced. Additionally, it corrects the call from .getPartitionExprs().size() to .getPartitionExprsSize() reflecting the intended code behavior while avoiding potential NullPointerExceptions or incorrect behavior due to missing arguments.

for (Expr expr : partitionExprs) {
expr.accept(renameVisitor, null);
for (ColumnIdExpr expr : partitionExprs) {
expr.getExpr().accept(renameVisitor, null);
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Incorrect handling and potential loss of partition expressions during serialization/deserialization

You can modify the code like this:

public static PartitionInfo read(DataInput in) throws IOException {
    // Instead of returning null, properly implement the deserialization logic if applicable.
    // The provided code snippet does not include the complete implementation for read and write methods,
    // which are critical for correct serialization and deserialization of the object.

    // Assuming ExpressionSerializedObject and ColumnIdExpr have proper serialize and deserialize methods implemented.
    ExpressionRangePartitionInfo info = new ExpressionRangePartitionInfo();
    info.readFields(in); // Assuming this reads the basic fields required except partition expressions

    // Properly handle reading the serialized form of partition expressions.
    String json = Text.readString(in);
    List<ExpressionSerializedObject> expressionSerializedObjects = GsonUtils.GSON.fromJson(json,
            new TypeToken<List<ExpressionSerializedObject>>() {}.getType());

    List<ColumnIdExpr> partitionExprs = new ArrayList<>();
    for (ExpressionSerializedObject expressionSerializedObject : expressionSerializedObjects) {
        if (expressionSerializedObject != null) {
            ColumnIdExpr columnIdExpr = expressionSerializedObject.deserialize(); // Assuming an appropriate deserialize method exists
            partitionExprs.add(columnIdExpr);
        }
    }

    info.setPartitionExprs(partitionExprs); // Ensure this method properly updates the object's state
    return info;
}

@Override
public void write(DataOutput out) throws IOException {
    super.write(out);

    // Serialize the partition expressions into JSON string
    List<ExpressionSerializedObject> serializedPartitionExprs = new ArrayList<>();
    for (ColumnIdExpr columnIdExpr : partitionExprs) {
        // Assuming an appropriate method to serialize a ColumnIdExpr to ExpressionSerializedObject
        serializedPartitionExprs.add(ExpressionSerializedObject.create(columnIdExpr));
    }
    
    // Convert the list of serialized objects to JSON string and write it
    String json = GsonUtils.GSON.toJson(serializedPartitionExprs);
    Text.writeString(out, json);
}

This rectifies the potential issue by ensuring that the read and write methods for serialization/deserialization are appropriately implemented, thus avoiding data loss or inconsistencies.

@wanpengfei-git wanpengfei-git requested a review from a team May 17, 2024 14:09
@nshangyiming nshangyiming self-assigned this May 21, 2024
@@ -62,7 +63,8 @@

public class SlotRef extends Expr {
private TableName tblName;
private String col;
private String colName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the logical name , right?

}
}

private static void setColumnIdToColumnName(Expr expr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static void setColumnIdToColumnName(Expr expr) {
private static void setColumnIdByColumnName(Expr expr) {

better?

@Override
public String visitSlot(SlotRef node, Void context) {
if (node.getTblNameWithoutAnalyzed() != null) {
return node.getTblNameWithoutAnalyzed().toString() + "." + node.getColumnId().getId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use column id here, should use column name?

@@ -0,0 +1,120 @@
diff a/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java b/fe/fe-core/src/main/java/com/starrocks/sql/common/MetaUtils.java (rejected hunks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this file?

return expr;
}

public String serialize() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why call it serialize/deserialize? how about toSQL/fromSQL?

@@ -252,7 +254,7 @@ public ColumnDef toColumnDef() {
}
}
ColumnDef col = new ColumnDef(name, new TypeDef(type), null, isKey, aggregationType, isAllowNull,
defaultValueDef, isAutoIncrement, generatedColumnExpr, comment);
defaultValueDef, isAutoIncrement, generatedColumnExpr.convertToColumnNameExpr(table.getIdToColumn()), comment);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why using column name here?

sb.append(Joiner.on(", ").join(partitionExprs.stream().map(Expr::toSql).collect(toList())));
sb.append(Joiner.on(", ").join(partitionExprs
.stream()
.map(physicalExpr -> physicalExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map(physicalExpr -> physicalExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())
.map(columnIdExpr -> columnIdExpr.convertToColumnNameExpr(table.getIdToColumn()).toSql())

Signed-off-by: gengjun-git <gengjun@starrocks.com>
Signed-off-by: gengjun-git <gengjun@starrocks.com>
Signed-off-by: gengjun-git <gengjun@starrocks.com>
Copy link

sonarcloud bot commented May 31, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

Signed-off-by: gengjun-git <gengjun@starrocks.com>
Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

fail : 183 / 301 (60.80%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/optimizer/dump/DesensitizedSQLBuilder.java 0 1 00.00% [768]
🔵 com/starrocks/load/Load.java 0 2 00.00% [252, 709]
🔵 com/starrocks/sql/analyzer/UpdateAnalyzer.java 0 1 00.00% [150]
🔵 com/starrocks/sql/InsertPlanner.java 0 3 00.00% [164, 221, 651]
🔵 com/starrocks/alter/SchemaChangeJobV2.java 0 4 00.00% [547, 598, 599, 602]
🔵 com/starrocks/sql/optimizer/rule/transformation/materialization/MvPartitionCompensator.java 0 1 00.00% [668]
🔵 com/starrocks/sql/common/MetaUtils.java 10 68 14.71% [290, 291, 292, 293, 294, 295, 296, 297, 299, 300, 301, 303, 307, 308, 309, 310, 311, 313, 314, 315, 319, 320, 321, 322, 323, 325, 326, 327, 331, 332, 333, 334, 335, 339, 340, 341, 342, 343, 345, 346, 347, 354, 355, 361, 362, 366, 367, 368, 369, 371, 375, 376, 377, 378, 379, 381, 382, 383]
🔵 com/starrocks/catalog/Column.java 9 21 42.86% [528, 552, 583, 584, 586, 588, 707, 708, 710, 712, 765, 809]
🔵 com/starrocks/catalog/Table.java 1 2 50.00% [454]
🔵 com/starrocks/planner/OlapTableSink.java 3 6 50.00% [490, 507, 508]
🔵 com/starrocks/load/DeleteMgr.java 1 2 50.00% [382]
🔵 com/starrocks/sql/optimizer/operator/ColumnFilterConverter.java 3 6 50.00% [189, 190, 191]
🔵 com/starrocks/alter/SchemaChangeHandler.java 1 2 50.00% [461]
🔵 com/starrocks/sql/analyzer/AlterTableClauseVisitor.java 5 8 62.50% [536, 689, 775]
🔵 com/starrocks/catalog/MaterializedView.java 9 14 64.29% [682, 683, 1677, 1694, 1845]
🔵 com/starrocks/sql/ast/ColumnDef.java 3 4 75.00% [523]
🔵 com/starrocks/catalog/ExpressionRangePartitionInfoV2.java 12 16 75.00% [133, 134, 159, 160]
🔵 com/starrocks/analysis/ColumnIdExpr.java 44 53 83.02% [22, 23, 66, 81, 104, 105, 108, 109, 112]
🔵 com/starrocks/catalog/ExpressionRangePartitionInfo.java 34 38 89.47% [100, 101, 107, 233]
🔵 com/starrocks/analysis/SlotRef.java 21 22 95.45% [274]
🔵 com/starrocks/sql/analyzer/CreateTableAnalyzer.java 2 2 100.00% []
🔵 com/starrocks/connector/iceberg/IcebergAlterTableExecutor.java 3 3 100.00% []
🔵 com/starrocks/catalog/OlapTable.java 2 2 100.00% []
🔵 com/starrocks/sql/analyzer/AstToStringBuilder.java 3 3 100.00% []
🔵 com/starrocks/sql/analyzer/QueryAnalyzer.java 2 2 100.00% []
🔵 com/starrocks/server/LocalMetastore.java 1 1 100.00% []
🔵 com/starrocks/sql/analyzer/AnalyzerUtils.java 2 2 100.00% []
🔵 com/starrocks/sql/ast/ExpressionPartitionDesc.java 4 4 100.00% []
🔵 com/starrocks/sql/optimizer/rewrite/OptOlapPartitionPruner.java 1 1 100.00% []
🔵 com/starrocks/sql/analyzer/ExpressionAnalyzer.java 1 1 100.00% []
🔵 com/starrocks/persist/ExpressionSerializedObject.java 6 6 100.00% []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants