Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in reading Iceberg v2 table since v3.2.2 #45848

Open
megri opened this issue May 17, 2024 · 1 comment
Open

Regression in reading Iceberg v2 table since v3.2.2 #45848

megri opened this issue May 17, 2024 · 1 comment
Labels
type/bug Something isn't working

Comments

@megri
Copy link

megri commented May 17, 2024

Operating against a table in an external iceberg/unified catalog in v3.2.6 gives a nondescript error (see stacktrace below)

I tried tracking down the issue by diffing v3.2.6 against v3.2.2 and I think this change is what causes it:
9ab128d#diff-62955f4651c70b99d1f502bc07103c278a9d80ef7bfb9eba6fa90cafb8d943fbL155-R158

Note, doing show create table hive_catalog.db.table in StarRocks gives:

CREATE TABLE `table` (
  `id` int(11) DEFAULT NULL,
  `countrycode` varchar(1048576) DEFAULT NULL,
  `department` varchar(1048576) DEFAULT NULL,
  `name` varchar(1048576) DEFAULT NULL,
  `postbox` varchar(1048576) DEFAULT NULL,
  `postcode` varchar(1048576) DEFAULT NULL,
  `searchstring` varchar(1048576) DEFAULT NULL,
  `street` varchar(1048576) DEFAULT NULL,
  `town` varchar(1048576) DEFAULT NULL,
  `latitude` double DEFAULT NULL,
  `longitude` double DEFAULT NULL,
  `street2` varchar(1048576) DEFAULT NULL,
  `province` varchar(1048576) DEFAULT NULL,
  `_cdc` struct<op varchar(1048576), ts datetime, offset bigint(20), source varchar(1048576), target varchar(1048576), key struct<id int(11)>> DEFAULT NULL
)
PARTITION BY (  )
PROPERTIES ("location" = "s3a://hive_catalog/db/table");

While doing the same in Trino gives:

CREATE TABLE hive_catalog.db.table (
   id integer NOT NULL,
   countrycode varchar NOT NULL,
   department varchar,
   name varchar NOT NULL,
   postbox varchar,
   postcode varchar,
   searchstring varchar NOT NULL,
   street varchar,
   town varchar,
   latitude double,
   longitude double,
   street2 varchar,
   province varchar,
   _cdc ROW(op varchar, ts timestamp(6) with time zone, offset bigint, source varchar, target varchar, key ROW(id integer)) NOT NULL
)
WITH (
   format = 'PARQUET',
   format_version = 2,
   location = 's3a://hive_catalog/db/table',
   partitioning = ARRAY['day("_cdc.ts")']
)

Especially note the empty PARTITION BY( ) coming from StarRocks. I think this is the issue.

Real behavior (Required)

An error:

2024-05-17 21:39:52.730Z WARN (starrocks-mysql-nio-pool-9|10602) [StmtExecutor.execute():708] execute Exception, sql /* ApplicationName=DBeaver 24.0.4 - SQLEditor  */ select id from hive_catalog.db.table
LIMIT 0, 200
java.lang.NullPointerException: null
	at com.starrocks.sql.optimizer.operator.logical.LogicalIcebergScanOperator.lambda$new$0(LogicalIcebergScanOperator.java:49) ~[starrocks-fe.jar:?]
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?]
	at com.starrocks.sql.optimizer.operator.logical.LogicalIcebergScanOperator.(LogicalIcebergScanOperator.java:49) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitTable(RelationTransformer.java:562) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitTable(RelationTransformer.java:144) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.TableRelation.accept(TableRelation.java:182) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:68) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:64) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.QueryTransformer.planFrom(QueryTransformer.java:172) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.QueryTransformer.plan(QueryTransformer.java:87) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitSelect(RelationTransformer.java:263) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitSelect(RelationTransformer.java:144) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.SelectRelation.accept(SelectRelation.java:242) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:68) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:64) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.transform(RelationTransformer.java:213) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.optimizer.transformer.RelationTransformer.transformWithSelectLimit(RelationTransformer.java:181) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:194) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:134) ~[starrocks-fe.jar:?]
	at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:91) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:520) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:413) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:607) ~[starrocks-fe.jar:?]
	at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:901) ~[starrocks-fe.jar:?]
	at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starrocks-fe.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]

StarRocks version (Required)

3.2.6

@megri megri added the type/bug Something isn't working label May 17, 2024
@megri
Copy link
Author

megri commented May 20, 2024

Further investigation reveals that this might be due to us partitioning our iceberg-tables on a nested (ROW) column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant