Skip to content

[client] Move limit scan code to client's table api from flink.#2794

Merged
wuchong merged 3 commits into
apache:mainfrom
loserwang1024:fluss-table-scanner
Mar 10, 2026
Merged

[client] Move limit scan code to client's table api from flink.#2794
wuchong merged 3 commits into
apache:mainfrom
loserwang1024:fluss-table-scanner

Conversation

@loserwang1024

Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #2793

Brief change log

Tests

API and Format

Documentation

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a table-level batch scanning API to the Fluss client (to avoid Flink-side bucket iteration) and updates Flink limit pushdown to use the new client capability.

Changes:

  • Introduce Scan#createBatchScanner() and implement it in TableScan by building bucket scanners and combining them via CompositeBatchScanner.
  • Update Flink PushdownUtils.limitScan to use the new table-level batch scanner API.
  • Add unit/integration tests for CompositeBatchScanner and table-level limit scans.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/utils/PushdownUtils.java Switch Flink limit scan implementation to use client table-level createBatchScanner() API.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/Scan.java Add new table-level createBatchScanner() API to the public scan interface.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/TableScan.java Implement table-level batch scanner creation by enumerating buckets/partitions and composing scanners.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/batch/CompositeBatchScanner.java New composite batch scanner to unify multiple per-bucket scanners behind one BatchScanner.
fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/CompositeBatchScannerTest.java New unit tests for composite scanner behavior (no-limit, limit, close).
fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/BatchScannerITCase.java Rename IT case and add integration test for table-level scan with limit.
Comments suppressed due to low confidence (1)

fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/BatchScannerITCase.java:343

  • This test claims to verify “respects limit” but asserts actual.size() >= limit. From an API perspective, Scan.limit(N) should return at most N rows; allowing more will surprise callers (especially since BatchScanUtils.collectRows returns all rows from the scanner). Consider updating the implementation to cap results at limit and tighten this assertion to <= limit (and keep the existing <= 9 bound).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fluss-client/src/main/java/org/apache/fluss/client/table/scanner/TableScan.java Outdated
Comment thread fluss-client/src/main/java/org/apache/fluss/client/table/scanner/Scan.java Outdated
Comment thread fluss-client/src/main/java/org/apache/fluss/client/table/scanner/TableScan.java Outdated
@loserwang1024 loserwang1024 requested a review from wuchong March 5, 2026 11:49

@wuchong wuchong left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @loserwang1024 , I left some minor comments.

Comment on lines +87 to +90
// Ensure all scanners are closed on failure to avoid resource leaks
IOUtils.closeQuietly(scanner);
scannerQueue.forEach(IOUtils::closeQuietly);
scannerQueue.clear();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this logic be moved to the close() method? If a fatal exception occurs, the scanner owner is responsible for manually invoking close(). Placing it here might be problematic if the exception is transient and eligible for retry.

Comment on lines +341 to +342
assertThat(actual.size()).isGreaterThanOrEqualTo(limit);
assertThat(actual.size()).isLessThanOrEqualTo(9);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just assert the size should be equal to limit? The current assersion looks like the returned result is not determinist.

while (batch.hasNext()) {
values.add(batch.next().getInt(0));
}
assertThat(values.size()).isGreaterThanOrEqualTo(3);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be equal to 3?

@wuchong wuchong merged commit 199af0b into apache:main Mar 10, 2026
6 checks passed
hemanthsavasere pushed a commit to hemanthsavasere/fluss that referenced this pull request Mar 14, 2026
wxplovecc pushed a commit to tongcheng-elong/fluss that referenced this pull request Apr 17, 2026
wxplovecc pushed a commit to tongcheng-elong/fluss that referenced this pull request Apr 20, 2026
Ugbot pushed a commit to Ugbot/fluss that referenced this pull request Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a batch scanner that can be used directly for the whole table

3 participants