Skip to content

orca: implement intra-segment parallel table scan support#1398

Merged
yjhjstz merged 3 commits into
apache:mainfrom
yjhjstz:yjhjstz/orca_parallel
May 13, 2026
Merged

orca: implement intra-segment parallel table scan support#1398
yjhjstz merged 3 commits into
apache:mainfrom
yjhjstz:yjhjstz/orca_parallel

Conversation

@yjhjstz

@yjhjstz yjhjstz commented Oct 16, 2025

Copy link
Copy Markdown
Member

Add comprehensive parallel table scan capability to GPORCA optimizer, enabling worker-level parallelism within segments for improved query performance on large table scans.

Key components:

  • New CPhysicalParallelTableScan operator and CDistributionSpecWorkerRandom distribution specification for worker-level data distribution
  • CXformGet2ParallelTableScan transformation with parallel safety checks (excludes CTEs, dynamic scans, foreign tables, replicated tables, etc.)
  • Cost model integration with parallel_setup_cost and efficiency degradation scaling (logarithmic based on worker count)
  • DXL serialization/deserialization for CDXLPhysicalParallelTableScan
  • Plan translation to PostgreSQL SeqScan nodes with parallel_aware=true
  • Rewindability constraints (parallel scans are non-rewindable)
  • GUC integration: max_parallel_workers_per_gather controls worker count

Impl #1316

TPCH improved 15%.

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@yjhjstz yjhjstz marked this pull request as ready for review October 17, 2025 16:41
@my-ship-it my-ship-it force-pushed the yjhjstz/orca_parallel branch from cde0dec to 05d9edf Compare October 20, 2025 06:54
@my-ship-it my-ship-it self-requested a review October 20, 2025 07:50
@my-ship-it

Copy link
Copy Markdown
Contributor

Please add more test cases

Comment thread src/backend/gpopt/gpdbwrappers.cpp
Comment thread src/backend/gpopt/gpdbwrappers.cpp
Comment thread src/backend/gpopt/translate/CTranslatorDXLToPlStmt.cpp Outdated
Comment thread src/backend/gpopt/translate/CTranslatorDXLToPlStmt.cpp Outdated
Comment thread src/backend/gpopt/translate/CTranslatorDXLToPlStmt.cpp
Comment thread src/backend/gpopt/translate/CTranslatorRelcacheToDXL.cpp
Comment thread src/backend/gporca/libgpdbcost/src/CCostModelGPDB.cpp
Comment thread src/backend/gporca/libgpopt/include/gpopt/base/CRewindabilitySpec.h Outdated
Comment thread src/backend/gporca/libgpopt/src/search/CGroup.cpp
@yjhjstz

yjhjstz commented Oct 21, 2025

Copy link
Copy Markdown
Member Author

Please add more test cases

see src/test/regress:installcheck-orca-parallel

@yjhjstz yjhjstz force-pushed the yjhjstz/orca_parallel branch from 05d9edf to 8a5bc1e Compare October 21, 2025 15:24
Comment thread src/test/regress/GNUmakefile
Comment thread src/test/regress/excluded_tests.conf

@my-ship-it my-ship-it left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@avamingli

Copy link
Copy Markdown
Contributor

Add some cases to test the plan?

@yjhjstz

yjhjstz commented Oct 24, 2025

Copy link
Copy Markdown
Member Author

Add some cases to test the plan?

maybe after impl parallel hash join .

@yjhjstz yjhjstz closed this Jan 16, 2026
@yjhjstz yjhjstz reopened this May 8, 2026
@yjhjstz yjhjstz force-pushed the yjhjstz/orca_parallel branch 2 times, most recently from 319b9f1 to 6db920d Compare May 8, 2026 17:44
Comment thread src/backend/gporca/libgpdbcost/src/CCostModelGPDB.cpp Outdated
Comment thread src/backend/gporca/libgpdbcost/src/CCostModelGPDB.cpp
Comment thread src/backend/gporca/libgpopt/src/base/CCostContext.cpp Outdated
Comment thread src/backend/gporca/libgpopt/src/xforms/CXformGet2ParallelTableScan.cpp Outdated
@zhangwenchao-123

Copy link
Copy Markdown
Contributor

LGTM.

@yjhjstz yjhjstz force-pushed the yjhjstz/orca_parallel branch 2 times, most recently from 84a13dc to 3d959c8 Compare May 13, 2026 12:52
yjhjstz added 3 commits May 14, 2026 01:39
Add comprehensive parallel table scan capability to GPORCA optimizer,
enabling worker-level parallelism within segments for improved query
performance on large table scans.

Key components:
- New CPhysicalParallelTableScan operator and CDistributionSpecWorkerRandom
distribution specification for worker-level data distribution
- CXformGet2ParallelTableScan transformation with parallel safety checks
(excludes CTEs, dynamic scans, foreign tables, replicated tables, etc.)
- Cost model integration with parallel_setup_cost and efficiency degradation
scaling (logarithmic based on worker count)
- DXL serialization/deserialization for CDXLPhysicalParallelTableScan
- Plan translation to PostgreSQL SeqScan nodes with parallel_aware=true
- Rewindability constraints (parallel scans are non-rewindable)
- GUC integration: max_parallel_workers_per_gather controls worker count
In Cloudberry's MPP architecture, segment stats are delivered
asynchronously to the coordinator. The seq_scan counter can be
registered before seq_tup_read arrives from segments, causing
wait_for_stats() to exit prematurely and the subsequent assertion
to fail intermittently in the pax-ic-good-opt-off CI job.

Add an explicit wait condition (updated6) for seq_tup_read reaching
the expected value, and update the comment to reflect Cloudberry's
segment-level async stats delivery rather than parallel workers.
@yjhjstz yjhjstz force-pushed the yjhjstz/orca_parallel branch from 3d959c8 to 3e8c25d Compare May 13, 2026 17:45
@yjhjstz yjhjstz merged commit 9052b7a into apache:main May 13, 2026
68 of 69 checks passed
@yjhjstz yjhjstz deleted the yjhjstz/orca_parallel branch May 13, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants