【OpenGauss源码学习 —— 列存储（ColumnTableSample）】

执行算子（ColumnTableSample）概述ColumnTableSample 类ColumnTableSample::ColumnTableSample 构造函数ColumnTableSample::~ColumnTableSample 析构函数ExecCStoreScan 函数ColumnTableSample::scanVecSample 函数ColumnTableSample::getMaxOffset 函数ColumnTableSample::scanBatch 函数ColumnTableSample::getBatchBySamples 函数ColumnTableSample::resetVecSampleScan 函数BaseTableSample::system_nextsampletuple 函数案例

声明：本文的部分内容参考了他人的文章。在编写过程中，我们尊重他人的知识产权和学术成果，力求遵循合理使用原则，并在适用的情况下注明引用来源。
本文主要参考了 OpenGauss1.1.0 的开源代码和《OpenGauss数据库源码解析》一书以及相关学习资料。

概述

在先前的学习中，我们在【 OpenGauss源码学习 —— 列存储（CopyTo）】一文中也学习过 CStore::CStoreScan成员函数，该函数实现了列式存储表的扫描过程，通过加载压缩单元描述符（CUDesc）、进行粗略检查（RoughCheck）、填充数据批量（FillVecBatch）等步骤完成扫描操作。那么CStore::CStoreScan和ExecCStoreScan函数之间有什么联系吗？
CStore::CStoreScan和ExecCStoreScan函数的区别主要在于它们所属的上下文和调用方式。这两个函数都用于执行列式存储表的扫描，但可能在不同的软件层次上。CStore::CStoreScan：属于CStore类，表明它可能是类的成员函数；而ExecCStoreScan函数属于执行器（Executor）或执行计划节点，表明它可能在查询执行计划的上下文中调用。总的来说，CStore::CStoreScan可能更关联于存储引擎的具体实现，而ExecCStoreScan则更可能属于查询执行计划的执行器的一部分。

ColumnTableSample 类

ColumnTableSample类是用于在列式存储表中执行采样扫描的实现。其功能包括维护偏移标识、当前列存储单元标识以及元组标识向量等状态信息，提供方法用于重置采样扫描状态、获取采样数据的批量行数，执行批量和向量化的采样扫描操作，并获取采样中的最大偏移值。该类的设计旨在有效地支持列式存储表的采样查询和分析需求。函数源码如下所示：（路径：src/include/executor/nodeSamplescan.h）

class ColumnTableSample : public BaseTableSample { private: uint16* offsetIds; // 存储列的偏移标识数组 uint32 currentCuId; // 当前的列单元（CU）标识 int batchRowCount; // 批次中的行数 VectorBatch* tids; // 存储样本行的标识符 public: ColumnTableSample(CStoreScanState* scanstate); // 构造函数，接受 CStoreScanState 参数进行初始化 virtual ~ColumnTableSample(); // 虚析构函数，用于释放资源 void resetVecSampleScan(); // 重置抽样扫描的矢量 void getBatchBySamples(VectorBatch* vbout); // 根据样本获取批次数据 ScanValid scanBatch(VectorBatch* batch); // 扫描并填充批次数据 void scanVecSample(VectorBatch* batch); // 扫描抽样矢量数据 void getMaxOffset(); // 获取最大的偏移值 }; ColumnTableSample::ColumnTableSample 构造函数

ColumnTableSample::ColumnTableSample函数是ColumnTableSample类的构造函数，用于初始化样本扫描参数。在函数中进行了以下操作：

为offsetIds分配了内存，并将其初始化为0。创建了一个新的VectorBatch对象，用于构造tids以获取样本的VectorBatch。

函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * 描述: 初始化 CStoreScanState 的样本扫描参数。 * * 参数: * @in scanstate: CStoreScanState 信息 * * 返回: void */ ColumnTableSample::ColumnTableSample(CStoreScanState* scanstate) : BaseTableSample(scanstate), currentCuId(0), batchRowCount(0) { // 为 offsetIds 分配内存，并初始化为0 offsetIds = (uint16*)palloc0(sizeof(uint16) * BatchMaxSize); errno_t rc = memset_s(offsetIds, sizeof(uint16) * BatchMaxSize, 0, sizeof(uint16) * BatchMaxSize); securec_check(rc,"",""); TupleDesc tupdesc = CreateTemplateTupleDesc(1, false); TupleDescInitEntry(tupdesc, (AttrNumber)1,"tids", INT8OID, -1, 0); tids = New(CurrentMemoryContext) VectorBatch(CurrentMemoryContext, tupdesc); } ColumnTableSample::~ColumnTableSample 析构函数

ColumnTableSample::~ColumnTableSample是ColumnTableSample类的析构函数，用于释放在构造函数中分配的资源。在函数中进行了以下操作：

释放offsetIds分配的内存。删除tids对象，释放相关资源。

函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

ColumnTableSample::~ColumnTableSample() { // 释放 offsetIds 分配的内存 if (offsetIds) { pfree_ext(offsetIds); offsetIds = NULL; } // 释放 tids 对象 if (tids) { delete tids; tids = NULL; } }

下面，我将按照函数的调用关系来依次介绍ColumnTableSample类相关成员函数，函数调用关系如下图所示：

【OpenGauss源码学习 —— 列存储（ColumnTableSample）】

ExecCStoreScan 函数

ExecCStoreScan函数用于执行列存储表的扫描操作，返回下一个符合条件的向量批次。函数包括了设置运行时键、处理函数返回集合、运行列存储扫描或样本扫描、检查扫描是否结束等关键步骤。函数的目的是执行列存储表的顺序扫描，返回下一个符合条件的向量批次。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 执行列存储表的扫描操作，返回下一个符合条件的向量批次。 * * Parameters: * @in node: CStoreScanState 信息 * * Returns: VectorBatch 包含下一个符合条件的元组。 */ VectorBatch* ExecCStoreScan(CStoreScanState* node) { VectorBatch* p_out_batch = NULL; // 输出向量批次 VectorBatch* p_scan_batch = NULL; // 扫描到的向量批次 // 如果有运行时键且尚未设置，进行设置 if (node->m_ScanRunTimeKeysNum && !node->m_ScanRunTimeKeysReady) { ExecCStoreScanEvalRuntimeKeys(node->ps.ps_ExprContext, node->m_pScanRunTimeKeys, node->m_ScanRunTimeKeysNum); node->m_ScanRunTimeKeysReady = true; } p_out_batch = node->m_pCurrentBatch; p_scan_batch = node->m_pScanBatch; // 更新列存储扫描计时标志 node->m_CStore->SetTiming(node); ExprDoneCond done = ExprSingleResult; // 处理函数返回集合的情况 if (node->ps.ps_TupFromTlist) { Assert(node->ps.ps_ProjInfo); p_out_batch = ExecVecProject(node->ps.ps_ProjInfo, true, &done); if (p_out_batch->m_rows > 0) { return p_out_batch; } node->ps.ps_TupFromTlist = false; } restart: // 重置向量批次和表达式上下文 p_scan_batch->Reset(true); p_out_batch->Reset(true); node->ps.ps_ProjInfo->pi_exprContext->current_row = 0; // 运行列存储扫描或样本扫描 if (!node->isSampleScan) { node->m_CStore->RunScan(node, p_scan_batch); } else { ((ColumnTableSample*)node->sampleScanInfo.tsm_state)->scanVecSample(p_scan_batch); } // 检查扫描是否结束且批次为空 if (node->m_CStore->IsEndScan() && p_scan_batch->m_rows == 0) { // 如果有数据，扫描增量存储表 ScanDeltaStore(node, p_scan_batch, NULL); if (p_scan_batch->m_rows == 0) return p_out_batch; } // 修复扫描到的向量批次的行数 p_scan_batch->FixRowCount(); // 应用条件和投影操作 p_out_batch = ApplyProjectionAndFilter(node, p_scan_batch, &done); if (done != ExprEndResult) { // 如果表达式计算结果不是结束状态 node->ps.ps_TupFromTlist = (done == ExprMultipleResult); } // 处理停止查询标志 if (unlikely(executorEarlyStop())) return NULL; // 如果输出批次为空，重新开始扫描 if (BatchIsNull(p_out_batch)) { CHECK_FOR_INTERRUPTS(); goto restart; } return p_out_batch; }

其中，可以看到，在代码ExecCStoreScan中调用了CStore::RunScan函数：

// 运行列存储扫描或样本扫描 if (!node->isSampleScan) { node->m_CStore->RunScan(node, p_scan_batch); } else { ((ColumnTableSample*)node->sampleScanInfo.tsm_state)->scanVecSample(p_scan_batch); } ColumnTableSample::scanVecSample 函数

ColumnTableSample::scanVecSample函数是一个用于获取列存储表样本的函数。函数通过状态机的方式，依次执行获取最大块数（GETMAXBLOCK）、获取块号（GETBLOCKNO）、获取最大偏移量（GETMAXOFFSET）、获取偏移量（GETOFFSET）、获取数据（GETDATA）的操作。在每个阶段，根据不同的条件和状态执行相应的操作，最终将样本数据存储在输出的VectorBatch中。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 获取列存储表的样本 VectoBatch。 * * Parameters: * @in pOutBatch: 返回 VectorBatch 的数值 * * Returns: void */ void ColumnTableSample::scanVecSample(VectorBatch* pOutBatch) { if ((finished == true) || (vecsampleScanState->sampleScanInfo.sampleType == BERNOULLI_SAMPLE && percent[0] == 0) || (vecsampleScanState->sampleScanInfo.sampleType == SYSTEM_SAMPLE && percent[0] == 0) || (vecsampleScanState->sampleScanInfo.sampleType == HYBRID_SAMPLE && percent[BERNOULLI_SAMPLE] == 0 && percent[SYSTEM_SAMPLE] == 0)) { return; } for (;;) { CHECK_FOR_INTERRUPTS(); switch (runState) { case GETMAXBLOCK: { totalBlockNum = CStoreRelGetCUNumByNow((CStoreScanDesc)vecsampleScanState); runState = GETBLOCKNO; elog(DEBUG2,"获取关系：%s 在 %s 上的 %u 个CUs。", NameStr(vecsampleScanState->ss_currentRelation->rd_rel->relname), g_instance.attr.attr_common.PGXCNodeName, totalBlockNum); break; } case GETBLOCKNO: { (this->*nextSampleBlock_function)(); if (!BlockNumberIsValid(currentBlock)) { finished = true; return; } currentCuId = currentBlock + FirstCUID + 1; runState = GETMAXOFFSET; break; } case GETMAXOFFSET: { getMaxOffset(); if (InvalidOffsetNumber == curBlockMaxoffset) { runState = GETBLOCKNO; } else { runState = GETOFFSET; } elog(DEBUG2,"获取关系：%s 在 %s 上的 CUNo: %u 中的 %d 个元组。", NameStr(vecsampleScanState->ss_currentRelation->rd_rel->relname), g_instance.attr.attr_common.PGXCNodeName, currentBlock, curBlockMaxoffset); break; } case GETOFFSET: { (this->*nextSampleTuple_function)(); runState = GETDATA; break; } case GETDATA: { // 调用 scanBatch 函数获取有效的批次 ScanValid scanState = scanBatch(pOutBatch); switch (scanState) { // 如果存在有效数据，继续获取下一行数据 case VALIDDATA: { runState = GETOFFSET; return; } // 如果没有更多数据，转到获取下一个块的状态 case NEXTDATA: { runState = GETOFFSET; break; } // 如果块中的数据已经全部扫描完毕，需要获取下一个块 case INVALIDOFFSET: { runState = GETBLOCKNO; // 如果上一个批次已经填满，返回上一个批次并获取新的块和批次 if (batchRowCount > 0) { batchRowCount = 0; return; } break; } // 处理其他情况 default: { break; } } break; } default: { break; } } } } ColumnTableSample::getMaxOffset 函数

ColumnTableSample::getMaxOffset函数是用于获取当前块的最大偏移量的函数。在列存储数据库系统中，数据通常以列为单位进行组织，一个列可以被分成多个块，每个块包含一定数量的行。偏移量是指在一个块中某一行的相对位置。函数首先根据当前块号和列标识符获取列存储描述符CUDesc。然后，通过检查快照规则和元组状态，确定当前块的最大偏移量。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 获取当前块的最大偏移量。 * * Parameters: 无 * * Returns: void */ void ColumnTableSample::getMaxOffset() { CUDesc cu_desc; int fstColIdx = 0; Assert(BlockNumberIsValid(currentBlock)); curBlockMaxoffset = InvalidOffsetNumber; if (vecsampleScanState->ss_currentRelation->rd_att->attrs[0]->attisdropped) { fstColIdx = CStoreGetfstColIdx(vecsampleScanState->ss_currentRelation); } /* * 根据 currentCuId 获取列的 CUDesc。 */ if (vecsampleScanState->m_CStore->GetCUDesc(fstColIdx, currentCuId, &cu_desc, GetActiveSnapshot()) != true) { return; } /* * 我们尽力保持对行关系获取元组的规则： * 1). 忽略已死亡的元组 * 2). 忽略最近死亡的元组 * 3). 忽略其他事务中正在插入中的元组 * 4). 忽略我们事务中正在删除中的元组 * 5). 忽略其他事务中正在删除中的元组 * SnapshotNow 可以满足规则 1) 2) 3) 4)，因此在这里使用它。 */ vecsampleScanState->m_CStore->GetCUDeleteMaskIfNeed(currentCuId, GetActiveSnapshot()); if (vecsampleScanState->m_CStore->IsTheWholeCuDeleted(cu_desc.row_count)) { return; } curBlockMaxoffset = cu_desc.row_count; } ColumnTableSample::scanBatch 函数

ColumnTableSample::scanBatch函数是用于通过元组 ID扫描每个偏移量，获取样本的 VectorBatch的函数。函数会检查当前块的偏移量是否为无效值，如果是，则判断是否有剩余的批次需要处理，如果有，则调用getBatchBySamples处理批次。接着，函数检查当前偏移量对应的行是否为无效行，如果不是，则将其添加到当前批次的offsetIds数组中。当批次的元组数量达到BatchMaxSize时，调用getBatchBySamples处理批次，然后清空 offsetIds 数组。最后，函数根据处理的结果返回相应的标志。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 扫描每个偏移量，并通过元组ID获取样本的 VectorBatch。 * * Parameters: * @in pOutBatch: 返回 VectorBatch 的数值 * * Returns: ScanValid（用于标识元组是否有效的标志） */ ScanValid ColumnTableSample::scanBatch(VectorBatch* pOutBatch) { Assert(BlockNumberIsValid(currentBlock)); if (currentOffset == InvalidOffsetNumber) { if (batchRowCount > 0) { /* * 如果到达这里，意味着我们已经用尽了这个 CU 上的元组， * 现在是时候移到下一个 CU。 */ getBatchBySamples(pOutBatch); errno_t rc = memset_s(offsetIds, sizeof(uint16) * BatchMaxSize, 0, sizeof(uint16) * BatchMaxSize); securec_check(rc,"",""); } return INVALIDOFFSET; } if (!vecsampleScanState->m_CStore->IsDeadRow(currentCuId, (uint32)currentOffset)) { elog(DEBUG2,"获取一个元组 [currentCuId: %u, currentOffset: %u] for 关系: %s 在 %s 上.", currentCuId, currentOffset, NameStr(vecsampleScanState->ss_currentRelation->rd_rel->relname), g_instance.attr.attr_common.PGXCNodeName); offsetIds[batchRowCount++] = currentOffset; if (batchRowCount >= BatchMaxSize) { getBatchBySamples(pOutBatch); batchRowCount = 0; errno_t rc = memset_s(offsetIds, sizeof(uint16) * BatchMaxSize, 0, sizeof(uint16) * BatchMaxSize); securec_check(rc,"",""); return VALIDDATA; } } return NEXTDATA; } ColumnTableSample::getBatchBySamples 函数

ColumnTableSample::getBatchBySamples函数是根据tids（CuId+offsetId）获取样本的VectorBatch的函数。函数首先重置了 tids 的状态，然后通过当前 CU 的 CuId 和 offsetIds构建了tids 的 VectorBatch。接着，函数通过tids扫描了VectorBatch，并将结果存储在输出参数vbout中。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 根据 tids（CuId+offsetId）获取样本的 VectorBatch。 * * Parameters: * @in state: CStoreScanState 信息 * @in cuId: 当前 CU 的 CuId * @in maxOffset: 当前 CU 的最大 Offset * @in offsetIds: 当前 CU 的随机 offsetIds * @in tids: 通过 cuId 和 offsetIds 构建 tids 的 VectorBatch * @in vbout: 返回 VectorBatch 的数值 * * Returns: void */ void ColumnTableSample::getBatchBySamples(VectorBatch* vbout) { ScalarVector* vec = tids->m_arr; tids->Reset(); for (int j = 0; j < batchRowCount; j++) { vec->m_vals[j] = 0; ItemPointer itemPtr = (ItemPointer)&vec->m_vals[j]; ItemPointerSet(itemPtr, currentCuId, offsetIds[j]); } vec->m_rows = batchRowCount; tids->m_rows = vec->m_rows; if (!BatchIsNull(tids)) { CStoreIndexScanState* indexScanState = makeNode(CStoreIndexScanState); indexScanState->m_indexOutAttrNo = 0; vecsampleScanState->m_CStore->ScanByTids(indexScanState, tids, vbout); vecsampleScanState->m_CStore->ResetLateRead(); } } ColumnTableSample::resetVecSampleScan 函数

ColumnTableSample::resetVecSampleScan函数是用于重置 VectoBatch 样本扫描参数的函数。在函数中，将currentCuId和batchRowCount设置为初始值，然后调用resetSampleScan函数重置表样本的通用参数。接着，如果存在 tids 对象，将其重置；同时，如果存在 offsetIds 数组，使用memset_s函数将其清零。这个函数主要用于准备进行下一轮VectoBatch样本扫描时的初始状态。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 重置 VectoBatch 样本扫描参数。 * * Parameters: 无 * * Returns: void */ void ColumnTableSample::resetVecSampleScan() { currentCuId = 0; batchRowCount = 0; (((ColumnTableSample*)vecsampleScanState->sampleScanInfo.tsm_state)->resetSampleScan)(); if (tids) { tids->Reset(); } if (offsetIds) { errno_t rc = memset_s(offsetIds, sizeof(uint16) * BatchMaxSize, 0, sizeof(uint16) * BatchMaxSize); securec_check(rc,"",""); } } BaseTableSample::system_nextsampletuple 函数

BaseTableSample::system_nextsampletuple函数是用于获取下一个顺序偏移量的函数。函数首先记录当前偏移量，然后将其递增到页面上的下一个可能的偏移量。如果当前偏移量为无效值，则将其设置为第一个偏移量。接着，如果递增后的偏移量超过了当前块的最大偏移量，将其重新设置为无效值。最后，更新对象的当前偏移量。这个函数通常在对数据进行顺序扫描时使用，确保按顺序逐个获取数据行的偏移量。函数源码如下所示：（路径：src/gausskernel/runtime/executor/nodeSamplescan.cpp）

/* * Description: 获取顺序下一个偏移量。 * Parameters: 无 * Returns: void */ void BaseTableSample::system_nextsampletuple() { // 记录当前偏移量 OffsetNumber tupoffset = currentOffset; if (tupoffset == InvalidOffsetNumber) { tupoffset = FirstOffsetNumber; } else { tupoffset++; } // 如果偏移量超过当前块的最大偏移量，则将其设置为无效值 if (tupoffset > curBlockMaxoffset) { tupoffset = InvalidOffsetNumber; } // 更新当前偏移量 currentOffset = tupoffset; } 案例

下面我们还是以一个案例来调试一下代码吧，首先执行以下sql语句：

-- 创建表 CREATE TABLE column_store_table ( id INT, name VARCHAR(50), age INT, salary DECIMAL(10, 2), email VARCHAR(100) )WITH (ORIENTATION = COLUMN); -- 插入数据 INSERT INTO column_store_table VALUES (1, 'John', 30, 50000.00, 'john@example.com'), (2, 'Alice', 28, 60000.50, NULL), (3, 'Bob', NULL, NULL, 'bob@example.com'); -- 执行列存查询操作 select * from column_store_table where id >1;

1.在 ExecCStoreScan 函数中打上断点。

函数调用关系如下所示：

#0 ExecCStoreScan (node=0x7f15ae082060) at veccstore.cpp:314 #1 0x000000000173425f in VectorEngine (node=0x7f15ae082060) at vecexecutor.cpp:171 #2 0x0000000001687fd5 in ExecVecToRow (state=0x7f15adeea060) at vectortorow.cpp:149 #3 0x000000000159a439 in ExecProcNodeByType (node=0x7f15adeea060) at execProcnode.cpp:677 #4 0x000000000159a8dd in ExecProcNode (node=0x7f15adeea060) at execProcnode.cpp:769 #5 0x0000000001595232 in ExecutePlan (estate=0x7f15b335c060, planstate=0x7f15adeea060, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x7f15b3355d60) at execMain.cpp:2124 #6 0x0000000001591d6a in standard_ExecutorRun (queryDesc=0x7f15b3368c60, direction=ForwardScanDirection, count=0) at execMain.cpp:608 #7 0x000000000139a5d4 in explain_ExecutorRun (queryDesc=0x7f15b3368c60, direction=ForwardScanDirection, count=0) at auto_explain.cpp:116 #8 0x000000000159188f in ExecutorRun (queryDesc=0x7f15b3368c60, direction=ForwardScanDirection, count=0) at execMain.cpp:484 #9 0x000000000147298f in PortalRunSelect (portal=0x7f15adedc060, forward=true, count=0, dest=0x7f15b3355d60) at pquery.cpp:1396 #10 0x0000000001471b5c in PortalRun (portal=0x7f15adedc060, count=9223372036854775807, isTopLevel=true, dest=0x7f15b3355d60, altdest=0x7f15b3355d60, completionTag=0x7f15abf27f90"") at pquery.cpp:1134 ---Type to continue, or q to quit---

相关调试信息如下所示：

(gdb) p *node $1 = { = {ps = {type = T_CStoreScanState, plan = 0x7f15b9b1a0a0, state = 0x7f15b335c060, instrument = 0x0, targetlist = 0x7f15b9b242d0, qual = 0x7f15b9b25530, lefttree = 0x0, righttree = 0x0, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, hbktScanSlot = {currSlot = 0}, ps_ResultTupleSlot = 0x7f15b9b24b98, ps_ExprContext = 0x7f15ae082288, ps_ProjInfo = 0x7f15b9b27e50, ps_TupFromTlist = false, vectorized = true, nodeContext = 0x7f15adee0060, earlyFreed = false, stubType = 0 '\000', jitted_vectarget = 0x0, plan_issues = 0x0, recursive_reset = false, qual_is_inited = false, ps_rownum = 0}, ss_currentRelation = 0x7f15adff8390, ss_currentScanDesc = 0x0, ss_ScanTupleSlot = 0x7f15b9b24d08, ss_ReScan = false, ss_currentPartition = 0x0, isPartTbl = false, currentSlot = 0, partScanDirection = NoMovementScanDirection, partitions = 0x0, lockMode = 0, runTimeParamPredicates = 0x0, runTimePredicatesReady = false, is_scan_end = false, ss_scanaccessor = 0x0, part_id = 0, startPartitionId = 0, endPartitionId = 0, rangeScanInRedis = {isRangeScanInRedis = 0 '\000', sliceTotal = 0 '\000', sliceIndex = 0 '\000'}, isSampleScan = false, sampleScanInfo = {args = 0x0, repeatable = 0x0, sampleType = SYSTEM_SAMPLE, tsm_state = 0x0}, ScanNextMtd = 0x0}, ss_currentDeltaRelation = 0x7f15adffb050, ss_partition_parent = 0x0, ss_currentDeltaScanDesc = 0x7f15b8289060, ss_deltaScan = false, ss_deltaScanEnd = false, m_pScanBatch = 0x7f15b305b3d0, m_pCurrentBatch = 0x7f15b9b25b20, m_pScanRunTimeKeys = 0x0, m_ScanRunTimeKeysNum = 0, m_ScanRunTimeKeysReady = false, m_CStore = 0x7f15b31f54f0, csss_ScanKeys = 0x7f15b31f50c8, csss_NumScanKeys = 1, m_fSimpleMap = true, m_fUseColumnRef = false, jitted_vecqual = 0x0, m_isReplicaTable = false} (gdb) p *p_out_batch $2 = { = {}, m_rows = 0, m_cols = 5, m_checkSel = false, m_sel = 0x7f15b9b25b88, m_arr = 0x7f15b2ddbdb0, m_sysColumns = 0x0, m_pCompressBuf = 0x0} (gdb) p *p_scan_batch $3 = { = {}, m_rows = 0, m_cols = 5, m_checkSel = false, m_sel = 0x7f15b305b438, m_arr = 0x7f15b9b27c30, m_sysColumns = 0x0, m_pCompressBuf = 0x0}

在这里，执行 select * from column_store_table where id > 1; 后 ExecCStoreScan函数会调用CStore::RunScan对列存表进行扫描。这里，我们可以使用如下SQL 来限制表扫描的范围：SELECT * FROM column_store_table TABLESAMPLE SYSTEM (10);

2.执行样本表扫描。

执行 SELECT * FROM column_store_table TABLESAMPLE SYSTEM (10); 后，可以看到ExecCStoreScan函数会调用ColumnTableSample::scanVecSample函数进行样本表扫描。

│354 if (!node->isSampleScan) { │ │355 node->m_CStore->RunScan(node, p_scan_batch); │ │356 } else { │ │357 /* │ │358 * Sample scan for column table. │ │359 */ │ >│360 (((ColumnTableSample*)node->sampleScanInfo.tsm_state)->scanVecSample)(p_scan_batch); │ │361 } │

相关调试信息如下所示：

(gdb) p * pOutBatch $1 = { = {}, m_rows = 0, m_cols = 5, m_checkSel = false, m_sel = 0x7f15b305c1c8, m_arr = 0x7f15b2dd3db0, m_sysColumns = 0x0, m_pCompressBuf = 0x0} (gdb) p totalBlockNum $2 = 2 (gdb) p blockindex $3 = 2 (gdb) p p_scan_batch->m_rows $4 = 0

这里由于 TABLESAMPLE SYSTEM (10) 表示从表中随机抽取约 10% 的行，所以当执行到如下判断时就会直接 return。

if (!BlockNumberIsValid(currentBlock)) { finished = true; return; }

为了调试ColumnTableSample::getMaxOffset函数，我们这里修改SQL 语句如下：SELECT * FROM column_store_table TABLESAMPLE SYSTEM (60);

3.步入 getMaxOffset 函数。

相关调试信息如下所示：

(gdb) p curBlockMaxoffset $1 = 0 (gdb) p cu_desc.row_count $2 = 1 (gdb) p curBlockMaxoffset $3 = 1

ColumnTableSample::getMaxOffset函数的目的是获取当前块（Column Unit，CU）的最大偏移量。在列存储数据库系统中，数据通常以列为单位进行组织，一个列可以被分成多个块，每个块包含一定数量的行。偏移量是指在一个块中某一行的相对位置。
举个具体例子：假设有一个列存储表sample_table包含以下数据：

id | name | value ---|-------|------- 1 | John | 10 2 | Alice | 20 3 | Bob | 15

假设该表按照value 列进行列存储，每个Column Unit（CU）包含2 行数据。现在，我们来模拟一下ColumnTableSample::getMaxOffset函数的执行：

获取列存储描述符：假设当前块的CuId为1，即第一个 CU。通过CuId获取列存储描述符，该描述符包含有关该列的元数据信息，例如每个 CU 中的行数。检查行是否被删除：假设当前块的第一行数据被删除，但其他行有效。函数检查列存储描述符，并确定第一行已被删除。确定最大偏移量：由于第一行已被删除，最大偏移量将是第二行，因此最大偏移量为 2。

这样，在进行后续的列存储扫描时，系统将从第二行开始扫描，忽略已被删除的第一行，从而避免不必要的数据读取和处理。这种方式有助于提高查询性能，特别是当表中包含大量被删除或不需要的数据时。

4.进入 GETDATA 状态。

相关调试信息如下所示：

(gdb) p scanState $1 = NEXTDATA

返回NEXTDATA 状态，这通常表示当前块中的数据已经全部扫描完毕，需要获取下一个块的数据。使用 (this->*nextSampleTuple_function)() 调用 BaseTableSample::system_nextsampletuple函数获取下一个顺序偏移量的函数。
相关调试信息如下所示：

(gdb) p currentOffset $1 = 1 (gdb) p curBlockMaxoffset $2 = 1 (gdb) p tupoffset $30 = 2 # 执行 if (tupoffset > curBlockMaxoffset) 后 (gdb) p tupoffset $4 = 0 # 返回下一个符合条件的向量批次 (gdb) p *p_out_batch $5 = { = {}, m_rows = 1, m_cols = 5, m_checkSel = false, m_sel = 0x7f15b30505a0, m_arr = 0x7f15b9b27d58, m_sysColumns = 0x0, m_pCompressBuf = 0x0}

【OpenGauss源码学习 —— 列存储（ColumnTableSample）】

计算机视觉的应用20-图像生成模型(Stable Diffusion)的 ...

2023最新AI创作系统ChatGPT网站源码+Midjourney绘画 ...

re:Invent 构建未来：云计算&生成式 AI 诞生科技新局 ...

初识Dockerfile

计算机视觉的应用20-图像生成模型(Stable Diffusion)的原理详解与相关项目介绍

2023最新AI创作系统ChatGPT网站源码+Midjourney绘画+支持GPT-4-Turbo模型+即将支持TSS语音对话功能模块

re:Invent 构建未来：云计算&生成式 AI 诞生科技新局面

初识Dockerfile

OpenCV中的一些图像方法记录

精选内容

顶流网红Speed中国行，外交部回应：展现一个没有滤镜的真实中国

仁迷太熟了每体赞弗里克：现在巴萨就算丢球也让人感觉能逆转

名记：若绿军无法卫冕将不得不交易波津&霍勒迪最可能成牺牲品

【OpenGauss源码学习 —— 列存储（ColumnTableSample）】

相关信息

你可能还喜欢

热门推荐信息