PostgreSQL源码解读（233）-查询#126(NOTIN实现#4)

本节简单解释了PostgreSQL NOT IN在执行时为何会出现时快时慢的现象。

创新互联是一家专业提供铁东企业网站建设,专注与成都做网站、成都网站建设、H5高端网站建设、小程序制作等业务。10年已为铁东众多企业、政府机构等服务。创新互联专业的建站公司优惠进行中。

测试数据如下：

[local]:5432 pg12@testdb=# select count(*) from tbl;
 count 
-------
     1
(1 row)
Time: 6.009 ms
[local]:5432 pg12@testdb=# select count(*) from t_big_null;
  count   
----------
 10000001
(1 row)
Time: 633.248 ms
[local]:5432 pg12@testdb=# \d tbl
                Table "public.tbl"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           | not null | 
 value  | integer |           | not null | 
Indexes:
    "tbl_pkey" PRIMARY KEY, btree (id)
Rules:
    rule_tbl_update AS
    ON INSERT TO tbl
   WHERE (EXISTS ( SELECT tbl_1.id,
            tbl_1.value
           FROM tbl tbl_1
          WHERE tbl_1.id = new.id)) DO INSTEAD  UPDATE tbl SET value = tbl.value + 1
  WHERE tbl.id = new.id
[local]:5432 pg12@testdb=# \d t_big_null
             Table "public.t_big_null"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          | 
[local]:5432 pg12@testdb=#

注意tbl表只有一行数据(id = 1)，而t_big_null表在插入”id = 1”这一行时有意放在最后才插入

truncate table t_big_null;
insert into t_big_null select generate_series(2,10000000);
insert into t_big_null values(1);

一、数据结构

SubPlanState
子计划运行期状态


/* ----------------
 *        SubPlanState node
 * ----------------
 */
typedef struct SubPlanState
{
    NodeTag        type;
    SubPlan    *subplan;        /* expression plan node */
    struct PlanState *planstate;    /* subselect plan's state tree */
    struct PlanState *parent;    /* parent plan node's state tree */
    ExprState  *testexpr;        /* 组合表达式状态；state of combining expression */
    List       *args;            /* 参数表达式状态；states of argument expression(s) */
    HeapTuple    curTuple;        /* subplan最近的元组；copy of most recent tuple from subplan */
    Datum        curArray;        /* most recent array from ARRAY() subplan */
    /* these are used when hashing the subselect's output: */
    TupleDesc    descRight;        /* 投影后的子查询描述符；subselect desc after projection */
    ProjectionInfo *projLeft;    /* for projecting lefthand exprs */
    ProjectionInfo *projRight;    /* for projecting subselect output */
    TupleHashTable hashtable;    /* hash table for no-nulls subselect rows */
    TupleHashTable hashnulls;    /* hash table for rows with null(s) */
    bool        havehashrows;    /* true if hashtable is not empty */
    bool        havenullrows;    /* true if hashnulls is not empty */
    MemoryContext hashtablecxt; /* memory context containing hash tables */
    MemoryContext hashtempcxt;    /* temp memory context for hash tables */
    ExprContext *innerecontext; /* econtext for computing inner tuples */
    AttrNumber *keyColIdx;        /* control data for hash tables */
    Oid           *tab_eq_funcoids;    /* equality func oids for table
                                     * datatype(s) */
    Oid           *tab_collations; /* collations for hash and comparison */
    FmgrInfo   *tab_hash_funcs; /* hash functions for table datatype(s) */
    FmgrInfo   *tab_eq_funcs;    /* equality functions for table datatype(s) */
    FmgrInfo   *lhs_hash_funcs; /* hash functions for lefthand datatype(s) */
    FmgrInfo   *cur_eq_funcs;    /* equality functions for LHS vs. table */
    ExprState  *cur_eq_comp;    /* equality comparator for LHS vs. table */
} SubPlanState;

SubPlan
子查询计划


/*
 * SubPlan - executable expression node for a subplan (sub-SELECT)
 *
 * The planner replaces SubLink nodes in expression trees with SubPlan
 * nodes after it has finished planning the subquery.  SubPlan references
 * a sub-plantree stored in the subplans list of the toplevel PlannedStmt.
 * (We avoid a direct link to make it easier to copy expression trees
 * without causing multiple processing of the subplan.)
 * 查询规划器在完成子查询的规划后使用SubPlan节点替换表达式树中的SubLink节点。
 * SubPlan引用了存储在高层PlannedStmt中的subplans链表中的sub-plantree。
 * （避免使用直接链接，从而使得拷贝表达式树相对比较简单）
 *
 * In an ordinary subplan, testexpr points to an executable expression
 * (OpExpr, an AND/OR tree of OpExprs, or RowCompareExpr) for the combining
 * operator(s); the left-hand arguments are the original lefthand expressions,
 * and the right-hand arguments are PARAM_EXEC Param nodes representing the
 * outputs of the sub-select.  (NOTE: runtime coercion functions may be
 * inserted as well.)  This is just the same expression tree as testexpr in
 * the original SubLink node, but the PARAM_SUBLINK nodes are replaced by
 * suitably numbered PARAM_EXEC nodes.
 * 常规情况下，testexpr指向用于组合操作的可执行表达式（OpExpr、OpExprs的AND/OR树或者RowCompareExpr）；
 * 左参数是原始的左表达式，右参数是PARAM_EXEC参数节点用以表示子查询的输出。
 * 与原始SubLink节点的testexpr具有相同的表达式树，但PARAM_SUBLINK节点则使用合适的已编号PARAM_EXEC节点替代。
 *
 * If the sub-select becomes an initplan rather than a subplan, the executable
 * expression is part of the outer plan's expression tree (and the SubPlan
 * node itself is not, but rather is found in the outer plan's initPlan
 * list).  In this case testexpr is NULL to avoid duplication.
 * 如果子查询成了initplan而不是subplan，可执行的表达式是外层plan表达式树的一部分。
 * 这种情况下，testexpr为NULL以避免重复。
 *
 * The planner also derives lists of the values that need to be passed into
 * and out of the subplan.  Input values are represented as a list "args" of
 * expressions to be evaluated in the outer-query context (currently these
 * args are always just Vars, but in principle they could be any expression).
 * The values are assigned to the global PARAM_EXEC params indexed by parParam
 * (the parParam and args lists must have the same ordering).  setParam is a
 * list of the PARAM_EXEC params that are computed by the sub-select, if it
 * is an initplan; they are listed in order by sub-select output column
 * position.  (parParam and setParam are integer Lists, not Bitmapsets,
 * because their ordering is significant.)
 * 规划器还派生了需要传入和传出子计划的值的链表。
 * 输入值标识位表达式的“args”链表，在外层查询上下文中进行解析。
 * （这些args通常是Vars，但原则上它们可以是任意表达式）
 * 这些值以parParam为索引给全局PARAM_EXEC参数赋值。
 * setParam是PARAM_EXEC参数链表，通过子查询（如为initplan）计算所得。
 * 它们按子查询输出列的位置进行排序组织为链表形式。
 * （parParam和setParam是整型链表，而不是Bitmapsets链表）
 *
 * Also, the planner computes startup and per-call costs for use of the
 * SubPlan.  Note that these include the cost of the subquery proper,
 * evaluation of the testexpr if any, and any hashtable management overhead.
 * 同时，规划器计算SubPlan启动和每次调用的成本。注意：包括子查询正常解析testexpr的成本以及哈希表管理成本。
 */
typedef struct SubPlan
{
    Expr        xpr;//表达式
    /* Fields copied from original SubLink: */
    //从SubLink中拷贝而来
    SubLinkType subLinkType;    /* see above */
    /* The combining operators, transformed to an executable expression: */
    //组合操作符,转换为可执行的表达式
    Node       *testexpr;        /* OpExpr or RowCompareExpr expression tree */
    List       *paramIds;        /* 参数IDs;IDs of Params embedded in the above */
    /* Identification of the Plan tree to use: */
    //Plan tree标识
    int            plan_id;        /* Index (from 1) in PlannedStmt.subplans */
    /* Identification of the SubPlan for EXPLAIN and debugging purposes: */
    //EXPLAIN和debug目的的SubPlan标识
    char       *plan_name;        /* A name assigned during planning */
    /* Extra data useful for determining subplan's output type: */
    //用于确定subplan输出类型的额外信息
    Oid            firstColType;    /* subplan结果的第一个列类型;Type of first column of subplan result */
    int32        firstColTypmod; /* 第一列的Typmod;Typmod of first column of subplan result */
    Oid            firstColCollation;    /* 第一列的Collation;Collation of first column of subplan
                                     * result */
    /* Information about execution strategy: */
    //执行阶段的相关信息
    bool        useHashTable;    /* 是否使用哈希表存储子查询输出;true to store subselect output in a hash
                                 * table (implies we are doing "IN") */
    bool        unknownEqFalse; /* 如OK为T,如为未知则为F;快速处理null值;true if it's okay to return FALSE when the
                                 * spec result is UNKNOWN; this allows much
                                 * simpler handling of null values */
    bool        parallel_safe;    /* 是否并行安全?is the subplan parallel-safe? */
    /* Note: parallel_safe does not consider contents of testexpr or args */
    /* Information for passing params into and out of the subselect: */
    //用于给子查询传入和传出参数的信息
    /* setParam and parParam are lists of integers (param IDs) */
    //setParam和parParam是整型链表(param IDs)
    List       *setParam;        /* initplan subqueries have to set these
                                 * Params for parent plan */
    List       *parParam;        /* indices of input Params from parent plan */
    List       *args;            /* 以parParam值进行传递的表达式；exprs to pass as parParam values */
    /* Estimated execution costs: */
    //估算执行成本
    Cost        startup_cost;    /* one-time setup cost */
    Cost        per_call_cost;    /* cost for each subplan evaluation */
} SubPlan;

SubLinkType
SubLink类型


/*
 * SubLink
 *
 * A SubLink represents a subselect appearing in an expression, and in some
 * cases also the combining operator(s) just above it.  The subLinkType
 * indicates the form of the expression represented:
 *    EXISTS_SUBLINK        EXISTS(SELECT ...)
 *    ALL_SUBLINK            (lefthand) op ALL (SELECT ...)
 *    ANY_SUBLINK            (lefthand) op ANY (SELECT ...)
 *    ROWCOMPARE_SUBLINK    (lefthand) op (SELECT ...)
 *    EXPR_SUBLINK        (SELECT with single targetlist item ...)
 *    MULTIEXPR_SUBLINK    (SELECT with multiple targetlist items ...)
 *    ARRAY_SUBLINK        ARRAY(SELECT with single targetlist item ...)
 *    CTE_SUBLINK            WITH query (never actually part of an expression)
 *  我们使用SubLink表示在表达式中出现的子查询，在某些情况下组合操作符会出现在SubLink之上。
 *  subLinkType表示表达式的形式：
 *    EXISTS_SUBLINK        EXISTS(SELECT ...)
 *    ALL_SUBLINK            (lefthand) op ALL (SELECT ...)
 *    ANY_SUBLINK            (lefthand) op ANY (SELECT ...)
 *    ROWCOMPARE_SUBLINK    (lefthand) op (SELECT ...)
 *    EXPR_SUBLINK        (SELECT with single targetlist item ...)
 *    MULTIEXPR_SUBLINK    (SELECT with multiple targetlist items ...)
 *    ARRAY_SUBLINK        ARRAY(SELECT with single targetlist item ...)
 *    CTE_SUBLINK            WITH query (never actually part of an expression) 
 *
 * For ALL, ANY, and ROWCOMPARE, the lefthand is a list of expressions of the
 * same length as the subselect's targetlist.  ROWCOMPARE will *always* have
 * a list with more than one entry; if the subselect has just one target
 * then the parser will create an EXPR_SUBLINK instead (and any operator
 * above the subselect will be represented separately).
 * ROWCOMPARE, EXPR, and MULTIEXPR require the subselect to deliver at most
 * one row (if it returns no rows, the result is NULL).
 * ALL, ANY, and ROWCOMPARE require the combining operators to deliver boolean
 * results.  ALL and ANY combine the per-row results using AND and OR
 * semantics respectively.
 * ARRAY requires just one target column, and creates an array of the target
 * column's type using any number of rows resulting from the subselect.
 * 对于ALL,ANY和ROWCOMPARE,左操作符是与子查询目标链表长度一致的表达式链表。
 * ROWCOMPARE通常有超过一个条目的链表；如果子查询刚好只有一个目标列，那么解析器会创建EXPR_SUBLINK
 * （同时所有在子查询之上的操作符会单独表示）
 * ROWCOMPARE, EXPR, 和MULTIEXPR要求子查询至少输出一行（如返回0行，则结果为NULL）。
 * ALL,ANY和ROWCOMPARE要求组合操作符输出布尔型结果。
 * ALL/ANY使用AND/OR语义来组合每一行的结果。
 *
 * SubLink is classed as an Expr node, but it is not actually executable;
 * it must be replaced in the expression tree by a SubPlan node during
 * planning.
 * SubLink归类为Expr节点，但实际上并不是可执行的，必须在计划阶段通过SubPlan替代。
 *
 * NOTE: in the raw output of gram.y, testexpr contains just the raw form
 * of the lefthand expression (if any), and operName is the String name of
 * the combining operator.  Also, subselect is a raw parsetree.  During parse
 * analysis, the parser transforms testexpr into a complete boolean expression
 * that compares the lefthand value(s) to PARAM_SUBLINK nodes representing the
 * output columns of the subselect.  And subselect is transformed to a Query.
 * This is the representation seen in saved rules and in the rewriter.
 * 注意：在gram.y的裸输出中，testexpr只包含左表达式的裸形式，operName是组合操作符的字符串名称。
 * 同时，子查询是裸parsetree。在解析分析期间，
 * 解析器转换testexpr为完整的布尔表达式用于比较左操作符值与PARAM_SUBLINK节点所代表的子查询输出列值。
 * 子查询会转换为Query结构体。
 * 在已存储的规则和重写时可见的表示形式。
 *
 * In EXISTS, EXPR, MULTIEXPR, and ARRAY SubLinks, testexpr and operName
 * are unused and are always null.
 * 在EXISTS/EXPR/MULTEXPR/ARRAY SubLinks中，testexpr和operName不再使用通常是NULL值。
 *
 * subLinkId is currently used only for MULTIEXPR SubLinks, and is zero in
 * other SubLinks.  This number identifies different multiple-assignment
 * subqueries within an UPDATE statement's SET list.  It is unique only
 * within a particular targetlist.  The output column(s) of the MULTIEXPR
 * are referenced by PARAM_MULTIEXPR Params appearing elsewhere in the tlist.
 * subLinkId当前只用于MULTIEXPR，在其他SubLinks中取值为0.
 * 该数字标识了在UPDATE语句SET链表中不同的多个赋值子查询。
 * 只有在特定的targetlist内是唯一的。
 * 出现在tlist其他地方的PARAM_MULTIEXPR参数依赖于MULTIEXPR的输出列。
 *
 * The CTE_SUBLINK case never occurs in actual SubLink nodes, but it is used
 * in SubPlans generated for WITH subqueries.
 * CTE_SUBLINK不会出现在实际的SubLink节点中，但用于WITH子查询所产生的SubPlans中。
 */
typedef enum SubLinkType
{
    EXISTS_SUBLINK,
    ALL_SUBLINK,
    ANY_SUBLINK,
    ROWCOMPARE_SUBLINK,
    EXPR_SUBLINK,
    MULTIEXPR_SUBLINK,
    ARRAY_SUBLINK,
    CTE_SUBLINK                    /* 仅用于SubPlans中；for SubPlans only */
} SubLinkType;

SubLink
SubLink结构体


typedef struct SubLink
{
    Expr        xpr;
    SubLinkType subLinkType;    /* see above */
    int            subLinkId;        /* ID (1..n); 0 if not MULTIEXPR */
    Node       *testexpr;        /* outer-query test for ALL/ANY/ROWCOMPARE */
    List       *operName;        /* originally specified operator name */
    Node       *subselect;        /* subselect as Query* or raw parsetree */
    int            location;        /* token location, or -1 if unknown */
} SubLink;

MaterialState
Material状态

/* ----------------
 *     MaterialState information
 *
 *        materialize nodes are used to materialize the results
 *        of a subplan into a temporary file.
 *        materialize节点用于物化subplan的结果为临时文件。
 *
 *        ss.ss_ScanTupleSlot refers to output of underlying plan.
 *        ss.ss_ScanTupleSlot指向underlyling plan的输出（subplan）
 * ----------------
 */
typedef struct MaterialState
{
    ScanState    ss;                /* its first field is NodeTag */
    int            eflags;            /* 传递给tuplestore的capability标记；capability flags to pass to tuplestore */
    bool        eof_underlying; /* 已经到达underlying plan的末尾？reached end of underlying plan? */
    Tuplestorestate *tuplestorestate;
} MaterialState;

Tuplestorestate
Tuplestore相关操作的私有状态。


/*
 * Possible states of a Tuplestore object.  These denote the states that
 * persist between calls of Tuplestore routines.
 */
typedef enum
{
    TSS_INMEM,                    /* Tuples still fit in memory */
    TSS_WRITEFILE,                /* Writing to temp file */
    TSS_READFILE                /* Reading from temp file */
} TupStoreStatus;
/*
 * Private state of a Tuplestore operation.
 */
struct Tuplestorestate
{
    TupStoreStatus status;        /* 状态枚举值；enumerated value as shown above */
    int            eflags;            /* capability flags (OR of pointers' flags) */
    bool        backward;        /* store extra length words in file? */
    bool        interXact;        /* keep open through transactions? */
    bool        truncated;        /* tuplestore_trim has removed tuples? */
    int64        availMem;        /* remaining memory available, in bytes */
    int64        allowedMem;        /* total memory allowed, in bytes */
    int64        tuples;            /* number of tuples added */
    BufFile    *myfile;            /* underlying file, or NULL if none */
    MemoryContext context;        /* memory context for holding tuples */
    ResourceOwner resowner;        /* resowner for holding temp files */
    /*
     * These function pointers decouple the routines that must know what kind
     * of tuple we are handling from the routines that don't need to know it.
     * They are set up by the tuplestore_begin_xxx routines.
     *
     * (Although tuplestore.c currently only supports heap tuples, I've copied
     * this part of tuplesort.c so that extension to other kinds of objects
     * will be easy if it's ever needed.)
     *
     * Function to copy a supplied input tuple into palloc'd space. (NB: we
     * assume that a single pfree() is enough to release the tuple later, so
     * the representation must be "flat" in one palloc chunk.) state->availMem
     * must be decreased by the amount of space used.
     */
    void       *(*copytup) (Tuplestorestate *state, void *tup);
    /*
     * Function to write a stored tuple onto tape.  The representation of the
     * tuple on tape need not be the same as it is in memory; requirements on
     * the tape representation are given below.  After writing the tuple,
     * pfree() it, and increase state->availMem by the amount of memory space
     * thereby released.
     */
    void        (*writetup) (Tuplestorestate *state, void *tup);
    /*
     * Function to read a stored tuple from tape back into memory. 'len' is
     * the already-read length of the stored tuple.  Create and return a
     * palloc'd copy, and decrease state->availMem by the amount of memory
     * space consumed.
     */
    void       *(*readtup) (Tuplestorestate *state, unsigned int len);
    /*
     * This array holds pointers to tuples in memory if we are in state INMEM.
     * In states WRITEFILE and READFILE it's not used.
     *
     * When memtupdeleted > 0, the first memtupdeleted pointers are already
     * released due to a tuplestore_trim() operation, but we haven't expended
     * the effort to slide the remaining pointers down.  These unused pointers
     * are set to NULL to catch any invalid accesses.  Note that memtupcount
     * includes the deleted pointers.
     */
    void      **memtuples;        /* array of pointers to palloc'd tuples */
    int            memtupdeleted;    /* the first N slots are currently unused */
    int            memtupcount;    /* number of tuples currently present */
    int            memtupsize;        /* allocated length of memtuples array */
    bool        growmemtuples;    /* memtuples' growth still underway? */
    /*
     * These variables are used to keep track of the current positions.
     *
     * In state WRITEFILE, the current file seek position is the write point;
     * in state READFILE, the write position is remembered in writepos_xxx.
     * (The write position is the same as EOF, but since BufFileSeek doesn't
     * currently implement SEEK_END, we have to remember it explicitly.)
     */
    TSReadPointer *readptrs;    /* array of read pointers */
    int            activeptr;        /* index of the active read pointer */
    int            readptrcount;    /* number of pointers currently valid */
    int            readptrsize;    /* allocated length of readptrs array */
    int            writepos_file;    /* file# (valid if READFILE state) */
    off_t        writepos_offset;    /* offset (valid if READFILE state) */
};
#define COPYTUP(state,tup)    ((*(state)->copytup) (state, tup))
#define WRITETUP(state,tup) ((*(state)->writetup) (state, tup))
#define READTUP(state,len)    ((*(state)->readtup) (state, len))
#define LACKMEM(state)        ((state)->availMem < 0)
#define USEMEM(state,amt)    ((state)->availMem -= (amt))
#define FREEMEM(state,amt)    ((state)->availMem += (amt))

TSReadPointer
tuplestore读指针


/*
 * Possible states of a Tuplestore object.  These denote the states that
 * persist between calls of Tuplestore routines.
 */
typedef enum
{
    TSS_INMEM,                    /* Tuples still fit in memory */
    TSS_WRITEFILE,                /* Writing to temp file */
    TSS_READFILE                /* Reading from temp file */
} TupStoreStatus;
/*
 * State for a single read pointer.  If we are in state INMEM then all the
 * read pointers' "current" fields denote the read positions.  In state
 * WRITEFILE, the file/offset fields denote the read positions.  In state
 * READFILE, inactive read pointers have valid file/offset, but the active
 * read pointer implicitly has position equal to the temp file's seek position.
 *
 * Special case: if eof_reached is true, then the pointer's read position is
 * implicitly equal to the write position, and current/file/offset aren't
 * maintained.  This way we need not update all the read pointers each time
 * we write.
 */
typedef struct
{
    int            eflags;            /* capability flags */
    bool        eof_reached;    /* read has reached EOF */
    int            current;        /* next array index to read */
    int            file;            /* temp file# */
    off_t        offset;            /* byte offset in file */
} TSReadPointer;

二、源码解读

NOT IN在实际执行时会转换为ALL_SUBLINK，执行的快慢取决于什么时候会碰到符合条件的记录，一旦遇到马上返回。因此，SQL的执行时间与数据表的扫描顺序有紧密的关系，符合条件的行越早出现，程序越早返回，需要的时间越短。
相关代码如下：

...
        //解析表达式
        rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,
                                              &rownull);
        if (subLinkType == ANY_SUBLINK)
        {
            //ANY : 使用OR语义组合
            /* combine across rows per OR semantics */
            if (rownull)
                *isNull = true;
            else if (DatumGetBool(rowresult))
            {
                result = BoolGetDatum(true);
                *isNull = false;
                break;            /* needn't look at any more rows */
            }
        }

如上代码所示，在全表扫描物化的时候一旦textexpr表达式测试到其中一行满足条件，那么就会马上退出循环，而这一行如果非常幸运的出现在扫描的最开始的地方，那执行时间将会很快（扫描几个数据块 vs 全表扫描）。

新建一张表，插入2条记录，其中id = 2的行出现在block编号最小的地方，而id = 1出现在block编号最大的地方，这时候pg就会出现时快时慢的情况，两者相差3个数量级。

[local]:5432 pg12@testdb=# create table tbl3(id int);
CREATE TABLE
Time: 1.852 ms
[local]:5432 pg12@testdb=# insert into tbl3 values(1);
INSERT 0 1
Time: 1.276 ms
[local]:5432 pg12@testdb=# insert into tbl3 values(2);
INSERT 0 1
Time: 1.089 ms
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 3.676 ms
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 4925.893 ms (00:04.926)
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 2.858 ms
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 4588.436 ms (00:04.588)
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 1.896 ms
[local]:5432 pg12@testdb=# select * from tbl3 where id not in (select b.id from t_big_null b);
 id 
----
(0 rows)
Time: 4653.525 ms (00:04.654)
[local]:5432 pg12@testdb=#

ExecScanSubPlan


/*
 * ExecScanSubPlan: default case where we have to rescan subplan each time
 * 默认情况下每次都不得不重新扫描subplan
 */
static Datum
ExecScanSubPlan(SubPlanState *node,
                ExprContext *econtext,
                bool *isNull)
{
    SubPlan    *subplan = node->subplan;//子计划
    PlanState  *planstate = node->planstate;//计划运行期状态
    SubLinkType subLinkType = subplan->subLinkType;//子链接类型
    MemoryContext oldcontext;//原内存上下文
    TupleTableSlot *slot;//元组slot
    Datum        result;//结果指针
    bool        found = false;    /* 如找到至少一个元组,则返回T;true if got at least one subplan tuple */
    ListCell   *pvar;//临时变量
    ListCell   *l;//临时变量
    ArrayBuildStateAny *astate = NULL;//
    /*
     * MULTIEXPR subplans, when "executed", just return NULL; but first we
     * mark the subplan's output parameters as needing recalculation.  (This
     * is a bit of a hack: it relies on the subplan appearing later in its
     * targetlist than any of the referencing Params, so that all the Params
     * have been evaluated before we re-mark them for the next evaluation
     * cycle.  But in general resjunk tlist items appear after non-resjunk
     * ones, so this should be safe.)  Unlike ExecReScanSetParamPlan, we do
     * *not* set bits in the parent plan node's chgParam, because we don't
     * want to cause a rescan of the parent.
     *
     * MULTIEXPR处理逻辑
     */
    if (subLinkType == MULTIEXPR_SUBLINK)
    {
        EState       *estate = node->parent->state;
        foreach(l, subplan->setParam)
        {
            int            paramid = lfirst_int(l);
            ParamExecData *prm = &(estate->es_param_exec_vals[paramid]);
            prm->execPlan = node;
        }
        *isNull = true;
        return (Datum) 0;
    }
    /* Initialize ArrayBuildStateAny in caller's context, if needed */
    //数组
    if (subLinkType == ARRAY_SUBLINK)
        astate = initArrayResultAny(subplan->firstColType,
                                    CurrentMemoryContext, true);
    /*
     * We are probably in a short-lived expression-evaluation context. Switch
     * to the per-query context for manipulating the child plan's chgParam,
     * calling ExecProcNode on it, etc.
     */
    //切换上下文
    oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
    /*
     * Set Params of this plan from parent plan correlation values. (Any
     * calculation we have to do is done in the parent econtext, since the
     * Param values don't need to have per-query lifetime.)
     */
    //通过父计划相关值中设置子计划参数
    Assert(list_length(subplan->parParam) == list_length(node->args));
    forboth(l, subplan->parParam, pvar, node->args)
    {
        int            paramid = lfirst_int(l);
        ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
        prm->value = ExecEvalExprSwitchContext((ExprState *) lfirst(pvar),
                                               econtext,
                                               &(prm->isnull));
        planstate->chgParam = bms_add_member(planstate->chgParam, paramid);
    }
    /*
     * Now that we've set up its parameters, we can reset the subplan.
     */
    //执行ReScan
    //Reset a plan node so that its output can be re-scanned.
    ExecReScan(planstate);
    /*
     * For all sublink types except EXPR_SUBLINK and ARRAY_SUBLINK, the result
     * is boolean as are the results of the combining operators. We combine
     * results across tuples (if the subplan produces more than one) using OR
     * semantics for ANY_SUBLINK or AND semantics for ALL_SUBLINK.
     * (ROWCOMPARE_SUBLINK doesn't allow multiple tuples from the subplan.)
     * NULL results from the combining operators are handled according to the
     * usual SQL semantics for OR and AND.  The result for no input tuples is
     * FALSE for ANY_SUBLINK, TRUE for ALL_SUBLINK, NULL for
     * ROWCOMPARE_SUBLINK.
     * 除EXPR_SUBLINK和ARRAY_SUBLINK外的所有sublink,结果是布尔值(组合运算符的结果).
     * PG通过跨元组(如子计划产生多个元组)合并结果,对于ANY_SUBLINK使用OR语义,ALL_SUBLINK则使用AND语义.
     * (ROWCOMPARE_SUBLINK不允许子计划返回多个元组)
     * 从组合操作符中返回的NULL遵循SQL中的OR和AND语义.
     * 如没有输入元组,ANY_SUBLINK为FALSE,ALL_SUBLINK为TRUE,ROWCOMPARE_SUBLINK为NULL.
     *
     * For EXPR_SUBLINK we require the subplan to produce no more than one
     * tuple, else an error is raised.  If zero tuples are produced, we return
     * NULL.  Assuming we get a tuple, we just use its first column (there can
     * be only one non-junk column in this case).
     * 对于EXPR_SUBLINK,需要subplan产生不超过一个元组,否则报错.如果没有元组产生,返回NULL.
     * 假定获取到一个元组,则使用第一个列(这种情况下只有一个non-junk列).
     *
     * For ARRAY_SUBLINK we allow the subplan to produce any number of tuples,
     * and form an array of the first column's values.  Note in particular
     * that we produce a zero-element array if no tuples are produced (this is
     * a change from pre-8.3 behavior of returning NULL).
     * 对于ARRAY_SUBLINK,允许subplan产生任意数目的元组,使用第一个列值组成数组.
     * 特别注意的是如没有元组产生则产生0个元素的数组(8.3以前是返回NULL).
     */
    result = BoolGetDatum(subLinkType == ALL_SUBLINK);//ALL为T,否则为F
    *isNull = false;
    for (slot = ExecProcNode(planstate);
         !TupIsNull(slot);
         slot = ExecProcNode(planstate))//循环获取元组,直至没有元组为NULL(即已完成)
    {
        //元组描述符
        TupleDesc    tdesc = slot->tts_tupleDescriptor;
        Datum        rowresult;//结果
        bool        rownull;//是否为空?
        int            col;//列计数器
        ListCell   *plst;//临时变量
        if (subLinkType == EXISTS_SUBLINK)//EXISTS
        {
            found = true;
            result = BoolGetDatum(true);
            break;
        }
        if (subLinkType == EXPR_SUBLINK)//EXPR表达式
        {
            /* cannot allow multiple input tuples for EXPR sublink */
            if (found)
                ereport(ERROR,
                        (errcode(ERRCODE_CARDINALITY_VIOLATION),
                         errmsg("more than one row returned by a subquery used as an expression")));
            found = true;
            /*
             * We need to copy the subplan's tuple in case the result is of
             * pass-by-ref type --- our return value will point into this
             * copied tuple!  Can't use the subplan's instance of the tuple
             * since it won't still be valid after next ExecProcNode() call.
             * node->curTuple keeps track of the copied tuple for eventual
             * freeing.
             */
            if (node->curTuple)
                heap_freetuple(node->curTuple);
            node->curTuple = ExecCopySlotHeapTuple(slot);
            result = heap_getattr(node->curTuple, 1, tdesc, isNull);
            /* keep scanning subplan to make sure there's only one tuple */
            continue;
        }
        if (subLinkType == ARRAY_SUBLINK)//数组
        {
            Datum        dvalue;
            bool        disnull;
            found = true;
            /* stash away current value */
            Assert(subplan->firstColType == TupleDescAttr(tdesc, 0)->atttypid);
            dvalue = slot_getattr(slot, 1, &disnull);
            astate = accumArrayResultAny(astate, dvalue, disnull,
                                         subplan->firstColType, oldcontext);
            /* keep scanning subplan to collect all values */
            continue;
        }
        /* cannot allow multiple input tuples for ROWCOMPARE sublink either */
        if (subLinkType == ROWCOMPARE_SUBLINK && found)//行比较
            ereport(ERROR,
                    (errcode(ERRCODE_CARDINALITY_VIOLATION),
                     errmsg("more than one row returned by a subquery used as an expression")));
        found = true;//初始为T
        /*
         * For ALL, ANY, and ROWCOMPARE sublinks, load up the Params
         * representing the columns of the sub-select, and then evaluate the
         * combining expression.
         * 对于ALL,ANY和ROWCOMPARE子链接,加载表示子查询列的Params,并解析组合表达式
         */
        col = 1;//列从1计数
        foreach(plst, subplan->paramIds)//循环遍历子查询参数
        {
            int            paramid = lfirst_int(plst);
            ParamExecData *prmdata;
            prmdata = &(econtext->ecxt_param_exec_vals[paramid]);
            Assert(prmdata->execPlan == NULL);
            //获取参数值
            prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));
            //下一个列
            col++;
        }
        //解析表达式
        rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,
                                              &rownull);
        if (subLinkType == ANY_SUBLINK)
        {
            //ANY : 使用OR语义组合
            /* combine across rows per OR semantics */
            if (rownull)
                *isNull = true;
            else if (DatumGetBool(rowresult))
            {
                result = BoolGetDatum(true);
                *isNull = false;
                break;            /* needn't look at any more rows */
            }
        }
        else if (subLinkType == ALL_SUBLINK)
        {
            //ALL : 使用AND语义
            /* combine across rows per AND semantics */
            if (rownull)
                *isNull = true;
            else if (!DatumGetBool(rowresult))
            {
                result = BoolGetDatum(false);
                *isNull = false;
                break;            /* needn't look at any more rows */
            }
        }
        else
        {
            /* must be ROWCOMPARE_SUBLINK */
            //这里一定是ROWCOMPARE
            result = rowresult;
            *isNull = rownull;
        }
    }
    MemoryContextSwitchTo(oldcontext);
    if (subLinkType == ARRAY_SUBLINK)
    {
        /* We return the result in the caller's context */
        //在调用者上下文中返回结果
        result = makeArrayResultAny(astate, oldcontext, true);
    }
    else if (!found)
    {
        /*
         * deal with empty subplan result.  result/isNull were previously
         * initialized correctly for all sublink types except EXPR and
         * ROWCOMPARE; for those, return NULL.
         * subplan没有结果返回,设置result&isNull值
         */
        if (subLinkType == EXPR_SUBLINK ||
            subLinkType == ROWCOMPARE_SUBLINK)
        {
            result = (Datum) 0;
            *isNull = true;
        }
    }
    //返回结果
    return result;
}

三、跟踪分析

执行SQL：

[local]:5432 pg12@testdb=# select * from tbl a where a.id not in (select b.id from t_big_null b);

启动gdb跟踪，设置断点，观察到断点hit 1760次后就会退出，因此设置为忽略前1758次，只跟踪最后2次。

(gdb) info b
Num     Type           Disp Enb Address            What
13      breakpoint     keep y   0x0000000000721126 in ExecMaterial at nodeMaterial.c:150
    breakpoint already hit 1760 times
    ignore next 3360 hits
...
(gdb) b nodeSubplan.c:328
Breakpoint 17 at 0x7303b9: file nodeSubplan.c, line 328.
(gdb) del 16
(gdb) info b
Num     Type           Disp Enb Address            What
17      breakpoint     keep y   0x00000000007303b9 in ExecScanSubPlan at nodeSubplan.c:328
(gdb) ignore 17 1758
Will ignore next 1758 crossings of breakpoint 17.
(gdb) c
Continuing.

开始跟踪，这是第1759次，这时候从SubPlan获取的数据是id = 10000000

Breakpoint 17, ExecScanSubPlan (node=0x3069268, econtext=0x3068aa0, isNull=0x3068dbd)
    at nodeSubplan.c:328
328            TupleDesc    tdesc = slot->tts_tupleDescriptor;
(gdb) n
334            if (subLinkType == EXISTS_SUBLINK)
(gdb) 
341            if (subLinkType == EXPR_SUBLINK)
(gdb) 
367            if (subLinkType == ARRAY_SUBLINK)
(gdb) 
383            if (subLinkType == ROWCOMPARE_SUBLINK && found)
(gdb) 
388            found = true;
(gdb) 
395            col = 1;
(gdb) 
396            foreach(plst, subplan->paramIds)
(gdb) 
398                int            paramid = lfirst_int(plst);
(gdb) 
401                prmdata = &(econtext->ecxt_param_exec_vals[paramid]);
(gdb) 
402                Assert(prmdata->execPlan == NULL);
(gdb) 
403                prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));
(gdb) p *prmdata
$109 = {execPlan = 0x0, value = 9999999, isnull = false}
(gdb) n
404                col++;
(gdb) p *prmdata
$110 = {execPlan = 0x0, value = 10000000, isnull = false}
(gdb) n
396            foreach(plst, subplan->paramIds)
(gdb)

解析表达式

407            rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,
(gdb) step
ExecEvalExprSwitchContext (state=0x3069380, econtext=0x3068aa0, isNull=0x7ffd184750ef)
    at ../../../src/include/executor/executor.h:306
306        oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
(gdb) n
307        retDatum = state->evalfunc(state, econtext, isNull);
(gdb) step
ExecInterpExpr (state=0x3069380, econtext=0x3068aa0, isnull=0x7ffd184750ef)
    at execExprInterp.c:404
404        if (unlikely(state == NULL))
(gdb) n
411        op = state->steps;
(gdb) p *state
$111 = {tag = {type = T_ExprState}, flags = 6 '\006', resnull = false, resvalue = 0, 
  resultslot = 0x0, steps = 0x3069418, evalfunc = 0x6e2d4d , 
  expr = 0x30917a8, evalfunc_private = 0x6e2d4d , steps_len = 5, 
  steps_alloc = 16, parent = 0x3068988, ext_params = 0x0, innermost_caseval = 0x0, 
  innermost_casenull = 0x0, innermost_domainval = 0x0, innermost_domainnull = 0x0}
(gdb) n
412        resultslot = state->resultslot;
(gdb) 
413        innerslot = econtext->ecxt_innertuple;
(gdb) 
414        outerslot = econtext->ecxt_outertuple;
(gdb) 
415        scanslot = econtext->ecxt_scantuple;
(gdb) p *innerslot
Cannot access memory at address 0x0
(gdb) p *outerslot
Cannot access memory at address 0x0
(gdb) n
418        EEO_DISPATCH();
(gdb) p *scanslot
$112 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 1, 
  tts_ops = 0xc3e780 , tts_tupleDescriptor = 0x7fab449c99f0, 
  tts_values = 0x3068bd0, tts_isnull = 0x3068be0, tts_mcxt = 0x3067da0, tts_tid = {
    ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 2}, tts_tableOid = 40960}
(gdb) p *scanslot->tts_values
$113 = 1
(gdb) n
448                CheckOpSlotCompatibility(op, scanslot);
(gdb) n
450                slot_getsomeattrs(scanslot, op->d.fetch.last_var);
(gdb) 
452                EEO_NEXT();
(gdb) 
487                int            attnum = op->d.var.attnum;
(gdb) 
491                Assert(attnum >= 0 && attnum < scanslot->tts_nvalid);
(gdb) 
492                *op->resvalue = scanslot->tts_values[attnum];
(gdb) 
493                *op->resnull = scanslot->tts_isnull[attnum];
(gdb) 
495                EEO_NEXT();
(gdb) p *op->resvalue
$114 = 1
(gdb) n
962                ExecEvalParamExec(state, op, econtext);
(gdb) 
964                EEO_NEXT();
(gdb) p *op
$115 = {opcode = 7224136, resvalue = 0x30698b8, resnull = 0x30698c0, d = {fetch = {
      last_var = 0, fixed = 23, known_desc = 0x0, kind = 0x0}, var = {attnum = 0, 
      vartype = 23}, wholerow = {var = 0x1700000000, first = false, slow = false, 
      tupdesc = 0x0, junkFilter = 0x0}, assign_var = {resultnum = 0, attnum = 23}, 
    assign_tmp = {resultnum = 0}, constval = {value = 98784247808, isnull = false}, 
    func = {finfo = 0x1700000000, fcinfo_data = 0x0, fn_addr = 0x0, nargs = 0}, 
    boolexpr = {anynull = 0x1700000000, jumpdone = 0}, qualexpr = {jumpdone = 0}, jump = {
      jumpdone = 0}, nulltest_row = {argdesc = 0x1700000000}, param = {paramid = 0, 
      paramtype = 23}, cparam = {paramfunc = 0x1700000000, paramarg = 0x0, paramid = 0, 
      paramtype = 0}, casetest = {value = 0x1700000000, isnull = 0x0}, make_readonly = {
      value = 0x1700000000, isnull = 0x0}, iocoerce = {finfo_out = 0x1700000000, 
      fcinfo_data_out = 0x0, finfo_in = 0x0, fcinfo_data_in = 0x0}, sqlvaluefunction = {
      svf = 0x1700000000}, nextvalueexpr = {seqid = 0, seqtypid = 23}, arrayexpr = {
      elemvalues = 0x1700000000, elemnulls = 0x0, nelems = 0, elemtype = 0, 
      elemlength = 0, elembyval = false, elemalign = 0 '\000', multidims = false}, 
    arraycoerce = {elemexprstate = 0x1700000000, resultelemtype = 0, amstate = 0x0}, 
    row = {tupdesc = 0x1700000000, elemvalues = 0x0, elemnulls = 0x0}, rowcompare_step = {
      finfo = 0x1700000000, fcinfo_data = 0x0, fn_addr = 0x0, jumpnull = 0, 
      jumpdone = 0}, rowcompare_final = {rctype = 0}, minmax = {values = 0x1700000000, 
      nulls = 0x0, nelems = 0, op = IS_GREATEST, finfo = 0x0, fcinfo_data = 0x0}, 
    fieldselect = {fieldnum = 0, resulttype = 23, argdesc = 0x0}, fieldstore = {
      fstore = 0x1700000000, argdesc = 0x0, values = 0x0, nulls = 0x0, ncolumns = 0}, 
    sbsref_subscript = {state = 0x1700000000, off = 0, isupper = false, jumpdone = 0}, 
    sbsref = {state = 0x1700000000}, domaincheck = {
      constraintname = 0x1700000000 , 
      checkvalue = 0x0, checknull = 0x0, resulttype = 0}, convert_rowtype = {
      convert = 0x1700000000, indesc = 0x0, outdesc = 0x0, map = 0x0, 
      initialized = false}, scalararrayop = {element_type = 0, useOr = 23, typlen = 0, 
      typbyval = false, typalign = 0 '\000', finfo = 0x0, fcinfo_data = 0x0, 
      fn_addr = 0x0}, xmlexpr = {xexpr = 0x1700000000, named_argvalue = 0x0, 
      named_argnull = 0x0, argvalue = 0x0, argnull = 0x0}, aggref = {
      astate = 0x1700000000}, grouping_func = {parent = 0x1700000000, clauses = 0x0}, 
    window_func = {wfstate = 0x1700000000}, subplan = {sstate = 0x1700000000}, 
    alternative_subplan = {asstate = 0x1700000000}, agg_deserialize = {
      aggstate = 0x1700000000, fcinfo_data = 0x0, jumpnull = 0}, 
    agg_strict_input_check = {args = 0x1700000000, nulls = 0x0, nargs = 0, jumpnull = 0}, 
    agg_init_trans = {aggstate = 0x1700000000, pertrans = 0x0, aggcontext = 0x0, 
      setno = 0, transno = 0, setoff = 0, jumpnull = 0}, agg_strict_trans_check = {
      aggstate = 0x1700000000, setno = 0, transno = 0, setoff = 0, jumpnull = 0}, 
    agg_trans = {aggstate = 0x1700000000, pertrans = 0x0, aggcontext = 0x0, setno = 0, 
      transno = 0, setoff = 0}}}
(gdb) p *state
$116 = {tag = {type = T_ExprState}, flags = 6 '\006', resnull = false, resvalue = 0, 
  resultslot = 0x0, steps = 0x3069418, evalfunc = 0x6e2d4d , 
  expr = 0x30917a8, evalfunc_private = 0x6e2d4d , steps_len = 5, 
  steps_alloc = 16, parent = 0x3068988, ext_params = 0x0, innermost_caseval = 0x0, 
  innermost_casenull = 0x0, innermost_domainval = 0x0, innermost_domainnull = 0x0}
(gdb) n
634                FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
(gdb) 
635                NullableDatum *args = fcinfo->args;
(gdb) p *fcinfo
$117 = {flinfo = 0x3069830, context = 0x0, resultinfo = 0x0, fncollation = 0, 
  isnull = false, nargs = 2, args = 0x30698a8}
(gdb) p *fcinfo->args
$118 = {value = 1, isnull = false}
(gdb) n
640                for (argno = 0; argno < op->d.func.nargs; argno++)
(gdb) p op->d.func.nargs
$119 = 2
(gdb) p *op
$120 = {opcode = 7222440, resvalue = 0x3069388, resnull = 0x3069385, d = {fetch = {
      last_var = 50763824, fixed = false, known_desc = 0x3069888, 
      kind = 0x96c2b2 }, var = {attnum = 50763824, vartype = 0}, wholerow = {
      var = 0x3069830, first = 136, slow = 152, tupdesc = 0x96c2b2 , 
      junkFilter = 0x2}, assign_var = {resultnum = 50763824, attnum = 0}, assign_tmp = {
      resultnum = 50763824}, constval = {value = 50763824, isnull = 136}, func = {
      finfo = 0x3069830, fcinfo_data = 0x3069888, fn_addr = 0x96c2b2 , 
      nargs = 2}, boolexpr = {anynull = 0x3069830, jumpdone = 50763912}, qualexpr = {
      jumpdone = 50763824}, jump = {jumpdone = 50763824}, nulltest_row = {
      argdesc = 0x3069830}, param = {paramid = 50763824, paramtype = 0}, cparam = {
      paramfunc = 0x3069830, paramarg = 0x3069888, paramid = 9880242, paramtype = 0}, 
    casetest = {value = 0x3069830, isnull = 0x3069888}, make_readonly = {
      value = 0x3069830, isnull = 0x3069888}, iocoerce = {finfo_out = 0x3069830, 
      fcinfo_data_out = 0x3069888, finfo_in = 0x96c2b2 , fcinfo_data_in = 0x2}, 
    sqlvaluefunction = {svf = 0x3069830}, nextvalueexpr = {seqid = 50763824, 
      seqtypid = 0}, arrayexpr = {elemvalues = 0x3069830, elemnulls = 0x3069888, 
      nelems = 9880242, elemtype = 0, elemlength = 2, elembyval = false, 
      elemalign = 0 '\000', multidims = false}, arraycoerce = {elemexprstate = 0x3069830, 
      resultelemtype = 50763912, amstate = 0x96c2b2 }, row = {
      tupdesc = 0x3069830, elemvalues = 0x3069888, elemnulls = 0x96c2b2 }, 
    rowcompare_step = {finfo = 0x3069830, fcinfo_data = 0x3069888, 
      fn_addr = 0x96c2b2 , jumpnull = 2, jumpdone = 0}, rowcompare_final = {
      rctype = 50763824}, minmax = {values = 0x3069830, nulls = 0x3069888, 
      nelems = 9880242, op = IS_GREATEST, finfo = 0x2, fcinfo_data = 0x0}, fieldselect = {
      fieldnum = -26576, resulttype = 0, argdesc = 0x3069888}, fieldstore = {
      fstore = 0x3069830, argdesc = 0x3069888, values = 0x96c2b2 , nulls = 0x2, 
      ncolumns = 0}, sbsref_subscript = {state = 0x3069830, off = 50763912, 
      isupper = false, jumpdone = 9880242}, sbsref = {state = 0x3069830}, domaincheck = {
      constraintname = 0x3069830 "\262\302\226", checkvalue = 0x3069888, 
      checknull = 0x96c2b2 , resulttype = 2}, convert_rowtype = {
      convert = 0x3069830, indesc = 0x3069888, outdesc = 0x96c2b2 , map = 0x2, 
      initialized = false}, scalararrayop = {element_type = 50763824, useOr = false, 
      typlen = 0, typbyval = 136, typalign = -104 '\230', finfo = 0x96c2b2 , 
      fcinfo_data = 0x2, fn_addr = 0x0}, xmlexpr = {xexpr = 0x3069830, 
      named_argvalue = 0x3069888, named_argnull = 0x96c2b2 , argvalue = 0x2, 
      argnull = 0x0}, aggref = {astate = 0x3069830}, grouping_func = {parent = 0x3069830, 
      clauses = 0x3069888}, window_func = {wfstate = 0x3069830}, subplan = {
      sstate = 0x3069830}, alternative_subplan = {asstate = 0x3069830}, 
    agg_deserialize = {aggstate = 0x3069830, fcinfo_data = 0x3069888, 
      jumpnull = 9880242}, agg_strict_input_check = {args = 0x3069830, nulls = 0x3069888, 
      nargs = 9880242, jumpnull = 0}, agg_init_trans = {aggstate = 0x3069830, 
      pertrans = 0x3069888, aggcontext = 0x96c2b2 , setno = 2, transno = 0, 
      setoff = 0, jumpnull = 0}, agg_strict_trans_check = {aggstate = 0x3069830, 
      setno = 50763912, transno = 0, setoff = 9880242, jumpnull = 0}, agg_trans = {
      aggstate = 0x3069830, pertrans = 0x3069888, aggcontext = 0x96c2b2 , 
      setno = 2, transno = 0, setoff = 0}}}
(gdb) p op->d->func
$121 = {finfo = 0x3069830, fcinfo_data = 0x3069888, fn_addr = 0x96c2b2 , 
  nargs = 2}
(gdb) p op->d->func->finfo
$122 = (FmgrInfo *) 0x3069830
(gdb) p *op->d->func->finfo
$123 = {fn_addr = 0x96c2b2 , fn_oid = 65, fn_nargs = 2, fn_strict = true, 
  fn_retset = false, fn_stats = 2 '\002', fn_extra = 0x0, fn_mcxt = 0x3067da0, 
  fn_expr = 0x30917a8}
(gdb) p *op->d->func->fcinfo_data
$124 = {flinfo = 0x3069830, context = 0x0, resultinfo = 0x0, fncollation = 0, 
  isnull = false, nargs = 2, args = 0x30698a8}
(gdb) p *op->d->func->fcinfo_data->flinfo
$125 = {fn_addr = 0x96c2b2 , fn_oid = 65, fn_nargs = 2, fn_strict = true, 
  fn_retset = false, fn_stats = 2 '\002', fn_extra = 0x0, fn_mcxt = 0x3067da0, 
  fn_expr = 0x30917a8}
(gdb) p *op->d->func->fcinfo_data->args
$126 = {value = 1, isnull = false}
(gdb) n
642                    if (args[argno].isnull)
(gdb) 
640                for (argno = 0; argno < op->d.func.nargs; argno++)
(gdb) 
642                    if (args[argno].isnull)
(gdb) 
640                for (argno = 0; argno < op->d.func.nargs; argno++)
(gdb) 
648                fcinfo->isnull = false;
(gdb) p *args
$127 = {value = 1, isnull = false}
(gdb) n
649                d = op->d.func.fn_addr(fcinfo);
(gdb) 
650                *op->resvalue = d;
(gdb) p d
$128 = 0
(gdb) n
651                *op->resnull = fcinfo->isnull;
(gdb) 
654                EEO_NEXT();
(gdb) 
425                goto out;
(gdb) n
1747        *isnull = state->resnull;
(gdb) 
1748        return state->resvalue;
(gdb) p *state
$129 = {tag = {type = T_ExprState}, flags = 6 '\006', resnull = false, resvalue = 0, 
  resultslot = 0x0, steps = 0x3069418, evalfunc = 0x6e2d4d , 
  expr = 0x30917a8, evalfunc_private = 0x6e2d4d , steps_len = 5, 
  steps_alloc = 16, parent = 0x3068988, ext_params = 0x0, innermost_caseval = 0x0, 
  innermost_casenull = 0x0, innermost_domainval = 0x0, innermost_domainnull = 0x0}
(gdb) n
1749    }
(gdb) 
ExecEvalExprSwitchContext (state=0x3069380, econtext=0x3068aa0, isNull=0x7ffd184750ef)
    at ../../../src/include/executor/executor.h:308
308        MemoryContextSwitchTo(oldContext);
(gdb) 
309        return retDatum;
(gdb) p retDatum
$130 = 0
(gdb) n
310    }
(gdb)

这是第1760次调用

ExecScanSubPlan (node=0x3069268, econtext=0x3068aa0, isNull=0x3068dbd)
    at nodeSubplan.c:410
410            if (subLinkType == ANY_SUBLINK)
(gdb) 
413                if (rownull)
(gdb) 
415                else if (DatumGetBool(rowresult))
(gdb) p rowresult
$131 = 0
(gdb) n
326             slot = ExecProcNode(planstate))
(gdb) 
324        for (slot = ExecProcNode(planstate);
(gdb) 
325             !TupIsNull(slot);
(gdb) 
Breakpoint 17, ExecScanSubPlan (node=0x3069268, econtext=0x3068aa0, isNull=0x3068dbd)
    at nodeSubplan.c:328
328            TupleDesc    tdesc = slot->tts_tupleDescriptor;
(gdb) 
334            if (subLinkType == EXISTS_SUBLINK)
(gdb) 
341            if (subLinkType == EXPR_SUBLINK)
(gdb) 
367            if (subLinkType == ARRAY_SUBLINK)
(gdb) 
383            if (subLinkType == ROWCOMPARE_SUBLINK && found)
(gdb) 
388            found = true;
(gdb) 
395            col = 1;
(gdb) p *slot->tts_values
$132 = 10000000 --> 上一次的数据
(gdb) n
396            foreach(plst, subplan->paramIds)
(gdb) 
398                int            paramid = lfirst_int(plst);
(gdb) 
401                prmdata = &(econtext->ecxt_param_exec_vals[paramid]);
(gdb) 
402                Assert(prmdata->execPlan == NULL);
(gdb) p *prmdata
$133 = {execPlan = 0x0, value = 10000000, isnull = false}
(gdb) n
403                prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));
(gdb) 
404                col++;
(gdb) p *prmdata
$134 = {execPlan = 0x0, value = 1, isnull = false} --> 本次数据，值为1
(gdb) info b
Num     Type           Disp Enb Address            What
17      breakpoint     keep y   0x00000000007303b9 in ExecScanSubPlan at nodeSubplan.c:328
    breakpoint already hit 1760 times
(gdb) n
396            foreach(plst, subplan->paramIds)
(gdb) 
407            rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,
(gdb) 
410            if (subLinkType == ANY_SUBLINK)
(gdb) 
413                if (rownull)
(gdb) 
415                else if (DatumGetBool(rowresult))
(gdb) 
417                    result = BoolGetDatum(true);
(gdb) 
418                    *isNull = false;
(gdb) 
419                    break;            /* needn't look at any more rows */
(gdb) 
442        MemoryContextSwitchTo(oldcontext);
(gdb) 
444        if (subLinkType == ARRAY_SUBLINK)
(gdb) 
449        else if (!found)
(gdb) 
464        return result;
(gdb) 
(gdb) p result
$135 = 1 --> 满足条件

DONE

四、参考资料

N/A

网站名称：PostgreSQL源码解读（233）-查询#126(NOTIN实现#4)
文章路径：http://azwzsj.com/article/jcicde.html

PostgreSQL源码解读（233）-查询#126(NOTIN实现#4)

一、数据结构

二、源码解读

三、跟踪分析

四、参考资料

其他资讯