![]() |
Bitcoin Core 30.99.0
P2P Digital Currency
|
Class to represent the internal state of the spanning-forest linearization (SFL) algorithm. More...
#include <cluster_linearize.h>
Classes | |
| struct | TxData |
| Structure with information about a single transaction. More... | |
Public Member Functions | |
| SpanningForestState (const DepGraph< SetType > &depgraph LIFETIMEBOUND, uint64_t rng_seed, const CostModel &cost=CostModel{}) noexcept | |
| Construct a spanning forest for the given DepGraph, with every transaction in its own chunk (not topological). More... | |
| void | LoadLinearization (std::span< const DepGraphIndex > old_linearization) noexcept |
| Load an existing linearization. More... | |
| void | MakeTopological () noexcept |
| Make state topological. More... | |
| void | StartOptimizing () noexcept |
| Initialize the data structure for optimization. More... | |
| bool | OptimizeStep () noexcept |
| Try to improve the forest. More... | |
| void | StartMinimizing () noexcept |
| Initialize data structure for minimizing the chunks. More... | |
| bool | MinimizeStep () noexcept |
| Try to reduce a chunk's size. More... | |
| std::vector< DepGraphIndex > | GetLinearization (const StrongComparator< DepGraphIndex > auto &fallback_order) noexcept |
| Construct a topologically-valid linearization from the current forest state. More... | |
| std::vector< FeeFrac > | GetDiagram () const noexcept |
| Get the diagram for the current state, which must be topological. More... | |
| uint64_t | GetCost () const noexcept |
| Determine how much work was performed so far. More... | |
| void | SanityCheck () const |
| Verify internal consistency of the data structure. More... | |
Private Types | |
| using | TxIdx = DepGraphIndex |
| Data type to represent indexing into m_tx_data. More... | |
| using | SetIdx = std::conditional_t<(SetType::Size()<=0xff), uint8_t, std::conditional_t<(SetType::Size()<=0xffff), uint16_t, uint32_t > > |
| Data type to represent indexing into m_set_info. More... | |
Private Member Functions | |
| TxIdx | PickRandomTx (const SetType &tx_idxs) noexcept |
| Pick a random transaction within a set (which must be non-empty). More... | |
| std::pair< SetType, SetType > | GetReachable (const SetType &tx_idxs) const noexcept |
| Find the set of out-of-chunk transactions reachable from tx_idxs, both in upwards and downwards direction. More... | |
| SetIdx | Activate (TxIdx parent_idx, TxIdx child_idx) noexcept |
| Make the inactive dependency from child to parent, which must not be in the same chunk already, active. More... | |
| std::pair< SetIdx, SetIdx > | Deactivate (TxIdx parent_idx, TxIdx child_idx) noexcept |
| Make a specified active dependency inactive. More... | |
| SetIdx | MergeChunks (SetIdx top_idx, SetIdx bottom_idx) noexcept |
| Activate a dependency from the bottom set to the top set, which must exist. More... | |
| template<bool DownWard> | |
| SetIdx | MergeChunksDirected (SetIdx chunk_idx, SetIdx merge_chunk_idx) noexcept |
| Activate a dependency from chunk_idx to merge_chunk_idx (if !DownWard), or a dependency from merge_chunk_idx to chunk_idx (if DownWard). More... | |
| template<bool DownWard> | |
| SetIdx | PickMergeCandidate (SetIdx chunk_idx) noexcept |
| Determine which chunk to merge chunk_idx with, or INVALID_SET_IDX if none. More... | |
| template<bool DownWard> | |
| SetIdx | MergeStep (SetIdx chunk_idx) noexcept |
| Perform an upward or downward merge step, on the specified chunk. More... | |
| template<bool DownWard> | |
| void | MergeSequence (SetIdx chunk_idx) noexcept |
| Perform an upward or downward merge sequence on the specified chunk. More... | |
| void | Improve (TxIdx parent_idx, TxIdx child_idx) noexcept |
| Split a chunk, and then merge the resulting two chunks to make the graph topological again. More... | |
| SetIdx | PickChunkToOptimize () noexcept |
| Determine the next chunk to optimize, or INVALID_SET_IDX if none. More... | |
| std::pair< TxIdx, TxIdx > | PickDependencyToSplit (SetIdx chunk_idx) noexcept |
| Find a (parent, child) dependency to deactivate in chunk_idx, or (-1, -1) if none. More... | |
Private Attributes | |
| InsecureRandomContext | m_rng |
| Internal RNG. More... | |
| SetType | m_transaction_idxs |
| The set of all TxIdx's of transactions in the cluster indexing into m_tx_data. More... | |
| SetType | m_chunk_idxs |
| The set of all chunk SetIdx's. More... | |
| SetType | m_suboptimal_idxs |
| The set of all SetIdx's that appear in m_suboptimal_chunks. More... | |
| std::vector< TxData > | m_tx_data |
| Information about each transaction (and chunks). More... | |
| std::vector< SetInfo< SetType > > | m_set_info |
| Information about each set (chunk, or active dependency top set). More... | |
| std::vector< std::pair< SetType, SetType > > | m_reachable |
| For each chunk, indexed by SetIdx, the set of out-of-chunk reachable transactions, in the upwards (.first) and downwards (.second) direction. More... | |
| VecDeque< SetIdx > | m_suboptimal_chunks |
| A FIFO of chunk SetIdxs for chunks that may be improved still. More... | |
| VecDeque< std::tuple< SetIdx, TxIdx, unsigned > > | m_nonminimal_chunks |
| A FIFO of chunk indexes with a pivot transaction in them, and a flag to indicate their status: More... | |
| const DepGraph< SetType > & | m_depgraph |
| The DepGraph we are trying to linearize. More... | |
| CostModel | m_cost |
| Accounting for the cost of this computation. More... | |
Static Private Attributes | |
| static constexpr SetIdx | INVALID_SET_IDX = SetIdx(-1) |
| An invalid SetIdx. More... | |
Class to represent the internal state of the spanning-forest linearization (SFL) algorithm.
At all times, each dependency is marked as either "active" or "inactive". The subset of active dependencies is the state of the SFL algorithm. The implementation maintains several other values to speed up operations, but everything is ultimately a function of what that subset of active dependencies is.
Given such a subset, define a chunk as the set of transactions that are connected through active dependencies (ignoring their parent/child direction). Thus, every state implies a particular partitioning of the graph into chunks (including potential singletons). In the extreme, each transaction may be in its own chunk, or in the other extreme all transactions may form a single chunk. A chunk's feerate is its total fee divided by its total size.
The algorithm consists of switching dependencies between active and inactive. The final linearization that is produced at the end consists of these chunks, sorted from high to low feerate, each individually sorted in an arbitrary but topological (= no child before parent) way.
We define four quality properties the state can have:
acyclic: The state is acyclic whenever no cycle of active dependencies exists within the graph, ignoring the parent/child direction. This is equivalent to saying that within each chunk the set of active dependencies form a tree, and thus the overall set of active dependencies in the graph form a spanning forest, giving the algorithm its name. Being acyclic is also equivalent to every chunk of N transactions having exactly N-1 active dependencies.
For example in a diamond graph, D->{B,C}->A, the 4 dependencies cannot be simultaneously active. If at least one is inactive, the state is acyclic.
The algorithm maintains an acyclic state at all times as an invariant. This implies that activating a dependency always corresponds to merging two chunks, and that deactivating one always corresponds to splitting two chunks.
topological: We say the state is topological whenever it is acyclic and no inactive dependency exists between two distinct chunks such that the child chunk has higher or equal feerate than the parent chunk.
The relevance is that whenever the state is topological, the produced output linearization will be topological too (i.e., not have children before parents). Note that the "or equal" part of the definition matters: if not, one can end up in a situation with mutually-dependent equal-feerate chunks that cannot be linearized. For example C->{A,B} and D->{A,B}, with C->A and D->B active. The AC chunk depends on DB through C->B, and the BD chunk depends on AC through D->A. Merging them into a single ABCD chunk fixes this.
The algorithm attempts to keep the state topological as much as possible, so it can be interrupted to produce an output whenever, but will sometimes need to temporarily deviate from it when improving the state.
optimal: For every active dependency, define its top and bottom set as the set of transactions in the chunks that would result if the dependency were deactivated; the top being the one with the dependency's parent, and the bottom being the one with the child. Note that due to acyclicity, every deactivation splits a chunk exactly in two.
We say the state is optimal whenever it is topological and it has no active dependency whose top feerate is strictly higher than its bottom feerate. The relevance is that it can be proven that whenever the state is optimal, the produced linearization will also be optimal (in the convexified feerate diagram sense). It can also be proven that for every graph at least one optimal state exists.
Note that it is possible for the SFL state to not be optimal, but the produced linearization to still be optimal. This happens when the chunks of a state are identical to those of an optimal state, but the exact set of active dependencies within a chunk differ in such a way that the state optimality condition is not satisfied. Thus, the state being optimal is more a "the eventual output is *known* to be optimal".
minimal: We say the state is minimal when it is:
A minimal state effectively corresponds to an optimal state, where every chunk has been split into its minimal equal-feerate components.
The algorithm terminates whenever a minimal state is reached.
This leads to the following high-level algorithm:
When merging, we always either:
Using these strategies in the improvement loop above guarantees that the output linearization after a deactivate + merge step is never worse or incomparable (in the convexified feerate diagram sense) than the output linearization that would be produced before the step. With that, we can refine the high-level algorithm to:
Instead of performing merges arbitrarily to make the initial state topological, it is possible to do so guided by an existing linearization. This has the advantage that the state's would-be output linearization is immediately as good as the existing linearization it was based on:
After reaching an optimal state, it can be transformed into a minimal state by attempting to split chunks further into equal-feerate parts. To do so, pick a specific transaction in each chunk (the pivot), and rerun the above split-then-merge procedure again:
What remains to be specified are a number of heuristics:
Definition at line 718 of file cluster_linearize.h.
|
private |
Data type to represent indexing into m_set_info.
Use the smallest type possible to improve cache locality.
Definition at line 728 of file cluster_linearize.h.
|
private |
Data type to represent indexing into m_tx_data.
Definition at line 725 of file cluster_linearize.h.
|
inlineexplicitnoexcept |
Construct a spanning forest for the given DepGraph, with every transaction in its own chunk (not topological).
Definition at line 1172 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Make the inactive dependency from child to parent, which must not be in the same chunk already, active.
Returns the merged chunk idx.
Definition at line 813 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Make a specified active dependency inactive.
Returns the created parent and child chunk indexes.
Definition at line 886 of file cluster_linearize.h.
|
inlinenoexcept |
Determine how much work was performed so far.
Definition at line 1626 of file cluster_linearize.h.
|
inlinenoexcept |
Get the diagram for the current state, which must be topological.
Test-only.
The linearization produced by GetLinearization() is always at least as good (in the CompareChunks() sense) as this diagram, but may be better.
After an OptimizeStep(), the diagram will always be at least as good as before. Once OptimizeStep() returns false, the diagram will be equivalent to that produced by GetLinearization(), and optimal.
After a MinimizeStep(), the diagram cannot change anymore (in the CompareChunks() sense), but its number of segments can increase still. Once MinimizeStep() returns false, the number of chunks of the produced linearization will match the number of segments in the diagram.
Definition at line 1615 of file cluster_linearize.h.
|
inlinenoexcept |
Construct a topologically-valid linearization from the current forest state.
Must be topological. fallback_order is a comparator that defines a strong order for DepGraphIndexes in this cluster, used to order equal-feerate transactions and chunks.
Specifically, the resulting order consists of:
The output linearization.
A heap with all chunks (by set index) that can currently be included, sorted by chunk feerate (high to low), chunk size (small to large), and by least maximum element according to the fallback order (which is the second pair element).
For every chunk, indexed by SetIdx, the number of unmet dependencies the chunk has on other chunks (not including dependencies within the chunk itself).
For every transaction, indexed by TxIdx, the number of unmet dependencies the transaction has.
A heap with all transactions within the current chunk that can be included, sorted by tx feerate (high to low), tx size (small to large), and fallback order.
Function to compute the highest element of a chunk, by fallback_order.
Comparison function for the transaction heap. Note that it is a max-heap, so tx_cmp_fn(a, b) == true means "a appears after b in the linearization".
Comparison function for the chunk heap. Note that it is a max-heap, so chunk_cmp_fn(a, b) == true means "a appears after b in the linearization".
Definition at line 1460 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Find the set of out-of-chunk transactions reachable from tx_idxs, both in upwards and downwards direction.
Only used by SanityCheck to verify the precomputed reachable sets in m_reachable that are maintained by Activate/Deactivate.
Definition at line 800 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Split a chunk, and then merge the resulting two chunks to make the graph topological again.
Definition at line 1077 of file cluster_linearize.h.
|
inlinenoexcept |
Load an existing linearization.
Must be called immediately after constructor. The result is topological if the linearization is valid. Otherwise, MakeTopological still needs to be called.
Definition at line 1210 of file cluster_linearize.h.
|
inlinenoexcept |
Make state topological.
Can be called after constructing, or after LoadLinearization.
What direction to initially merge chunks in; one of the two directions is enough. This is sufficient because if a non-topological inactive dependency exists between two chunks, at least one of the two chunks will eventually be processed in a direction that discovers it - either the lower chunk tries upward, or the upper chunk tries downward. Chunks that are the result of the merging are always tried in both directions.
Which chunks are the result of merging, and thus need merge attempts in both directions.
What direction(s) to attempt merging in. 1=up, 2=down, 3=both.
Definition at line 1224 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Activate a dependency from the bottom set to the top set, which must exist.
Return the index of the merged chunk.
Definition at line 945 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Activate a dependency from chunk_idx to merge_chunk_idx (if !DownWard), or a dependency from merge_chunk_idx to chunk_idx (if DownWard).
Return the index of the merged chunk.
Definition at line 988 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Perform an upward or downward merge sequence on the specified chunk.
Definition at line 1060 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Perform an upward or downward merge step, on the specified chunk.
Returns the merged chunk, or INVALID_SET_IDX if no merge took place.
Definition at line 1049 of file cluster_linearize.h.
|
inlinenoexcept |
Try to reduce a chunk's size.
Returns false if all chunks are minimal, true otherwise.
Whether to move the pivot down rather than up.
Whether this is already the second stage.
Definition at line 1352 of file cluster_linearize.h.
|
inlinenoexcept |
Try to improve the forest.
Returns false if it is optimal, true otherwise.
Definition at line 1312 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Determine the next chunk to optimize, or INVALID_SET_IDX if none.
Definition at line 1112 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Find a (parent, child) dependency to deactivate in chunk_idx, or (-1, -1) if none.
Definition at line 1136 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Determine which chunk to merge chunk_idx with, or INVALID_SET_IDX if none.
Information about the chunk.
The minimum feerate (if downward) or maximum feerate (if upward) to consider when looking for candidate chunks to merge with. Initially, this is the original chunk's feerate, but is updated to be the current best candidate whenever one is found.
The chunk index for the best candidate chunk to merge with. INVALID_SET_IDX if none.
We generate random tiebreak values to pick between equal-feerate candidate chunks. This variable stores the tiebreak of the current best candidate.
Which parent/child transactions we still need to process the chunks for.
Definition at line 999 of file cluster_linearize.h.
|
inlineprivatenoexcept |
Pick a random transaction within a set (which must be non-empty).
Definition at line 785 of file cluster_linearize.h.
|
inline |
Verify internal consistency of the data structure.
Definition at line 1629 of file cluster_linearize.h.
|
inlinenoexcept |
Initialize data structure for minimizing the chunks.
Can only be called if state is known to be optimal. OptimizeStep() cannot be called anymore afterwards.
Definition at line 1332 of file cluster_linearize.h.
|
inlinenoexcept |
Initialize the data structure for optimization.
It must be topological already.
Definition at line 1294 of file cluster_linearize.h.
|
staticconstexprprivate |
An invalid SetIdx.
Definition at line 734 of file cluster_linearize.h.
|
private |
The set of all chunk SetIdx's.
This excludes the SetIdxs that refer to active dependencies' tops.
Definition at line 755 of file cluster_linearize.h.
|
private |
Accounting for the cost of this computation.
Definition at line 782 of file cluster_linearize.h.
|
private |
The DepGraph we are trying to linearize.
Definition at line 779 of file cluster_linearize.h.
|
private |
A FIFO of chunk indexes with a pivot transaction in them, and a flag to indicate their status:
Definition at line 776 of file cluster_linearize.h.
|
private |
For each chunk, indexed by SetIdx, the set of out-of-chunk reachable transactions, in the upwards (.first) and downwards (.second) direction.
Definition at line 767 of file cluster_linearize.h.
|
private |
Internal RNG.
Definition at line 722 of file cluster_linearize.h.
|
private |
Information about each set (chunk, or active dependency top set).
Indexed by SetIdx.
Definition at line 764 of file cluster_linearize.h.
|
private |
A FIFO of chunk SetIdxs for chunks that may be improved still.
Definition at line 769 of file cluster_linearize.h.
|
private |
The set of all SetIdx's that appear in m_suboptimal_chunks.
Note that they do not need to be chunks: some of these sets may have been converted to a dependency's top set since being added to m_suboptimal_chunks.
Definition at line 759 of file cluster_linearize.h.
|
private |
The set of all TxIdx's of transactions in the cluster indexing into m_tx_data.
Definition at line 752 of file cluster_linearize.h.
|
private |
Information about each transaction (and chunks).
Keeps the "holes" from DepGraph during construction. Indexed by TxIdx.
Definition at line 762 of file cluster_linearize.h.