Scaled Fenwick Trees
Scaled Fenwick Trees
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/[Link]
ABSTRACT A novel data structure that enables the storage and retrieval of linear array numeric
data with logarithmic time complexity updates, range sums, and rescaling is introduced and studied.
Computing sums of ranges of arrays of numbers is a common computational problem encountered in
data compression, coding, machine learning, computational vision, and finance, among other fields.
Efficient data structures enabling log n updates of the underlying data (including range updates),
queries of sums over ranges, and searches for ranges with a given sum have been extensively
studied (n being the length of the array). Two solutions to this problem are well-known: Fenwick
trees (also known as Binary Indexed Trees) and Segment Trees. The new data structure extends the
capabilities for the first time to further enable multiplying (rescaling) ranges of the underlying data
by a scalar as well in log n. Scaling by 0 can be enabled, with the effect that subsequent updates
may take (log n)2 time. The new data structure introduced here consists of a pair of interacting
Fenwick tree-like structures, one of which holds the unscaled values and the other of which holds
the scalars. Experimental results demonstrating performance improvements for the multiplication
operation on arrays from a few dozen to over 30 million data points are discussed. This research was
done as part of Ajna Labs in the course of developing a decentralized finance protocol. It enables
an efficient on-chain encoding and processing of an order book-like data structure used to manage
lending, interest, and collateral.
INDEX TERMS Cumulative Sums, Fenwick Trees, Partial Sums, Prefix Sums, Segment Trees
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
product of entries encountered in the scale factor array ρi be its rightmost (least significant) 1. For example, 44
as you traverse from the node towards the root in a is 101100 in binary, so λ44 = 5 and ρ44 = 2.
particular manner, is equal to the sum of scaled values Let a sequence ai , which is the underlying data to be
in a particular range. Scaling by zero can be enabled stored, encoded as a Fenwick tree V[i]. The principle
as well, with a hit to the update operation becoming of the Fenwick tree is to store at index i in the array
O((log n)2 ). the sum of the values from the index j with the least
This research was done as part of Ajna Labs in the significant bit of i cleared, up to index. Define FR(i)
course of developing a decentralized finance protocol. (the Fenwick Range of i) to be the these indices:
In this protocol, lenders deposit tokens in an order-
book like structure indexed by price. Computing the
FR(i) = i − 2ρi + 1, i − 2ρi + 2, . . . , i − 1, i
amount of deposit above a given price, or finding the (1)
price above which a given amount of deposit sits, are
both key problems. Furthermore, deposits earn interest, V[i] = ai−2ρi +1 + ai−2ρi +2 + · · · + ai (2)
but only if they are priced above a certain level, which
was the motivation for the rescaling operation. For example, if i is odd, FR(i) = {i} and V[i] = ai .
If i is a power of 2, then FR(i) = {1, 2, ..., i} and V[i]
II. SYMBOLS AND ABBREVIATIONS is the entire prefix sum up to and including ai . If i ≡ 2
For convenience, Table 2 contains a reference list of (mod 4) then FR(i) = {i − 1, i} and V[i] = ai−1 + ai .
commonly used symbols and abbreviations in the text. The following facts are easily verified and are the key
In all cases, they are also defined or described when observations explaining how Fenwick trees work:
introduced.
FT.1 i ∈ FR(i)
TABLE 2. Table of Terms and Definitions FT.2 For all i, j, i 6= j implies FR(i) 6= FR(j). Also,
either FR(i) ⊂ FR(j), or FR(j) ⊂ FR(i), or
Term Definition
n Array length
FR(i) ∩ FR(j) = ∅
a Sequence of underlying data to be stored, processed FT.3 j ∈ FR(i) if and only if i can be obtained from j by
and queried iterating the update function upd(j) := j + 2ρi .
λi The place of the leftmost nonzero bit of integer i
ρi The place of the rightmost nonzero bit of integer i
FT.4 Let the interrogation function int(j) be the integer
FR[i] The range of values included in the sum obtained by clearing the least significant bit of j’s
stored in index i of a Fenwick tree. binary expansion: int(j) := j − 2ρj . The set of
i − 2ρi + 1, i − 2ρi + 2, . . . , i − 1, i
V[i] The values array of a Scaled Fenwick Tree
positive integers up to and including i is partitioned
S[i] The scaling array of a Scaled Fenwick Tree into the sets FR(j) where j is obtained by iterating
upd(j) The next index to visit when updating a classic int starting at i and ceasing once obtaining 0.
Fenwick Tree; upd(j) = j + 2ρj
int(j) The next index to visit when querying a classic The functions upd and int were introduced in [13] to
Fenwick Tree; int(j) = j − 2ρj streamline the discussion of the procedures to update
Upd(j) The set of iterates of upd applied to j
scale(i) The total scaling applied to the and interrogate Fenwick trees. In order to increment
Qvalue at index i of a
Scaled Fenwick Tree, equal to j∈Upd(i) S[i] an underlying value ai stored in a Fenwick tree (an
“update” call) while preserving the invariant 2, one
can use property FT.3. Increment the value stored in
III. REVIEW OF FENWICK TREES location j of the Fenwick tree itself, V[i] for j being
The following discussion is influenced by Section 4 of any iterate of the update function upd starting at i. Let
[13], which has a detailed discussion of the arithmetic Upd(i) be the set of these indices obtained by iterating
relationships between indices that form the basis for upd on i. There are only at most log n such numbers
Fenwick trees. As Marchini and Vigna discuss, the term less than n, hence the iteration finished in logarithmic
Fenwick “tree” is a misnomer, as there is no single time. Figure 1 illustrates an example of this. The indices
tree-like structure relating the indices to one another. are listed in the bottom row of boxes, and the raw
Instead, there are three distinct iteration patterns that underlying data ai in the row of boxes above that. The
are used to increment, query, and search through a Fenwick tree data itself is sorted above, with single solid
Fenwick tree. Below is an overview of Fenwick trees to arrows showing the upd function and dashed arrows the
fix the notation. int function. The double solid arrows show the path
All arrays and sequences begin with index 1. This is that the update algorithm would traverse in order to
standard in the Fenwick tree literature, as the index increment the value stored at index 5.
calculations become simpler to express in standard bit Similarly, using FT.4, one obtains the prefix sum of
arithmetic. the underlying array by summing V[j] for j being any
For an integer i, define λi be the place of the leftmost iterate of int applied to i. There are at most log i such
(most significant) 1 in the binary expansion of i, and let nonzero iterates. An example is given in Figure 2.
VOLUME 10, 2022 3
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
IV. SCALING FENWICK TREES: NONZERO SCALARS Since the query and update operation is O(log n) and
We now move on to the main result of this paper: they would need to do this n times, the entire process
enabling efficient rescaling of ranges of the underlying would be O(n log n). An obvious improvement would
data as well. For example, suppose the elements in be to just iterate directly on the underlying tree itself,
the underlying array ai correspond to some statistical scaling each element V[i] by the factor f, in O(n) time.
observations that fall in particular buckets indexed by Starting with any data structure for representing
i. In order to translate the observations in a probability arrays ai that supports range sums and updates, one can
distribution, one would need to rescale the array by the augment the data structure with a global scalar s, which
sum of the entire array to ensure that the sum of values is interpreted as “all elements of array ai are scaled by s”.
is 1. This would enable global rescaling even more simply and
The most naive algorithm to do this for data repre- efficiently in O(1) time. To compute the sum over any
sented in a Fenwick tree would be of order n log n as desired range, one simply scales the sum by s (relying
follows. Let f be the factor by which the user wants to on the distributive property). To increment or update ai
rescale every element ai . They could iterate through the by a value z, call the increment/update function on the
array, adding (f − 1) · ai to the element ai for i = 1 · · · n. underlying data structure with s−1 · z, which works fine
4 VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
increment the value V[i] by is unknown, because it’s factor stored in entries 4 and 5 alone is insufficient:
been scaled by scale(index) which is the product of all subsequent queries for the sum up to 6 would still reflect
the S[j] as j traverses the path from index to the root the unscaled value at index 5, as this is stored as part
along the update path. One could compute this explicitly of the sum in index 6. The correct algorithm needs to
at the outset, but this would require a redundant traver- adjust the values stored in these overlapping indices as
sal. Instead, reversing direction and traversing down- well.
wards from the root to index avoids this redundancy. This can be done by not only traversing upwards
Accumulate the scale factors in runningScale along through the indices by flipping successive least signif-
the traversal. Because the value v is added to the value icant bits to 0 in the binary expansion of index as
at index i, which is included in the partial sum scaled done in increment, but by also including intermediate
by runningScale at location ii + j in the code below, indices that have a single 0 flipped to a 1. The code
when incrementing the value array, it is necessary to mult below does this by starting with j as the least
divide by runningScale first. significant bit of index and iteratively shifting it left.
1 def increment(values[], scales[], index, v): The variable runningSum stores the total increase in
2 j=1 << maxNumberOfBitsInIndex sum below index in the tree at each loop. If index has
3 ii=0 the same bit set to 1, execute the “if” part (lines 6-8),
4 runningScale=1
5 while j>0: which scale the subtree below index and accumulates
6 if (index-1)&j: in runningSum how much they were incremented. Also
7 ii+=j flip the bit of index to 0 as in increment. If the
8 else:
9 runningScale *= scales[ii+j] corresponding bit in index is set to 0, the index is
10 values[ii+j]+=v/runningScale in the overlapping interval case similar to index 6 in
11 j = j >> 1 the example above. Then increment the correspond-
12 return (values, scales)
ing value array element by runningSum, and update
Listing 2. Increment
runningSum itself by the corresponding scale factor so
Now consider the algorithm to scale a prefix range that it remains accurate further up the tree.
of values itself. To scale every entry up to index by Figure 4 shows an example of this operating on the
a number factor, one could partition {1, . . . , index} SFT presented earlier. The red boxes and blue boxes
into subranges as in FT.4. Each one of these subranges are the nodes visited when multiplying the 9th entry by
can be implicitly scaled by applying the scaling factor 3. Red boxes correspond to the “if” clause, while blue
to the appropriate entry in the scaling array S[j]. This boxes correspond to the “else” clause.
works well for maintaining invariant (4) for index itself, 1 def mult(values,scales, i, factor):
but alone would cause the resulting data to violate the 2 runningSum=0
same invariant (4) for other indices that overlap but 3 j=i&(-i)
4 while j<=maxIndex:
aren’t contained in 1 . . . index. For example, rescaling 5 if(i&j):
the values up to index 5 by only changing the scale 6 runningSum+=(factor-1)*scales[i]
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
which will preserve criterion (5). The function getScale computes scale(i) by accumu-
Below is pseudo-code for the obliterate operation. lating the product of entries in S traversing the int tree
Lines 2-6 below compute the difference between the left- from i to the root.
hand side and right-hand side of (5). Lines 7-11 then Similar tricks are at play for mult. Recall from the
apply this difference to the left-hand side of (5) and discussion of mult above that, in addition to applying
propagate the difference up the tree. the new scale factor to various elements of the scaling
1 def obliterate(values, S, i): array S, specific overlapping values of the values array X
2 j=1 also need to be updated as well. Line 4 below computes
3 runningSum=-values[i]
4 while j&i==0: that delta, and the while loop starting on line 9 applies
5 runningSum+=scales[i-j]*values[i-j] it consistently to the overlapping intervals.
8 VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
TABLE 3. Mean Execution Time Comparison for Naive Versus Baseline Fenwick Versus Scaled Fenwick
TABLE 4. Maximum Execution Time Comparison for Naive Versus Baseline Fenwick Versus Scaled Fenwick
FIGURE 5. Plot of Log Average Execution Time (ms) Versus log2 (n) FIGURE 6. Plot of Log Maximum Execution Time (ms) Versus log2 (n)
particular problem in the management of a database range rescalings for linear numerical array data of tens
of loans and lenders for a blockchain-based decentral- of millions of data points on common desktop consumer
ized finance application. Similarly, structured problems hardware.
present themselves in coding and compression, data Scaled Fenwick Trees do come with some drawbacks.
analysis, filtering and sorting, and other areas, however, There is space redundancy in the form of an additional
so this research may find application well beyond its scaling array, so that twice the memory usage is neces-
original motivation. Experimental results show that this sary to hold the same number of data points as compared
algorithm enables sub-second updates, range sums and to either a straightforward naive array or classical Fen-
10 VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3299352
wick tree. As with classical Fenwick Trees or Segment MATTHEW CUSHMAN Matthew Cushman
Trees, updating an array value requires logarithmic, not received the B.S. degree in mathematics and
logic and computation, and the M.S. degree
linear, time in the array length. Finally, compared to in mathematics, from Carnegie Mellon Uni-
a classical Fenwick tree, both updates and range sums versity in Pittsburgh, PA, and the Ph.D.
require additional computation to incorporate the values degree in mathematics from the University
in the scaling array so that while both methods are log- of Chicago. He was Managing Director at
Knight Capital Group from 2002 to 2011,
time complexity, the constants are worse for the scaled Senior Managing Director at Citadel Secu-
Fenwick tree. rities from 2011 to 2013, and a co-founder
Overall, for efficient implementation of all three op- of Engineers Gate in 2014. He left Engineers Gate in 2017 to found
erations: updates, range sums and range rescaling of Etale, Inc., a trading software firm that was acquired by NYDIG in
2020. From January 2022 to the present, he has been a co-founder
linear array date, Scaled Fenwick Trees offer signficant of Ajna Labs, where he works on smart contract protocol design
advantages with reasonable offsetting disadvantages. In and implementation.
applications that require frequent rescalings in particu-
lar, Scaled Fenwick Trees can be a good choice of data
structure to store and process data.
VIII. ACKNOWLEDGEMENT
The author would like to acknowledge valuable conver-
sations with Shiva Chaudhuri, Mike Hatheway, George
Niculae, Ed Noepel, and Sebastiano Vigna.
REFERENCES
[1] Peter M. Fenwick. A new data structure for cumulative fre-
quency tables. Software: Practice and Experience, 24(3):327–
336, 1994.
[2] Boris Ryabko. A fast on-line code. Soviet Math. Dokl.,
39(3):533–537, 1989.
[3] B.Y. Ryabko. A fast on-line adaptive code. IEEE Transac-
tions on Information Theory, 38(4):1400–1404, 1992.
[4] Guy E. Blelloch. Prefix sums and their applications. In J. H.
Reif, editor, Synthesis of Parallel Algorithms, 1990.
[5] Philip Bille, Anders Roy Christiansen, Patrick Hagge Cord-
ing, Inge Li Gørtz, Frederik Rye Skjoldjensen, Hjalte Wedel
Vildhøj, and Søren Vind. Dynamic relative compression,
dynamic partial sums, and substring concatenation. Algo-
rithmica, 80(11):3207–3224, 2018.
[6] Mihai Pătraşcu and Erik D. Demaine. Lower bounds for
dynamic connectivity. In Proceedings of the Thirty-Sixth
Annual ACM Symposium on Theory of Computing, STOC
2004, pages 546–553, New York, NY, USA, 2004. Association
for Computing Machinery.
[7] Ernst W. Mayr, Gunther Schmidt, and Gottfried Tinhofer,
editors. Prefix graphs and their applications, Berlin, Heidel-
berg, 1995. Springer Berlin Heidelberg.
[8] Giulio Ermanno Pibiri and Rossano Venturini. Practical
trade-offs for the prefix-sum problem. Software: Practice and
Experience, 51, 10 2020.
[9] Christian Reinbold and Rü diger Westermann. Parameterized
splitting of summed volume tables. Computer Graphics
Forum, 40(3):123–134, 2021.
[10] Jens Schneider and Peter Rautek. A versatile and efficient
gpu data structure for spatial indexing. IEEE Transactions on
Visualization and Computer Graphics, 23(1):911–920, 2017.
[11] Simon S. Du Yining Wang, Yi Wu. Near-linear time local
polynomial nonparametric estimation with box kernels. IN-
FORMS Journal on Computing, 33(4):1339–1353, 2021.
[12] Pushkar Mishra. On updating and querying sub-arrays of
multidimensional arrays. CoRR, abs/1311.6093, 2013.
[13] Stefano Marchini and Sebastiano Vigna. Compact fenwick
trees for dynamic ranking and selection. Software: Practice
and Experience, 50, 01 2020.
[14] Matt Cushman. Scaled fenwick tree reference imple-
mentation, validation, time benchmarks and comparisons.
[Link] 4 2023.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see [Link]