Geometry Nodes: How to Read Last Digit of a Big Integer?

Question

I have made the following setup:

As you can see, the assertion fails. The Custom Group is what you would expect:

(╯°□°)╯︵ ┻━┻

Some approaches I tried and failed

I figured you can calculate the biggest possible error you got upon conversion to float, by using an expression:

$$\operatorname{floor}\left(2^{\log_{2}n-24}\right)$$

floor(2**(log(number, 2) - 24))

or just much simpler 😜

$$\operatorname{floor}\left({n \over 2^{24}}\right)$$

n // 2**24

And at the end of the integer range $2147483647$ that error is no bigger than $128$. For simplicity I just assumed this maximum regardless of the input, because a loop of 128 iterations is no challenge for a computer (sure, should be optimized when dealing with many numbers).

So the idea is: just after converting the integer to float (add zero to the int) I can convert it back to an integer and see if it's smaller or bigger than the original integer (or equal if I'm lucky, but this case is simple). Then I can add/subtract the calculated error (or assumed $128$) and end up with two float values, one lower than the original integer, one bigger.

First approach was to spawn a curve with two points, and capture on them an integer attribute: the lower bound one the first point (start of curve), and the upper bound on the second point (end of curve). I captured it in the integer domain. Now upon resampling the curve to 129 points, I would expect the integer attribute to lerp with $1$ increments. This is where the idea fails, because apparently lerping is done using float32, even if the datatype is integer. Moreover, even assigning a single integer value to both endpoints before resampling, for a big integer, upon resampling produces a corrupted result (different integer, a negative one). If it actually worked properly, I could use my answer of a repeat zone, simply starting with a modulo $10$ on the lower bound, and iterating through the points, incrementing (mod 10) my digit only if the captured attribute on currently evaluated point is lower than the original integer.
An even more desperate attempt was to increment an integer using a "Random Value" node. I could take the lower and upper bounds, use integer comparisons and hopefully, using a ridiculous (but not 4 billion!) number of iterations emerge at the integer that is just 1 higher. This way I could iterate over 128 integers one by one, like I planned in the first solution. Moreover, I discovered a random integer just randomizes an integer in range from zero to the difference of max and min, and then adds min. So I can actually find a seed that for a random integer in range $[0, 128]$ gives me $1$ and then I can use the same seed for a bigger range to the same effect. Except, again, at big integers the node breaks:

I tried to simplify the StefLAncien's answer (the first one that actually worked!), by simply spawning the input number of points (at first I disregarded this option thinking it would take too much memory, and went for a repeat zone that shouldn't take any memory and just take long time - turn out I was wrong), detect if the integer converts to a float by going down or up, in the second case just add 128 more points (or less, as I argued earlier; hardly matters for performance) separate the points with indices larger than than the float, take modulo of the separated domain size (subtract 128 if you added it earlier), take modulo of the float, add together, take modulo again. But I discovered yet another integer related bug!

The above turned out to be a tooltip bug - thanks scurest!

At this point I'm thinking... Maybe we should consider big integers just not supported? Maybe we shouldn't limit integer inputs to the big range of $[-2147483648, 2147483647]$? It clearly just doesn't work, we should limit the range to $[-2^{24}, +2^{24}] = [-16777216, +16777216]$

As explained StefLAncien, I think this is effectively hopeless. As soon as you're going through float math (including value to string, for instance) you loose the last digits. And there is no integer math nodes. — lemon, Commented Feb 5 at 10:06
If there is a trick (but we don't know the use case), that could be before that, for instance splitting the integer in two or several parts before, using a driver (Python). — lemon, Commented Feb 5 at 10:12
For the tooltip issue, see projects.blender.org/blender/blender/issues/118073. — scurest, Commented Feb 10 at 13:13

scurest · Accepted Answer · 2024-02-09 15:26:42Z

Explanation

I'll assume the integer X is non-negative.

The first number where the fmod doesn't work is 16777217. But if we remove the first digit by subtracting 10 million

   16777217
 - 10000000
-----------
    6777217

the last digit doesn't change, but the number is now back in the range where fmod works!

You can do the same thing for any large number by subtracting the right multiple of 10 million to get a number with at most seven digits

   123456789       2147483647
 - 120000000     - 2140000000
------------     ------------
     3456789          7483647

How do we know what multiple of 10 million to use? We can just try all of them. If X is our integer, we make the list

X
X -   10000000
X -   20000000
X -   30000000
...
X - 2140000000

At some point in this list, there's a number that's seven digits or fewer, so fmod will work on it. All the numbers after it are negative, and all the numbers before it have more than seven digits, so it is the smallest non-negative number in this list.

The reason this is useful is this list can be computed with the Accumulate Field node. Accumulate Field calculates the running sum of a field, ie.

In	Out
a	a
b	a+b
c	a+b+c
...	...

and it does this with proper integer math on integer fields, no float conversions. So we can make our list by feeding it this

In	Out
X	X
-10000000	X-10000000
-10000000	X-20000000
-10000000	X-30000000
...	...
-10000000	X-2140000000

The input list is easy to make as a point-domain attribute on a cloud of 215 points with (if index == 0 then X else -10000000).

We can extract the smallest value from the accumulated field with the Attribute Statistic node, but we want the smallest non-negative value, so we need to select the non-negative values first. The smallest value is now our desired number with the same last digit as X and no more than seven digits, so all that's left is to fmod it.

HAhahahah I don't understand this at all, I just tested it, and it works and is very fast. Amazing answer! I'll definitely analyze it thoroughly as soon as I have the time to do that, but regardless, I think it would benefit from some ELI5 explanation. — Markus von Broady, Commented Feb 9 at 13:03
I used the Selection socket on the Attribute Statistic to simplify the graph a bit. — scurest, Commented Feb 9 at 15:22
+1 for identifying and sharing that "Switch" and "Accumulate Field" nodes are preserving int32 algebra. — StefLAncien, Commented Feb 11 at 11:03

StefLAncien · Accepted Answer · 2024-02-11 10:36:24Z

(Using Blender 3.6.5)

Taking advantage that Blender internal algebra of indexes is preserving int32 and that $2^n$ is not truncated in binary32 format (at least not for $n \le 127$), the following approach relies on deleting points of a point cloud with indexes lower than $2^n$, with $24 \le n \le 30$, such that the remaining count is lower than or equal to $2^{23}$. From this value, not truncated in binary32 format, and keeping track of the successively subtracted last digit of $2^n$, the last digit of a int29 can be computed, at a "huge" RAM cost.

0. Preliminary information:
0.1. $2^{23}=8.388.608$ is the largest int32 not a power of $2$ stored without truncation using binary32.
0.1. $2^{31}-1=2.147.483.647$ is the largest signed int32.
1. Because Floored Modulo operator is not available in Blender 3.6, the last digit is recovered by firstly shifting it after the decimal point, by dividing par 10. Then the decimal part is extracted with a Fraction operator. Finally, the last digit is brought back before the decimal point, by multiplying by 10.
2. For step 1 process to be accurate, the treated value $v$ must not be truncated, i.e. it is mandatory that $v \leq 2^{23}$. So numbers such that $2^{23} \lt v$ must be reduced; others are unmodified.
3. A point cloud is initialized with as many points as the number to reduce.
4. $2^n$ points are successively deleted. Values between $2^{30}$ and $2^{24}$ must be subtracted from the number to reduce. So this deletion process is called at most 7 times. Because Repeat Zone is not available in Blender 3.6, the Group Node call is duplicated.
5. After reduction, the subtracted last digits are added to the remaining points count to recover the original last digit number. This value is initialized at 0.
6. At each iteration of step 4, the remaining points count $N$ is compared to $2^{23}$. If it is such that $2^{23} \ge N$, this step is skipped and inputs are just passed as outputs.
7. Let $n$ and $K$ be defined such that $N=2^n+K$ with $K \lt 2^n$. $$ \begin{array}{rrcl} \mbox{} & N & = & 2^n \left( 1 + \frac{K}{2^n}\right) \\ \Rightarrow & \ln{(N)} & = & n \ln{(2)} + \ln{\left( 1 + \frac{K}{2^n}\right)} \\ \Rightarrow & \frac{\ln{(N)}}{\ln{(2)}} & = & n + \frac{1}{\ln{(2)}} \ln{\left( 1 + \frac{K}{2^n}\right)} \end{array} $$ As $1 \le 1 + \frac{K}{2^n} \lt 2$, the second addend is positive and lower than 1, so $n$ is computed as the integer part of $\log_2{(N)}$. It is to notice that the truncation from the Point Count socket of the Domain Size node, to the Value socket of the Logarithm node, is affecting the accuracy of $K$, not of $n$.
8. Points with Index lower than $2^n$ are deleted, reducing the points count. Even with $n \gt 23$, $2^n$ is accurately passed as Binary32 by the Power math node. (NB: because indexes start at 0, comparison operator is "$\lt$", not "$\le$").
9. A record of the last digit of the "Subtracted" number of points must be kept, to recover the original, before reduction, value. It is noticed that this last digit value is circling between $2, 4, 8, 6$ as the value of $n$ increases. Let $i$ and $j$ be integers defined as $n=4i+j$ with $5 \le i$ and $1 \le j \le 4$, keeping in mind that $n \gt 23$. $$ 2^n = 2^{(4i+j)} = (2^4)^i \times 2^j = 16^i \times 2^j $$ As the last digit of $16^i$ is always $6 \ \forall i$, the last digit of $2^n$ is the same as $2^j$.
9.1. The value of $j$ is computed from $n$ with a Modulo(4) operator.
9.2. The returned value of $0$ must be changed to $4$ to comply to the definition of $j$.
9.3. The largest $2^j$ value being $16$, and step 4 being repeated at most 7 times, the largest accumulated value is $112$. So the values manipulated by the Add operator are without truncation.
10. Final remarks:
10.1. To extend this approach to negative numbers, an Absolute operator is to be linked right after the int29 socket, before step 2. By just affecting the sign bit, it does not induce truncation of the input value.

Resources:

With my 10 years old, 8Gb of RAM MacBook, I succeed to reach 812^3 = 535.387.328. At 813^3 = 537.367.797, it just frizz and reboot... — StefLAncien, Commented Feb 8 at 22:55
NB: 2^29 = 536.870.912. Still a factor of 4 before the last signed int32: 2^31-1. — StefLAncien, Commented Feb 8 at 23:01
I just started analyzing it, and it's late so I'm going to bed, but I just tested it for integers 123456780 to 123456789 AND IT WORKS!! - great job! — Markus von Broady, Commented Feb 8 at 23:09
Oh, you just spawn as many points as the integer AHAHAHAHA I was thinking about that, but figured it can't possibly work, too many points, too much memory, I figured a repeat zone should be much more lightweight - and yet your approach ended up working LOL, it's also quite fast, much faster than say the python equivalent of the loop I was trying (which took somewhere between 10 and 20 seconds for me) — Markus von Broady, Commented Feb 8 at 23:13
I guess the reason why it works so fast is that you don't do something obvious that I would try - iterating over points, but instead remove points - as I said, I haven't yet thoroughly analyzed it, but I think I understand it in principle, it's a smart solution and takes advantage of some of geonodes optimizations... — Markus von Broady, Commented Feb 8 at 23:15

Markus von Broady · Accepted Answer · 2024-02-02 21:35:40Z

3

I figured this solution:

However, Blender crashes when I try it. I'll have to report a bug on the Issue Tracker…

answered Feb 2 at 21:35

Markus von Broady

38.2k3 gold badges34 silver badges105 bronze badges

$\begingroup$ I think the repeat zone is not meant to be run hundreds of millions of times on update, you're surely hitting a limit there. Maybe you can do something really dumb and take a string as input, and compare the number of output vertices (assuming they're all different for each digit) . Something like that i.sstatic.net/ptmf4.png $\endgroup$
– Gorgious
Commented Feb 5 at 11:37
$\begingroup$ @Gorgious string as input works, but if you start with an integer, how do you obtain the string... You can't. So if you read some kind of integer data, be it imported from csv or generated in a random value node etc. you're in a pickle. As for the limit - I was joking with the bug report, but on the other hand, that would be the first programming language I ever used with such a limit. It's rather sad that Blender crashes on something this simple - I think to both of us, the number isn't as scary to think a modern CPU would never finish the job. Less than 20 s for Python actually. $\endgroup$
– Markus von Broady
Commented Feb 5 at 11:54
$\begingroup$ Sure, but converting a column in a csv to a string or generating a string of arbitrary length with random digits should be relatively straightforward, do you really HAVE to deal with integers in the first place ? $\endgroup$
– Gorgious
Commented Feb 5 at 15:51
$\begingroup$ @Gorgious what inspired this question was the simplicity of a single integer input and the limitation of 7 digits here which could be increased by just 2 more digits and the workaround is not terrible - multiple integer inputs. Of course knowing Python I can make a workaround to any situation offloading the work to Python (and even then if Python doesn't work I can code a binary in some other language and execute it from Python). Still, I'm quite surprised it's so hard within geonodes. $\endgroup$
– Markus von Broady
Commented Feb 5 at 16:03

Add a comment |

StefLAncien · Accepted Answer · 2024-02-03 14:17:46Z

3

(Using Blender 3.6.5)

The Binary32 standard might prevent this query. It states that :

All integers with 7 or fewer decimal digits, and any $2^n$ for a whole number $−149 \leq n \leq 127$, can be converted exactly into an IEEE 754 single-precision floating-point value.

As $123456789$ is 9 decimal digits long, the last two might be lost in the float32 format used by all Math nodes.

This figure illustrates that as a string, the largest positive int32 (i.e. 2.147.483.647) can be displayed. However, it shows that integers from 123.456.780 included to 123.456.788 included, after conversion to float (the format of the Value socket) are displayed as 123.456.784. Even the last 8 is questionable as 123.456.789 is displayed as 123.456.792.
These results are indicating that Blender Geometry Nodes behaviour is conforming to the Binary32 standard. Pure int32 algebra or double-precision floating point data type are required to capture accurately digits from the $8^\mathrm{th}$.

answered Feb 3 at 14:17

StefLAncien

4,9974 silver badges19 bronze badges

$\begingroup$ This is all correct, but question is, can a hack be used to obtain that digit anyway. For example my answer does just that, it just requires ridiculous processing power to do such a simple task. Imagine if "Value to String" had an integer input, it would then be possible to succeed by analyzing the string. Maybe I'm missing something that is possible... $\endgroup$
– Markus von Broady
Commented Feb 3 at 14:52
$\begingroup$ I am sharing the same hope for a hack. After failure of the classical x-floor(x/10)*10 to get the last digit, I tracked down the round-off error and found that as soon as an integer socket is connected to a value/float socket, the last digits were lost. So the idea to use strings. But the "Value to String" node suffers the same limitation ! My conclusion is that we can rely only on nodes manipulating integers without conversion to float (green noodles should remain green...). I have to check yet, but I think there are not so many... $\endgroup$
– StefLAncien
Commented Feb 3 at 16:03
$\begingroup$ Exactly. But it gets worse. See even if you stay green, the internal implementation often uses floats or otherwise just fails. Try generating a random integer in range of MAX-2...MAX. You will get a negative integer. Try setting an integer attribute on curve, on both points set it to MAX. Resample to 3 points: the attribute becomes negative. BTW MAX = 2147483647 i.imgur.com/VvrDdTJ.gif $\endgroup$
– Markus von Broady
Commented Feb 3 at 16:07
$\begingroup$ A "brute force" approach might be to use hexadecimal representation (for example) of integers as string, and to develop new "Math " GN manipulating strings. Inside such nodes, an int32 might be decomposed as two float32. The internal algebra could be coded using standard Math GN. Then the result could be encoded back as a string. A challenge ? (side note: instead of HEX, chucks of 7 digits, as 1234567_8901234_5... ?) $\endgroup$
– StefLAncien
Commented Feb 3 at 16:09
$\begingroup$ But how would you convert the input integer to a hex representation? I don't see any option to do that. $\endgroup$
– Markus von Broady
Commented Feb 3 at 16:38

| Show 2 more comments

StefLAncien · Accepted Answer · 2024-03-03 17:29:45Z

(Using Blender 3.6.5)

This post is an extension of scurest's proposal.
It deals with negative values and it should run faster, not relying on an Attribute Statistic node coupled to a selection mask.

Objective

To split an int32 into two parts not affected by truncation, to possibly develop a fully compliant int32 algebra.

Approach

Let $n$ be an int32, and $X$ and $Y$ be two float32 such that $n=X+Y \times 10^7$ with $|X| \lt 10^7$. If $n \lt 0$, then $X \lt 0$ and $Y \lt 0$. The highest value of $|Y|$ is 214.
Following scurest's proposal, $10^7$ is chosen as base because it reads 0x989680 in hexadecimal. So it is 6x4 bits long, and as the significant precision of the Binary32 standard is 24 bits, $10^7$ and all lower integers (i.e. $X$) do not undergo truncation.

$Y$ is computed as the truncation of $n\ /\ 10^7$ using Math nodes.
$X$ is computed as $n + \varepsilon.10^7 \times |Y|$ with $\varepsilon=-\text{sgn}(n)$, using an Accumulated Field node which is compliant to int32 algebra.

Geometry Node Graph

1. $X$ and $Y$ are encoded in a Vector, with $Z=0$.
2. $Y$ is computed as $\text{trunc}(n \times 10^{-7}, 0)$.
3. To compute $n + \varepsilon.10^7 \times |Y| = n + \sum_{i=1}^{|Y|} \varepsilon.10^7$, a Point Cloud of $|Y|+1$ elements is created.
4. At the point of Index 0 is associated the field value $n$.
5. For subsequent points of Index between 1 and $|Y|$ is associated the field value $\varepsilon.10^7$. It is to notice that the Power and Multiply Math nodes do not truncate this number.
6. The value of $X$ is recovered as the accumulated field of the point with the last, i.e. $|Y|$, index.

NB: It is to notice that if $|n| \lt 10^7$, $Y=0$. So a single point is created with the 0 index, and the Accumulated Field node is returning $n$ as sole value.

Resources

Blender file:

Harry McKenzie · Accepted Answer · 2024-04-02 15:26:27Z

0

The integer's capacity just does not fit your purposes.

Though as a workaround you could use strings.

Attaching my file with the node setup: (made with blender ver. 4.0.2)

edited Apr 2 at 15:26

Harry McKenzie♦

13.1k9 gold badges29 silver badges60 bronze badges

answered Feb 9 at 9:26

Yaroslav

5433 silver badges13 bronze badges

$\begingroup$ there is a "Compare" node with a "string" mode. Yes, you're right, strings can be used, but as you can read in the comments somewhere here, you can't convert an integer to a string, so if you have your data as a string for some reason (some data loaded from an external source, or some random number or some ID etc....) you just can't read the last digit. BTW, your picture of strings doesn't show the part that actually makes them strings - the string, which is on the other side, and unlike in other types of panties, doesn't cover the buttocks. $\endgroup$
– Markus von Broady
Commented Feb 9 at 10:07

Add a comment |

Stack Exchange Network

Geometry Nodes: How to Read Last Digit of a Big Integer?

Some approaches I tried and failed

6 Answers 6

Explanation

Objective

Approach

Geometry Node Graph

Resources

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
geometry-nodes
.

Linked

Hot Network Questions

Geometry Nodes: How to Read Last Digit of a Big Integer?

Some approaches I tried and failed

6 Answers 6

Explanation

Objective

Approach

Geometry Node Graph

Resources

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged geometry-nodes.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
geometry-nodes
.