Viewed   71 times

We have some integer arithmetic which for historical reasons has to work the same on PHP as it does in a few statically typed languages. Since we last upgraded PHP the behavior for overflowing integers has changed. Basically we are using following formula:

function f($x1, $x2, $x3, $x4)
{
   return (($x1 + $x2) ^ $x3) + $x4;
}

However, even with conversions:

function f($x1, $x2, $x3, $x4)
{
   return intval(intval(intval($x1 + $x2) ^ $x3) + $x4);
}

I am still ending up with the completely wrong number...

For example, with $x1 = -1580033017, $x2 = -2072974554, $x3 = -1170476976) and $x4 = -1007518822, I end up with -30512150 in PHP and 1617621783 in C#.

Just adding together $x1 and $x2 I cannot get the right answer:

In C# I get

(-1580033017 + -2072974554) = 641959725

In PHP:

intval(intval(-1580033017) + intval(-2072974554)) = -2147483648

which is the same as:

intval(-1580033017 + -2072974554) = -2147483648

I don't mind writing a "IntegerOverflowAdd" function or something, but I can't quite figure out how (-1580033017 + -2072974554) equals 641959725. (I do recognize that it is -2147483648 + (2 * 2^31), but -2147483648 + 2^31 is -1505523923 which is greater than Int.Min so why is do you add 2*2^31 and not 2^31?)

Any help would be appreciated...

 Answers

4

So I solved the problem, and discovered a lot about PHP (at least in the way it handles Integer overflow).

1) It completely depended on a cross between which platform the machine was running on, which version of PHP, whether or not it had Suhosin Hardened PHP running, and how many bits it was compiled for (32 or 64). 6 machines behaved the way I expected (which was actually wrong, at least wrong according to their documentation) and 3 machines behaved in a way I still can't explain, and 3 machines behaved according to what the intval command says it does in the documentation.

2) Intval is supposed to return PHP_INT_MAX when int > PHP_INT_MAX (not int & 0xffffffff), but this only happens on some versions of PHP4 and PHP5. Different versions of PHP return different values when int > PHP_INT_MAX.

3) The following code can return 3 different results (see 1):

<?php
echo "Php max int: ".PHP_INT_MAX."n";
echo "The Val: ".(-1580033017 + -2072974554)."n";
echo "Intval of the val: ".intval(-3653007571)."n";
echo "And 0xffffffff of the val: ".(-3653007571 & 0xffffffff)."n";
?>

It can return (which appears to be right for Intval but wrong for & 0xffffff)

Php max int: 2147483647
The Val: -3653007571
Intval of the val: -2147483648
And of the val: -2147483648

And it can return (which contradicts the PHP documentation for intval):

Php max int: 2147483647
The Val: -3653007571
Intval of the val: -641959725
And of the val: -641959725

And on 64 Bit machines it returns (which is correct):

Php max int: 2147483647
The Val: -3653007571
Intval of the val: -3653007571
And of the val: -641959725

Solution

Anyhow, I needed a solution that would work on all these platforms, and not be dependent upon quirks of a particular version of PHP compiled with a particular Max int. Thus I cam up with the following cross-PHP thirtyTwoBitIntval function:

function thirtyTwoBitIntval($value)
{
    if ($value < -2147483648)
    {
        return -(-($value) & 0xffffffff);
    }
    elseif ($value > 2147483647)
    {
        return ($value & 0xffffffff);
    }
    return $value;
}

Comment

I do think the designers of PHP should have said an Int is a 32 Bit Int no matter whether it is running on a 32 or 64 or 128 bit machine (like the DotNet CLR for example), and didn't randomly upconvert it to a float depending on the number of Bits that PHP is compiler under.

Thursday, November 10, 2022
4

The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.

In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).

Relevant quotes:

C99 6.2.6.1:3:

Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.

C99 6.2.6.2:2:

If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value ?(2N) (two’s complement);

— the sign bit has the value ?(2N ? 1) (one’s complement).


Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.

Saturday, August 13, 2022
 
pooria
 
4

Math.addExact throws exception on overflow

Since Java 8 there is a set of methods in the Math class:

  • toIntExact(long)
  • addExact(int,int)
  • subtractExact(int,int)
  • multiplyExact(int,int)

…and versions for long as well.

Each of these methods throws ArithmeticException if overflow happens. Otherwise they return the proper result if it fits within the range.

Example of addition:

int x = 2_000_000_000;
int y = 1_000_000_000;
try {
    int result = Math.addExact(x, y);
    System.out.println("The proper result is " + result);
} catch(ArithmeticException e) {
    System.out.println("Sorry, " + e);
}

See this code run live at IdeOne.com.

Sorry, java.lang.ArithmeticException: integer overflow

Sunday, December 11, 2022
 
3
unsigned long long x;
unsigned int y, z;

x = y*z;

The evaluation of the expression y*z is not affected by the context in which it appears. It multiplies two unsigned int values, yielding an unsigned int result. If the mathematical result cannot be represented as an unsigned int value, the result will wrap around. The assignment then implicitly converts the (possibly truncated) result from unsigned int to unsigned long long.

If you want a multiplication that yields an unsigned long long result, you need to explicitly convert one or both of the operands:

x = (unsigned long long)y * z;

or, to be more explicit:

x = (unsigned long long)y * (unsigned long long)z;

C's * multiplication operator applies only to two operands of the same type. Because of this, when you give it operands of different types, they're converted to some common type before the multiplication is performed. The rules can be a bit complex when you're mixing signed and unsigned types, but in this case if you multiply an unsigned long long by an unsigned int, the unsigned int operand is promoted to unsigned long long.

If unsigned long long is at least twice as wide as unsigned int, as it is on most systems, then the result will neither overflow nor wrap around, because, for example, a 64-bit unsigned long long can hold the result of multiplying any two 32-bit unsigned int values. But if you're on a system where, for example, int and long long are both 64 bits wide, you can still have overflow wraparound, giving you a result in x that's not equal to the mathematical product of y and z.

Sunday, November 6, 2022
 
grains
 
3

I would question whether it makes sense to hash UTF-16LE ("Unicode") this way. It might make a lot more sense to convert your VB String to UTF-8 and then hash that.

While I can't find any test vectors for djb2 to validate my own implementation it seems to run quite quickly:

Private Type CURRENCY_CURRENCY
    Value As Currency
End Type

Private Type CURRENCY_2LONGS
    ValueLo As Long
    ValueHi As Long
End Type

Public Function djb2(ByVal StringToHash As String) As Long
    Dim C2L As CURRENCY_2LONGS
    Dim CC As CURRENCY_CURRENCY
    Dim I As Long

    C2L.ValueLo = 5381
    For I = 1 To Len(StringToHash)
        LSet CC = C2L
        CC.Value = CC.Value * 33@ + CCur(AscW(Mid$(StringToHash, I, 1))) / 10000@
        LSet C2L = CC
        C2L.ValueHi = 0
    Next I

    djb2 = C2L.ValueLo
End Function
Sunday, November 20, 2022
 
amergin
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :