In the above image, you can see a coordinate grid containing some random green points. Each point has a pseudorandom 1/10 chance of being green. What I'm looking for are clusters of these green points within a radius of ~8 (ignore the inner mask shown). Said another way I am looking for statistically unlikely high density areas of these green points.
At the core of this problem is the Java RNG found in java.util.Random
(source here). The code for determining whether a point is green boils down to this hash function. The inputs are some constant, $k$, and the coordinates of the point, $x$ and $y$.
long seed = ((k + (long) (x * x * 4987142) + (long) (x * 5947611) + (long) (y * y) * 4392871L + (long) (y * 389711) ^ 987234911L) ^ 0x5DEECE66DL) & ((1L << 48) - 1);
int bits, val;
do
{
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
bits = (int)((ulong)seed >> 17);
val = bits % 10;
} while (bits - val + 9 < 0);
return val == 0;
There has been minor research on this problem in the past but I am not knowledgeable enough to contribute further. What was found was that potential clusters with small sizes of 2x2 and 3x3 create a pattern when comparing with different $k$ values.
This may provide clues as to what coordinates a search should focus more compute given a certain $k$, but I am not convinced.
As an exmaple, here is a heatmap of cluster sizes for a particular $k$. You can find more information about how these images were derived from here.
As of now I am just brute force checking the cluster count of each coordinate, and skipping a coordinate if the cluster is too low for the next one over to have a cluster of sufficient size, which is most of them because I am looking for statistical outliers.
What I am hoping is that there is some exploitable pattern in this algorithm, it is practical to reverse this hash in some way, or there are major optimizations to be had in my current method.
Maybe a possible avenue forward would be to see if the lattice pattern keeps persisting for larger and larger clusters, but another image on that post seems to indicate that it would just get lost in the noise.