I'd suggest taking a look at the Truestamp Observable Entropy project. We recently created this to address a need for the type of randomness you're asking about.
Observable Entropy automatically collects randomness from publicly verifiable sources every five minutes, storing and hashing the contents of that retrieved data. The hash of each data source file is then combined and hashed deterministically resulting in a final signed SHA2-256 hash representing the totality of the collected entropy as a new verifiable public random value.
The system currently collects data from the following public sources at each interval:
- Bitcoin blockchain latest block header
- Stellar Blockchain latest block header
- Ethereum blockchain latest block header
- Drand Random Beacon
- NIST Randomness Beacon
- Hacker News, top 10 stories and content links
- Timestamp UTC
Each of these, with the exception of the timestamp which is for convenience, is considered a strong source of randomness. For the truly paranoid, you can provide your own entropy to add to the mix in the form of 32 byte hex strings representing e.g. a hash or random bytes. You can contribute your own randomness if you don't fully trust any of the other sources.
All of the data collected at each interval, as well as the resultant hashes, are committed to a public Github repository. If you clone the repo, scripts are provided that allow you to recreate the entropy hash from the stored data yourself. You can also verify the public key signature on the entropy to ensure that it was signed by Truestamp. Instructions are also provided for how to verify each public source of entropy or the signature yourself.
Its easy for almost anyone to maintain and sync their own clone of this repo, ensuring that everyone has the same data, or can reference the entropy at a point in time. For example, you and a group of others can agree to use a random value that is the latest value at a specific point in time in the future. At that time you will all be looking at the data stored in the same Git commit which should normally be no more than five minutes old. By storing it in Git you also get all of the integrity properties that Git offers, such as chained commit hashes, and breakage of the clones if someone pushed an entirely new data set to Github which would be clear indicators of a bad actor.
In addition, the historical, and latest entropy values are available via a public API (which is simply a proxy designed to read the latest raw data from the Github Repo and has no cache or data store of its own). Here's a sample of the output from https://entropy.truestamp.com/latest.
I believe this meets each of your original requirements:
- no one party controls it (or has access to it)
- it can't be reliably predicted
- it must also be the same for each person that accesses it
At Truestamp we'll be using this source of verifiable randomness to help prove that data committed to our system was provably and independently verifiably created after a specific point in time.
We'd love to get your feedback, as this is a new and somewhat experimental service which has been operating reliably now for about two months.
UPDATED 08/25/2021 to address comment by @fgrieu:
Thanks for the comment @fgrieu. I’ll try to address your concerns, or ask for more clarification on the potential issues you feel may be present.
(1) insiders can know the outcome before others
In this implementation, there are no “insiders” who have access to the final data that is hashed to form the final entropy. The collection of the data is done using automation provided by Github Actions workflows. The data collected, like the signed Drand beacon, is not know to anyone publicly until it is collected (everyone gets a new signed random beacon). In this case the only insider would be someone at Github who has access to their infrastructure, and can extract the contents of memory for this script after the point when it runs 500,000 rounds of SHA-256. Even in this unlikely scenario they would have access to a final entropy value within several milliseconds of when it was committed, and made publicly available to all in the Github repository. This project, which collects entropy every five minutes, is likely not sensitive to this level of time granularity. There may be scenarios where an early (~1s) preview of the final value would be useful enough to an attacker to force a compromise of Github. I don’t think the use cases this project is intended for are affected by this though.
In practice, no one has access to the final output until they all do when a public commit is made. At this point it would be a race to who can fetch the commit first for those seeking this advantage. I would suggest that if your entropy needs are milliseconds sensitive when it comes to when the output is revealed, then this is not the project for you.
(2) insiders can know the inputs before others, and use that to submit
a last extra input that manipulates the outcome; e.g. make the
low-order 16 bits any desired value.
I addressed the concept of ‘insiders’ above. There are sources that Observable Entropy collects that can be know to outsiders who want to try to manipulate the final output by sending in their own attack data, however this will fail as the attacker cannot know the script collected values of the Drand beacon or the timestamp when the script runs, both of which are known only to the script itself at execution time.
I’m not aware of any attacks on multiple rounds of SHA2-256 that would allow an attacker to predict which bits of the output would be modified to suite their desired outcome.
The ability to accept, and publish, user provided entropy, even up to the last moments before the collection script starts, is a benefit. This allows anyone to submit entropy that will be included and doesn’t require the consumer to have full trust in any of the sources of entropy that feed the final output. They only need to trust their own data. They can verify that their data was included by running the script locally to confirm the same entropy output.
Page 31 of this presentation (and the related academic paper) provides more information on the concepts and benefits of including open public input.
(3) the diffusion of the outcome is not instantaneous thus even
non-insiders can have a lead; and network manipulations can
artificially grow that
If I understand what you are trying to express, the “diffusion of the outcome” is in fact instantaneous. The outcome can be known only to the generation script at the moment it has collected all the data sources, some of which are non-public but verifiable after the fact. Without the totality of this information it is infeasible to arrive at the final output hash. The window of opportunity is between the time when e.g. the Drand beacon value is collected, and the hash concatenation and final hash rounds have begun.
The original poster’s needs could likely be fulfilled by using only the NIST Randomness beacon or Drand Beacon alone. But this requires putting a certain degree of trust in the infrastructure and its owners for each. This solution diffuses that trust across multiple public and verifiable sources (including yourself) with all sources verifiable after the fact. This is a key differentiator from systems that, for example, observe natural phenomenon which do not allow others to verify after observation.
I’d be happy to hear more feedback on any remaining perceived weakness in my arguments or the system, and suggestions for improvement.