Score:6

A source of randomness that anyone can independently, conveniently and robustly access?

in flag

Does there exist a source of randomness that anyone in the world can independently, conveniently and robustly access? For example, the 10th decimal place of the temperature in Mexico City is sufficiently random. But it's inconvenient for Bob to access independently, and it can't be measured robustly anyways.

The source of randomness must also be secure, in that no one party controls it (or access to it), and it can't be reliably predicted. It must also be the same for each person that accesses it.

The application I'd like this for is something like the following problem: 10 million people access the same cryptographically secure random value without having to all sync up with each other (so Diffie-Hellman won't work).

in flag
By "(or access to it)" I meant that no one party can control access to it. So for example, relying on a particular website won't work because that single website controls access and could take it away. That being said, no one should have early access to it either.
Paul Uszak avatar
cn flag
Like a national public lottery, perchance?
Maarten Bodewes avatar
in flag
I've removed the "why" public randomness entirely as it lead to unwanted and uninformative debate. Please assume that the public random values are required.
Score:4
cn flag

A possibility is to rely on jointly observable, untemperable and apparently random, physical processes. There is one such phenomenon which is very commonly mentioned in the cryptographic community, though this is more as an indication of feasibility than as a concrete and well-fleshed proposal: extracting common randomness from the dark spots of the sun. Doing so has all kind of nice properties, but at least one (severe) issue: there a good probability that the common randomness extracted won't be exactly the same between all participants, but will instead be very close. In some scenarios, this can be shown to suffice, se e.g. this paper on the issue.

Paul Uszak avatar
cn flag
Not one for Icelanders then?
Geoffroy Couteau avatar
cn flag
Yup, or at least not via direct observation from home. But they already have a beautiful country and rank very high on life quality: they can't have it all!
Score:2
cn flag

The exact official trade price of a publicly traded stock at a given time is very expensive to control: throwing a lot of money at it will definitely let you influence it, but not control all the digits. Average over multiple stocks and you get a value that's public and effectively impossible to fully control or predict. Of course there's significant bias in the digits, so as with any source of randomness, hash the input to obtain a good distribution.

The opening value of a stock exchange index gives you one pseudorandom value every business day. An example of a well-defined standard with readily available implementations based on this principle is the geohashing algorithm which is based on the opening price of the Dow Jones index, hashed with MD5. (MD5 is unsuitable as a cryptographic hash, but here it's used as a pseudorandom function, which is fine.)

fgrieu avatar
ng flag
Nice, but this won't be available to all at exactly the same time. There even are movies/series about knowing stock prices before others, and a lot of truth behind the reality of the problem. And a wide hash from the value of a single stock the only variable will be distinguishable from random with a large advantage, hours before.
Gilles 'SO- stop being evil' avatar
cn flag
@fgrieu Indeed it depends on the application. If you're going to do short-term trading, a few milliseconds of latency is a big deal. For preparing an expedition within a hundred kilometer radius, having an extra few seconds of headstart doesn't matter.
in flag
I also considered using the stock market. But the thing is: would you consider it or access to it controlled by a (relative) few parties? In most cases it's fine probably, but it does mean you're relying on the US and Wallstreet institutions for access. I'd like something more universal, like the number of flashes emitted by a celestial body, that anyone can individually access without reliance on any institution or party.
kodlu avatar
sa flag
IMHO, you also need to consider the technological sophistication and equipment required to access "number of flashes emitted by a celestial body" reliably and universally.
Gilles 'SO- stop being evil' avatar
cn flag
@chausies Anybody can manipulate the stock market by trading shares. But to control a stock index with enough precision to reduce the set of values to a small selection, you'd have to manipulate _all_ of it, or almost. For Wall Street institutions to merely DoS the Dow Jones would seriously hurt the US economy. For them to control the Dow Jones would be even more expensive. It's just not feasible. Keep in mind that you would need to control all the prices of all the stocks up to the last decimal. This is radically different from manipulating a few stocks for financial gain.
in flag
@Gilles'SO-stopbeingevil' I agree controlling the stock market to any significant precision is basically impossible. My main concern is that **access** to the value of American stock prices is controlled by a relative few parties, and not as freely available as say observing celestial bodies.
Score:2
cn flag

I'd suggest taking a look at the Truestamp Observable Entropy project. We recently created this to address a need for the type of randomness you're asking about.

Observable Entropy automatically collects randomness from publicly verifiable sources every five minutes, storing and hashing the contents of that retrieved data. The hash of each data source file is then combined and hashed deterministically resulting in a final signed SHA2-256 hash representing the totality of the collected entropy as a new verifiable public random value.

The system currently collects data from the following public sources at each interval:

  • Bitcoin blockchain latest block header
  • Stellar Blockchain latest block header
  • Ethereum blockchain latest block header
  • Drand Random Beacon
  • NIST Randomness Beacon
  • Hacker News, top 10 stories and content links
  • Timestamp UTC

Each of these, with the exception of the timestamp which is for convenience, is considered a strong source of randomness. For the truly paranoid, you can provide your own entropy to add to the mix in the form of 32 byte hex strings representing e.g. a hash or random bytes. You can contribute your own randomness if you don't fully trust any of the other sources.

All of the data collected at each interval, as well as the resultant hashes, are committed to a public Github repository. If you clone the repo, scripts are provided that allow you to recreate the entropy hash from the stored data yourself. You can also verify the public key signature on the entropy to ensure that it was signed by Truestamp. Instructions are also provided for how to verify each public source of entropy or the signature yourself.

Its easy for almost anyone to maintain and sync their own clone of this repo, ensuring that everyone has the same data, or can reference the entropy at a point in time. For example, you and a group of others can agree to use a random value that is the latest value at a specific point in time in the future. At that time you will all be looking at the data stored in the same Git commit which should normally be no more than five minutes old. By storing it in Git you also get all of the integrity properties that Git offers, such as chained commit hashes, and breakage of the clones if someone pushed an entirely new data set to Github which would be clear indicators of a bad actor.

In addition, the historical, and latest entropy values are available via a public API (which is simply a proxy designed to read the latest raw data from the Github Repo and has no cache or data store of its own). Here's a sample of the output from https://entropy.truestamp.com/latest.

I believe this meets each of your original requirements:

  • no one party controls it (or has access to it)
  • it can't be reliably predicted
  • it must also be the same for each person that accesses it

At Truestamp we'll be using this source of verifiable randomness to help prove that data committed to our system was provably and independently verifiably created after a specific point in time.

We'd love to get your feedback, as this is a new and somewhat experimental service which has been operating reliably now for about two months.

UPDATED 08/25/2021 to address comment by @fgrieu:

Thanks for the comment @fgrieu. I’ll try to address your concerns, or ask for more clarification on the potential issues you feel may be present.

(1) insiders can know the outcome before others

In this implementation, there are no “insiders” who have access to the final data that is hashed to form the final entropy. The collection of the data is done using automation provided by Github Actions workflows. The data collected, like the signed Drand beacon, is not know to anyone publicly until it is collected (everyone gets a new signed random beacon). In this case the only insider would be someone at Github who has access to their infrastructure, and can extract the contents of memory for this script after the point when it runs 500,000 rounds of SHA-256. Even in this unlikely scenario they would have access to a final entropy value within several milliseconds of when it was committed, and made publicly available to all in the Github repository. This project, which collects entropy every five minutes, is likely not sensitive to this level of time granularity. There may be scenarios where an early (~1s) preview of the final value would be useful enough to an attacker to force a compromise of Github. I don’t think the use cases this project is intended for are affected by this though.

In practice, no one has access to the final output until they all do when a public commit is made. At this point it would be a race to who can fetch the commit first for those seeking this advantage. I would suggest that if your entropy needs are milliseconds sensitive when it comes to when the output is revealed, then this is not the project for you.

(2) insiders can know the inputs before others, and use that to submit a last extra input that manipulates the outcome; e.g. make the low-order 16 bits any desired value.

I addressed the concept of ‘insiders’ above. There are sources that Observable Entropy collects that can be know to outsiders who want to try to manipulate the final output by sending in their own attack data, however this will fail as the attacker cannot know the script collected values of the Drand beacon or the timestamp when the script runs, both of which are known only to the script itself at execution time.

I’m not aware of any attacks on multiple rounds of SHA2-256 that would allow an attacker to predict which bits of the output would be modified to suite their desired outcome.

The ability to accept, and publish, user provided entropy, even up to the last moments before the collection script starts, is a benefit. This allows anyone to submit entropy that will be included and doesn’t require the consumer to have full trust in any of the sources of entropy that feed the final output. They only need to trust their own data. They can verify that their data was included by running the script locally to confirm the same entropy output.

Page 31 of this presentation (and the related academic paper) provides more information on the concepts and benefits of including open public input.

(3) the diffusion of the outcome is not instantaneous thus even non-insiders can have a lead; and network manipulations can artificially grow that

If I understand what you are trying to express, the “diffusion of the outcome” is in fact instantaneous. The outcome can be known only to the generation script at the moment it has collected all the data sources, some of which are non-public but verifiable after the fact. Without the totality of this information it is infeasible to arrive at the final output hash. The window of opportunity is between the time when e.g. the Drand beacon value is collected, and the hash concatenation and final hash rounds have begun.

The original poster’s needs could likely be fulfilled by using only the NIST Randomness beacon or Drand Beacon alone. But this requires putting a certain degree of trust in the infrastructure and its owners for each. This solution diffuses that trust across multiple public and verifiable sources (including yourself) with all sources verifiable after the fact. This is a key differentiator from systems that, for example, observe natural phenomenon which do not allow others to verify after observation.

I’d be happy to hear more feedback on any remaining perceived weakness in my arguments or the system, and suggestions for improvement.

fgrieu avatar
ng flag
There are three potential weaknesses: (1) insiders can know the outcome before others (2) insiders can know the inputs before others, and use that to submit a last extra input that manipulates the outcome; e.g. make the low-order 16 bits any desired value. (3) the diffusion of the outcome is not instantaneous thus even non-insiders can have a lead; and network manipulations can artificially grow that.
cn flag
Thanks for the comment @fgrieu, I've added a rather lengthy response to my original response.
Paul Uszak avatar
cn flag
Okay, this is nuanced and edging into real world stuff beyond this forum's scope, rather than playing with kiddy math. GitHub is a totally owned subsidiary of µsoft. µsoft is subject to the Cloud Act and the Patriot Act. And much other legislation to guarantee patriotist. And they're happy with that given their US contracts. It's like operating in China. Therefore it falls flat. No downvote as it's very interesting technically, but a non starter politically/practically.
cn flag
I would point out: 1) OE doesn't rely on, or put trust in, Github. It is a useful vessel for scheduled execution and making the output available widely. Compromise of GH can offer nothing more than a few seconds head start, not corrupt the data which is from verifiable external sources. 2) The entire logic for creation/verification is baked into a single Deno Typescript source file`cli.ts`. 3) You can fork and run that command in the environment you choose and expose the output through any method of choice. Deno is x-platform and Git runs fine peer to peer. Private Tailscale net? Sure.
Score:1
br flag

I know you said that "without having to all sync up with each other" but other people mentioned the blockchain so I wanted to mention a cool paper I read:

https://jbonneau.com/doc/BGB17-IEEESB-proof_of_delay_ethereum.pdf

This paper solves problems with using block hashes as randomness. If you simply use the block hash as your randomness source, you invite miners to affect your randomness (in a lottery, they could simulate the lottery to check if they've won and discard the block if they hadn't). Using VDFs (and another protocol to ensure efficient verification) you can ensure that no miner could simulate how you're going to use the randomness and thus they cannot know which blocks to discard and cannot affect your randomness.

Or at least that's my understanding of it.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.