DAOs Are Not Scary, Part 1: Self-Enforcing Contracts And Factum Law

Many of the concepts that we promote over in Ethereum land may seem incredibly futuristic, and perhaps even frightening, at times. We talk about so-called “smart contracts” that execute themselves without any need, or any opportunity, for human intervention or involvement, people forming Skynet-like “decentralized autonomous organizations” that live entirely on the cloud and yet control powerful financial resources and can incentivize people to do very real things in the physical world, decentralized “math-based law”, and a seemingly utopian quest to create some kind of fully trust-free society. To the uninformed user, and especially to those who have not even heard of plain old Bitcoin, it can be hard to see how these kinds of things are possible, and if they are why they can possibly be desirable. The purpose of this series will be to dissect these ideas in detail, and show exactly what we mean by each one, discussing its properties, advantages and limitations.

The first installment of the series will talk about so-called “smart contracts”. Smart contracts are an idea that has been around for several decades, but was given its current name and first substantially brought to the (cryptography-inclined) public’s attention by Nick Szabo in 2005. In essence, the definition of a smart contract is simple: a smart contract is a contract that enforces itself. That is to say, whereas a regular contract is a piece of paper (or more recently PDF document) containing text which implicitly asks for a judge to order a party to send money (or other property) to another party under certain conditions, a smart contract is a computer program that can be run on hardware which automatically executes those conditions. Nick Szabo uses the example of a vending machine:

A canonical real-life example, which we might consider to be the primitive ancestor of smart contracts, is the humble vending machine. Within a limited amount of potential loss (the amount in the till should be less than the cost of breaching the mechanism), the machine takes in coins, and via a simple mechanism, which makes a freshman computer science problem in design with finite automata, dispense change and product according to the displayed price. The vending machine is a contract with bearer: anybody with coins can participate in an exchange with the vendor. The lockbox and other security mechanisms protect the stored coins and contents from attackers, sufficiently to allow profitable deployment of vending machines in a wide variety of areas.

Smart contracts are the application of this concept to, well, lots of things. We can have smart financial contracts that automatically shuffle money around based on certain formulas and conditions, smart domain name sale orders that give the domain to whoever first sends in $200, perhaps even smart insurance contracts that control bank accounts and automatically pay out based on some trusted source (or combination of sources) supplying data about real-world events.

Smart Property

At this point, however, one obvious question arises: how are these contracts going to be enforced? Just like traditional contracts, which are not worth the paper they’re written on unless there’s an actual judge backed by legal power enforcing them, smart contracts needs to be “plugged in” to some system in order to actually have power to do anything. The most obvious, and oldest, solution is hardware, an idea that also goes by the name “smart property”. Nick Szabo’s vending machine is the canonical example here. Inside the vending machine, there is a sort of proto-smart-contract, containing a set of computer code that looks something like this:

if button_pressed == "Coca Cola" and money_inserted >= 1.75:
    release("Coca Cola")
    return_change(money_inserted - 1.75)

else if button_pressed == "Aquafina Water" and money_inserted >= 1.25:
    release("Aquafina Water")
    return_change(money_inserted - 1.25)

else if ...

The contract has four “hooks” into the outside world: the button_pressed and money_inserted variables as input, and the release and return_change commands as output. All four of these depend on hardware, although we focus on the last three because human input is generally considered to be a trivial problem. If the contract was running on an Android phone from 2007, it would be useless; the Android phone has no way of knowing how much money was inserted into a slot, and certainly cannot release Coca Cola bottles or return change. On a vending machine, on the other hand, the contract carries some “force”, backed by the vending machine’s internal Coca Cola holdings and its physical security preventing people from just taking the Coca Cola without following the rules of the contract.

Another, more futuristic, application of smart property is rental cars: imagine a world where everyone has their own private key on a smartphone, and there is a car such that when you pay $100 to a certain address the car automatically starts responding commands signed by your private key for a day. The same principle can also be applied to houses. If that sounds far-fetched, keep in mind that office buildings are largely smart property already: access is controlled by access cards, and the question of which (if any) doors each card is valid for is determined by a piece of code linked to a database. And if the company has an HR system that automatically processes employment contracts and activates new employees access cards, then that employment contract is, to a slight extent, a smart contract.

Smart Money and Factum Society

However, physical property is very limited in what it can do. Physical property has a limited amount of security, so you cannot practically do anything interesting with more than a few tens of thousands of dollars with a smart-property setup. And ultimately, the most interesting contracts involve transferring money. But how can we actually make that work? Right now, we basically can’t. We can, theoretically, give contracts the login details to our bank accounts, and then have the contract send money under some conditions, but the problem is that this kind of contract is not really “self-enforcing”. The party making the contract can always simply turn the contract off just before payment is due, or drain their bank account, or even simply change the password to the account. Ultimately, no matter how the contract is integrated into the system, someone has the ability to shut it off.

How can we solve the problem? Ultimately, the answer is one that is radical in the context of our wider society, but already very much old news in the world of Bitcoin: we need a new kind of money. So far, the evolution of money has followed three stages: commodity money, commodity-backed money and fiat money. Commodity money is simple: it’s money that is valuable because it is also simultaneously a commodity that has some “intrinsic” use value. Silver and gold are perfect examples, and in more traditional societies we also have tea, salt (etymology note: this is where the word “salary” comes from), seashells and the like. Next came commodity-backed money – banks issuing certificates that are valuable because they are redeemable for gold. Finally, we have fiat money. The “fiat” in “fiat money” is just like in “fiat lux“, except instead of God saying “let there be light” it’s the federal government saying “let there be money”. The money has value largely because the government issuing it accepts that money, and only that money, as payment for taxes and fees, alongside several other legal privileges.

With Bitcoin, however, we have a new kind of money: factum money. The difference between fiat money and factum money is this: whereas fiat money is put into existence, and maintained, by a government (or, theoretically, some other kind of agency) producing it, factum money just is. Factum money is simply a balance sheet, with a few rules on how that balance sheet can be updated, and that money is valid among that set of users which decides to accept it. Bitcoin is the first example, but there are more. For example, one can have an alternative rule, which states that only bitcoins coming out of a certain “genesis transaction”, count as part of the balance sheet; this is called “colored coins”, and is also a kind of factum money (unless those colored coins are fiat or commodity-backed).

The main promise of factum money, in fact, is precisely the fact that it meshes so well with smart contracts. The main problem with smart contracts is enforcement: if a contract says to send $200 to Bob if X happens, and X does happen, how do we ensure that $200 actually gets sent to Bob. The solution with factum money is incredibly elegant: the definition of the money, or more precisely the definition of the current balance sheet, is the result of executing all of the contracts. Thus, if X does happen, then everyone will agree that Bob has the extra $200, and if X does not happen then everyone will agree that Bob has whatever Bob had before.

This is actually a much more revolutionary development than you might think at first; with factum money, we have created a way for contracts, and perhaps even law in general, to work, and be effective, without relying on any kind of mechanism whatsoever to enforce it. Want a $100 fine for littering? Then define a currency so that you have 100 units less if you litter, and convince people to accept it. Now, that particular example is very far-fetched, and likely impractical without a few major caveats which we will discuss below, but it shows the general principle, and there are many more moderate examples of this kind of principle that definitely can be put to work.

Just How Smart Are Smart Contracts?

Smart contracts are obviously very effective for any kind of financial applications, or more generally any kind of swaps between two different factum assets. One example is a domain name sale; a domain, like google.com, is a factum asset, since it’s backed by a database on a server that only carries any weight because we accept it, and money can obviously be factum as well. Right now, selling a domain is a complicated process that often requires specialized services; in the future, you may be able to package up a sale offer into a smart contract and put it on the blockchain, and if anyone takes it both sides of the trade will happen automatically – no possibility of fraud involved. Going back to the world of currencies, decentralized exchange is another example, and we can also do financial contracts such as hedging and leverage trading.

However, there are places where smart contracts are not so good. Consider, for example, the case of an employment contract: A agrees to do a certain task for B in exchange for payment of X units of currency C. The payment part is easy to smart-contract-ify. However, there is a part that is not so easy: verifying that the work actually took place. If the work is in the physical world, this is pretty much impossible, since blockchains don’t have any way of accessing the physical world. Even if it’s a website, there is still the question of assessing quality, and although computer programs can use machine learning algorithms to judge such characteristics quite effectively in certain cases, it is incredibly hard to do so in a public contract without opening the door for employees “gaming the system”. Sometimes, a society ruled by algorithms is just not quite good enough.

Fortunately, there is a moderate solution that can capture the best of both worlds: judges. A judge in a regular court has essentially unlimited power to do what they want, and the process of judging does not have a particularly good interface; people need to file a suit, wait a significant length of time for a trial, and the judge eventually makes a decision which is enforced by the legal system – itself not a paragon of lightning-quick efficiency. Private arbitration often manages to be cheaper and faster than courts, but even there the problems are still the same. Judges in a factum world, on the other hand, are very much different. A smart contract for employment might look like this:

if says(B,"A did the job") or says(J,"A did the job"):
    send(200, A)

else if says(A,"A did not do the job") or says(J,"A did not do the job"):
    send(200, B)

says is a signature verification algorithm; says(P,T) basically checks if someone had submitted a message with text T and a digital signature that verifies using P’s public key. So how does this contract work? First, the employer would send 200 currency units into the contract, where they would sit in escrow. In most cases, the employer and employee are honest, so either A quits and releases the funds back to B by signing a message saying “A did not do the job” or A does the job, B verifies that A did the job, and the contract releases the funds to A. However, if A does the job, and B disagrees, then it’s up to judge J to say that either A did the job or A did not do the job.

Note that J’s power is very carefully delineated; all that J has the right to do is say that either A did the job or A did not do the job. A more sophisticated contract might also give J the right to grant judgements within the range between the two extremes. J does not have the right to say that A actually deserves 600 currency units, or that by the way the entire relationship is illegal and J should get the 200 units, or anything else outside of the clearly defined boundaries. And J’s power is enforced by factum – the contract contains J’s public key, and thus the funds automatically go to A or B based on the boundaries. The contract can even require messages from 2 out of 3 judges, or it can have separate judges judge separate aspects of the work and have the contract automatically assign B’s work a quality score based on those ratings. Any contract can simply plug in any judge in exactly the way that they want, whether to judge the truth or falsehood of a specific fact, provide a measurement of some variable, or be one of the parties facilitating the arrangement.

How will this be better than the current system? In short, what this introduces is “judges as a service”. Now, in order to become a “judge” you need to get hired at a private arbitration firm or a government court or start your own. In a cryptographically enabled factum law system, being a judge simply requires having a public key and a computer with internet access. As counterintuitive as it sounds, not all judges need to be well-versed in law. Some judges can specialize in, for example, determining whether or not a product was shipped correctly (ideally, the postal system would do this). Other judges can verify the completion of employment contracts. Others would appraise damages for insurance contracts. It would be up to the contract writer to plug in judges of each type in the appropriate places in the contract, and the part of the contract that can be defined purely in computer code will be.

And that’s all there is to it.

The next part of this series will talk about the concept of trust, and what cryptographers and Bitcoin advocates really mean when they talk about building a “trust-free” society.

Ethereum Scalability and Decentralization Updates

Scalability is now at the forefront of the technical discussion in the cryptocurrency scene. The Bitcoin blockchain is currently over 12 GB in size, requiring a period of several days for a new bitcoind node to fully synchronize, the UTXO set that must be stored in RAM is approaching 500 MB, and continued software improvements in the source code are simply not enough to alleviate the trend. With every passing year, it becomes more and more difficult for an ordinary user to locally run a fully functional Bitcoin node on their own desktop, and even as the price, merchant acceptance and popularity of Bitcoin has skyrocketed the number of full nodes in the network has essentially stayed the same since 2011. The 1 MB block size limit currently puts a theoretical cap on this growth, but at a high cost: the Bitcoin network cannot process more than 7 transactions per second. If the popularity of Bitcoin jumps up tenfold yet again, then the limit will force the transaction fee up to nearly a dollar, making Bitcoin less useful than Paypal. If there is one problem that an effective implementation of cryptocurrency 2.0 needs to solve, it is this.

The reason why we in the cryptocurrency spaceare having these problems, and are making so little headway toward coming up with a solution, is that there one fundamental issue with all cryptocurrency designs that needs to be addressed. Out of all of the various proof of work, proof of stake and reputational consensus-based blockchain designs that have been proposed, not a single one has managed to overcome the same core problem: that every single full node must process every single transaction. Having nodes that can process every transaction, even up to a level of thousands of transactions per second, is possible; centralized systems like Paypal, Mastercard and banking servers do it just fine. However, the problem is that it takes a large quantity of resources to set up such a server, and so there is no incentive for anyone except a few large businesses to do it. Once that happens, then those few nodes are potentially vulnerable to profit motive and regulatory pressure, and may start making theoretically unauthorized changes to the state, like giving themselves free money, and all other users, which are dependent on those centralized nodes for security, would have no way of proving that the block is invalid since they do not have the resources to process the entire block.

In Ethereum, as of this point, we have no fundamental improvements over the principle that every full node must process every transaction. There have been ingenious ideas proposed by various Bitcoin developers involving multiple merge-mined chains with a protocol for moving funds from one chain to another, and these will be a large part of our cryptocurrency research effort, but at this point research into how to implement this optimally is not yet mature. However, with the introduction of Block Protocol 2.0 (BP2), we have a protocol that, while not getting past the fundamental blockchain scalability flaw, does get us partway there: as long as at least one honest full node exists (and, for anti-spam reasons, has at least 0.01% mining power or ether ownership), “light clients” that only download a small amount of data from the blockchain can retain the same level of security as full nodes.

What Is A Light Client?

The basic idea behind a light client is that, thanks to a data structure present in Bitcoin (and, in a modified form, Ethereum) called a Merkle tree, it is possible to construct a proof that a certain transaction is in a block, such that the proof is much smaller than the block itself. Right now, a Bitcoin block is about 150 KB in size; a Merkle proof of a transaction is about half a kilobyte. If Bitcoin blocks become 2 GB in size, the proofs might expand to a whole kilobyte. To construct a proof, one simply needs to follow the “branch” of the tree all the way up from the transaction to the root, and provide the nodes on the side every step of the way. Using this mechanism, light clients can be assured that transactions sent to them (or from them) actually made it into a block.

This makes it substantially harder for malicious miners to trick light clients. If, in a hypothetical world where running a full node was completely impractical for ordinary users, a user wanted to claim that they sent 10 BTC to a merchant with not enough resources to download the entire block, the merchant would not be helpless; they would ask for a proof that a transaction sending 10 BTC to them is actually in the block. If the attacker is a miner, they can potentially be more sophisticated and actually put such a transaction into a block, but have it spend funds (ie. UTXO) that do not actually exist. However, even here there is a defense: the light client can ask for a second Merkle tree proof showing that the funds that the 10 BTC transaction is spending also exist, and so on down to some safe block depth. From the point of view of a miner using a light client, this morphs into a challenge-response protocol: full nodes verifying transactions, upon detecting that a transaction spent an output that does not exist, can publish a “challenge” to the network, and other nodes (likely the miner of that block) would need to publish a “response” consisting of a Merkle tree proof showing that the outputs in question do actually exist in some previous block. However, there is one weakness in this protocol in Bitcoin: transaction fees. A malicious miner can publish a block giving themselves a 1000 BTC reward, and other miners running light clients would have no way of knowing that this block is invalid without adding up all of the fees from all of the transactions themselves; for all they know, someone else could have been crazy enough to actually add 975 BTC worth of fees.

BP2

With the previous Block Protocol 1.0, Ethereum was even worse; there was no way for a light client to even verify that the state tree of a block was a valid consequence of the parent state and the transaction list. In fact, the only way to get any assurances at all was for a node to run through every transaction and sequentially apply them to the parent state themselves. BP2, however, adds some stronger assurances. With BP2, every block now has three trees: a state tree, a transaction tree, and a stack trace tree providing the intermediate root of the state tree and the transaction tree after each step. This allows for a challenge-response protocol that, in simplified form, works as follows:

1. Miner M publishes block B. Perhaps the miner is malicious, in which case the block updates the state incorrectly at some point.
2. Light node L receives block B, and does basic proof of work and structural validity checks on the header. If these checks pass, then L starts off treating the block as legitimate, though unconfirmed.
3. Full node F receives block B, and starts doing a full verification process, applying each transaction to the parent state, and making sure that each intermediate state matches the intermediate state provided by the miner. Suppose that F finds an inconsistency at point k. Then, F broadcasts a “challenge” to the network consisting of the hash of B and the value k.
4. L receives the challenge, and temporarily flags B as untrustworthy.
5. If F’s claim is false, and the block is valid at that point, then M can produce a proof of localized consistency by showing a Merkle tree proof of point k in the stack trace, point k+1 in the stack trace, and the subset of Merkle tree nodes in the state and transaction tree that were modified during the process of updating from k to k+1. L can then verify the proof by taking M’s word on the validity of the block up to point k, manually running the update from k to k+1 (this consists of processing a single transaction), and making sure the root hashes match what M provided at the end. L would, of course, also check that the Merkle tree proof for the values at state k and k+1 is valid.
6. If F’s claim is true, then M would not be able to come up with a response, and after some period of time L would discard B outright.

Note that currently the model is for transaction fees to be burned, not distributed to miners, so the weakness in Bitcoin’s light client protocol does not apply. However, even if we decided to change this, the protocol can easily be adapted to handle it; the stack trace would simply also keep a running counter of transaction fees alongside the state and transaction list. As an anti-spam measure, in order for F’s challenge to be valid, F needs to have either mined one of the last 10000 blocks or have held 0.01% of the total supply of ether for at least some period of time. If a full node sends a false challenge, meaning that a miner successfully responds to it, light nodes can blacklist the node’s public key.

Altogether, what this means is that, unlike Bitcoin, Ethereum will likely still be fully secure, including against fraudulent issuance attacks, even if only a small number of full nodes exist; as long as at least one full node is honest, verifying blocks and publishing challenges where appropriate, light clients can rely on it to point out which blocks are flawed. Note that there is one weakness in this protocol: you now need to know all transactions ahead of time before processing a block, and adding new transactions requires substantial effort to recalculate intermediate stack trace values, so the process of producing a block will be more inefficient. However, it is likely possible to patch the protocol to get around this, and if it is possible then BP2.1 will have such a fix.

Blockchain-based Mining

We have not finalized the details of this, but Ethereum will likely use something similar to the following for its mining algorithm:

1. Let H[i] = sha3(sha3(block header without nonce) ++ nonce ++ i) for i in [0 ...16]
2. Let N be the number of transactions in the block.
3. Let T[i] be the (H[i] mod N)th transaction in the block.
4. Let S be the parent block state.
5. Apply T[0] ... T[15] to S, and let the resulting state be S'.
6. Let x = sha3(S'.root)
7. The block is valid if x * difficulty <= 2^256

This has the following properties:

1. This is extremely memory-hard, even more so than Dagger, since mining effectively requires access to the entire blockchain. However it is parallelizable with shared disk space, so it will likely be GPU-dominated, not CPU-dominated as Dagger originally hoped to be.
2. It is memory-easy to verify, since a proof of validity consists of only the relatively small subset of Patricia nodes that are used while processing T[0] ... T[15]
3. All miners essentially have to be full nodes; asking the network for block data for every nonce is prohibitively slow. Thus there will be a larger number of full nodes in Ethereum than in Bitcoin.
4. As a result of (3), one of the major motivations to use centralized mining pools, the fact that they allow miners to operate without downloading the entire blockchain, is nullified. The other main reason to use mining pools, the fact that they even out the payout rate, can be assomplished just as easily with the decentralized p2pool (which we will likely end up supporting with development resources)
5. ASICs for this mining algorithm are simultaneously ASICs for transaction processing, so Ethereum ASICs will help solve the scalability problem.

From here, there is only really one optimization that can be made: figuring out some way to get past the obstacle that every full node must process every transaction. This is a hard problem; a truly scalable and effective solution will take a while to develop. However, this is a strong start, and may even end up as one of the key ingredients to a final solution.

Important Statement regarding the Ether pre-sale

The Ethereum Project has had the incredible privilege to launch its PoC testnet and engage the crypto-currency community over the past two months. During our experiences, we’ve encountered a lot of passionate support and wonderful questions that have helped us refine our thoughts and goals including the process we will eventually use to sell ether. This said, we have not finalized the structure and format for the ether presale and thus we do not recommend, encourage, or endorse any attempt to sell, trade, or acquire ether.

We have recently become aware of peercover.com announcing a fundraising structure based in some way upon ether- they are in no way associated with the Ethereum project, do not speak for it, and are, in our opinion, doing a disservice to the Ethereum community by possibly leading their own clients into a situation that they don’t understand. Offering to sell ether that doesn’t yet exist to mislead purchasers can only be considered irresponsible at this point. Buyer beware.

We request that peercover.com cease to offer ether forwards, until there is more information released on the Ethereum project, the potential value of the ether cryptofuel, and until lawyers in various countries clarify what the securities and regulatory issues might be in selling ether to the public in various countries.

Why Not Just Use X? An Instructive Example from Bitcoin

Bitcoin developer Gregory Maxwell writes the following on Reddit:

There is a design flaw in the Bitcoin protocol where its possible for a third party to take a valid transaction of yours and mutate it in a way which leaves it valid and functionally identical but with a different transaction ID. This greatly complicates writing correct wallet software, and it can be used abusively to invalidate long chains of unconfirmed transactions that depend on the non-mutant transaction (since transactions refer to each other by txid).

This issue arises from several sources, one of them being OpenSSL’s willingness to accept and make sense of signatures with invalid encodings. A normal ECDSA signature encodes two large integers, the encoding isn’t constant length— if there are leading zeros you are supposed to drop them.

It’s easy to write software that assumes the signature will be a constant length and then leave extra leading zeros in them.

This is a very interesting cautionary tale, and is particularly important because situations like these are part of the reason why we have made certain design decisions in our development philosophy. Specifically, the issue is this: many people continue to bring up the point that we are in many places unnecessarily reinventing the wheel, creating our own serialization format, RLP, instead of using the existing protobuf and we’re building an application-specific scripting language instead of “just using Lua”. This is a very valid concern; not-invented-here syndrome is a commonly-used pejorative, so doing such in-house development does require justification.

And the cautionary tale I quoted above provides precisely the perfect example of the justification that I will provide. External technologies, whether protobuf, Lua or OpenSSL, are very good, and have years of development behind them, but in many cases they were never designed with the perfect consensus, determinism and cryptographic integrity in mind that cryptocurrencies require. The OpenSSL situation above is the perfect example; aside from cryptocurrencies, there really is no other situations where the fact that you can take a valid signature and turn it into another valid signature with a different hash is a significant problem, and yet here it’s fatal. One of our core principles in Ethereum is simplicity; the protocol should be as simple as possible, and the protocol should not contain any black boxes. Every single feature of every single sub-protocol should be precisely 100% documented on the whitepaper or wiki, and implemented using that as a specification (ie. test-driven development). Doing this for an existing software package is arguably almost as hard as building an entirely new package from scratch; in fact, it may even be harder, since existing software packages often have more complexity than they need to in order to be feature-complete, whereas our alternatives do not – read the protobuf spec and compare it to the RLP spec to understand what I mean.

Note that the above principle has its limits. For example, we are certainly not foolish enough to start inventing our own hash algorithms, instead using the universally acclaimed and well-vetted SHA3, and for signatures we’re using the same old secp256k1 as Bitcoin, although we’re using RLP to store the v,r,s triple (the v is an extra two bits for public key recovery purposes) instead of the OpenSSL buffer protocol. These kinds of situations are the ones where “just using X” is precisely the right thing to do, because X has a clean and well-understood interface and there are no subtle differences between different implementations. The SHA3 of the empty string is c5d2460186...a470 in C++, in Python, and in Javascript; there’s no debate about it. In between these two extremes, it’s basically a matter of finding the right balance.

Cryptographic Code Obfuscation: Decentralized Autonomous Organizations Are About to Take a Huge Leap Forward

There have been a number of very interesting developments in cryptography in the past few years. Satoshi’s blockchain notwithstanding, perhaps the first major breakthrough after blinding and zero-knowledge proofs is fully homomorphic encryption, a technology which allows you to upload your data onto a server in an encrypted form so that the server can then perform calculations on it and send you back the results all without having any idea what the data is. In 2013, we saw the beginnings of succinct computational integrity and privacy (SCIP), a toolkit pioneered by Eli ben Sasson in Israel that lets you cryptographically prove that you carried out some computation and got a certain output. On the more mundane side, we now have sponge functions, an innovation that substantially simplifies the previous mess of hash functions, stream ciphers and pseudorandom number generators into a beautiful, single construction. Most recently of all, however, there has been another major development in the cryptographic scene, and one whose applications are potentially very far-reaching both in the cryptocurrency space and for software as a whole: obfuscation.

The idea behind obfuscation is an old one, and cryptographers have been trying to crack the problem for years. The problem behind obfuscation is this: is it possible to somehow encrypt a program to produce another program that does the same thing, but which is completely opaque so there is no way to understand what is going on inside? The most obvious use case is proprietary software – if you have a program that incorporates advanced algorithms, and want to let users use the program on specific inputs without being able to reverse-engineer the algorithm, the only way to do such a thing is to obfuscate the code. Proprietary software is for obvious reasons unpopular among the tech community, so the idea has not seen a lot of enthusiasm, a problem compounded by the fact that each and every time a company would try to put an obfuscation scheme into practice it would quickly get broken. Five years ago, researchers put what might perhaps seem to be a final nail in the coffin: a mathematical proof, using arguments vaguely similar to those used to show the impossibility of the halting problem, that a general purpose obfuscator that converts any program into a “black box” is impossible.

At the same time, however, the cryptography community began to follow a different path. Understanding that the “black box” ideal of perfect obfuscation will never be achieved, researchers set out to instead aim for a weaker target: indistinguishability obfuscation. The definition of an indistinguishability obfuscator is this: given two programs A and B that compute the same function, if an effective indistinguishability obfuscator O computes two new programs X=O(A) and Y=O(B), given X and Y there is no (computationally feasible) way to determine which of X and Y came from A and which came from B. In theory, this is the best that anyone can do; if there is a better obfuscator, P, then if you put A and P(A) through the indistinguishability obfuscator O, there would be no way to tell between O(A) and O(P(A)), meaning that the extra step of adding P could not hide any information about the inner workings of the program that O does not. Creating such an obfuscator is the problem which many cryptographers have occupied themselves with for the last five years. And in 2013, UCLA cryptographer Amit Sahai, homomorphic encryption pioneer Craig Gentry and several other researchers figured out how to do it.

Does the indistinguishability obfuscator actually hide private data inside the program? To see what the answer is, consider the following. Suppose your secret password is bobalot_13048, and the SHA256 of the password starts with 00b9bbe6345de82f. Now, construct two programs. A just outputs 00b9bbe6345de82f, whereas B actually stores bobalot_13048 inside, and when you run it it computes SHA256(bobalot_13048) and returns the first 16 hex digits of the output. According to the indistinguishability property, O(A) and O(B) are indistinguishable. If there was some way to extract bobalot_13048 from B, it would therefore be possible to extract bobalot_13048 from A, which essentially implies that you can break SHA256 (or by extension any hash function for that matter). By standard assumptions, this is impossible, so therefore the obfuscator must also make it impossible to uncover bobalot_13048 from B. Thus, we can be pretty sure that Sahai’s obfuscator does actually obfuscate.

So What’s The Point?

In many ways, code obfuscation is one of the holy grails of cryptography. To understand why, consider just how easily nearly every other primitive can be implemented with it. Want public key encryption? Take any symmetric-key encryption scheme, and construct a decryptor with your secret key built in. Obfuscate it, and publish that on the web. You now have a public key. Want a signature scheme? Public key encryption provides that for you as an easy corollary. Want fully homomorphic encryption? Construct a program which takes two numbers as an input, decrypts them, adds the results, and encrypts it, and obfuscate the program. Do the same for multiplication, send both programs to the server, and the server will swap in your adder and multiplier into its code and perform your computation.

However, aside from that, obfuscation is powerful in another key way, and one which has profound consequences particularly in the field of cryptocurrencies and decentralized autonomous organizations: publicly running contracts can now contain private data. On top of second-generation blockchains like Ethereum, it will be possible to run so-called “autonomous agents” (or, when the agents primarily serve as a voting system between human actors, “decentralized autonomous organizations”) whose code gets executed entirely on the blockchain, and which have the power to maintain a currency balance and send transactions inside the Ethereum system. For example, one might have a contract for a non-profit organization that contains a currency balance, with a rule that the funds can be withdrawn or spent if 67% of the organization’s members agree on the amount and destination to send.

Unlike Bitcoin’s vaguely similar multisig functionality, the rules can be extremely flexible, for example allowing a maximum of 1% per day to be withdrawn with only 33% consent, or making the organization a for-profit company whose shares are tradable and whose shareholders automatically receive dividends. Up until now it has been thought that such contracts are fundamentally limited – they can only have an effect inside the Ethereum network, and perhaps other systems which deliberately set themselves up to listen to the Ethereum network. With obfuscation, however, there are new possibilities.

Consider the simplest case: an obfuscated Ethereum contract can contain a private key to an address inside the Bitcoin network, and use that private key to sign Bitcoin transactions when the contract’s conditions are met. Thus, as long as the Ethereum blockchain exists, one can effectively use Ethereum as a sort of controller for money that exists inside of Bitcoin. From there, however, things only get more interesting. Suppose now that you want a decentralized organization to have control of a bank account. With an obfuscated contract, you can have the contract hold the login details to the website of a bank account, and have the contract carry out an entire HTTPS session with the bank, logging in and then authorizing certain transfers. You would need some user to act as an intermediary sending packets between the bank and the contract, but this would be a completely trust-free role, like an internet service provider, and anyone could trivially do it and even receive a reward for the task. Autonomous agents can now also have social networking accounts, accounts to virtual private servers to carry out more heavy-duty computations than what can be done on a blockchain, and pretty much anything that a normal human or proprietary server can.

Looking Forward

Thus, we can see that in the next few years decentralized autonomous organizations are potentially going to become much more powerful than they are today. But what are the consequences going to be? In the developed world, the hope is that there will be a massive reduction in the cost of setting up a new business, organization or partnership, and a tool for creating organizations that are much more difficult to corrupt. Much of the time, organizations are bound by rules which are really little more than gentlemen’s agreements in practice, and once some of the organization’s members gain a certain measure of power they gain the ability to twist every interpretation in their favor.

Up until now, the only partial solution was codifying certain rules into contracts and laws – a solution which has its strengths, but which also has its weaknesses, as laws are numerous and very complicated to navigate without the help of a (often very expensive) professional. With DAOs, there is now also another alternative: making an organization whose organizational bylaws are 100% crystal clear, embedded in mathematical code. Of course, there are many things with definitions that are simply too fuzzy to be mathematically defined; in those cases, we will still need some arbitrators, but their role will be reduced to a limited commodity-like function circumscribed by the contract, rather than having potentially full control over everything.

In the developing world, however, things will be much more drastic. The developed world has access to a legal system that is at times semi-corrupt, but whose main problems are otherwise simply that it’s too biased toward lawyers and too outdated, bureaucratic and inefficient. The developing world, on the other hand, is plagues by legal systems that are fully corrupt at best, and actively conspiring to pillage their subjects at worst. There, nearly all businesses are gentleman’s agreements, and opportunities for people to betray each other exist at every step. The mathematically encoded organizational bylaws that DAOs can have are not just an alternative; they may potentially be the first legal system that people have that is actually there to help them. Arbitrators can build up their reputations online, as can organizations themselves. Ultimately, perhaps on-blockchain voting, like that being pioneered by BitCongress, may even form a basis for new experimental governments. If Africa can leapfrog straight from word of mouth communications to mobile phones, why not go from tribal legal systems with the interference of local governments straight to DAOs?

Many will of course be concerned that having uncontrollable entities moving money around is dangerous, as there are considerable possibilities for criminal activity with these kinds of powers. To that, however, one can make two simple rebuttals. First, although these decentralized autonomous organizations will be impossible to shut down, they will certainly be very easy to monitor and track every step of the way. It will be possible to detect when one of these entities makes a transaction, it will be easy to see what its balance and relationships are, and it will be possible to glean a lot of information about its organizational structure if voting is done on the blockchain. Much like Bitcoin, DAOs are likely far too transparent to be practical for much of the underworld; as FINCEN director Jennifer Shasky Calvery has recently said, “cash is probably still the best medium for laundering money”. Second, ultimately DAOs cannot do anything normal organizations cannot do; all they are is a set of voting rules for a group of humans or other human-controlled agents to manage ownership of digital assets. Even if a DAO cannot be shut down, its members certainly can be just as if they were running a plain old normal organization offline.

Whatever the dominant applications of this new technology turn out to be, one thing is looking more and more certain: cryptography and distributed consensus are about to make the world a whole lot more interesting.

More Thoughts on Scripting and Future-Compatibility

My previous post introducing Ethereum Script 2.0 was met with a number of responses, some highly supportive, others suggesting that we switch to their own preferred stack-based / assembly-based / functional paradigm, and offering various specific criticisms that we are looking hard at. Perhaps the strongest criticism this time came from Sergio Damian Lerner, Bitcoin security researcher, developer of QixCoin and to whom we are grateful for his analysis of Dagger. Sergio particularly criticizes two aspects of the change: the fee system, which is changed from a simple one-variable design where everything is a fixed multiple of the BASEFEE, and the loss of the crypto opcodes.

The crypto opcodes are the more important part of Sergio’s argument, and I will handle that issue first. In Ethereum Script 1.0, the opcode set had a collection of opcodes that are specialized around certain cryptographic functions – for example, there was an opcode SHA3, which would take a length and a starting memory index off the stack and then push the SHA3 of the string taken from the desired number of blocks in memory starting from the starting index. There were similar opcodes for SHA256 and RIPEMD160 and there were also crypto opcodes oriented around secp256k1 elliptic curve operations. In ES2, those opcodes are gone. Instead, they are replaced by a fluid system where people will need to write SHA256 in ES manually (in practice, we would offer a commision or bounty for this), and then later on smart interpreters can seamlessly replace the SHA256 ES script with a plain old machine-code (or even hardware) version of SHA256 of the sort that you use when you call SHA256 in C++. From an outside view, ES SHA256 and machine code SHA256 are indistinguishable; they both compute the same function and therefore make the same transformations to the stack, the only difference is that the latter is hundreds of times faster, giving us the same efficiency as if SHA256 was an opcode. A flexible fee system can then also be implemented to make SHA256 cheaper to accommodate its reduced computation time, ideally making it as cheap as an opcode is now.

Sergio, however, prefers a different approach: coming with lots of crypto opcodes out of the box, and using hard-forking protocol changes to add new ones if necessary further down the line. He writes:

First, after 3 years of watching Bitcoin closely I came to understand that a cryptocurrency is not a protocol, nor a contract, nor a computer-network. A cryptocurrency is a community. With the exception of a very few set of constants, such as the money supply function and the global balance, anything can be changed in the future, as long as the change is announced in advance. Bitcoin protocol worked well until now, but we know that in the long term it will face scalability issues and it will need to change accordingly. Short term benefits, such as the simplicity of the protocol and the code base, helped the Bitcoin get worldwide acceptance and network effect. Is the reference code of Bitcoin version 0.8 as simple as the 0.3 version? not at all. Now there are caches and optimizations everywhere to achieve maximum performance and higher DoS security, but no one cares about this (and nobody should). A cryptocurrency is bootstrapped by starting with a simple value proposition that works in the short/mid term.

This is a point that is often brought up with regard to Bitcoin. However, the more I look at what is actually going on in Bitcoin development, the more I become firmly set in my position that, with the exception of very early-stage cryptographic protocols that are in their infancy and seeing very low practical usage, the argument is absolutely false. There are currently many flaws in Bitcoin that can be changed if only we had the collective will to. To take a few examples:

  1. The 1 MB block size limit. Currently, there is a hard limit that a Bitcoin block cannot have more than 1 MB of transactions in it – a cap of about seven transactions per second. We are starting to brush against this limit already, with about 250 KB in each block, and it’s putting pressure on transaction fees already. In most of Bitcoin’s history, fees were around $0.01, and every time the price rose the default BTC-denominated fee that miners accept was adjusted down. Now, however, the fee is stuck at $0.08, and the developers are not adjusting it down arguably because adjusting the fee back down to $0.01 would cause the number of transactions to brush against the 1 MB limit. Removing this limit, or at the very least setting it to a more appropriate value like 32 MB, is a trivial change; it is only a single number in the source code, and it would clearly do a lot of good in making sure that Bitcoin continues to be used in the medium term. And yet, Bitcoin developers have completely failed to do it.
  2. The OP_CHECKMULTISIG bug. There is a well-known bug in the OP_CHECKMULTISIG operator, used to implement multisig transactions in Bitcoin, where it requires an additional dummy zero as an argument which is simply popped off the stack and not used. This is highly non-intuitive, and confusing; when I personally was working on implementing multisig for pybitcointools, I was stuck for days trying to figure out whether the dummy zero was supposed to be at the front or take the place of the missing public key in a 2-of-3 multisig, and whether there are supposed to be two dummy zeroes in a 1-of-3 multisig. Eventually, I figured it out, but I would have figured it out much faster had the operation of the OP_CHECKMULTISIG operator been more intuitive. And yet, the bug has not been fixed.
  3. The bitcoind client. The bitcoind client is well-known for being a very unwieldy and non-modular contraption; in fact, the problem is so serious that everyone looking to build a bitcoind alternative that is more scalable and enterprise-friendly is not using bitcoind at all, instead starting from scratch. This is not a core protocol issue, and theoretically changing the bitcoind client need not involve any hard-forking changes at all, but the needed reforms are still not being done.

All of these problems are not there because the Bitcoin developers are incompetent. They are not; in fact, they are very skilled programmers with deep knowledge of cryptography and the database and networking issues inherent in cryptocurrency client design. The problems are there because the Bitcoin developers very well realize that Bitcoin is a 10-billion-dollar train hurtling along at 400 kilometers per hour, and if they try to change the engine midway through and even the tiniest bolt comes loose the whole thing could come crashing to a halt. A change as simple as swapping the database back in March 2011 almost did. This is why in my opinion it is irresponsible to leave a poorly designed, non-future-proof protocol, and simply say that the protocol can be updated in due time. On the contrary, the protocol must be designed to have an appropriate degree of flexibility from the start, so that changes can be made by consensus to automatically without needing to update any software.

Now, to address Sergio’s second issue, his main qualm with modifiable fees: if fees can go up and down, it becomes very difficult for contracts to set their own fees, and if a fee goes up unexpectedly then that may open up a vulnerability through which an attacker may even be able to force a contract to go bankrupt. I must thank Sergio for making this point; it is something that I had not yet sufficiently considered, and we will need to think carefully about when making our design. However, his solution, manual protocol updates, is arguably no better; protocol updates that change fee structures can expose new economic vulnerabilities in contracts as well, and they are arguably even harder to compensate for because there are absolutely no restrictions on what content manual protocol updates can contain.

So what can we do? First of all, there are many intermediate solutions between Sergio’s approach – coming with a limited fixed set of opcodes that can be added to only with a hard-forking protocol change – and the idea I provided in the ES2 blogpost of having miners vote on fluidly changing fees for every script. One approach might be to make the voting system more discrete, so that there would be a hard line between a script having to pay 100% fees and a script being “promoted” to being an opcode that only needs to pay a 20x CRYPTOFEE. This could be done via some combination of usage counting, miner voting, ether holder voting or other mechanisms. This is essentially a built-in mechanism for doing hardforks that does not technically require any source code updates to apply, making it much more fluid and non-disruptive than a manual hardfork approach. Second, it is important to point out once again that the ability to efficiently do strong crypto is not gone, even from the genesis block; when we launch Ethereum, we will create a SHA256 contract, a SHA3 contract, etc and “premine” them into pseudo-opcode status right from the start. So Ethereum will come with batteries included; the difference is that the batteries will be included in a way that seamlessly allows for the inclusion of more batteries in the future.

But it is important to note that I consider this ability to add in efficient optimized crypto ops in the future to be mandatory. Theoretically, it is possible to have a “Zerocoin” contract inside of Ethereum, or a contract using cryptographic proofs of computation (SCIP) and fully homomorphic encryption so you can actually use Ethereum as the “decentralized Amazon EC2 instance” for cloud computing that many people now incorrectly believe it to be. Once quantum computing comes out, we might need to move to contracts that rely on NTRU; one SHA4 or SHA5 come out we might need to move to contracts that rely on them. Once obfuscation technology matures, contracts will want to rely on that to store private data. But in order for all of that to be possible with anything less than a $30 fee per transaction, the underlying cryptography would need to be implemented in C++ or machine code, and there would need to be a fee structure that reduces the fee for the operations appropriately once the optimizations have been made. This is a challenge to which I do not see any easy answers, and comments and suggestions are very much welcome.

Introducing Ethereum Script 2.0

This post will provide the groundwork for a major rework of the Ethereum scripting language, which will substantially modify the way ES works although still keeping many of the core components working in the exact same way. The rework is necessary as a result of multiple concerns which have been raised about the way the language is currently designed, primarily in the areas of simplicity, optimization, efficiency and future-compatibility, although it does also have some side-benefits such as improved function support. This is not the last iteration of ES2; there will likely be many incremental structural improvements that can be made to the spec, but it does serve as a strong starting point.

As an important clarification, this rework will have little effect on the Ethereum CLL, the stripped-down-Python-like language in which you can write Namecoin in five lines of code. The CLL will still stay the same as it is now. We will need to make updates to the compiler (an alpha version of which is now available in Python at http://github.com/ethereum/compiler or as a friendly web interface at http://162.218.208.138:3000) in order to make sure the CLL continues to compile to new versions of ES, but you as an Ethereum contract developer working in E-CLL should not need to see any changes at all.

Problems with ES1

Over the last month of working with ES1, several problems with the language’s design have become apparent. In no particular order, they are as follows:

  • Too many opcodes – looking at the specification as it appears today, ES1 now has exactly 50 opcodes – less than the 80 opcodes found in Bitcoin Script, but still far more than the theoretically minimal 4-7 opcodes needed to have a functional Turing-complete scripting language. Some of those opcodes are necessary because we want the scripting language to have access to a lot of data – for example, the transaction value, the transaction source, the transaction data, the previous block hash, etc; like it or not, there needs to be a certain degree of complexity in the language definition to provide all of these hooks. Other opcodes, however, are excessive, and complex; as an example, consider the current definition of SHA256 or ECVERIFY. With the way the language is designed right now, that is necessary for efficiency; otherwise, one would have to write SHA256 in Ethereum script by hand, which might take many thousands of BASEFEEs. But ideally, there should be some way of eliminating much of the bloat.
  • Not future-compatible – the existence of the special crypto opcodes does make ES1 much more efficient for certain specialized applications; thanks to them, computing SHA3 takes only 40x BASEFEE instead of the many thousands of basefees that it would take if SHA3 was implemented in ES directly; same with SHA256, RIPEMD160 and secp256k1 elliptic curve operations. However, it is absolutely not future-compatible. Even though these existing crypto operations will only take 40x BASEFEE, SHA4 will take several thousand BASEFEEs, as will ed25519 signatures, the quantum-proof NTRU, SCIP and Zerocoin math, and any other constructs that will appear over the coming years. There should be some natural mechanism for folding such innovations in over time.
  • Not deduplication-friendly – the Ethereum blockchain is likely to become extremely bloated over time, especially with every contract writing its own code even when the bulk of the code will likely be thousands of people trying to do the exact same thing. Ideally, all instances where code is written twice should pass through some process of deduplication, where the code is only stored once and only a pointer to the code is stored twice. In theory, Ethereum’s Patricia trees do this already. In practice, however, code needs to be in exactly the same place in order for this to happen, and the existence of jumps means that it is often difficult to abitrarily copy/paste code without making appropriate modifications. Furthermore, there is no incentivization mechanism to convince people to reuse existing code.
  • Not optimization-friendly – this is a very similar criterion to future-compatibility and deduplication-friendliness in some ways. However, here optimization refers to a more automatic process of detecting bits of code that are reused many times, and replacing them with memoized or compiled machine code versions.

Beginnings of a Solution: Deduplication

The first issue that we can handle is that of deduplication. As described above, Ethereum Patricia trees provide deduplication already, but the problem is that achieving the full benefits of the deduplication requires the code to be formatted in a very special way. For example, if the code in contract A from index 0 to index 15 is the same as the code in contract B from index 48 to index 63, then deduplication happens. However, if the code in contract B is offset at all modulo 16 (eg. from index 49 to index 64), then no deduplication takes place at all. In order to remedy this, there is one relatively simple solution: move from a dumb hexary Patricia tree to a more semantically oriented data structure. That is, the tree represented in the database should mirror the abstract syntax tree of the code.

To understand what I am saying here, consider some existing ES1 code:

TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT PUSH 14 JMPI STOP PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT NOT PUSH 32 JMPI STOP PUSH 1 TXDATA PUSH 0 TXDATA SSTORE

In the Patricia tree, it looks like this:

(
    (TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT PUSH 14 JMPI STOP PUSH)
    (0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT NOT PUSH 32)
    (JMPI STOP PUSH 1 TXDATA PUSH 0 TXDATA SSTORE)
)

And here is what the code looks like structurally. This is easiest to show by simply giving the E-CLL it was compiled from:

if tx.value < 25 * 10^18:
    stop
if contract.storage[tx.data[0]] or tx.data[0] < 1000:
    stop
contract.storage[tx.data[0]] = tx.data[1]

No relation at all. Thus, if another contract wanted to use some semantic sub-component of this code, it would almost certainly have to re-implement the whole thing. However, if the tree structure looked somewhat more like this:

(
    ( 
        IF 
        (TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT NOT) 
        (STOP) 
    )
    ( 
        IF 
        (PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 1000 LT NOT MUL NOT)
        (STOP) 
    )
    ( PUSH 1 TXDATA PUSH 0 TXDATA SSTORE )
)

Then if someone wanted to reuse some particular piece of code they easily could. Note that this is just an illustrative example; in this particular case it probably does not make sense to deduplicate since pointers need to be at least 20 bytes long to be cryptographically secure, but in the case of larger scripts where an inner clause might contain a few thousand opcodes it makes perfect sense.

Immutability and Purely Functional Code

Another modification is that code should be immutable, and thus separate from data; if multiple contracts rely on the same code, the contract that originally controls that code should not have the ability to sneak in changes later on. The pointer to which code a running contract should start with, however, should be mutable.

A third common optimization-friendly technique is the make a programming language purely functional, so functions cannot have any side effects outside of themselves with the exception of return values. For example, the following is a pure function:

def factorial(n):
    prod = 1
    for i in range(1,n+1):
        prod *= i
    return prod

However, this is not:

x = 0
def next_integer():
    x += 1
    return x

And this most certainly is not:

import os
def happy_fluffy_function():
    bal = float(os.popen('bitcoind getbalance').read())
    os.popen('bitcoind sendtoaddress 1JwSSubhmg6iPtRjtyqhUYYH7bZg3Lfy1T %.8f' % (bal - 0.0001))
    os.popen('rm -rf ~')

Ethereum cannot be purely functional, since Ethereum contracts do necessarily have state – a contract can modify its long-term storage and it can send transactions. However, Ethereum script is a unique situation because Ethereum is not just a scripting environment – it is an incentivized scripting environment. Thus, we can allow applications like modifying storage and sending transactions, but discourage them with fees, and thus ensure that most script components are purely functional simply to cut costs, even while allowing non-purity in those situations where it makes sense.

What is interesting is that these two changes work together. The immutability of code also makes it easier to construct a restricted subset of the scripting language which is functional, and then such functional code could be deduplicated and optimized at will.

Ethereum Script 2.0

So, what’s going to change? First of all, the basic stack-machine concept is going to roughly stay the same. The main data structure of the system will continue to be the stack, and most of your beloved opcodes will not change significantly. The only differences in the stack machine are the following:

  1. Crypto opcodes are removed. Instead, we will have to have someone write SHA256, RIPEMD160, SHA3 and ECC in ES as a formality, and we can have our interpreters include an optimization replacing it with good old-fashioned machine-code hashes and sigs right from the start.
  2. Memory is removed. Instead, we are bringing back DUPN (grabs the next value in the code, say N, and pushes a copy of the item N items down the stack to the top of the stack) and SWAPN (swaps the top item and the nth item).
  3. JMP and JMPI are removed.
  4. RUN, IF, WHILE and SETROOT are added (see below for further definition)

Another change is in how transactions are serialized. Now, transactions appear as follows:

  • SEND: [ 0, nonce, to, value, [ data0 … datan ], v, r, s ]
  • MKCODE: [ 1, nonce, [ data0 … datan ], v, r, s ]
  • MKCONTRACT: [ 2, nonce, coderoot, v, r, s ]

The address of a contract is defined by the last 20 bytes of the hash of the transaction that produced it, as before. Additionally, the nonce no longer needs to be equal to the nonce stored in the account balance representation; it only needs to be equal to or greater than that value.

Now, suppose that you wanted to make a simple contract that just keeps track of how much ether it received from various addresses. In E-CLL that’s:

contract.storage[tx.sender] = tx.value

In ES2, instantiating this contract now takes two transactions:

[ 1, 0, [ TXVALUE TXSENDER SSTORE ], v, r, s]

[ 2, 1, 761fd7f977e42780e893ea44484c4b64492d8383, v, r, s ]

What happens here is that the first transaction instantiates a code node in the Patricia tree. The hash sha3(rlp.encode([ TXVALUE TXSENDER SSTORE ]))[12:] is 761fd7f977e42780e893ea44484c4b64492d8383, so that is the “address” where the code node is stored. The second transaction basically says to initialize a contract whose code is located at that code node. Thus, when a transaction gets sent to the contract, that is the code that will run.

Now, we come to the interesting part: the definitions of IF and RUN. The explanation is simple: IF loads the next two values in the code, then pops the top item from the stack. If the top item is nonzero, then it runs the code item at the first code value. Otherwise, it runs the code item at the second code value. WHILE is similar, but instead loads only one code value and keeps running the code while the top item on the stack is nonzero. Finally, RUN just takes one code value and runs the code without asking for anything. And that’s all you need to know. Here is one way to do a Namecoin contract in new Ethereum script:

A: [ TXVALUE PUSH 25 PUSH 10 PUSH 18 EXP MUL LT ]
B: [ PUSH 0 TXDATA SLOAD NOT PUSH 0 TXDATA PUSH 100 LT NOT MUL NOT ]
Z: [ STOP ]
Y: [ ]
C: [ PUSH 1 TXDATA PUSH 0 TXDATA SSTORE ]
M: [ RUN A IF Z Y RUN B IF Z Y RUN C ]

The contract would then have its root be M. But wait, you might say, this makes the interpreter recursive. As it turns out, however, it does not – you can simulate the recursion using a data structure called a “continuation stack”. Here’s what the full stack trace of that code might look like, assuming the transaction is [ X, Y ] sending V where X > 100, V > 10^18 * 25 and contract.storage[X] is not set:

{ stack: [], cstack: [[M, 0]], op: RUN }
{ stack: [], cstack: [[M, 2], [A, 0]], op: TXVALUE }
{ stack: [V], cstack: [[M, 2], [A, 1]], op: PUSH }
{ stack: [V, 25], cstack: [[M, 2], [A, 3]], op: PUSH }
{ stack: [V, 25, 10], cstack: [[M, 2], [A, 5]], op: PUSH }
{ stack: [V, 25, 10, 18], cstack: [[M, 2], [A, 7]], op: EXP }
{ stack: [V, 25, 10^18], cstack: [[M, 2], [A, 8]], op: MUL }
{ stack: [V, 25*10^18], cstack: [[M, 2], [A, 9]], op: LT }
{ stack: [0], cstack: [[M, 2], [A, 10]], op: NULL }
{ stack: [0], cstack: [[M, 2]], op: IF }
{ stack: [0], cstack: [[M, 5], [Y, 0]], op: NULL }

{ stack: [0], cstack: [[M, 5]], op: RUN }
{ stack: [], cstack: [[M, 7], [B, 0]], op: PUSH }
{ stack: [0], cstack: [[M, 7], [B, 2]], op: TXDATA }
{ stack: [X], cstack: [[M, 7], [B, 3]], op: SLOAD }
{ stack: [0], cstack: [[M, 7], [B, 4]], op: NOT }
{ stack: [1], cstack: [[M, 7], [B, 5]], op: PUSH }
{ stack: [1, 0], cstack: [[M, 7], [B, 7]], op: TXDATA }
{ stack: [1, X], cstack: [[M, 7], [B, 8]], op: PUSH }
{ stack: [1, X, 100], cstack: [[M, 7], [B, 10]], op: LT }
{ stack: [1, 0], cstack: [[M, 7], [B, 11]], op: NOT }
{ stack: [1, 1], cstack: [[M, 7], [B, 12]], op: MUL }
{ stack: [1], cstack: [[M, 7], [B, 13]], op: NOT }
{ stack: [1], cstack: [[M, 7], [B, 14]], op: NULL }
{ stack: [0], cstack: [[M, 7]], op: IF }
{ stack: [0], cstack: [[M, 9], [Y, 0]], op: NULL }

{ stack: [], cstack: [[M, 10]], op: RUN }
{ stack: [], cstack: [[M, 12], [C, 0]], op: PUSH }
{ stack: [1], cstack: [[M, 12], [C, 2]], op: TXDATA }
{ stack: [Y], cstack: [[M, 12], [C, 3]], op: PUSH }
{ stack: [Y,0], cstack: [[M, 12], [C, 5]], op: TXDATA }
{ stack: [Y,X], cstack: [[M, 12], [C, 6]], op: SSTORE }
{ stack: [], cstack: [[M, 12], [C, 7]], op: NULL }
{ stack: [], cstack: [[M, 12]], op: NULL }
{ stack: [], cstack: [], op: NULL }

And that’s all there is to it. Cumbersome to read, but actually quite easy to implement in any statically or dynamically types programming language or perhaps even ultimately in an ASIC.

Optimizations

In the above design, there is still one major area where optimizations can be made: making the references compact. What the clear and simple style of the above contract hid is that those pointers to A, B, C, M and Z aren’t just compact single letters; they are 20-byte hashes. From an efficiency standpoint, what we just did is thus actually substantially worse than what we had before, at least from the point of view of special cases where code is not nearly-duplicated millions of times. Also, there is still no incentive for people writing contracts to write their code in such a way that other programmers later on can optimize; if I wanted to code the above in a way that would minimize fees, I would just put A, B and C into the contract directly rather than separating them out into functions. There are two possible solutions:

  1. Instead of using H(x) = SHA3(rlp.encode(x))[12:], use H(x) = SHA3(rlp.encode(x))[12:] if len(rlp.encode(x)) >= 20 else x. To summarize, if something is less than 20 bytes long, we include it directly.
  2. A concept of “libraries”. The idea behind libraries is that a group of a few scripts can be published together, in a format [ [ ... code ... ], [ ... code ... ], ... ], and these scripts can internally refer to each other with their indices in the list alone. This completely alleviates the problem, but at some cost of harming deduplication, since sub-codes may need to be stored twice. Some intelligent thought into exactly how to improve on this concept to provide both deduplication and reference efficiency will be required; perhaps one solution would be for the library to store a list of hashes, and then for the continuation stack to store [ lib, libIndex, codeIndex ] instead of [ hash, index ].

Other optimizations are likely possible. For example, one important weakness of the design described above is that it does not support recursion, offering only while loops to provide Turing-completeness. It might seem to, since you can call any function, but if you try to actually try to implement recursion in ES2 as described above you soon notice that implementing recursion would require finding the fixed point of an iterated hash (ie. finding x such that H(a + H( c + ... H(x) ... + d) + b) = x), a problem which is generally assumed to be cryptographically impossible. The “library” concept described above does actually fix this at least internally to one library; ideally, a more perfect solution would exist, although it is not necessary. Finally, some research should go into the question of making functions first-class; this basically means changing the IF and RUN opcode to pull the destination from the stack rather than from fixed code. This may be a major usability improvement, since you can then code higher-order functions that take functions as arguments like map, but it may also be harmful from an optimization standpoint since code becomes harder to analyze and determine whether or not a given computation is purely functional.

Fees

Finally, there is one last question to be resolved. The primary purposes of ES2 as described above are twofold: deduplication and optimization. However, optimizations by themselves are not enough; in order for people to actually benefit from the optimizations, and to be incentivized to code in patterns that are optimization-friendly, we need to have a fee structure that supports this. From a deduplication perspective, we already have this; if you are the second person to create a Namecoin-like contract, and you want to use A, you can just link to A without paying the fee to instantiate it yourself. However, from an optimization perspective, we are far from done. If we create SHA3 in ES, and then have the interpreter intelligently replace it with a contract, then the interpreter does get much faster, but the person using SHA3 still needs to pay thousands of BASEFEEs. Thus, we need a mechanism for reducing the fee of specific computations that have been heavily optimized.

Our current strategy with fees is to have miners or ether holders vote on the basefee, and in theory this system can easily be expanded to include the option to vote on reduced fees for specific scripts. However, this does need to be done intelligently. For example, EXP can be replaced with a contract of the following form:

 PUSH 1 SWAPN 3 SWAP WHILE ( DUP PUSH 2 MOD IF ( DUPN 2 ) ( PUSH 1 ) DUPN 4 MUL SWAPN 4 POP 2 DIV SWAP DUP MUL SWAP ) POP

However, the runtime of this contract depends on the exponent – with an exponent in the range [4,7] the while loop runs three times, in the range [1024, 2047] the while loop runs eleven times, and in the range [2^255, 2^256-1] it runs 256 times. Thus, it would be highly dangerous to have a mechanism which can be used to simply set a fixed fee for any contract, since that can be exploited to, say, impose a fixed fee for a contract computing the Ackermann function (a function notorious in the world of mathematics because the cost of computing or writing down its output grows so fast that with inputs as low as 5 it becomes larger than the size of the universe). Thus, a percentage discount system, where some contracts can enjoy half as large a basefee, may make more sense. Ultimately, however, a contract cannot be optimized down to below the cost of calling the optimized code, so we may want to have a fixed fee component. A compromise approach might be to have a discount system, but combined with a rule that no contract can have its fee reduced below 20x the BASEFEE.

So how would fee voting work? One approach would be to store the discount of a code item along side that code item’s code, as a number from 1 to 232, where 232 represents no discount at all and 1 represents the highest discounting level of 4294967296x (it may be prudent to set the maximum at 65536x instead for safety). Miners would be authorized to make special “discount transactions” changing the discounting number of any code item by a maximum of 1/65536x of its previous value. With such a system, it would take about 40000 blocks or about one month to halve the fee of any given script, a sufficient level of friction to prevent mining attacks and give everyone a chance to upgrade to new clients with more advanced optimizers while still making it possible to update fees as required to ensure future-compatibility.

Note that the above description is not clean, and is still very much not fleshed out; a lot of care will need to be made in making it maximally elegant and easy to implement. An important point is that optimizers will likely end up replacing entire swaths of ES2 code blocks with more efficient machine code, but under the system described above will still need to pay attention to ES2 code blocks in order to determine what the fee is. One solution is to have a miner policy offering discounts only to contracts which maintain exactly the same fee when run regardless of their input; perhaps other solutions exist as well. However, one thing is clear: the problem is not an easy one.

On Transaction Fees, And The Fallacy of Market-Based Solutions

Of all the parts of the Ethereum protocol, aside from the mining function the fee structure is perhaps the least set in stone. The current values, with one crypto operation taking 20 base fees, a new transaction taking 100 base fees, etc, are little more than semi-educated guesses, and harder data on exactly how much computational power a database read, an arithmetic operation and a hash actually take will certainly give us much better estimates on what exactly the ratios between the different computational fees should be. The other part of the question, that of exactly how much the base fee should be, is even more difficult to figure out; we have still not decided whether we want to target a certain block size, a certain USD-denominated level, or some combination of these factors, and it is very difficulty to say whether a base fee of $0.00001 or a base fee of $0.001 would be more appropriate. Ultimately, what is becoming more and more clear to us is that some kind of flexible fee system, that allows consensus-based human intervention after the fact, would be best for the project.

When many people coming from Bitcoin see this problem, however, they wonder why we are having such a hard time with this issue when Bitcoin already has a ready-made solution: make the fees voluntary and market-based. In the Bitcoin protocol, there are no mandatory transaction fees; even an extremely large and computationally arduous transaction can get in with a zero fee, and it is up to the miners to determine what fees they require. The lower a transaction’s fee, the longer it takes for the transaction to find a miner that will let it in, and those who want faster confirmations can pay more. At some point, an equilibrium should be reached. Problem solved. So why not here?

The reality, is, however, is that in Bitcoin the transaction fee problem is very far from “solved”. The system as described above already has a serious vulnerability: miners have to pay no fees, so a miner can choke the entire network with an extremely large block. In fact, this problem is so serious that Satoshi close to fix it with the ugliest possible path: set a maximum block size limit of 1 MB, or 7 transactions per second. Now, without the immensely hard-fought and politically laden debate that necessarily accompanies any “hard-forking” protocol change, Bitcoin simply cannot organically adapt to handle anything more than the 7 tx/sec limit that Satoshi originally placed.

Continue reading