Learning Ethereum: Hash Function

Learning Ethereum

Learning Ethereum: Hash Function


Unless you’re part of the Ethereum Foundation it’s safe to say that none of us are experts, but that doesn’t mean we can’t learn! Following the tutorials from ETH.BUILD we can take a look at some of the foundational concepts which power Ethereum and how we can use them to build some pretty cool things.

What is a Hash Function?

Austin Griffith starts the series exploring Hash Functions. Simply put, a Hash Function is just like a digital fingerprint. Like a physical fingerprint, all hash’s are unique.

Specifically, a cryptographic hash function (CHF) is a mathematical algorithm that maps data of arbitrary size to a bit array of a fixed size. Hash’s are a one-way function, meaning it’s nearly impossible to invert.

Let’s look at this a little closer. If we put anything into the TEXT field we will always be given a 64 character hexadecimal string. No matter how long our message is in the TEXT field, we’ll always be given a 64 character hexadecimal string.

To provide a better visual representation of this you may be familiar with HEX Color Codes which come from Hexadecimal. Let’s add a HEX Output to visually represent what’s happening to the data.

We’ve taken the first 6-digits of the initial Hash and when they are run through the COLOR module, we see a visual representation of that Hash. Now when we put ‘Unvetica’ in the TEXT field we are provided our Hash but we see a nice green color.

What is worth noting is that we can put any arbitrary text in the TEXT module and it will always provide the same output of a 64 character hexadecimal string. That means if you put a single word like ‘Unvetica’ or an entire 500-page text document, you will always get a 64 character hexadecimal string.

The input provided to the TEXT module will always determine the same Hash output, meaning the same Hash is always provided each time I input ‘Unvetica’ into the TEXT module – making that specific output unique to the word ‘Unvetica’.

Why is a Hash important?

Now that we have a sense of how a Hash can be generated let’s explore how it can be used and why it’s important in relation to cryptography. A Hash is one-directional.

Let’s say for example that Unvetica runs a contest by creating a unique Hash from the TEXT module and then we post the Hash, declaring anyone who can guess the TEXT that generated the Hash the winner and gets $1,000.

Let’s look at our Hash and see why this would be an impossible contest to win. The TEXT module input of ‘Unvetica’ produces a unique Hash of “58904b5e0ebc7eb1cd25673fd868c5023008d86469f2ed7a674c380796a1c7bd”. That means that someone would have to somehow guess the exact input which was provided in the TEXT module.

If we compare this same task but replace the Hash with a physical combination lock the same logic would not be true. Meaning the more combination options you try the closer you’ll get to the answer, the same is not true with a Hash. Getting close to the initial TEXT module input has no bearing on the original Hash.

We can see proof of this by guessing part of the initial input string and seeing the Hash output as something not even resembling our true Hash. The answer is either exactly correct or it isn’t.

This is an incredibly powerful tool in cryptography, we can see an example of this technology being expanded on using a Merkle Tree. Meaning if we wanted to make a super-secret password (or Hash) we would make a compounded Hash.

We have three independent Hash Functions that and the combined output creates a unique Hash. That means in order to guess the super-secret password in this example you’d have to somehow guess 4 arbitrary strings.

What did we learn?

  • Any data of any size can be fed into a Hash Function and will output an exact 64 character hexadecimal string or “finger print”
  • A Hash is deterministic
  • A Hash is one-directional