Lnation

Learning Perl - Hashes

In the last post we covered the basics of arrays, today we will look at hashes in more detail.

What is a hash?

A hash is a collection of key-value pairs. Each key is unique and maps to a value, which does not need to be unique. Hashes are also known as associative arrays or dictionaries in other programming languages.

In Perl we define a hash using the '%' symbol. We create our key pairs separated by what we call in Perl a fat comma '=>'. What this does is stringifys the key so we do not need to explicitly quote it. This is a common practice in Perl to make the code cleaner and more readable. Here is an example of how to define a hash with a fat comma:

my %fruit = (
    apple  => 'red',
    banana => 'yellow',
    grape  => 'purple',
);

Here is an example of defining the same hash using just a normal comma.

my %fruite = (
    "apple", "red",
    "banana", "yellow",
    "grape", "purple",
)

There is only one way to access a hash and that is by key, we use the '$' symbol followed by the hash name and the key in curly braces as shown below.

print $fruit{apple}, "\n";  # prints 'red'

We can also add new key-value pairs to a hash or update existing ones like this:

$fruit{orange} = 'orange';  # adds a new key-value pair
$fruit{banana} = 'green';   # updates the value for the key 'banana'
print $fruit{banana};  # prints 'green'

Also like arrays, hashes can also be initialised empty and populated later:

my %empty_hash;  # creates an empty hash
$empty_hash{key1} = 'value1';  # adds a key-value pair

Along with basic access and modification, hashes in Perl provide several useful keywords for manipulation. Here are some common operations that we will go through today:

Operator/Function	Description	Example
keys %hash	Returns a list of all keys in the hash	'@keys = keys %fruit;'
values %hash	Returns a list of all values in the hash	'@vals = values %fruit;'
each %hash	Iterates over key-value pairs in the hash	'while (my ($k, $v) = each %fruit) { ... }'
exists $hash{key}	Checks if a key exists in the hash	'exists $fruit{apple}'
delete $hash{key}	Removes a key-value pair from the hash	'delete $fruit{banana};'
scalar %hash	Returns the number of key-value pairs	'my $count = scalar %fruit;'

Let's look at some examples of these operations, first create a new file 'hashes.pl' and add the following code:

use strict;
use warnings;
use Data::Dumper;
my %fruit = (
    apple  => 'red',
    banana => 'yellow',
    grape  => 'purple',
);
print "Original hash:", Dumper(\%fruit);

You should be familiar with this code, it uses once again the 'Data::Dumper' module to print the contents of the variable, this time the hash we define which we call '%fruit' and has the key-value pairs matching fruit names to their colours. If we were to run this we should see the following output:

Original hash: $VAR1 = {
          'banana' => 'yellow',
          'grape' => 'purple',
          'apple' => 'red'
        };

Let's start with the 'keys' function, which returns an array(list) of all keys in the hash. Add the following code to 'hashes.pl':

my @keys = keys %fruit;
print "Keys in the hash:", Dumper(\%fruit);

This will return a list of all keys in the hash, which in this case will be 'apple', 'banana', and 'grape'. The output will look like this:

Keys in the hash: $VAR1 = [
          'banana',
          'grape',
          'apple'
        ];

What is important to understand is perl hashes are unordered so each time you run the code the order of the keys may change. This is because hashes do not maintain the order of insertion, unlike arrays. If you would like a persistent order we can use the 'sort' keyword to sort the array returned by the 'keys' function. For example:

my @sorted_keys = sort keys %fruit;
print "Sorted keys in the hash: ", Dumper(\@sorted_keys);

will return the keys in alphabetical order:

Sorted keys in the hash: $VAR1 = [
          'apple',
          'banana',
          'grape'
        ];

Next, let's look at the 'values' function, which returns a list of all values in the hash. Add the following code to 'hashes.pl':

my @values = values %fruit;
print "Values in the hash: ", Dumper(\@values);

This will return a list of all values in the hash, which in this case will be 'red', 'yellow', and 'purple'. The output will look like this:

Values in the hash: $VAR1 = [
          'yellow',
          'purple',
          'red'
        ];

Again, the order of the values may change each time you run the code, as hashes do not maintain order. To fix this you can 'sort' the values like we sorted the 'keys' or we can 'sort' the keys and iterate then using map and then access the value like the following:

my @sorted_values = map { $fruit{$_} } sort keys %fruit;
print "Sorted values in the hash: ", Dumper(\@sorted_values);

This will return the values in the same order as the sorted keys:

Sorted values in the hash: $VAR1 = [
          'red',
          'yellow',
          'purple'
        ];

With that working now lets look at the 'each' function, which iterates over key-value pairs in the hash. Add the following code to 'hashes.pl':

while (my ($key, $value) = each %fruit) {
    print "Key: $key, Value: $value\n";
}

This will print each key-value pair in the hash. The output will look like this:

Key: banana, Value: yellow
Key: grape, Value: purple
Key: apple, Value: red

The 'each' function is useful for iterating over hashes when you need both the key and value. We have also just implemented our first loop in Perl, which is a 'while' loop. This will continue to iterate over the hash until there are no more key-value pairs to process. We will cover loops in more detail in a future post.

Next, let's look at the 'exists' function, which checks if a key exists in the hash. Add the following code to 'hashes.pl':

my $apple_exists = exists $fruit{apple};
print "Does 'apple' exist in the hash? ", Dumper($apple_exists);

When you run this code, it will print the additional output:

Does 'apple' exist in the hash? $VAR1 = !!1;

This indicates that the key 'apple' exists in the hash. If you check for a key that does not exist, such as 'orange', it will return '!!0' false. Do not get confused with the double exclamation marks, this is just how Perl represents boolean values in the Data::Dumper output.

Next, let's look at the 'delete' function, which removes a key-value pair from the hash. Add the following code to 'hashes.pl':

delete $fruit{banana};
print "After deleting 'banana': ", Dumper(\%fruit);

When you run this code, it will print the hash after removing the key 'banana':

After deleting 'banana': $VAR1 = {
          'grape' => 'purple',
          'apple' => 'red'
        };

This shows that the key 'banana' has been removed from the hash. If you try to access a deleted key, it will return 'undef', which is Perl's way of saying "no value".

Finally, let's look at the 'scalar' function, which returns the number of key-value pairs in the hash. Add the following code to 'hashes.pl':

my $count = scalar %fruit;
print "Number of key-value pairs in the hash: $count\n";

When you run this code, it will print the number of key-value pairs in the hash:

Number of key-value pairs in the hash: 2

This shows that there are currently two key-value pairs in the hash after we deleted 'banana'.

Remember that hashes are unordered collections of key-value pairs, so the order of keys and values may change each time you run the code. In the next post, we will look at conditional statements.