Dictionaries are extremely useful data objects. A dictionary is an associative store, basically a set of key-value pairs. A dictionary takes a key and returns the associated value.
Python has dictionaries and the Python engineers have put a lot of effort into making them efficient in both space and time. The PyFL interpreter makes extensive use of dictionaries and they are responsible for major improvements in performance. (PyFL is my experimental functional language.)
The obvious conclusion is that PyFL, too, should have dictionaries. A PyFL dictionary could simply be a Python dictionary wrapped in a functional interface.
In fact this is not so simple as it seems. The Python dictionary interface is imperative: the basic primitive allows a given value to be assigned to a given key. The dictionary is altered in place.
However, in a functional language data objects are immutable. Instead, we have a function which takes a dictionary and a key and a value and returns a new dictionary that differs from the original (at most) in that the given value is associated with the given key.
We also need a primitive to return the value associated with a given key, and a predicate to check that a given key has a value associated with it.
The basic primitives
It all begins with an empty dictionary. In our case, with dict0("student","grade"). This dictionary is empty but it has student and grade as the component names of any future entries.
Adding entries is done with the update function. Thus
d1 = update(dict0,'Tom',83)
defines d1 to be (denote) a dictionary with one key value pair, with 83 associated with the string (denoted by) 'Tom'.
If in addition to we have
d2 = update(d1,'Dick',94)
d3 = update(d2,'Harry',75)
marks = update(d3,'Sally',91)
we can think of marks as denoting a marks table for a small class.
Individual marks are retrieved using the infix operator atkey. Thus marks atkey 'Dick' evaluates to 94 and marks atkey 'Sally' evaluates to 91.
We can check if a dictionary has a given key using the haskey infix operator. To check if 'Tom' is a key, we evaluate marks haskey 'Tom', which yields true. On the other hand, marks haskey 'John' gives false.
We don’t have to build up a dictionary by adding entries one at a time, We can use the extend operator, whose second argument is a whole list of entries,
We represent individual entries as structs, with in this case student and grade components. For example,
<< student:'Tom' grade: 83 >>
The second argument of extend is a list of such entries. We could have defined marks as
marks = dict0("student","grade") extend
[%
<< student: 'Tom' grade: 83 >>,
<< student: 'Dick' grade: 91 >>,
<< student: 'Harry' grade: 75 >>,
<< student: 'Sally' grade: 94 >>
%]
The dictionary produced has all the new entries. If the original dictionary had entries for the given keys, they are overwritten.
The operations map and filter
There are also two important operations, map and filter, which allow us to avoid inefficient one-entry-at-a-time computations.
Suppose, for example, that we need to increase each class mark by 5. We could laboriously extract each mark, add 5, then update the dictionary. The first step would be
tommark = marks atkey 'Tom'
marks1 = update(marks,'Tom',tommark+5)
and this would be repeated three more times, creating three more dictionaries in the process.
Instead, we can use the map operator. It takes a dictionary and a function and returns the dictionary which results if we apply the function to each of the original dictionary’s values. Thus
newmarks = marks map lambda (m) m+5 end
or equivalently
marks map upgrade
where
upgrade(m) = m+5
end
Now suppose we want to look at only those who got A+ by having a mark at least 90. Once again we could laboriously extract individual entries. Instead we use the filter operator. It takes a dictionary and a predicate and returns the dictionary formed by taking those entries in the original dictionary which satisfy the predicate. Then
apmarks = marks filter lambda (e) e.grade >= 90 end
or equivalently
marks filter aplus
where
aplus(e) = e.grade >= 90
end
(Here the variable e stands for an entry, i.e. a struct with component names student and grade.)
VBOs
There are many more calculations that map and filter can’t perform. For example, suppose we want to compute the average mark. The operations map and filter are of no use. We do know how to compute the average component of a list L, we use the avg vbo: avg x for x in L end. Our idea is to use the same approach for dictionaries.
We do this by allowing dictionaries to be ranges of VBO expressions. The following expression
avg e.grade for e in dict marks end
computes the average mark. Here the variable e ranges over the entries of marks.
If we want the average of the A+ students we just add a condition:
avg e.grade for e in dict marks:e.grade >= 90 end
This gives us the set of names of the A+ students:
collect e.student for e in dict marks: e.grade >= 90 end
and it evaluates to the set denoted by {'Dick','Sally'}.
To calculate the top mark and the names of those who earned it
those e.student : e.grade == topgrade for e in dict marks end
where
topgrade = max e.grade for e in marks
end
This evaluates to the set denoted by the constant {'Sally'}.
Fessing up
Now a confession: most of this is vaporware, I’ve only implemented the basic primitives.
But the basic primitive don’t make for much of a blog post, so I faked the rest of the features.
I intend to implement them but not real soon (now). I’m currently working on a comprehensive PyFL manual (with Vaios Michalakis). When it’s done I’ll post it here.