For instance, given the following investments:
If we want to maximize earnings over 10 years, should we purchase ten times, twice, three times and one , once and once and twice or some other combination?
Before we go into programming, let’s do some basic math/algorithms. This is all simple math, so don’t worry. You can also skip to programming if you prefer.
Theorem 1: The final value to withdraw with any investment i such as the ones exemplified can be written as product of the initial value and a factor defined by the investment i,
Proof: Given some initial value v and composite yearly interest rates r, investment time t and taxes on profits taxes:
The previous line proves the existence of the factor. Now, since :
Theorem 2: Given a set of possible investments and a deadline , the best investment strategy , any sub-strategy contained in is the best strategy for the sum of the times of the investments contained in it.
Proof: Without loss of generality, let us consider , with and and assume the contrary: is not the best strategy for time , but is the best strategy for time . So there must be , and that would mean , which contradicts being the best strategy. Therefore, there can be no and is optimal.
Maybe you skipped the last part, but don’t worry. I’ll just roll out the recursive solution to the problem. How do we describe the list of investments to be made that maximizes earnings after some time n?
This basically means we test every possible combination, which is not very smart, of course. The advantage of finding a recursive solution is that we can compute and store calculations for use later on. This is what we call memoization.
Also, we only need to check up to , since every other check is redundant.
How do we implement this? In a procedural language we could use an array of size n and fill it up with all solutions from 0 to n. In Haskell we generally don’t want to use mutable data structures nor do we want to specify the order of evaluation of things, so we must find another tool in the toolbox to do this; it turns out that lazy evaluation is just that!
Lazy evaluation is, roughly speaking, a mechanism by which a value is only computed when required by some function. This means that we can define a data structure in terms of itself, and that it can even be infinite. Take the following example:
repeat :: a -> [a]
repeat x = x : repeat x
What happens here is that the function repeat takes an object of some type a and returns a possibly infinite list of a. The list will grow in size as more elements of it are demanded by evaluation. Let us take this idea to implement our bestStrategyFunctionBad:
— We use the “investment” below to make sure the algorithm always returns some strategy, even if it means leaving you money in the bank
investmentLeaveInTheBank :: Investment
investmentLeaveInTheBank = Investment { name = “Leave it in the bank account”, rate = 0, taxes = 0.0, time = 1 }
withMax :: Ord a => (b -> a) -> [b] -> Maybe b
withMax f xs = snd maybeRes
where maybeRes = foldl’ (\acc el ->case acc of
Just (maxVal, maxEl) -> let cmp = f el in if cmp > maxVal then Just (cmp, el) else acc
Nothing -> Just (f el, el)) Nothing xs
withMax1 :: Ord a => (b -> a) -> b -> [b] -> b
withMax1 f firstEl xs = snd $ foldl’ (\acc@(maxVal, _) el -> let cmp = f el in if cmp > maxVal then (cmp, el) else acc) (f firstEl, firstEl) xs
bestStrategyBad :: Int -> [Investment] -> [Investment]
bestStrategyBad timeInYears invs’ = go !! timeInYears
where invs = investmentLeaveInTheBank : invs’
factorStrategyBad is = product $ fmap factorInvestment is
bestStrat desiredTime = withMax1 factorStrategyBad (maybeToList (bestInvestmentWithTime desiredTime)) (allCombinations desiredTime)
bestInvestmentWithTime desiredTime = withMax factorInvestment $ filter (\i -> time i == desiredTime) invs
— For desiredTime=7 “allCombinations” returns strategies e1 ++ e6, e2 ++ e5 and e3 ++ e4
allCombinations desiredTime = let halfTheTime = floor (fromIntegral desiredTime / 2)
in fmap (\i -> go !! i ++ go !! (desiredTime – i)) [1..halfTheTime]
go :: [[Investment]]
go = [] : fmap bestStrat [1..]
There is nothing magical about the code above. When demanding , for instance, the function bestStrat will be called with the value 20, which will demand all possible strategy investments (as defined by our equations). Demanding all combinations will once again require , and many others, which will repeat the process for a smaller n (the fact that they are smaller is crucial for our recursion to converge).
What is different from recursion in imperative languages is that go is not a function: it is a list whose values are lazily calculated. As values are demanded from it, they are calculated only once, so you don’t have to worry about what order to evaluate things in. In C# this is sort of like a List<Lazy<Investment[]>>.
This is nice! Still, there are two bad things about this solution:
data StrategyCalc = StrategyCalc [Investment] Double
factorStrategyGood (StrategyCalc _ x) = x
combine :: StrategyCalc -> StrategyCalc -> StrategyCalc
combine (StrategyCalc s1 f1) (StrategyCalc s2 f2) = StrategyCalc (s1 ++ s2) (f1 * f2)
bestStrategyGood :: Int -> [Investment] -> [Investment]
bestStrategyGood timeInYears invs’ = let StrategyCalc res _ = go !! timeInYears in res
where invs = investmentLeaveInTheBank : invs’
bestStrat desiredTime = withMax1 factorStrategyGood (bestInvestmentWithTimeOr1 desiredTime) (allCombinations desiredTime)
bestInvestmentWithTimeOr1 desiredTime = case withMax factorInvestment $ filter (\i -> time i == desiredTime) invs of
Nothing -> StrategyCalc [] 1
Just i -> StrategyCalc [i] (factorInvestment i)
— For desiredTime=7 “allCombinations” returns strategies e1 ++ e6, e2 ++ e5 and e3 ++ e4
allCombinations desiredTime = let halfTheTime = floor (fromIntegral desiredTime / 2) in fmap (\i -> combine (go !! i) (go !! (desiredTime – i))) [1..halfTheTime]
go :: [StrategyCalc]
go = StrategyCalc [] 1 : fmap bestStrat [1..]
Take your time to digest this: the list of investments in each StrategyCalc will only be evaluated when the caller needs it to be evaluated. However, the combine function will create a StrategyCalc whose factor is calculated in constant time when combining two strategies. In fact, you could even have the final factor of the optimal strategy without having ever constructed a non empty list. Nice!
I thought a nice touch to finish this article would be to introduce an abstraction: the Monoid.
A Monoid is just a fancy name for a binary operation that is associative and a value that is an identity for this operation. The Int type, the sum function (+) and the value (zero) form an instance of Monoid, for instance, since any number plus zero equals itself and for any of type Int.
The same thing happens with investment strategies! So we can replace the combine function by the Monoidal append:
instance Monoid StrategyCalc where
mempty = StrategyCalc [] 1
mappend (StrategyCalc i1 f1) (StrategyCalc i2 f2) = StrategyCalc (i1 ++ i2) (f1 * f2)
Don’t forget that <> is an infix alias for mappend!
When taking the code for bestStrategyGood and the three investments from the beginning of the article, let us devise the best strategy to maximize gains over the next 11 years:
$ ghci
$ :l Investments.hs
ghci> let availableInvestments = [ Investment { name = “Investment 1”, rate = 0.09, taxes = 0.25, time = 1 } , Investment { name = “Investment 2”, rate = 0.08, taxes = 0.15, time = 5 } , Investment { name = “Investment 3”, rate = 0.07, taxes = 0, time = 3 } ]
ghci> fmap name $ bestStrategyGood 11 availableInvestments
[“Investment 3″,”Investment 3″,”Investment 2”]
So it seems that buying investment 3, rebuying it and then buying investment 2 is the best strategy in this case.
That’s it! I hope you liked it, and if it helps, do know that this problem is still solvable with the same algorithm if the tax of each investment is a function of the amount of time since the investment title was purchased and if the time until withdrawal is either an exact time or a minimum time. It is also possible to include inflation-correcting investments if you pass around some estimated inflation; all of this with only minor modifications. Also, feel free to change the time unit to months and get something much more precise for your investments!
]]>Functional Programming is getting lot of attention nowadays. Even every mainstream language is now supporting functional style i.e Java, C++, C# etc. Basically, it is one of the paradigm of writing code where we write less code and do more.
Why do we really care about it ?
Around 2003, when people realized that single-core processor is not enough so they started developing multi-core processors.Writing multi-threaded application was really hard. On a single processor, multi-threaded application is more like a multi-tasking. But in a multi-threaded multi-core system, things are quite different. An application that may be broken already, may pretend to work correctly on a single-core but on a multi-core processor it ends up being broken. One of the reason is that on multi-core system, threads are on steroids they are running continuously so they can access more of the data much more rapidly. As a result, it has become problematic to write programs correctly and can make it run concurrently. So it has become huge problem for us as we develop applications. Another problem arises from having mutable states. It is extremely difficult to work with mutability especially when multiple threads start sharing it.
And the more mutability means more error-prone the code . Whereas functional programming is inherently thread safe.
What is problem with mutable state ?
Functional programming prefers declarative style of coding over imperative style. Declarative style is less verbose, focus on assignment less programming and using pure functions.
What is pure function ?
A pure functions gives exactly the same result no matter how many times you ask as long as your input and output are exactly the same.
For example, We will see the same code in both style of coding:
def factorial(n: Int): Int = {
var result = 1
var i = 1
while (i <= n) {
result = result * i
i = i + 1
}
result
}
def factorial(num:Int): Int = {
if(num == 1) num
else
num * factorial(num - 1)
}
If we look above two approaches of solving same factorial problem. First approach is using imperative style of coding, where there are lots of mutation, too verbose and we have to tell what and how to do at every step .
But second approach is declarative style of coding which is assignment-less, concise, more expressive, thread-safe, explicit mutation less and we are not telling how to do at every step.
When we talk or heard about functional programming people usually think that immutability and higher order functions are extremely important. But the real charming feature of functional programming are:
Functional Composition makes the code more expressive, concise, easier to understand.
For example, find all even numbers, greater than 5 and double their value.
scala> val list = (1 to 10).toList
list: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> list.filter(_ > 5).filter(_ % 2 == 0).map(_ *2)
res20: List[Int] = List(12, 16, 20)
Code is very expressive, easy to understand that what is happening at every step, highly cohesive and there is no explicit mutation.
But one thing to notice, it creates lots of intermediate collections while computing final result. To avoid unnecessary computations lazy evaluations adds efficiency to your code.
Laziness leads to efficiency in code. Now we are not making copies and copies of object. You can defer execution until a much later time and have those executed very nicely that’s one of the biggest benefit you get out of this.
For example,
scala> list.view.filter(_ > 5).filter(_ % 2 == 0).map(_ *2).toList
res22: List[Int] = List(12, 16, 20)
Now, it will be evaluating lazily. It will never create intermediate collections. It will start evaluating it when we actually need the result. In our case, when we do .toList , it starts executing it. So, it never waste performance. It is very hard for it to be efficient without laziness.
So, the biggest benefit of functional programming is you get code clarity. You don’t need to waste time to figure out what is happening at every step. There is no mutation and can easily make concurrent. When functional compositions meets with lazy evaluation from that point reactive programming starts.
The main aim of this blog is to understand the essence of functional programming. I have used Scala for examples. In our next blog, we will go more deeper to understand the lazy evaluation.
Please feel free to suggest or comment!
References:
1) Thinking and Programming in Functional Style
]]>
In this chapter Okasaki works around the problem of doing amortized analysis with persistent data structures because the amortized analysis assumes in place modification while for persistent data structures (partial) copies are made. The intuition is that lazy evaluation, which comes with memoization and avoids recomputation, solves this problem.
He adapts the Banker’s and Physicists’s methods to work with lazy evaluated operations and applies them to a few structures including Binomial Heaps, Queues and Lazy Pairing Heaps. In this post we’ll only cover the examples of the Queues using both methods.
We’ll first introduce some concepts and terminology, then we’ll present a queue implementation using lazy evaluation that allows us analyzing it under persistence. Following that we’ll explain the Banker’s and Physicist’s methods and prove that the implementation for push/pop has an efficient amortized cost.
An execution trace is a DAG where nodes represent the operations (e.g. updates to a data structure) and an edge from nodes to indicates that the operation corresponding to uses the output of the one corresponding to .
For example, if we have this set of operations:
let a = push 0 newEmpty let b = push 1 a let c = pop b let d = push 2 b let e = append c d let f = pop c let g = push 3 d
The corresponding execution graph is:
The logical history of an operation v is the set of all operations it depends on (directly or indirectly, and including itself). Equivalently, in terms of the DAG, it’s the set of nodes that have a directed path to v.
A logical future of an operation v is any directed path from v to a terminal node.
We need to introduce a few more concepts and terminology. Note: we’ll use suspension and lazy operation interchangeably, which can be evaluated or forced.
The unshared cost of an operation is the time it would take to execute it if it had already been performed and memoized before, so if the operation involves any expression that is lazy, that expression would be O(1).
The shared cost of an operation is the time it would take it to execute (force) all the suspensions created (but not evaluated) by the operation.
The complete cost is the sum of the shared and unshared costs. Alternatively, the complete cost of an operation is the time it would take to execute the operation if lazy evaluation was replaced by strict. To see why, first we note that the unshared costs have to be paid regardless of laziness. Since we’re assuming no laziness, the operation has to pay the cost associated with the suspension it creates, which corresponds to the shared costs. Note that under this assumption we wouldn’t need to account for the cost of forcing suspensions created by previous operations because in theory they have already been evaluated.
When talking about a sequence of operations, we can break down the shared costs into two types: realized and unrealized costs. The realized costs are the shared costs from suspensions were actually forced by some operation in the sequence. Example: say that operations A and B are in the sequence and A creates a suspension, and then B forces it. The cost for B to force it is included in the realized cost. The unrealized costs are the shared costs for suspensions that were created but never evaluated within the sequence. The total actual cost of a sequence of operations is the sum of the realized costs and the unshared costs.
Throughout a set of operations, we keep track of the accumulated debt, which starts at 0 at the beginning of the sequence. Whenever an operation is performed, we add its shared cost to it. For each operation, we can decide how much of this debt we want to pay. When the debt of a suspension is paid off, we can force it. The amortized cost of an operation is its unshared cost plus the amount of debt it paid (note that it does not include the realized cost). Note that as long as we always pay the cost of a suspension before it’s forced, the amortized cost will be an upper bound on the actual cost.
This framework simplifies the analysis for the case when a suspension is used more than once by assuming that its debt was paid off within the logical history of when it was forced, so we can always analyze a sequence of operations and don’t worry about branching. This might cause the debt being paid multiple times, but it simplifies the analysis.
The author uses the term discharge debit as synonym of pay off debt. I find the latter term easier to grasp, so I’ll stick with it throughout this post.
Let’s introduce an example first and then proceed with the explanation of the Physicist’s method and the corresponding analysis of the example.
To allow efficient operations on a queue in the presence of persistence, we can make some of the operations lazy. Recall in a previous post we defined a queue using two lists. To avoid immediate computation, a natural replacement for lists is using its lazy version, the stream data structure, which we also talked about in a previous post.
For the list-based queue, the invariant was that if the front list is empty, then the rear list must be empty as well. For the stream queue, we have a tighter constraint: ‘front’ must be always greater or equal than ‘rear’. This constraint is necessary for the analysis.
The definition of the stream queue is the following:
We store the lengths of the streams explicitly for efficiency.
We’ll be using the Stream developed in the previous chapter, so we’ll refer to the module Stream2 to avoid ambiguity with the standard Stream
module.
Inserting an element at the end of the queue is straightforward, since the rear stream represents the end of the queue and is reversed:
The problem is that inserting at rear
can cause the invariant of the queue to be violated. check()
changes the structure so to conform to the invariant by potentially reversing rear
and concatenating with front
:
Removing an element from the queue requires us to evaluate the first element of the front stream. Again, the invariant can be violated in this case so we need to invoke check()
again:
The complete code for the stream queue is on Github.
The idea of the Banker’s Method is basically define an invariant for the accumulated debt and a strategy for paying it off (that is, decide how much debt each operation pays off). Then we show that whenever we need to force a suspension, the invariant guarantees that the accumulated debt has been paid off. One property of the Banker’s method is that it allows associating the debt to specific locations of the data structure. This is particularly interesting for streams, because it contains multiple (nested) suspensions, so we might force parts of this structure before we paid the debt associated with the entire structure.
By inspection, we can see that the unshared cost of both push and pop are O(1). It’s obvious in the case of push
, and in the case of pop
, in theory check could take O(m) where m is the size of the queue, but since Stream2.concat()
and Stream2.reverse()
are both lazy, and hence memoized, they are not included in the unshared costs.
To show that the amortized cost of both operations is O(1), we can show that paying off O(1) debt at each operation is enough to pay for the suspension before it is forced. For the queue, we also need to associate the debt with parts of the data structure, so that we could force the suspension of only some parts of it (for example, on the stream we can evaluate only the head, not necessarily the entire structure).
We now define an invariant that should be respected by all operations. Let be the debt at the i-th node on the front stream, and the accumulated debt up to node . The invariant is:
This constraint allows us evaluating the head at any time, because , which means its debt has been paid off. The second term in min()
, guarantees that if the entire stream can be evaluated, because for all .
The author then proves that by paying off one debt in push()
and two debt units in pop()
is enough to keep the debt under the constraint.
Because the Physicist’s method cannot assign costs to specific parts of the data structure, it doesn’t matter if the structure can be partially forced (like streams) or if it’s monolithic. With that in mind, we can come up with a simpler implementation of the queue by working with suspended lists instead of streams. Only the front list has to be suspended because the cost we want to avoid, the reversal of the back list and concatenation to the front list, happens on the front list.
On the other hand, we don’t want to evaluate the front list when we perform a peek or pop, so we keep a evaluated version of the front list too.
The signature of the structure is as follows:
As mentioned in the code above, the invariants we want to enforce is that the front list is never smaller than the rear list
and that the evaluated version of the front list is never empty if the lazy version still has some elements.
The push and pop operations are similar to the other versions of queue, but since we mutate the structure, we might need to adjust it to conform to the invariants:
Finally, because of our second invariant, peek
ing at the queue is straightforward:
The complete code for the suspended queue is on Github.
We’ve seen the Physicist’s Method in a previous post when we’re ignore the persistence of the data structures. We adapt the method to work with debits instead of credits. To avoid confusion, we’ll use to represent the potential function. , represents the accumulated debt of the structure at step . At each operation we may decide to pay off some debit, which will be then included in the amortized cost. We have that is the increase in debt after operation . Remember that the shared cost of an operation corresponds to the increase in debt if we don’t pay any of the debt. Thus, we can find out how much debt was paid off then by , where is the shared costs of operation . Let and be the unshared and complete costs of the operation . Given that, by definition, , we can then express the amortized cost as:
To analyze the suspended queue we need to assign values to the potentials such that by the time we need to evaluate a suspension the potential on the structure is 0 (that is, the debt has been paid off). For the suspended queues we’ll use the following potential function:
Where w
is the forcedFront
, f
is lazyFront
and r
is rear
.
We now claim that the amortized cost of push
is at most 2. If we push and element that doesn’t cause a rotation (i.e. doesn’t violate ), then increases by 1, and the potential decreases by 1. No shared is incurred and the unshared cost, inserting an element at the beginning of rear
is 1, hence the amortized cost for this case is 1 – (-1) = 2. If it does cause a rotation, then it must be that after the insertion and . After the rotation we have and , but w hasn’t changed and cannot be larger than the original , so the potential function is at most . The reversal of costs and concatenating to a list of size costs (discussed previously), plus the cost of initially appending an element to read, so the unshared cost is . No suspensions were created, so the amortized cost is given by .
Our next claim is that the amortized cost of pop
is at most 4. Again, if pop
doesn’t cause a rotation, decreases by 1, so the potential is reduced by 2. The unshared cost is 1, removing an element from , and the shared cost, 1 comes from the suspension that lazily removes the head of lazyFront
. The amortized cost is 2 – (-2) = 4. Note that if , we’ll evaluate , but the ideas is that it has been paid off already. If the pop
operation causes a rotation, then the analysis is similar to the push
case, except that the complete cost is must account for the shared cost of lazily reming the head of lazyFront
, so it’s , for an amortized cost of 3.
Note that when the suspensions are evaluated, the potential is 0, either when or .
In this post we covered a simple data structure, the queue, and modified it to be lazy evaluated and can show, with theory, that it allows for efficient amortized costs. The math for proving the costs is not complicated. The hardest part for me to grasp is to get the intuition of how the analysis works. The analogy with debt is very useful.
Meta: Since wordpress code plugin doesn’t support syntax highlighting for OCaml, I’m experimenting with the Gist plugin. Other advantages is that Gists allow comments and has version control!
]]>var array = [1, 2, 3, 4, 5]; var square = array.map(function(e){ return e * e; }); var even = square.filter(function(e){ return e % 2 === 0; }); var total = even.reduce(function(a, b){ return a + b; }, 0);
Jika kita amati, setiap operasi (square, even, total) akan melakukan iterasi terhadap setiap elemen array. Apakah operasi – operasi tersebut bisa kita optimasi? Ya, kita bisa melakukan optimasi dengan hanya melakukan satu kali iterasi untuk mendapatkan hasil akhir total. Untuk itu, kita membutuhkan sebuah teknik yang disebut lazy evaluation.
Dalam teknik lazy evaluation, operasi – operasi diperlakukan sedemikian rupa sehingga predikat (fungsi) dalam setiap operasi hanya dijalankan pada saat dibutuhkan.
]]>if (sequence.Any()) ...
and it is efficient as it can be, but when you rely on Count
method:
if (sequence.Count() > 0) ...
you may fall into a trap of extensive computation despite you just need confirmation whether the sequence is empty or not. Well, this disparity annoyed me for a long time and all I could think of was passing somehow a predicate to Count
. It is as fishy as it sounds, and it annoyed me even more.
Until the perspective changed — who said Count
should count anything? Why Count
should jump right away with exact answer?
Count
should be lazy and it should simply return the instance of some numeric like type, say SequenceCount
— that type should keep reference to the sequence, have implicit conversion to Int
and — what is crucial — it should also overload comparison operators. With all those goodies you could finally write:
// Skila syntax if (1...).count() > 0 then stdOut.writeLine("not empty"); end
to see a printout after a few CPU cycles.
Now when I solved this issue I can think of general lazy evaluation design for Skila…
]]>Given a transition table construct the NFA and using subset construction generate the DFA.
Task is to convert the NFA to DFA such that . Both DFA and the NFA will accept the same language.
Since the NFA has 4 states its power set will contain states. Omitting the empty set there will be states.
If is set of states of NFA the which is the power set of are possible states of the DFA .
Each sets in the power sets can be named something else to make it easy to understand. Each of the sets in the power set represent a state in the DFA.
The NFA can be converted to DFA using complete subset construction or by lazy evaluation of states. I will show both methods.
is all the set of N states that includes at least one accepting state. Final state of the DFA is set of subset of NFA states such that .
For each set and for each input symbol a in ,
This above is the formula to fill subset table. Although when filling by hand it is clearly visible.
This will be the transition table for the DFA. Although not all the states will be there. First thing to do is where ever there is the final state of the NFA mark that with star and p will also be start state for the DFA.
The table can be filled using shortcut by first copying the NFA table for the first four states since they are same as the NFA. Next for each element in the set for an copy from its corresponding row and union all the sets.
The same thing is done in the formula above. To get the transition from a state ( each of the sets in the power set ) for an input symbol which replaces a it produces the output. Right side of the formula takes each element from the set and for the input symbol it produces an item and unions them.
For example,
Another example,
Here, I just used the pre-computed value in the subset table. Since {p,q,r} is computed beforehand so I get the value from the table.
Each sets can be renamed to something else. Here I name sets from 1 to 15. Although it becomes a little bit confusing with the input symbol but it will take a lot of work to change things.
After renaming the states in the subset table,
Only reachable states of power set of or the reachable states of the DFA .
The above method is very slow and time consuming. Since all are not reachable for every DFA so this method will speed up the process by avoiding extra work.
Start with the start state and only construct the states the are reachable from starting state and similarly follow those states.
As it is visible both DFA are same.
]]>Prime.genPrime(20) // Generate 20 prime numbers .filter(e -> e < 15) // Select the prime numbers less than 15 .map(e -> e*2) // Double those prime numbers .forEach(System.out::println);// Output its result
A defined class Prime represents prime numbers and it has a static nested class named PrimeIterator which iterates over every prime number lazily. The abstract class Prime inherits Pipeline which has a method called opWrapSink so it needs to implement this method when generating a concrete Prime instance in its static method genPrime(). Pipeline behaves like a linked list to record piled stages (filter/map/foreach, etc.). The method opWrapSink returns a Sink which holds actions for every stage that are actually passed as Lambda expressions here. Since forEach is a terminal action it doesn’t produce a stream result, a terminal Sink was made called ForEachOp. The class ChainedInt is used for other intermediate operations such as filter/map which produce a new stream.
import java.util.List; import java.util.ArrayList; import java.util.Iterator; import java.util.function.Predicate; import java.util.function.Consumer; import java.util.function.IntUnaryOperator; import static java.lang.System.out; abstract class Prime extends Pipeline<Integer> { private static PrimeIterator<Integer> piterator; Prime(){} Prime(Prime prime){ super(prime); } public static Prime genPrime(){ piterator = new PrimeIterator<>(); return new Prime(){ @Override Sink<Integer> opWrapSink(Sink<Integer> sink){ throw new UnsupportedOperationException(); } }; } public static Prime genPrime(int upperLimit){ piterator = new PrimeIterator<>(upperLimit); return new Prime(){ @Override Sink<Integer> opWrapSink(Sink<Integer> sink){ throw new UnsupportedOperationException(); } }; } // Judge if the number is a prime number public static boolean isPrime(int number){ if(number <=2) return number == 2; if(number%2 == 0) return false; for(int i=3; i<= (int)Math.sqrt(number); i+=2){ if(number % i == 0) return false; } return true; } // Return next prime immediately after the passed prime number public static Integer nextPrime(int lastPrime){ lastPrime++; while(!isPrime(lastPrime)) lastPrime++; return lastPrime; } // Iterator for iterating over all prime numbers static class PrimeIterator<Interger> implements Iterator<Integer>{ private int lastPrime = 1; int MAX_VALUE; PrimeIterator(){ this.MAX_VALUE = Integer.MAX_VALUE; } PrimeIterator(int num){ this.MAX_VALUE = num; } @Override public boolean hasNext(){ return Prime.nextPrime(lastPrime) <= this.MAX_VALUE; } @Override public Integer next(){ return lastPrime = Prime.nextPrime(lastPrime); } @Override public void remove(){ throw new UnsupportedOperationException(); } } // filter public Prime filter(Predicate<Integer> predicate){ return new Prime (this){ @Override Sink<Integer> opWrapSink(Sink<Integer> sink){ return new Sink.ChainedInt<Integer>(sink){ @Override public void accept(Integer t){ if(predicate.test(t)){ downstream.accept(t); } } }; } }; } // map public Prime map(IntUnaryOperator mapper){ return new Prime (this){ @Override Sink<Integer> opWrapSink(Sink<Integer> sink){ return new Sink.ChainedInt<Integer>(sink){ @Override public void accept(Integer t){ downstream.accept(mapper.applyAsInt(t)); } }; } }; } // foreach public void forEach(Consumer<Integer> action){ Pipeline<Integer> p = this; Sink<Integer> sink = new Sink.ForEachOp(action); for( int i=p.depth; i>0 ; ){ sink = p.opWrapSink(sink); p = p.previousStage; i=p.depth; } while(piterator.hasNext()){ sink.accept(piterator.next()); } } public static void main(String[] args){ Prime.genPrime(20) .filter(e -> e < 15) .map(e -> e*2) .forEach(System.out::println); } } // abstract class Pipeline<T> { Pipeline<T> previousStage; int depth; Pipeline(){ this.previousStage = null; this.depth = 0; } Pipeline(Pipeline<T> previousStage){ this.previousStage = previousStage; this.depth = previousStage.depth + 1; } abstract Sink<T> opWrapSink(Sink<T> sink); } // interface Sink<T> extends Consumer<T>{ abstract static class ChainedInt<Integer> implements Sink<Integer> { protected final Sink<Integer> downstream; public ChainedInt(Sink<Integer> downstream){ this.downstream = downstream; } } // For terminal sink static class ForEachOp implements Sink<Integer> { Consumer<Integer> action; ForEachOp(Consumer<Integer> a){ this.action = a; } @Override public void accept(Integer i){ action.accept(i); } } }
The outcome of the above program is
$ java Prime 4 6 10 14 22 26]]>
Here’s a simple program to illustrate what I mean.
add <- function(x) { function(y) x + y }
adders <- lapply(1:10, add)
What would you expect to be the answer for the following two queries?
adders[[1]](10)
adders[[10]](10)
It turns out both evaluate to 20, when one expects 11 and 20!
Contrary to what one would expect from lambda calculus, the x in the closure of both adders[[1]] and adders[[10]] are bound to 10, the last value of the vector 1:10, as seen below:
as.list(environment(adders[[1]]))
$x
[1]10
as.list(environment(adders[[1]]))
$x
[1]10
There is further discussion of this issue at http://stackoverflow.com/questions/29084193/how-to-not-fall-into-rs-lazy-evaluation-trap
]]>function Module(name) { this.name = name; console.log("Creating " + this.name); } Module.prototype.start = function() { console.log("Starting " + this.name); }; Module.prototype.stop = function() { console.log("Stopping " + this.name); };
We want to create a couple of instances with the names “a”, “b” and “c”. At the beginning of the program we want to start each module, and at the end of the program we want to stop each module. For the creation of the instances we use a map() function call on the names array:
var names = ["a", "b", "c"]; var modules = names.map(function(name) { return new Module(name); }); modules.forEach(function(module) { module.start(); }); // do something modules.forEach(function(module) { module.stop(); });
The output is as intended:
Creating a
Creating b
Creating c
Starting a
Starting b
Starting c
Stopping a
Stopping b
Stopping c
Now we want to port this code to C#. The definition of the class is straight-forward:
class Module { private readonly String name; public Module(string name) { this.name = name; Console.WriteLine("Creating " + name); } public void Start() { Console.WriteLine("Starting " + name); } public void Stop() { Console.WriteLine("Stopping " + name); } }
The map() function is called Select() in .NET:
var names = new List<string>{"a", "b", "c"}; var modules = names.Select( name => new Module(name)); foreach (var module in modules) { module.Start(); } foreach (var module in modules) { module.Stop(); }
But when we run this program, we get a completely different output:
Creating a
Starting a
Creating b
Starting b
Creating c
Starting c
Creating a
Stopping a
Creating b
Stopping b
Creating c
Stopping c
Each module is created twice, and the creation calls are interleaved with the start() and stop() calls.
What has happened?
The answer is that .NET’s Select() method does lazy evaluation. It does not return a new list with the mapped elements. It returns an IEnumerable instead, which evaluates each mapping operation only when needed. This is a very useful concept. It allows for the chaining of multiple operations without creating an intermediate list each time. It also allows for operations on infinite sequences.
But in our case it’s not what we want. The stopped instances are not the same as the started instances.
How can we fix it?
By appending a .ToList() call after the .Select() call:
var modules = names.Select( name => new Module(name)).ToList();
Now the IEnumerable gets evaluated and collected into a list before the assignment to the modules variable.
So be aware of whether your programming language or framework uses lazy or eager evaluation for functional collection operations to avoid running into subtle bugs. Other examples of tools based on the concept of lazy evaluation are the Java stream API or the Haskell programming language. Some languages support both, for example Ruby since version 2.0:
range.collect { |x| x*x } range.lazy.collect { |x| x*x }]]>
Non-Lazy Version
scala> val number = { println("Printing!"); 1 + 2 } Printing! number: Int = 3 scala> number number: Int = 3 scala> number number: Int = 3
Lazy Version: def
scala> def number = { println("Printing!"); 1 + 2 } number: Int scala> number Printing! number: Int = 3 scala> number Printing! number: Int = 3
So what’s the difference? For the lazy version using def
, we are essentially defining a function which takes no argument. Each time it is referenced, it is evaluated. That’s why we see println
executed at each invocation. On the other hand non-lazy version using val
is evaluating the expression only once when the variable is defined. That is the reason why we see println
once at the definition.
Unwanted bug can happen if variable was defined as val
instead of def
. Let’s say that variable defined required a framework specific context which is provided at runtime when the framework starts. Unit tests usually runs in controlled runtime which is different from the production runtime. For example, if we need to add a new variable to an existing class which reads from a server side configuration; something like play.api.current.configuration
in the Play! Framework. Initially this configuration is defined as val
at global scope, so instantiation of the class containing this code required a running server. However this configuration is not needed at unit test level and we definitely do not want to start a server to run unit tests. In order to fix this, simply changing the variable definition to def
will prevent existing tests from failing.
Lazy Version: lazy val
scala> lazy val number = { println("Printing!"); 1 + 2 } number: Int = (lazy) scala> number Printing! number: Int = 3 scala> number number: Int = 3
You could also use lazy
keyword. In this case the variable is evaluated only once, first time it is needed. This is very useful if the evaluation is not always needed and expensive.
Now let’s say you have a case class below. It is a simple class which contains an integer. Every time it is instantiated we see “Instantiating the Context.”.
case class Context(x: Int) { println("Instantiating the Context.") }
Lazy Evaluation (Call by Name)
def callByName(context: => Context): Unit = { println(s"The x is ${context.x}") println(s"The x is ${context.x}") } scala> callByName(Context(1)) Instantiating the Context. The x is 1 Instantiating the Context. The x is 1
The only significance here is to define function with =>
which indicates that an argument is a function with no argument. As you can see in the invocation, every time the argument is referenced we see construction of the object.
Non-Lazy Evaluation (Call by Value)
def callByValue(context: Context): Unit = { println(s"The x is ${context.x}") println(s"The x is ${context.x}") } scala> callByValue(Context(1)) Instantiating the Context. The x is 1 The x is 1
As you can see, construction of the argument is invoked only once. Every time x
is referenced, it is reusing the same instance of Context
which is evaluated immediately it is passed into the function exactly once.
Above example is trivial, we are just constructing an object which holds an integer. At first, lazy evaluation seems better since you only invoke function if you need it so on demand. However, this can be catastrophic if you’re not careful. Imaging passing a large objects which requires recursive and expensive construction, worse case some construction makes external service calls. Can you see the impact? If we reference the argument more than once within the code block, we will be constructing those objects multiple times, and making service calls multiple times. Sooner or later we will see huge increase in the GC activity, and possible observe out of memory situations. This is very possible.
When to use Call by Name?
So now the question is if the impact is catastrophic, do we ever want to use it? We need to consciously decide exactly when to use call by name vs call by value.
Let’s tweak above example a little to see how to refactor the code above for practical use case.
First, we should only use call by name if there is a path for which we won’t need to evaluate the argument.
def callByValue(context: () => Context, evaluate: Boolean): Unit = { if (evaluate) { val c = context println(s"The x is ${c.x}") println(s"The x is ${c.x}") } else { println(s"Nothing to evaluate.") } }
As you can see, we won’t want to evaluate context
if evaluate
is false. Therefore there is clearly a path for which we do not want to evaluate context. In this case, it makes sense to use call by name.
Now we changed the argument to ( ) =>
to indicate that argument is side effecting. In this example, construction happens in the argument. Remember any side effecting function which has no argument should contain empty argument list to mark the impact.
Finally context
is evaluated once with val
when it is needed, so we ever evaluate it once when needed.
Happy Coding ^_^
]]>class Foo { lazy val lz = 5 }
それでこのプルグラムをコンパイルして、更にJDツールでデコンパイルすると、以下のコードとなります。
public class Foo { private int lz; private volatile boolean bitmap$0; private int lz$lzycompute() { synchronized (this) { if (!this.bitmap$0) { this.lz = 5; this.bitmap$0 = true;} return this.lz; } } public int lz() { return this.bitmap$0 ? this.lz : lz$lzycompute(); } }
デコンパイルされた内容を見ますと、フィルドlzに対して(13行目の)関数lz()が作られ、この関数を経由してlzへのアクセスとなっています。この関数は、もしフィルドlzに代入計算が既に完了されていれば、そのまま結果を返します。そうでなければ、lz$lzycompute()を呼び出して、代入の計算を行います。
ということは、Scalaのlazy valの役割は二つあります。
Suppose we’d like to write a nice, simple function to generate Fibonacci numbers. Furthermore, suppose we don’t know how many we’ll want when we call the function, so we might as well return “all of them”. We might be tempted to write a function like this:
def fib_bad(): # protip: don't run me.
a,b = 0,1
L = []
while True:
a,b = b,a+b
L.append(a)
return L
This is a Bad Idea™. Why? The list will necessarily be infinitely long. Even worse than returning fewer numbers than we want, this function will never return any! (Side note: if you ever see a “while True:
” loop with no return
, break
or yield
statements, it’s almost certainly a Bad Idea™.)
So how *do* we return an arbitrarily-to-infinitely-long sequence of integers for a specifically non-infinite length of time? See below:
def fib_gen():
a,b = 0,1
while True:
a,b = b,a+b
yield a
Here is our generator. This puppy will return Fibonacci number after Fibonacci number until the end of time, but only so long as we ask it to! Once we stop asking fib_gen()
for numbers, it stops, holding on to the next return yield value until it’s needed again. This is the core functionality of “lazy evaluation” — waiting until the very last second to compute and yield what is requested. What’s more, the total memory footprint of this function pretty much comes to the cost of remembering just two integers and nothing else (external lists of accumulated results notwithstanding).
I should point out that the “Fibonacci generator” is a very overused example of the benefits of lazy evaluation. Perhaps a better example would be one in which the cost of the function is measured in time as well as space. Consider a very slow function that needs to be called an unknown number of times (such as a complex math function, or database lookup), but for which receiving batches of results would be unhelpful. Perhaps each results measures in the hundreds of megabytes of data, or perhaps analyzing the results takes time and requires memory that we’d prefer not to have being taken up by a backlog of work.
Perhaps we just want to terminate our code as soon as we see a “positive result”!
The benefit of using generators (or generator-implementing objects) in these cases is that they allow the programmer to receive what data he/she needs to process — and nothing more — at will. Hooray for laziness!
On a similar note to the previous section, perhaps we’d like to parse a giant file a little at a time to minimize our memory usage. Generators are a wonderful choice for this! See below:
def file_gen(filename, bytes=1024):
with open(filename, 'r') as f:
data = f.read(bytes)
while data:
yield data
data = f.read(bytes)
raise StopIteration
The function above will open the given filename and yield blocks of data of the specified size (defaulting to 1KB) one block at a time, and only when told to, or until the file is empty, raising a StopIteration
exception in that case.
As the old adage goes: How does one move a proverbial mountain of data?
One byte at a time.
I should point out that generators are so useful in minimizing memory footprints that they are, by now, fundamental to the Python standard library of functions. Below are a few built-in functions that use generators:
file.readline()
essentially yields one line at a time, the generator equivalent of file.readlines()
.xrange(...)
is a quasi-generator that yields a sequence of numbers in the range specified. Its non-generator equivalent is range(...)
, which returns a list of numbers.enumerate(...)
is a generator that wraps any iterable you provide it and yields 2-tuples (pairs) of values, in the form (i,r), where i is the index number of the current element (from 0 to +inf) and r is the next value in the provided iterable.
It is very common to see xrange
used for counting/iterating a specific number of times, in the form for i in xrange(...): ...
This type of iteration probably accounts for 99% of the usage of range
, too — but it shouldn’t. If you ever see range
being used to iterate, it is almost definitely a mistake, since range calculates an entire list, wasting (potentially huge amounts of) memory.
The only scenario in which using range
is appropriate is when you actually need a list to modify at some point, and not just to iterate over. For all other uses, you should use xrange
.
(Note that the above only applies to Python 2.7 and below. Since Python 3, range
has actually been removed entirely, with xrange
taking its place. When you need a list of values in Python 3, say so explicitly with list(range(...))
.)
While there’s nothing wrong with taking a generator ‘g’ and calling g.next()
over and over in a loop… there is a better way. Remember how we discussed duck typing, and how generators act as lists, and thus can/should be treated like lists? The better way to interact with generators is to use them in a for loop, as such:
g = some_gen(...)
for val in g:
...
There are several benefits to the above. Firstly, we don’t have to keep track of how many elements ‘g’ has left — when it runs out, the loop simply stops looping. Secondly, if we later choose to replace the value of ‘g’ with a list, we have literally no other code to change, since we were treating ‘g’ like a list anyways. And thirdly, this code is much more implementation-independent; if ever in the future the implementation of Python’s generators changes so that the next()
method no longer exists or is changed somehow (like it already was…), you don’t have to come back to dust off your 10-year-old code and change a line that should have been abstracted/duck-typed in the first place!
Yes, yes, “So how do we iterate over infinite generators with duck-typing??”. I hear you.
As mentioned above, there is an oft-forgotten (by myself) and very handy built-in Python function called enumerate(...)
that pairs every iterated element with an index. As a refresher of how we use it, list(enumerate('xyz'))
returns [(0, 'x'), (1, 'y'), (2, 'z')]
(strings are iterable, remember!).
This handy tool, combined with any infinite generator, allows us to very easily keep track of our iteration progress. See below:
g = infinite_gen()
for (i,whatever) in enumerate(g):
if i > 20:
break
elif whatever == "some value we want":
break
else:
print whatever
This way, we can easily stop our iteration whenever we see fit (such as at the 11235th Fibonacci number).
]]>I hope to make this presentation self-contained. (However, look up this page, there are links to online tutorials, as well as already many posts on the general subjects, which you may discover either by clicking on the tag cloud at left, or by searching by keywords in this open notebook.)
_________________________________________________________
This series of posts may be used as a longer, more detailed version of sections
from the article M. Buliga, L.H. Kauffman, Chemlambda, universality and self-multiplication, arXiv:1403.8046 [cs.AI], presented by Louis Kauffman in the ALIFE 14 conference, 7/30 to 8/2 – 2014 – Javits Center / SUNY Global Center – New York. Here is a link to the published article, free, at MIT Press.
_________________________________________________________
Tags. I shall use the name “tag” instead of “actor” or “type”, because is more generic (and because in future developments we shall talk more about actors and types, continuing from the post Actors as types in the beta move, tentative).
Every port of a graphical element (see part II) and the graphical element itself can have tags, denoted by :tagname.
There is a null tag “null” which can be omitted in the g-patterns.
As an example, we may see, in the most ornate way, graphical elements like this one:
L[x:a,y:b,z:c]:d
where of course
L[x:null,y:null,z:null]:null means L[x,y,z]
The port names are tags, in particular “in” out” “middle” “left” and “right” are tags.
Any concatenation of tags is a tag. Concatenation of tags is denoted by a dot, for example “left.right.null.left.in”. By the use of “null” we have
a.null –concat–> a
null.a –concat–> a
I shall not regard concat as a move in itself (maybe I should, but that is for later).
Further in this post I shall not use tags for nodes.
Moves with tags. We can use tags in the moves, according to a predefined convention. I shall take several examples.
1. The FAN-IN move with tags. If the tags a and b are different then
FI[x:a, y:b, z:c] FO[z:c,u:b, v:a]
–FAN-IN–>
Arrow[x:a,v:a] Arrow[y:b,u:b]
Remark that the move is not reversible.
It means that you can do FAN-IN only if the right tags are there.
2. COMB with tags.
L[x:a, y:b, z:c] Arrow[y:b, u:d]
–COMB–>
L[x:a, u:d,z:c]
and so on for all the comb moves which involve two graphical elements.
3. DIST with tags. There are two DIST moves, here with tags.
A[x:a,y:b,z:c] FO[z:c,u:d,v:e]
–DIST–>
FO[x:a, w:left.d, p:right.e] FO[y:b, s:left.d, t:right.e]
A[w:left.d, s:left.d, u:d] A[p:right.e, t:right.e, v:e]
In graphical version
and the DIST move for the L node:
L[y:b, x:a, z:c] FO[z:c, u:d, v:e]
–DIST–>
FI[p:right, w:left, x:a] FO[y:b, s:left, t:right]
L[s:left, w:left,u:d] L[t:right, p:right, v:e]
In graphical version:
4. SHUFFLE. This move replaces CO-ASSOC, CO-COMM. (It can be done as a sequence of CO-COMM and CO-ASSOC; conversely, CO-COMM and CO-ASSOC can be done by SHUFFLE and LOC PRUNING, explanations another time.)
FO[x:a, y:b, z:c] FO[y:b, w:left, p:right] FO[z:c, s:left, t:right]
–SHUFFLE–>
FO[x:a, y:left, z:right] FO[y:left, w, s] FO[z:right, p, t]
In graphical version:
____________________________________________________________
]]>
first_n
, which returned a sequence of the first n elements of a sequence.
But this isn’t great if what you passed in was a random-access indexed collection. Where possible, it’s nice to retain random access on the output of your lazy function. The lazy map
does this, for example. If you pass an array into lazy
then call map
, the collection you get back is a LazyRandomAccessView
, and you can index randomly into it just like you could into an array.
To do this for first_n
, we need a struct that implements Collection
, takes a Collection
as an initializer plus a count n
, and layers a view over the top of that collection that only exposes the first n
elements.
Or, in the more general case, a view that takes a collection and a sub-range within that collection. Here it is:
struct SubrangeCollectionView<Base: Collection>: Collection { private let _base: Base private let _range: Range<Base.IndexType> init(_ base: Base, subrange: Range<Base.IndexType>) { _base = base _range = subrange } var startIndex: Base.IndexType { return _range.startIndex } var endIndex: Base.IndexType { return _range.endIndex } subscript(idx: Base.IndexType) -> Base.GeneratorType.Element { return _base[idx] } typealias GeneratorType = IndexingGenerator<SubrangeCollectionView> func generate() -> GeneratorType { return IndexingGenerator(self) } }
This sub-range collection view is so useful that I was sure it was somewhere in the Swift standard library. That’s partly why I wrote this article on collection and sequence helpers – as a side-effect of going line by line looking for it. I’m still expecting someone to reply to this article saying “no you fool, you just use X”. If you are that person, call me a fool here.
The Sliceable
protocol seems designed to provide this function. It requires a collection to typealias a SliceType
and implement a subscript(Range)-
>SliceType
method. But you need the collection to implement sliceable, which seems overly restrictive, whereas the view above works for any collection.
SubrangeCollectionView
enables you to pass in a sub-range of any collection to any algorithm that takes a collection or sequence. For example:
let r = 1...10 let halfway = r.startIndex.advancedBy(5) let top_half = SubrangeCollectionView(r, subrange: halfway..<r.endIndex) reduce(top_half,0,+) // returns 6+7+8+9+10
A nice feature is that if you pass a subrange collection into an algorithm that returns an index, such as find
, the index returned can also be used on the base collection. So in the example above, if you ran if let idx = find(top_half,8)
, you could then use idx
on not just top_half[idx]
, but also r[idx]
.
Operating on subranges is a capability that C++ developers, used to the STL, will be missing from Swift collection algorithms. In the STL, operations on containers are performed not by passing the container itself to the algorithm, but instead by passing two iterators on that container defining a range.
For example, here is the definition of the find
algorithm in the C++ STL compared to the Swift version:
// C++ std::find template <class InputIterator, class T> InputIterator find (InputIterator first, InputIterator last, const T& val); // Swift.find func find <C: Collection where C.GeneratorType.Element: Equatable> (domain: C, value: C.GeneratorType.Element) -> C.IndexType?
While Swift.find
takes a collection as its first argument, std::find
takes a first
and last
iterator.
STL container iterators are similar to Swift collection indices, in that they can be forward, bidirectional or random, and are used to move up and down the collection. Like Swift indices, the end iterator points to one past the last element – the equivalent of Swift.find
returning nil
for not found would be std::find
returning end
.
But unlike Swift indices, STL iterators model C pointer-like behaviour, so they are dereferencable. To get the value the iterator points to, you don’t have to have access to the underlying container, using something like Swift’s subscript. To get the value at first
, you call *first
. This is why std::find
doesn’t need to take a container in it’s input. It just increments first
until either it equals last
, or *first
equals val
.
There are lots of reasons why the STL models iterators like this: because it’s like pointers into C arrays; because it dodges memory management issues (which won’t apply to Swift) etc. But also because, with this model, every algorithm will work just as well on a sub-range of a container as on the whole container. You don’t have to pass the starting or ending iterator for the container into find
, you can pass in any two arbitrary points.^{1}
Swift indices, on the other hand, are pretty dumb beasts. They don’t have to know about what value they point to or what collection they index. Heck, the index for Array
is just an Int
.^{2}
For all the advantages of Swift’s just passing in collections as arguments has (it looks a lot cleaner, doesn’t confuse beginners),^{3} it lacks this ability to apply algorithms to subranges. SubrangeCollectionView
gives you that ability back.
Anyway, back to the original problem, which was to enhance LazyRandomAccessView
with a version of first_n
that returns another LazyRandomAccessView
. With the help of SubrangeCollectionView
, here it is:
extension LazyRandomAccessCollection { func first_n (n: LazyRandomAccessCollection.IndexType.DistanceType) -> LazyRandomAccessCollection<SubrangeCollectionView<LazyRandomAccessCollection>> { let start = self.startIndex let end = min(self.endIndex, start.advancedBy(n)) let range = start..<end let perm = SubrangeCollectionView(self, subrange: range) return lazy(perm) } } let a = [1, 2, 3, 4, 5, 6, 7] let first3 = lazy(a).first_n(3) first3[1] // returns 2
One final interesting question is whether to do the same for forward and bidirectional lazy views as well. The problem here is computing the endIndex
for the view. Unlike with a random-access index, you can’t just advance them by n in constant time. It takes O(n) because the index has to be walked up one by one. For this reason, I’d stick with returning sequences for these.
first
than to last
, at which point, kaboom! A risk which passing in the container itself eliminates. ↩
*
operator for it to dereference itself. ↩
I hope to make this presentation self-contained. (However, look up this page, there are links to online tutorials, as well as already many posts on the general subjects, which you may discover either by clicking on the tag cloud at left, or by searching by keywords in this open notebook.)
_________________________________________________________
This series of posts may be used as a longer, more detailed version of sections
from the article M. Buliga, L.H. Kauffman, Chemlambda, universality and self-multiplication, arXiv:1403.8046 [cs.AI], which is accepted in the ALIFE 14 conference, 7/30 to 8/2 – 2014 – Javits Center / SUNY Global Center – New York, (go see the presentation of Louis Kauffman if you are near the event.) Here is a link to the published article, free, at MIT Press.
_________________________________________________________
In this post I take a simple example which contains beta reduction and self-multiplication.
Maybe self-multiplication is a too long word. A short one would be “dup”, any tacit programming language has it. However, chemlambda is only superficially resembling to tacit programming (and it’s not a language, arguably, but a GRS, nevermind).
Or “self-dup” because chemlambda has no “dup”, but a mechanism of self-multiplication, as explained in part VI.
Enough with the problem of the right denomination, because
“A rose by any other name would smell as sweet”
as somebody wrote, clearly not believing that the limit of his world is the limit of his language.
Let’s consider the lambda term (Lx.xx)(Ly.yz). In lambda calculus there is the following string of reductions:
(Lx.xx)(Ly.yz) -beta-> (Ly.yz) (Lu.uz) -beta-> (Lu.uz) z -beta-> zz
What we see? Let’s take it slower. Denote by C=xx and by B= Ly.yz. Then:
(Lx.C)B -beta-> C[x:=B] = (xx)[x:=B] = (x)[x:=B] (x)[x:=B] = BB = (Ly.yz) B -beta-> (yz)[y:=B] = (y)[y:=B] (z)[y:=B] = Bz = (Lu.uz)z -beta=> (uz)[u:=z] = (u)[u:=z] (z)[u:=z] = zz
Now, with chemlambda and its moves performed only from LEFT to RIGHT.
The g-pattern which represents (Lx.xx)(Ly.yz) is
L[a1,x,a] FO[x,u,v] A[u,v,a1] A[a,c,b] L[w,y,c] A[y,z,w]
We can only do a beta move:
L[a1,x,a] FO[x,u,v] A[u,v,a1] A[a,c,b] L[w,y,c] A[y,z,w]
<–beta–>
Arrow[a1,b] Arrow[c,x] FO[x,u,v] A[u,v,a1] L[w,y,c] A[y,z,w]
We can do two COMB moves
Arrow[a1,b] Arrow[c,x] FO[x,u,v] A[u,v,a1] L[w,y,c] A[y,z,w]
2 <–COMB–>
FO[c,u,v] A[u,v,b] L[w,y,c] A[y,z,w]
Now look, that is not a representation of a lambda term, because of the fact that FO[c,u,v] is “in the middle”, i.e. the middle.in port of the FO[c,u,v] is the out port of B, i.e. the right.out port of the lambda node L[w,y,c]. On the same time, the out ports of FO[c,u,v] are the in ports of A[u,v,b].
The only move which can be performed is DIST, which starts the self-dup or self-multiplication of B = L[w,y,c] A[y,z,w] :
FO[c,u,v] A[u,v,b] L[w,y,c] A[y,z,w]
<–DIST–>
FI[e,f,y] FO[w,g,h] L[h,e,v] L[g,f,u] A[u,v,b] A[y,z,w]
This is still not a representation of a lambda term. Notice also that the g-pattern which represents B has not yet self-multiplied. However, we can already perform a beta move for L[g,f,u] A[u,v,b] and we get (after 2 COMB moves as well)
FI[e,f,y] FO[w,g,h] L[h,e,v] L[g,f,u] A[u,v,b] A[y,z,w]
<–beta–>
FI[e,f,y] FO[w,g,h] L[h,e,v] Arrow[g,b] Arrow[v,f] A[y,z,w]
2 <–COMB–>
FI[e,f,y] FO[w,b,h] L[h,e,f] A[y,z,w]
This looks like a weird g-pattern. Clearly is not a g-pattern coming from a lambda term, because it contains the fanin node FI[e,f,y]. Let’s write again the g-pattern as
L[h,e,f] FI[e,f,y] A[y,z,w] FO[w,b,h]
(for our own pleasure, the order of the elements in the g-pattern does not matter) and remark that A[y,z,w] is “conjugated” by the FI[e,f,y] and FO[w,b,h].
We can apply another DIST move
L[h,e,f] FI[e,f,y] A[y,z,w] FO[w,b,h]
<–DIST–>
A[i,k,b] A[j,l,h] FO[y,i,j] FO[z,k,l] FI[e,f,y] L[h,e,f]
and now there is only one move which can be done, namely a FAN-IN:
A[i,k,b] A[j,l,h] FO[y,i,j] FO[z,k,l] FI[e,f,y] L[h,e,f]
<–FAN-IN–>
Arrow[e,j] Arrow[f,i] A[i,k,b] A[j,l,h] FO[z,k,l] L[h,e,f]
which gives after 2 COMB moves:
Arrow[e,j] Arrow[f,i] A[i,k,b] A[j,l,h] FO[z,k,l] L[h,e,f]
2 <–COMB–>
A[f,k,b] A[e,l,h] FO[z,k,l] L[h,e,f]
The g-pattern
A[f,k,b] A[e,l,h] FO[z,k,l] L[h,e,f]
is a representation of a lambda term, finally: the representation of (Le.ez)z. Great!
From here, though, we can apply only a beta move at the g-pattern A[f,k,b] L[h,e,f]
A[f,k,b] A[e,l,h] FO[z,k,l] L[h,e,f]
<–beta–>
Arrow[h,b] Arrow[k,e] A[e,l,h] FO[z,k,l]
2 <–COMB–>
FO[z,k,l] A[k,l,b]
which represents zz.
_____________________________________________________
]]>
Here’s how you could add it to LazyRandomAccess
:
extension LazyRandomAccessCollection { func first_n (n: LazyRandomAccessCollection.IndexType.DistanceType) -> LazySequence< PermutationGenerator< LazyRandomAccessCollection, Range<LazyRandomAccessCollection.IndexType> > > { let start = self.startIndex let end = min(self.endIndex, start.advancedBy(n)) let range = start..<end let perm = PermutationGenerator(elements: self, indices: range) return lazy(perm) } }
This uses PermutationGenerator
, which is a lazy view class that returns a (sub-)sequence of values from a collection in the order given by another collection of indices into that collection. In first_n
’s case, the indices are a range of the first n indices (or the full collection if there are fewer than n elements).
The return type of first_n
is a bit crackers so I’ve broken it up over multiple lines. It returns a lazy sequence of a permutation generator of a lazy random access collection, permuted by a range of lazy random access collection indices. Phew.^{1} This is why the lazy
function and Swift’s type inference is more than just nice to have, it’s essential. The actual types you are using in a simple function can quickly get to a point where there’s no practical way you could declare them by hand.
The practice of returning another lazy object that wraps the results is copied from the lazy map
and filter
members, which do the same. Why wrap PermutationGenerator
in another LazySequence
instead of returning it directly? Chaining mostly I think.^{2} Without doing that, you’d have to re-lazy the result if you wanted to run more lazy filters on it:
let r = 1...10 // with lazy wrapper let l = lazy(r).first_n(5).filter(isOdd) // without lazy wrapper let l = lazy(lazy(r).first_n(5)).filter(isOdd)
That works for a collection,^{3} but what about a sequence? How can we generate a new sequence that only returns the first n values?
Instead of declaring that in one shot, we’ll follow the pattern of map
and filter
and first declare a view:
struct FirstNSequenceView<Base: Sequence> : Sequence { private let _n: Int private let _base: Base init(_ base: Base, _ n: Int) { self._base = base self._n = n } func generate() -> GeneratorOf<Base.GeneratorType.Element> { var i: Int = 0 var g = _base.generate() return GeneratorOf { ++i <= self._n ? g.next() : nil } } }
This uses GeneratorOf
, which takes for its constructor a closure that serves up values for each call to next()
. This helps avoid having to manually write a companion generator class when you implement your own sequence (IndexingGenerator
is a similar helper that can be used to conform to Sequence
when you implement your own Collection
). In this case, generate()
creates a closure that captures a counter that will count up, returning elements from the sequence until it hits n. It then returns a GeneratorOf
that will call that closure each time next()
is called on it. Since each time generate()
is called, a new closure with a fresh i
will be created, this means FirstNSequenceView
conforms to the requirement of Sequence
that it can be walked multiple times with multiple independent generators.
Now that we have this view, it’s easy to declare first_n
for all the lazy views:
extension LazySequence { func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazySequence>> { return lazy(FirstNSequenceView(self,n)) } } extension LazyForwardCollection { func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyForwardCollection>> { return lazy(FirstNSequenceView(self,n)) } } extension LazyBidirectionalCollection { func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyBidirectionalCollection>> { return lazy(FirstNSequenceView(self,n)) } } extension LazyRandomAccessCollection { func first_n(n: Int) -> LazySequence<FirstNSequenceView<LazyRandomAccessCollection>> { return lazy(FirstNSequenceView(self,n)) } }
It’s a bit of a hassle having to implement each one separately, but that’s the price you pay for structs that aren’t in a hierarchy. All the more reason to factor the hard part out into a seperate view object.
Does first_n
belong as a lazy view member? Unlike map
and filter
, it doesn’t take a closure so the laziness is less likely to bite you. Maybe take_while
, a filter that returned members of an array until a supplied predicate returned false, would have been a better example. But first_n
could take a lazy sequence that itself depended on a closure, so the laziness of first_n
could still come as a surprise. Also, the chaining is nice. Does this mean other lazy sequences like enumerate
should migrate into being lazy class members? I dunno, maybe.
Without the chaining, you have to wrap the results in a function call. But the function call goes on the left, while chained calls go on the right, which results in a confusing ping-pong effect:
let l = lazy(first_n(lazy(r).filter(isOdd),5)).map(...)
The ordering of this is not obvious at first glance, but it’s filter
, then first_n
, then map
. But then where does it end? Should everything get piled into the lazy classes just to benefit from chaining? Nah. The better solution is to implement a pipe forward operator, of which more some other time.
map
and filter
. But then the compiler doesn’t completely faithfully represent the generic types in the declarations you see in Xcode (which is why there is a lot of T == T
in what we see of the Swift standard library). Let me know if can figure out a better way. ↩
map
and filter
returned were objects that would evaluate and return results lazily on demand, so when you called them, no mapping or filtering took place right away. Instead it happened later, when you accessed the elements of a MapFilterView
and its kin.
Well, turns out Apple decided that cleverly not doing what people might expect isn’t necessarily the best move, so as of beta 4, map
and filter
return an Array
. They still take in collections and sequences of any kind, but an array is what they spit out.^{2}
This is probably for the best. If you didn’t realize map
computed lazily, you could be surprised when the results changed each time you iterated over a map using a closure like this:
let r = 1...4 var i = 0 let m = map(r) { $0 * ++i } for e in m { // loops over 1, 4, 9, 16 } for e in m { // loops over 5, 12, 21, 32 }
Even the Swift dev team weren’t immune to the unexpected consequences of lazyness. There were some bugs in using FilterCollectionView
to populate an Array
, as it took two passes, one to determine the array size needed and another to populate the array. A predicate that returned fewer results on the first pass than the second would result in buffer overrun.
Now, with beta 4, there’s no excuses for getting surprised by lazy evaluation. If you still want to be lazy, you first need to pass your sequence or collection into a call to lazy()
, which will give you back a lazy view class. What you get back depends on what you pass in – if you pass in a sequence, you’ll get back a LazySequence
. If you pass in a collection, you’ll get back one of the lazy collection structs – either LazyForwardCollection
, LazyBidirectionalCollection
, or LazyRandomAccessCollection
.
These views get progressively more features depending on the capabilities of their base. LazySequence
has lazy map
and filter
methods that work like the old lazy map
and filter
functions, by returning another LazySequence
object.
It also has an array
property for crystalizing the lazy results into an array. Should you decide at a later point you want to iterate over the collection more than once, you should use this. If you don’t, duplicate iterations will wastefully re-run the mapping or filtering function over and over.
This also means that you can now safely write this:
var i = 0 let lf = lazy(1...5).filter { _ in ++i % 2 == 0 } let a1 = lf.array // a1 is [2, 4] let a2 = lf.array // a2 is [1, 3, 5] let a3 = lf.array // a3 is [2, 4]
LazyForwardCollection
only adds subscript
, since forward-indexable collections can’t do much more than sequences.
Note, filter
still returns a sequence, even when called on the lazy collections, to avoid the heartache described above where other collection constructors assumed they could rely a collection’s length being consistent. The results of map
can be a collection, because it returns a value for every element in the base no matter what. That collection inherits the index properties of the base.
LazyBidirectionalCollection
and LazyRandomAccessCollection
add the ability to reverse
lazily. So if you wanted to filter just the first few items starting at the back of a collection, you could call lazy(col).reverse().filter { ... }
.
The collection returned by reverse
can be used wherever you use a regular collection. If you’re a C++ programmer and you liked the benefits of rbegin/rend
, this might be what you’re looking for:
let s = "The cat in the hat" let rs = lazy(s).reverse() if let idx = find(rs, "h") { // idx points to the h of hat // not the h of The }
How you get the best lazy view class is pretty cool. lazy
is actually 4 overloaded generic functions:
func lazy<S: SequenceType>(s: S) -> LazySequence<S> func lazy<S: CollectionType where S.Index: ForwardIndexType>(s: S) -> LazyForwardCollection<S> func lazy<S: CollectionType where S.Index: BidirectionalIndexType>(s: S) -> LazyBidirectionalCollection<S> func lazy<S: CollectionType where S.Index: RandomAccessIndexType>(s: S) -> LazyRandomAccessCollection<S>
When you call lazy
, the Swift compiler picks the most specific overload possible, preferring more specialized inherited protocols over base ones. So CollectionType
beats SequenceType
, because CollectionType
inherits from SequenceType
. CollectionType where S.Index: RandomAccessIndexType
beats CollectionType where S.Index: BidirectionalIndexType
because RandomAccessIndexType
inherits from BidirectionalIndexType
.^{3} What is returned is an instance of another generic class, that implements a lazy view on any specific collection or sequence.
I don’t know if there’s an official term for this, but I call it a generic factory. It’s similar to the abstract factory design pattern, in that you call a function to get back one of a range of possible concrete types. But in this case, the type determination happens at compile time, and what you get back is not an implementation of an abstract interface, but the actual appropriate concrete type.
This all feels transparent to the caller because of Swift’s type inference capabilities. You call lazy
, passing in your base object, assign the result to a variable, and then merrily start using it. But you aren’t constrained to an interface exposing only the common features of the possible concrete classes, like you would be with an interface and absract factory set-up. If you passed in collection that’s capable of it, you get a reverse method.
Other than help pick the best container type, lazy
doesn’t do much. There’s nothing stopping you from declaring the lazy views directly:
let r = 1...500 let l = LazyBidirectionalCollection(r) let evens = l.filter { $0%2 == 0 }
But if you were implementing your own set of classes generated by this generic factory pattern, you could also put common set-up code in your generic factory method (or even have a generic factory class if needed).
By the way, the new stride
function in beta 4 follows a similar pattern of returning different types at compile time from an overloaded function. But in its case, the overloading isn’t done by what you pass in. It isn’t done by types at all.
func stride<T: Strideable>(from start: T, to end: T, by stride: T.Stride) -> StrideTo<T> func stride<T: Strideable>(from start: T, through end: T, by stride: T.Stride) -> StrideThrough<T>
These two functions differ only by the name of their middle parameter. I don’t know about you, but that this was possible was an eye opener. Score one for the Objective-C named arguments enthusiasts.
So what if you have your own idea for a lazily evaluated filter to apply to sequences or collections? Well, you could extend the lazy classes to support it. We’ll look at that in the next article. Follow @airspeedswift to catch it.
Array
members map
and filter
, since they’re now duplicative. They could still be special-cased for performance purposes. ↩
I hope to make this presentation self-contained. (However, look up this page, there are links to online tutorials, as well as already many posts on the general subjects, which you may discover either by clicking on the tag cloud at left, or by searching by keywords in this open notebook.)
_________________________________________________________
This series of posts may be used as a longer, more detailed version of sections
from the article M. Buliga, L.H. Kauffman, Chemlambda, universality and self-multiplication, arXiv:1403.8046 [cs.AI], which is accepted in the ALIFE 14 conference, 7/30 to 8/2 – 2014 – Javits Center / SUNY Global Center – New York, (go see the presentation of Louis Kauffman if you are near the event.) Here is a link to the published article, free, at MIT Press.
_________________________________________________________
In this post I want to concentrate on the mechanism of self-multiplication for g-patterns coming from lambda terms (see part IV where the algorithm of translation from lambda terms to g-patterns is explained).
Before that, please notice that there is a lot to talk about an important problem which shall be described later in detail. But here is it, to keep an eye on it.
Chemlambda in itself is only a graph rewriting system. In part V is explained that the beta reduction from lambda calculus needs an evaluation strategy in order to be used. We noticed that in chemlambda the self-multiplication is needed in order to prove that one can do beta reduction as the beta move.
We go towards the obvious conclusion that in chemlambda reduction (i.e. beta move) and self-multiplication are just names used for parts of the computation. Indeed, the clear conclusion is that there is a computation which can be done with chemlambda, which has some parts where we use the beta move (and possibly some COMB, CO-ASSOC, CO-COMM, LOC PRUNING) and some other parts we use DIST and FAN-IN (and possibly some of the moves COMB, CO-ASSOC, CO-COMM, LOC PRUNING). These two parts have as names reduction and self-multiplication respectively, but in the big computation they mix into a whole. There are only moves, graphs rewrites applied to a molecule.
Which brings the problem: chemlambda in itself is not sufficient for having a model of computation. We need to specify how, where, when the reductions apply to molecules.
There may be many variants, roughly described as: sequential, parallel, concurrent, decentralized, random, based on chemical reaction network models, etc
Each model of computation (which can be made compatible with chemlambda) gives a different whole when used with chemlambda. Until now, in this series there has been no mention of a model of computation.
There is another aspect of this. It is obvious that chemlambda graphs form a larger class than lambda terms, and also that the graph rewrites apply to more general situations than beta reduction (and eventually an evaluation strategy). It means that the important problem of defining a model of computation over chemlambda will have influences over the way chemlambda molecules “compute” in general.
The model of computation which I prefer is not based on chemical reaction networks, nor on process calculi, but on a new model, inspired from the Actor Model, called the distributed GLC. I shall explain why I believe that the Actor Model of Hewitt is superior to those mentioned previously (with respect to decentralized, asynchronous computation in the real Internet, and also in the real world), I shall explain what is my understanding of that model and eventually the distributed GLC proposal by me and Louis Kauffman will be exposed in all details.
4. Self-multiplication of a g-pattern coming from a lambda term.
For the moment we concentrate on the self-multiplication phenomenon for g-patterns which represent lambda terms. In the following is a departure from the ALIFE 14 article. I shall not use the path which consists into going to combinators patterns, nor I shall discuss in this post why the self-multiplication phenomenon is not confined in the world of g-patterns coming from lambda terms. This is for a future post.
In this post I want to give an image about how these g-patterns self-multiply, in the sense that most of the self-multiplication process can be explained independently on the computing model. Later on we shall come back to this, we shall look outside lambda calculus as well and we shall explore also the combinator molecules.
OK, let’s start. In part V has been noticed that after an application of the beta rule to the g-pattern
L[a,x,b] A[b,c,d] C[c] FOTREE[x,a1,…,aN] B[a1,…,aN, a]
we obtain (via COMB moves)
C[x] FOTREE[x,a1,…,aN] B[a1,…,aN,d]
and the problem is that we have a g-pattern which is not coming from a lambda term, because it has a FOTREE in the middle of it. It looks like this (recall that FOTREEs are figured in yellow and the syntactic trees are figured in light blue)
The question is: what can happen next? Let’s simplify the setting by taking the FOTREE in the middle as a single fanout node, then we ask what moves can be applied further to the g-pattern
C[x] FO[x,a,b]
Clearly we can apply DIST moves. There are two DIST moves, one for the application node, the other for the lambda node.
There is a chain of propagation of DIST moves through the syntactic tree of C, which is independent on the model of computation chosen (i.e. on the rules about which, when and where rules are used), because the syntactic tree is a tree.
Look what happens. We have the propagation of DIST moves (for the application nodes say) first, which produce two copies of a part of the syntactic tree which contains the root.
At some point we arrive to a pattern which allows the application of a DIST move for a lambda node. We do the rule:
We see that fanins appear! … and then the propagation of DIST moves through the syntactic tree continues until eventually we get this:
So the syntactic tree self-multiplied, but the two copies are still connected by FOTREEs which connect to left.out ports of the lambda nodes which are part of the syntactic tree (figured only one in the previous image).
Notice that now (or even earlier, it does not matter actually, will be explained rigorously why when we shall talk about the computing model, for the moment we want to see if it is possible only) we are in position to apply the FAN-IN move. Also, it is clear that by using CO-COMM and CO-ASSOC moves we can shuffle the arrows of the FOTREE, which is “conjugated” with a fanin at the root and with fanouts at the leaves, so that eventually we get this.
The self-multiplication is achieved! It looks strikingly like the anaphase [source]
followed by telophase [source]
____________________________________________________
]]>
I hope to make this presentation self-contained. (However, look up this page, there are links to online tutorials, as well as already many posts on the general subjects, which you may discover either by clicking on the tag cloud at left, or by searching by keywords in this open notebook.)
_________________________________________________________
This series of posts may be used as a longer, more detailed version of sections
from the article M. Buliga, L.H. Kauffman, Chemlambda, universality and self-multiplication, arXiv:1403.8046 [cs.AI], which is accepted in the ALIFE 14 conference, 7/30 to 8/2 – 2014 – Javits Center / SUNY Global Center – New York, (go see the presentation of Louis Kauffman if you are near the event.) Here is a link to the published article, free, at MIT Press.
_________________________________________________________
2. Lambda calculus terms as seen in chemlambda continued.
Let’s look at the structure of a molecule coming from the process of translation of a lambda term described in part IV.
Then I shall make some comments which should be obvious after the fact, but useful later when we shall discuss about the relation between the graphic beta move (i.e. the beta rule for g-patterns) and the beta reduction and evaluation strategies.
That will be a central point in the exposition, it is very important to understand it!
So, a molecule (i.e. a pattern with the free ports names erased, see part II for the denominations) which represents a lambda term looks like this:
In light blue is the part of the molecule which is essentially the syntactic tree of the lambda term. The only peculiarity is in the orientation of the arrows of lambda nodes.
Practically this part of the molecule is a tree, which has as nodes the lambda and application ones, but not fanouts, nor fanins.
The arrows are directed towards the up side of the figure. There is no need to draw it like this, i.e. there is no global rule for the edges orientations, contrary to the ZX calculus, where the edges orientations are deduced from from the global down-to-up orientation.
We see a lambda node figured, which is part of the syntactic tree. It has the right.out port connecting to the rest of the syntactic tree and the left.out port connecting to the yellow part of the figure.
The yellow part of the figure is a FOTREE (fanout tree). There might be many FOTREEs, in the figure appears only one. By looking at the algorithm of conversion of a lambda term into a g-pattern, we notice that in the g-patterns which represent lambda terms the FOTREEs may appear in two places:
As a consequence of this observation, here are two configurations of nodes which NEVER appear in a molecule which represents a lambda term:
Notice that these two patterns are EXACTLY those which appear as the LEFT side of the moves DIST! More about this later.
Remark also the position of the the insertion points of the FOTREE which comes out of the left.out port of the figured lambda node: the out ports of the FOTREE connect with the syntactic tree somewhere lower than where the lambda node is. This is typical for molecules which represent lambda terms. For example the following molecule, which can be described as the g-pattern L[a,b,c] A[c,b,d]
(but with the port variables deleted) cannot appear in a molecule which corresponds to a lambda term.
Let’s go back to the first image and continue with “TERMINATION NODE (1)”. Recall that termination nodes are used to cap the left.out port of a lambda lode which corresponds to a term Lx.A with x not occurring in A.
Finally, “FREE IN PORTS (2)” represents free in ports which correspond to the free variables of the lambda term. As observed earlier, but not figured in the picture, we MAY have free in ports as ones of a FANOUT tree.
I collect here some obvious, in retrospect, facts:
_______________________________________________________
3. The beta move. Reduction and evaluation.
I explain now in what sense the graphic beta move, or beta rule from chemlambda, corresponds to the beta reduction in the case of molecules which correspond to lambda terms.
Recall from part III the definition of he beta move
”
L[a1,a2,x] A[x,a4,a3] <–beta–> Arrow[a1,a3] Arrow[a4,a2]
or graphically
If we use the visual trick from the pedantic rant, we may depict the beta move as:
i.e. we use as free port variables the relative positions of the ports in the doodle. Of course, there is no node at the intersection of the two arrows, because there is no intersection of arrows at the graphical level. The chemlambda graphs are not planar graphs.”
The beta reduction in lambda calculus looks like this:
(Lx.B) C –beta reduction–> B[x:=C]
Here B and C are lambda terms and B[x:=C] denotes the term which is obtained from B after we replace all the occurrences of x in B by the term C.
I want to make clear what is the relation between the beta move and the beta reduction. Several things deserve to be mentioned.
It is of course expected that if we translate (Lx.B)C and B[x:=C] into g-patterns, then the beta move transforms the g-pattern of (Lx.B)C into the g-pattern of B[x:=C]. This is not exactly true, in fact it is true in a more detailed and interesting sense.
Before that it is worth mentioning that the beta move applies even for patterns which don’t correspond to lambda terms. Hence the beta move has a range of application greater than the beta reduction!
Indeed, look at the third figure from this post, which can’t be a pattern coming from a lambda term. Written as a g-pattern this is L[a,b,c] A[c,b,d]. We can apply the beta move and it gives:
L[a,b,c] A[c,b,d] <-beta-> Arrow[a,d] Arrow[b,b]
which can be followed by a COMB move
Arrow[a,d] Arrow[b,b] <-comb-> Arrow[a,d] loop
Graphically it looks like that.
In particular this explains the need to have the loop and Arrow graphical elements.
In chemlambda we make no effort to stay inside the collection of graphs which represent lambda terms. This is very important!
Another reason for this is related to the fact that we can’t check if a pattern comes from a lambda term in a local way, in the sense that there is no local (i.e. involving an a priori bound on the number of graphical elements used) criterion which describes the patterns coming from lambda terms. This is obvious from the previous observation that FOTREEs connect to the syntactic tree lower than their roots.
Or, chemlambda is a purely local graph rewrite system, in the sense that the is a bound on the number of graphical elements involved in any move.
This has as consequence: there is no correct graph in chemlambda. Hence there is no correctness enforcement in the formalism. In this respect chemlambda differs from any other graph rewriting system which is used in relation to lambda calculus or more general to functional programming.
Let’s go back to the beta reduction
(Lx.B) C –beta reduction–> B[x:=C]
Translated into g-patterns the term from the LEFT looks like this:
L[a,x,b] A[b,c,d] C[c] FOTREE[x,a1,…,aN] B[a1,…,aN, a]
where
The beta move does not need all this context, but we need it in order to explain in what sense the beta move does what the beta reduction does.
The beta move needs only the piece L[a,x,b] A[b,c,d]. It is a local move!
Look how the beta move acts:
L[a,x,b] A[b,c,d] C[c] FOTREE[x,a1,…,aN] B[a1,…,aN, a]
<-beta->
Arrow[a,d] Arrow[c,x] FOTREE[x,a1,…,aN] B[a1,…,aN, a]
and then 2 comb moves:
Arrow[a,d] Arrow[c,x] C[c] FOTREE[x,a1,…,aN] B[a1,…,aN, a]
<-2 comb->
C[x] FOTREE[x,a1,…,aN] B[a1,…,aN,d]
Graphically this is:
The graphic beta move, as it looks on syntactic trees of lambda terms, has been discovered in
Wadsworth, Christopher P. (1971). Semantics and Pragmatics of the Lambda Calculus. PhD thesis, Oxford University
This work is the origin of the lazy, or call-by-need evaluation in lambda calculus!
Indeed, the result of the beta move is not B[x:=C] because in the reduction step is not performed any substitution x:=C.
In the lambda calculus world, as it is well known, one has to supplement the lambda calculus with an evaluation strategy. The call-by-need evaluation explains how to do in an optimized way the substitution x:=C in B.
From the chemlambda point of view on lambda calculus, a very interesting thing happens. The g-pattern obtained after the beta move (and obvious comb moves) is
C[x] FOTREE[x,a1,…,aN] B[a1,…,aN,d]
or graphically
As you can see this is not a g-pattern which corresponds to a lambda term. That is because it has a FOTREE in the middle of it!
Thus the beta move applied to a g-pattern which represents a lambda term gives a g-patterns which can’t represent a lambda term.
The g-pattern which represents the lambda term B[x:=C] is
C[a1] …. C[aN] B[a1,…,aN,d]
or graphically
In graphic lambda calculus, or GLC, which is the parent of chemlambda we pass from the graph which correspond to the g-pattern
C[x] FOTREE[x,a1,…,aN] B[a1,…,aN,d]
to the g-pattern of B[x:=C]
C[a1] …. C[aN] B[a1,…,aN,d]
by a GLOBAL FAN-OUT move, i.e. a graph rewrite which looks like that
if C[x] is a g-pattern with no other free ports than “x” then
C[x] FOTREE[x, a1, …, aN]
<-GLOBAL FAN-OUT->
C[a1] …. C[aN]
As you can see this is not a local move, because there is no a priori bound on the number of graphical elements involved in the move.
That is why I invented chemlambda, which has only local moves!
The evaluation strategy needed in lambda calculus to know when and how to do the substitution x:C in B is replaced in chemlambda by SELF-MULTIPLICATION.
Indeed, this is because the g-pattern
C[x] FOTREE[x,a1,…,aN] B[a1,…,aN,d]
surely has places where we can apply DIST moves (and perhaps later FAN-IN moves).
That is for the next post.
___________________________________________________
]]>
http://www.zaneacademy.com | allMatch Stream method (Lazy Evaluation + Short Circuit)
]]>The Swift non-member function map
acts much like the Array.map
member function, except you can pass in any kind of sequence or collection (click here for an article on how to write an algorithm like this).
But if you look at what map
returns, it isn’t an array of results. It’s an object (a MapColectionView
or a MapSequenceView
).
That’s because map
evaluates its results lazily. When you call map
, no mapping takes place. Instead, the input and mapping function are stored for later, and values are only mapped as and when they are needed.
To see this happening, run the following code:
let r = 1...3 let mapped = map(r) { (i: Int) -> Int in println("mapping (i)") return i*2 } println("crickets...") for i in mapped { println("mapped (i)") } println("tumbleweed...") println("index 2 = (mapped[2])") // prints out crickets... mapping 1 mapped 2 mapping 2 mapped 4 mapping 3 mapped 6 tumbleweed... mapping 2 index 2 = 4
The returned object from map
mimics the properties from the object passed in. If you pass in a sequence, then you get back a sequence. If the collection you passed in only supported a bidirectional index (such as a Dictionary or a String, which can’t be indexed randomly), then you won’t be able to subscript arbitrarily into the mapped collection, you’ll have to walk through it.
map
is not the only lazy function. filter
and reverse
also return these “view” classes. Combining these different lazily evaluated classes means you can build up quite complicated expressions without sacrificing efficiency.
Want to search backwards through an array or string? find(reverse(myarray), value)
won't actually reverse the array to search through it, it just iterates over the ReverseView
class, which serves up the array in reverse order.
Want the first 3 odd numbers that are a minimum of two arrays at each position? first_n(filter(map(Zip2(a1, a2), min) {$0%2 == 1}, 3)
wont do any more zipping or minning or modding than necessary.^{1}^{2}^{3}
There are also several lazily generated sequences you can also use:
enumerate
numbers each item in a collection — it returns an EnumerateGenerator
, that serves up those numbered elements on demand. This is useful in a for (count, val) in enumerate(seq)
loop where you want to know both the value and how many values so far.
PermutationGenerator
is initialized with a collection and a collection of indices into that collection, and serves up each element at each index in that order (thus allowing you to lazily reorder any collection – for example, PermutationGenerator(elements: a, indices: reverse(a.startIndex..
<a.endIndex))
would be the reverse of a
.^{4}
GeneratorOf
just takes a closure and runs it each time to generate the next value. So var i = 0;var naturals = GeneratorOf({ ++i })
sets naturals
to a sequence that counts up from 1 to infinity (or rather, to overflow).
These infinite virtual sequences can be pretty useful. For example, to do the same thing as enumerate
, you could write Zip2(naturals, somecollection)
. Or you could pass your infinite sequence into map
or filter
to generate further inifinite sequences.
(at the opposite end, if you have a single value, but what a function needs is a Sequence
, you can use GeneratorOfOne(value)
to create a quick sequence that will just serve up just your one value)
It's not all rainbows and unicorn giggles though. Imagine you did the following:
let r = 1...10_000 mapped = map(r) { megaExpensiveOp($0) } let a1 = mapped let a2 = mapped
Here the lazy evaluation will work against you. megaExpensiveOp
will be run not ten but twenty thousand times.
“Shouldn't map
cache the data?” you ask. Well that leads to the next complication. Take this code:
var i =0 let mapped = map(1...5) { i += 1 return $0 + i }
Every time you iterate over mapped
now, you'll get different results.^{5}
This behaviour might be put to very good use (say you wanted to peturb a data set with small random values). But if you weren't expecting it, that could be one nasty bug to track down.
So its good to know when you're using these functions what they are actually doing under the hood. Just keep in mind these caveats and an exciting life of lazy evaluation awaits you in the off-world colonies.
first_n
isn't a Swift standard library function. But it should be! ↩
reverse(a)
would work better, but a more useful example is tough to fit on one line… ↩
Enter, generators.
In simplest terms, Python generators are objects that act like lists (i.e., are “duck typed” to act as lists) since they implement a next()
method. The distinction that generators hold over lists is that generators are “lazily evaluated”, which is to say they only yield values when they need to. Let’s see an example below:
def my_gen():
yield 'a'
yield 'b'
yield 'c'
print my_gen()
for i in my_gen():
print i
yields
<generator object my_gen at 0x10ea04f50> a b c
We see here that calling the function my_gen()
is what actually returns the generator object. We also see that we’re using a special keyword “yield” here (as opposed to “return”). This indicates to Python that the function should return a generator, as opposed to the value indicated. Yields and returns don’t mix, as we can observe below:
def f():
yield 'x'
return 'y'
yields
SyntaxError: 'return' with argument inside generator (<pyshell#104>, line 3)
But let’s dissect how my_gen()
is actually operating. How does the for-loop know when the generator is exhausted? Well, since we know the generator implements the next()
method, let’s invoke it manually below:
def my_gen():
yield 'a'
yield 'b'
yield 'c'
g = my_gen()
print g.next()
print g.next()
print g.next()
print g.next() # oops
yields
a b c Traceback (most recent call last): File "/Users/Matt/PYTHON_FILES/COLOR/CODE/source.py", line 11, in <module> print g.next() # oops StopIteration
Ah, of course. Generators are duck-typed to be like lists. So, of course, they indicate that it’s time to stop iterating with a StopIteration exception.
]]>