
Power Query (or rather M) is a functional language. This simply means that functions can be passed around just as any other variable—or stored in lists, records and tables just like any other value. As a consequence, a function can take other functions as arguments, or itself return a new function.
I want to delve a bit deeper into this at some point but this post will give a very tangible example how applying functional techniques can make your code both easier to write and easier to read.
Untangling the nest
A colleague recently reminded me of this post I had made on the Power BI forums years ago. It offers a, rather roundabout, way of generating hashes in Power Query. This comes in handy primarily when there’s a need of anonymising data.
The solution looks like this (see post for details):
CalculateHash = (x as text) as text => Binary.ToText(
Binary.FromList(
List.FirstN(
List.LastN(
Binary.ToList(
Binary.Compress(
Text.ToBinary(x, BinaryEncoding.Base64),
Compression.GZip
)
),
8),
4)
),
BinaryEncoding.Hex)
Nested code like this is just awful to read, with the first operation on the function input hiding somewhere in the middle of f8(f7(f6(f5(f4(f3(f2(f1(x))))))))
.
We can write this in a more idiomatic Power Query fashion, binding each step to a self-documenting identifier:
CalculateHash = (x as text) as text => let
#"Convert text to binary" = Text.ToBinary(x, BinaryEncoding.Base64),
#"Compress binary data" = Binary.Compress(#"Convert text to binary", Compression.GZip),
#"Convert binary to list" = Binary.ToList(#"Compress binary data"),
#"Extract footer" = List.LastN(#"Convert binary to list", 8),
#"Extract checksum" = List.FirstN(#"Extract footer", 4),
#"Convert checksum to binary" = Binary.FromList(#"Extract checksum"),
#"Convert checksum to hexadecimal" = Binary.ToText(#"Convert checksum to binary", BinaryEncoding.Hex)
in #"Convert checksum to hexadecimal"
This is a good approach if you want to see each applied step in the Power Query editor (or if you want others to) but it’s very hard to read the code itself.
Pipes
Working with R (or other more or less strictly functional languages such as F#) there is often the notion of a pipe operator.
Say we want to take a list of numbers, filter out the odd ones, double the remaining even numbers and then reverse the list. Working with R and the Margittr package it looks something like:
library(magrittr)
x <- seq(1,9) %>%
Filter(function(x) x %% 2 == 0, .) %>%
(function(x) x * 2) %>%
rev
F# is much cleaner looking—the pipe operator is built in to the language rather than reliant on an external package like R—but the principle remains the same.
let x =
seq {1..9}
|> Seq.filter(fun x -> x % 2 = 0)
|> Seq.map(fun x -> x * 2)
|> Seq.rev
The code is written in the order in which the operations take place, the F# code has the almost magical quality of making sense even if you’ve never come across the language itself before.
Can we make Power Query behave in the same way? Yes we can (kind of)!
Pipes in Power Query
We don’t have the ability to implement new infix operators in Power Query/M (that is operators that sit in between its inputs, such as +
, -
or and
) but as the language is functional we can come up with a pretty slick alternative.
The corresponding M syntax will look like this:
let x = Pipe({1..9})(
each List.Select(_, Number.IsEven),
each List.Transform(_, each _ * 2),
List.Reverse
)
And, we can write the hash generator in the original example like this:
let Hash = Pipe(1234)(
Text.From,
each Text.ToBinary(_, BinaryEncoding.Base64),
each Binary.Compress(_, Compression.GZip),
Binary.ToList,
each List.LastN(_, 8),
each List.FirstN(_, 4),
Binary.FromList,
each Binary.ToText(_, BinaryEncoding.Hex)
)
The each
operator you see dotted around here is syntactic sugar (a shortcut) for a one argument lambda such as (_) => _ + 1
.
This means that each List.Select(_, Number.IsEven)
is just a slightly shorter version of (x) => List.Select(x, Number.IsEven)
. The syntax should look familiar as you will see these anonymous functions being generated automatically by the Power Query user interface all the time.
So, how does it work? Here’s how I went about it…
Building a simple M pipe syntax
Let’s start with a value x
, and to that value let’s apply a function f1
, to the result of that function let’s apply f2
then f3
and so on. It then seems a good idea to work with an arbitrary length list
of function
and a variable of type any
.
We want to apply these functions in turn and then return a single value, which is a pefect task for the built-in List.Accumulate
function.
The function syntax looks like this:
List.Accumulate(list as list, seed as any, accumulator as function) as any
As input it takes a list (this would be our list of functions), a seed (this will be our initial value x
) and an accumulator function which we’ll get to in a second.
A simple use case forList.Accumulate
would be this, an alternative implementation of List.Sum
:
let List.Sum2 = (list as list) =>
List.Accumulate(
list,
0,
(state, current) => state + current
)
The accumulator function is a two argument lambda, the first argument captures the state of the operation while the second argument captures the current value in the list. With the seed
argument set to 0
the very first calculation step when iterating over a list of numbers 1-9
will be (0, 1) => 0 + 1
, then (1, 2) => 1 + 2
and so on.
If the current value instead is a function we want to apply to the state (remember functions are just values) we simply define the accumulator as (x, f) => f(x)
.
And the heart of our final function then is this:
List.Accumulate(
functions,
object,
(x, f) => f(x)
)
We need to bind the object
variable (i.e. the initial state) so we wrap it all up in the following function definition:
Pipe = (object as any) as function =>
let RunPipe = (functions as list) as any =>
List.Accumulate(
functions,
object,
(x, f) => f(x)
)
in Function.From(type function (f as function) as any, RunPipe)
The Pipe
function actually return a function, hence the Pipe(...)(...)
syntax when we invoke it. We could implement this as a two argument function that takes an object and a list but I personally find the Pipe(..., {...})
option a bit less easy on the eye.
This technique is called a closure, which means we can also apply any transformation in stages. For instance, let’s define let p = Pipe({1..9})
and then base multiple calculations based off of this, e.g. sumIsEven = p(List.Sum, Number.IsEven)
and hasEvenNumberOfElements = p(List.Length, Number.IsEven)
.
The very last part of our pipe functions Function.From(type function (f as function) as any, RunPipe)
is what allows the returned function to take an arbitrary number of arguments. Function.From
returns a function which converts its arguments into a list and applies it to the function specified in the second argument.
And that’s all there is to it, a few lines of code to save us writing many, many more.