Recently, I've been in a mood to play with ORMs in Haskell. The most recognizable one is Persistent, developed as part of a Yesod project. It is quite pleasant to work with, I must say, but some assumptions it makes are... not suitable for me. So I've looked for something different and I found out about Groundhog, yet another ORM in Haskell. Unfortunately, it is almost identical to Persistent, but maybe instead of searching for the perfect library, I will be able to adapt it to my needs?

The point is - I don't want fancy ORMs with custom DSLs in pure Haskell that will allow me to express every query (well, sort of). Of course, I don't want to write simple `SELECT`

s with some conditions by hand as that may be expressed in EDSL well, but advanced queries are just PITA to write in something different than SQL (and even in SQL it's not that easy). But being able to write raw SQL and then map the result to a custom type is tempting.

My goal is simple - force Groundhog to decode results of raw SQL queries into Haskell objects. This means using `queryRaw`

with correct decoder. Sounds simple. But it isn't. Converting a list of values into a real object, using different code for every type is painful and I've *really* wanted to avoid it. To my luck, I was able to leverage `PersistEntity`

, something that every type usable by Groundhog must instantiate. So, let's start to code!

## The code

Assume we have this model:

data Author = Author { authorName :: Text , authorEmail :: Text } data Category = Category { categoryName :: Text } data Post = Post { postTitle :: Text , postContent :: Text , postCategory :: DefaultKey Category , postAuthor :: DefaultKey Author } -- This is a triple (Post, Category, Author), but using PCA c-tor is simpler data PCA = PCA { pcaPost :: Post , pcaCategory :: Category , pcaAuthor :: Author } mkPersist defaultCodegenConfig [groundhog| - entity: Author - entity: Category - entity: Post |]

And we want to get a `Post`

with its `Category`

and `Author`

(`PCA`

). It's just a `SELECT`

with two `JOIN`

s, so I'll skip it. How to do so? Simple - `queryRaw False queryString [PersistInt64 i] decode`

. The problem is - how to write the `decode`

function? ;)

Let's face a less complex problem first. How can we decode whole `Post`

? It instantiates `PersistEntity`

, so we can use `fromEntityPersistValues`

. And it has a signature that is almost identical to desired - `[PersistValue] -> m (v, [PersistValue])`

vs `m (Maybe [PersistValue]) -> m v`

(and the constraints on `m`

and `v`

match, more or less)! We just need to extract a list from a `Maybe`

, pass it to this function and select the first element of the result:

decode v = do Just v' <- v fst <$> fromEntityPersistValues v'

or even

decode = fmap fromJust >=> fromEntityPersistValues >=> return . fst

This was easy. There is one gotcha here - default implementation of `PersistEntity`

generated by `groundhog-th`

uses a magic value, encoded as a first column, to determine which constructor should be used (in sum types). Sadly, it does so for EVERY type, even for simple product types like `Post`

. We need to take care of it ourselves either by manually `SELECT`

ing 0 as first column, or prepending `PersistInt64 0`

to the `v`

.

Let's take a closer look at `fromEntityPersistValues`

. What does it do? It tries to decode an object from a list of `PersistValue`

and returns the remaining values or fails with an error. If there are more values than are needed, it will consume just the required ones and let us decide what to do with the rest. This allows us to construct the inner objects first and then build the outer, like this:

-- To simplify, I assume that `v` is `[PersistValue]` not `Maybe [PersistValue]`. decode v = do (post, v1) <- fromEntityPersistValues v (cat, v2) <- fromEntityPersistValues v1 (auth, _) <- fromEntityPersistValues v2 return $ PCA post cat auth

Can you spot the pattern? Decode an object, save it and, if we need more, (sort of) *recursively* decode, otherwise construct the final object. But do we really need to construct the object at the end? Can't we just do it in the meantime? Of course we can! ;) `PCA`

(or any other constructor) is just a function and we may leverage partial application to keep track of the partially constructed object (or, to be exact, a partially applied constructor).

Additionally, `PCA`

has a type `Post -> Category -> Author -> PCA`

and we want to decode an object that may be constructed with a function of type `(PersistEntity a, PersistEntity b, ...) => a -> b -> ... -> Object`

or, more generally, a function of type `PersistEntity a => a -> b`

where `b`

may be of the same form or is a fully constructed object. Having this, we can develop a function that tries to decode a single object (thus consumes some of the `PersistValue`

s), applies it to the partially applied function and returns it along with the rest of values.

decodePart :: (PersistBackend m, PersistEntity a) => (a -> b, [PersistValue]) -> m (b, [PersistValue]) decodePart (p, v) = do (o, v') <- fromEntityPersistValues v return (p o, v')

This way, we can just bind some calls to `decodePart`

(for `PCA`

- 3) and we have fully constructed object:

decode v = decodePart (PCA, l) >>= decodePart >>= decodePart >>= return . fst -- or with TupleSections extension we can make it poit-free decode = (decodePart >=> decodePart >=> decodePart >=> return . fst) . (PCA,)

The problem is - we must put as many calls to `decodePart`

as parameters to the constructor. This makes it quite unpleasant to work with. Fortunately, we can overcome this with a little bit of `TemplateHaskell`

.

As I said earlier, the constructor's type is always of the form `PersistEntity => a -> b`

(where b may be of the same form). This means that we could call `decodePart`

recursively, but unfortunately I don't know what to do to make it stop calling with only a function (as we can't examine type in code that easily). But! We can make a typeclass that will be able to do exactly this.

class PartialDecode m a where type PartialResult a decodePart :: (a, [PersistValue]) -> m (PartialResult a, [PersistValue])

We need the associated type `PartialResult`

instead of an additional parameter to the typeclass and functional dependencies as they would make the following definition a nightmare to write (or maybe even not possible).

And the key - an instance of `PartialDecode`

for `PersistEntity => a -> b`

(assumes that the `0`

, which denotes the constructor, is prepended in SQL):

instance (PersistBackend m, PersistEntity a, PartialDecode m b) => PartialDecode m (a -> b) where type PartialResult (a -> b) = PartialResult b decodePart (p, v) = do (a, v') <- fromEntityPersistValues v decodePart (p a, v')

Call to this `decodePart`

for functions like `PCA`

, will construct a real object, but will not compile as this sole instance requires the type to be infinite (as we always require a function to return another function). We have to have a trivial instance of `PartialDecode`

for every type we want to decode (because Haskell does not allow to write a general form) that will just `return`

for `decodePart`

.

-- For PCA instance (Monad m) => PartialDecode m PCA where type PartialResult PCA = PCA decodePart = return

Writing this boilerplate code is tiresome and that's where Template Haskell helps. With a little helper function we can collapse this definition to a single line (or, with a little bit of tweaking, we can force Groundhog to do this for us):

mkPartialDecode :: Name -> Q [Dec] mkPartialDecode n = [d| instance (PersistBackend m) => PartialDecode m $(conT n) where type PartialResult $(conT n) = $(conT n) decodePart = return|]

`mkPartialDecode ''PCA`

and the instance is generated. A little bit of `fmap`

ing, `mapM`

ing and `concat`

ing and we can make it accept a list of types instead of only one. We also need to have a way of selecting appropriate constructor to pass to the first call to `decodePart`

. Sadly, this is another obstacle that requires Template Haskell and a bunch of assumptions (type has only one constructor). Introduce another type class for this purpose:

class DecodeObject m a where decodeObject :: [PersistValue] -> m a instance (PersistBackend m) => DecodeObject m PCA where decodeObject v = fst <$> decodePart PCA

With a bit of TH, writing it may be automated:

mkPartialDecode :: Name -> Q [Dec] mkPartialDecode n = do TyConI (DataD _ _ _ (c1:_) _) <- reify n [d| instance (PersistBackend m) => DecodeObject m $(conT n) where decodeObject v = fst <$> decodePart ($(conE (getName c1)), v) instance (PersistBackend m) => PartialDecode m $(conT n) where type PartialResult $(conT n) = $(conT n) decodePart = return|] where getName (NormalC n _) = n getName (RecC n _) = n getName (InfixC _ n _) = n getName (ForallC _ _ n) = getName n

One final bit - real `decode`

function:

decode :: (PersistBackend m, PartialDecodeConstructor m a) => RowPopper m -> m a decode m = do Just l <- m decode l

## Summary

OK, I must admit - this post is way too long for the solution. It was fun to write (although I changed it too many times), forced me to learn a little bit of Template Haskell and opened my eyes for what I want from ORMs in Haskell. I think that the solution will not be of much use as it is not that elegant, but maybe the overall idea will be useful?

Groundhog is great, but having to be in a monad just to convert some `PersistValue`

s is not. I know that this has its application (sum types), but for me it's mostly overengineering and not something valuable, at least in a data layer. Persistent has a little different way of executing raw SQL queries (and has its own drawbacks), but everything presented here may be replicated there as well. Maybe in my next post I will show this? Who knows... ;)

I know now that I had not really knew what I wanted from ORMs. I think I still don't know, but at least I'm a bit wiser. ;)