Re: [ledgersmb-devel] RFC: Formalize templates interface and split out onto CPAN

31 Aug 2017

      Ok. What is the problem with the render_to_psgi() functions when you throw
...
...
away the success/fail indicator and convert that to a 'die()' when the
value isn't equal to HTTP_OK ? I mean, that amounts to the same thing as
the pair you're proposing? The body is one of two things: an array of
strings or a filehandle. That should be easy to convert to a string (as you
want it?).
That's one too many options for the body in my view.  It means unneeded
complexity given that different formats will probably return differently
here and you always have to handle both.
Yes. That's exactly the point: you *have* to handle both. In any reasonable
context, you can't know what the size of the returned document is and there
should not be any assumptions as to whether the document can or needs to be
loaded into memory in its entirety.
...
I would like to see, ideally, one response format only.  A file handle
would be good if we standardised on that.
But from a performance perspective, it's not always desirable to wrap
content in a wrapper object. And from a DDoS perspective, it's not always
desirable to load the entire response in memory. To that extent I think
that the PSGI interface has been designed pretty well.

LaTeX->PDF documents are already on disk and can be returned using a file
handle. HTML documents are (usually) already in-memory and can be returned
using scalar references.

Here's a case in point:  Suppose I want a cron job which emails a
...
...
...
spreadsheet of overdue invoices (ageing detail report).  I think this
should be a fairly easy thing to do even with the global state
requirements, but on 1.4 it goes something like "fork our reporting engine"
and the problem still is not fully fixed in master (worse, it is not
*obvious* that the problem is not fully fixed in master because of
assumptions in the templating engine).
I'm afraid I'm missing the point. In my mind you'd need to instantiate a
LedgerSMB::Report::Aging, call its "render_to_psgi()" method and take the
body out of the returned triple. Then you do with that body whatever you
want.
I'm not sure I see what the problem is that isn't completely fixed in
master.
Quick:  does the csv and OOO adaptors return bodies with the same
interface?  Or do I have to find out what kind of reference the body is and
handle it?
You always have to do the same thing, which means: Use Plack::Util to grab
the returned body.
...
And why does the template library have special knowledge of HTTP status?
Because we never settled on a good error handling model. Some parts of the
code are returning undef, others are dying in combination with Try::Tiny
(try/catch). Until we decide about a fitting error handling model, I don't
see other options than to simply be pragmatic and apply the examples from
the library we're using (e.g. Plack/PSGI in this case).
...
(Note that I'm avoiding to say that we have resolved the problem that none
...
...
...
of our modules behave as "good libraries" and make lots of assumptions
about the context in which they're being called, like accessing and/or
assuming global state having been set up.)
Agreed.  One of the point of trying to formalise things that can be
broken off is to make good libraries and a smaller footprint.  But this is
about two things.  The first is general interface and having an interface
which avoids assuming that every component is being accessed from the whole
framework.
Yes. We completely agree there.
WAIT! While going over the response above, I'm suddenly realising your
use-case is with the aging report!
Then it's my time now to say that you're confusing one problem for
another. Our LedgerSMB::Template is conflating two completely unrelated
(although both "template" related) problems: LedgerSMB::Template generates
our UI *and* LedgerSMB::Template generates our templatized documents.
Yep.  And the UI portion will probably eventually go away.
If you mean "go away from LedgerSMB::Template (into LedgerSMB::UI)", then
I can follow. However, I'm not so optimistic about LedgerSMB::UI going
away. I'm thinking it might switch template processors, for speed purposes,
but I'm not (unfortunately) seeing it be replaced by Dojo completely.
...
While doing the latter, it even mistakenly sends them as "UI responses"
...
(HTTP responses).
I've long wanted to separate the "UI templates processor" from the
"document templates processor". Having that separation resolves 99% of the
cases where people may want to run reports from outside the context of a
web request.
Exactly what I am trying to get to.
Ok.
...
I'm not convinced that separating "render_to_psgi()" out from the "UI
...
templates processor" actually brings us anything: Dancer's "template()"
function is very tightly nit to the response being sent too.
Right but Dancer's template engine assumes you are going to use it with
Dancer, to generate web pages with HTML and jquery.  I guess I am
questioning that assumption for an accounting package.
Ah. But you're looking at it wrong. We're using LedgerSMB::Template as part
of the "LedgerSMB web glue layer". It has nothing to do with the accounting
package LedgerSMB, but much more with LedgerSMB's infrastructure. We've
been writing all our components into the single application because these
frameworks didn't exist yet. And while we're moving to Plack quickly, we're
not completely rewriting the application to use Dancer (yet). So, we still
need our custom glue layer for a while.
...
I think these two are closely connected and global state is a part of
...
...
the second.
...
The case that again immediately occurred to me here was the fact that
LedgerSMB::Report->render sends the result in CGI format, despite the fact
that this is a module introduced in 1.4.
We didn't "do" PSGI-native-responses until 1.6, so, both 1.4 and 1.5
return CGI responses which are being converted to PSGI responses on the
request-dispatch boundary.
...
And even in master, you have a choice of PSGI or CGI only (because for
historical reasons we output to cgi by default).  Yes, we can fix it in the
reporting engine by adding a line of code but I guess I see this as a
single responsibility problem because the templating engine takes on the
responsibility of output which I think is a problem.
The problem is that we have a single templating engine to generate our UI
output as well as our "documents" (invoices, reports,...) output. (And
because they were conflated, we stored the report templates in the UI/
tree, instead of in the templates/ tree (!), with the huge downside that
they're not translatable or per-locale modifiable.)
Right.  So suggesting we a shim for old code and a clean separation of
concerns.
I'd like to avoid using shims: they obfuscate the execution flow
(generally) and tend to stick around way longer than planned and desirable.
Can this same effect be achieved by *not* using a shim and simply coding
"the right thing"?
...
(But if it doesn't take on the responsibility to output then I don't think
...
...
it should take on PSGI responsibility either.)
There is a related question here which is where should the HTTP status
messages come from but I think that's an additional discussion. I think
they should only come from the workflow scripts or from PSGI.pm but open to
other ideas.
Now we're getting back to the beginning of the discussion: if the UI
templating engine doesn't take care of generating the actual output, then
what role *does* it have? I mean, expanding the template is a function
already performed by Template Toolkit...
I guess that's the question, isn't it?
Here's what I would like to see (and I think this goes well with the
efforts to move to a more service-oriented front-end):  I would like to see
the template infrastructure be a report and document generator which
abstracts generation of documents in a variety of formats for the
programmer.  That's one responsibility.  It is a narrow one.  We can build
something that then works for the future by ignoring what happens after we
generate the document.
Ok. I like that. As long as the definition of "document" here excludes the
HTML being generated as part of the HTTP response generation. Because
that's an entirely different league of "documents" than the documents I
think the engine should focus on: the user-visible, possibly translatable
or adjustable envelopes, invoices, and loads of reports.
...
Put another way, what I see the module doing is providing an abstraction
around "I want a report" and "btw, I want it in pdf format" or "I want it
as a csv."  Or "I want an invoice" and "Please make it a PDF."
Ok. If you want it maximally flexible for the future, I think we want "I
want an invoice. Please use <this> template. Please make it a PDF" (because
then we can maintain documents which are PDF output from LaTeX while others
can be PDF from HTML).
...
Now to do this it has to do two things:
1.  Maintain knowledge of available formats on the system (and possibly be
able to do so on a per-template basis, something we are currently missing),
and
2.  It has to provide a stable interface for format providers to use to
communicate with it and to the program as a whole.
This means that what we have now for Template.pm in my proposal is
basically a point of dependency injection, which maintains the
responsibility of being aware of what dependencies have used it (PDF, HTML,
etc).  The Template::LaTeX, Template::HTML, etc. then take on the
responsibility of plugging in to allow document generation in a singularly
consistent API.
Ok. But I found Template Toolkit to be extremely flexible (we implemented
it to read from the database instead of from a filesystem, completely
plugin-based!). So, I'm still wondering if this can't be achieved with
Template Toolkit and a few wrappers and plugins.
...
That sounds small and narrow and it is.  But think about what that enables
as well:
1.  Dynamic format detection (What formats do I know about?  PDF, HTML,
PS, OOO, and Gnumeric*)
2.  Dynamic template detection (What formats can give me an invoice with
my current template list?)
* hypothetical new format we just installed
Note we have to provide a shim for old code.
Why? I mean, old code (as in code in old/) typically invokes the template
processor to generate a true document (the HTML responses are "composed"
inline). The difficulty might be existing code in lib/. Although there you
could probably search for 'render_to_psgi' and try to establish if there's
a document or a UI being generated in that specific invocation. (All
'render()' invocations probably are real documents being generated.)
...
But I think the stabilisation and generalisation here would be helpful.
More tangibly, it would allow someone to write an HTML -> PDF template
engine third party, and just have it plugged in if that was still desired
by anyone, or for someone who wants to be able to generate OOO doc invoices
to be able to do so without demanding that everyone else support the
functionality.
At any rate that's what I had in mind.
Thanks. That's much more detailed and as such better understood (by me) as
to where you're headed. So, to reiterate: I think splitting UI and
"document" template engine wrappers into two categories sounds like a
major, good step forward. I'd like us to try really hard to stay away from
shims as something that's more than just a proof-of-concept; in the sense
that it can be a way of hooking something in, but I'd say that the
refactoring is only done when the shim is gone.

Regards,

-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

Re: [ledgersmb-devel] RFC: Formalize templates interface and split out onto CPAN

Erik Huelsmann