As a couple notes, I would add the following expansions to this.
"Allow the widest possible version range on any module, disallowing
individual malfunctioning versions to further extend the range"
I think we are better off documenting bugs in misbehaving versions of
modules and perhaps offering a module check utility rather than disallowing
certain versions of modules.
How do you envisage the disallowing taking place? I was actually not
thinking of actively disallowing specific versions, but rather document the
fact that the module doesn't work with LedgerSMB by putting a
version-exclusion in our cpanfile (the canonical location to declare
Right and my point is that this is under inclusive since it is merely an
I don't understand. You want to replace an under-inclusive mechanism with
an over-inclusive one? Why? I mean: if a user uses our under-inclusive
mechanism, he or she gets a working install (presumably). With an
over-inclusive mechanism there's much more room for problems.
and moreover lacks any forward knowledge of later
Well, when I say "known broken" then in fact it doesn't, because I
claim that the list is complete either way (forward or backward). What I
*do* claim is that the knowledge we do have is reflected in cpanfile. We
could put it there in proze, but that doesn't help much as the rest of the
file is intended to be machine-readable. Writing out knowledge in README.md
doesn't make it machine-readable either and feels obfuscated too.
Also I can see possibilities where a package
distributed under Debian or
Gentoo might have a back ported fix and therefore it might also be over
inclusive since there might be patched versions with the same version
number which lack the misbehaviour.
Sure. But as you pointed out, the check is install-time only. And then even
only when using a CPAN-tool (cpan, cpanm, cpan<whatever>). Package managers
(dpkg, yum, ...) will not test these dependencies. So for a Debian or
Ubuntu package, nothing happens when they have backported the fix and use
the "declared unsupported" version-with-the-patch-which-makes-it-supported.
But if we do not document known mis-behaviour, how are packagers to find
out about these incompatible versions?
I say this for two reasons:
1. Managing this list is annoying and deciding when to block a specific
version of a module is going to be an annoying and political decision, and
Well, so far I've explicitly excluded versions which were known to be
buggy. E.g. there's a version of PGObject-Type-Bytestring which is known
not to work with LedgerSMB, because I've fixed the bug that was exposed
with you. My plan wasn't to go "hunt" for non-working versions.
What I would prefer would be a range of versions where the software was
intended to work together and not make bugs in dependencies the
responsibility of the LedgerSMB cpanfile at all.
Well, I don't think bugs in dependencies are the responsibility of
LedgerSMB at all. What I do think though is that if we want the software to
be used by people, we need to offer them a setup that works out of the box.
Which means we need to do whatever we can to avoid people getting into our
chat channel or mailing list saying "Your software doesn't work: I just
spent 2 weeks getting it to work and there's just no way", where the answer
is "Oh! But if you're using *that* version of module so-and-so, we could
have told you in advance it wouldn't work...".
The reason again being that the bugs in the
conceivably be addressed in a number of different ways (patching the
dependencies, possible by a packager, upgrading the dependencies, etc) and
we have no knowledge of that.
Correct. And as soon as people don't use the vanilla versions, they're on
their own. We could add documentation to the cpanfile why we exclude
certain versions, though. That could help packagers decide whether or not
the misbehaviour has been fixed in the version in their distribution.
What I still don't understand is if your opinion is that we should just let
people struggle and if not what other mechanism you think we should use to
"encode" our knowledge.
2. A block on a module version will not prevent someone who has an older
version of the
module from upgrading to a blocked version since it is an
install-time check and another piece of software could require a newer
version and thus break the install-time check.
Yup. But it does help packagers and the cpanm installer when doing their
How does this affect back-porting of patches to otherwise stable
Suppose for example there is a bug with some version of a common central
dependency (say, something central like the Plack/CGI adaptors) that causes
us grief, but Debian maintainers back port a fix for it. Do we disallow
the patched version?
No, we disallow the original. If the person applying the patch does not
change the version number, then there's no way we can know the difference,
can we? But more importantly, the packager of *our* package can inspect the
dependent packages and decide *not* to follow our dependency advice -
assuming we include sufficient documentation as to why specific versions
have been excluded.
Do we say "you cannot install from our source
in this case?" I don't
know what the answer is here. I don't even know if we want to make it our
problem aside from support.
Maybe we could say that we only blacklist module versions in our direct
control (like the PGObject stuff?)
Why is that any better than the "general module case"? Do we control
packages in Debian not getting patched for PGObject?
So I would suggest deciding required versions by
and providing a tool or just documentation for checking for other
problems in the mean time.
To what extent would that tool be different than 'cpanfile' and our
dependencies test "xt/01.2-deps.t"? (It's in xt/ which could be a reason
to develop an additional test for the actually installed versions...)
Secondly I would like to suggest that a focus on
moving things to CPAN
and breaking the application into components for better testability.
I'm all for breaking it up into components. However, I'm thinking that
moving those components to CPAN is really only useful when the component
has a function outside of LedgerSMB (i.e. is generic enough).
Certainly our reporting engine would be.
The formats, you mean, I take it? Or do you mean a broader scope? If it's
the formats, I really wonder what they add on CPAN: as an example, after my
refactoring on master, the LaTeX format is no more than 30 lines (less the
escaping routine). But there's another thread waiting for that discussion.
I'll respond in that thread.
Also a lot of tooling around LedgerSMB (such as what
goes in the setup.pl)
might be something we might want to break apart and have as a separate
management apps supporting many stable versions. Our templating and mailer
modules might be. And then the question becomes whether something like
contact management is general enough to be modularised and spun off.
But my larger point here is that this generates some work for downstream
packagers and we should also think about how to make this manageable
there. I think we also want to think about the issues of spinning things
off as a part of dependency management because at that point we are
generating our own dependencies.
Yes, we have spun off our own dependency with PGObject as well. So, yes, I
think that's a good thing to think about. My approach was that we think
about it when we spin off the code that can be spun off; possibly we need
to change the dependency declaration "rules" when we do. Currently though
we don't have one and we're not spinning off more code today (are we?). So,
my idea was to solve the problem that we're having today and when we want
to make things more complex tomorrow, then we solve the added complexity in
its full breadth (meaning including dependency management).
"Require as few a possible modules in the
expanded dependency tree
modules as direct dependencies which are already depended on
Additionally I think that when we spin things off we should shoot for
modules which are simple in interface, within a year or two will likely be
pretty close to bug free and unlikely to require dramatic modifications,
and therefore keep spin-offs throttled so they do not cause undue chaos for
The reason to put this point into the policy is actually not mainly for
packagers (although, as you note, it does help). The main reason to put it
there is to support those installing from source. The current dependency
list for LedgerSMB is huge. This doesn't mean it shouldn't grow new
dependencies, but my point here is that if we can - sometimes even
significantly - reduce the dependency tree by being careful about which
immediate dependencies we choose.
This actually brings me to a reason we may want to be thinking about
pushing as much off (eventually) into external dependencies as we can.
Right now we have a huge direct dependency tree. This is a problem for a
number of reasons. It makes it hard for people to install from source. It
also creates a whole lot of extra work for packagers.
Is our dependency tree in general a problem? Or the fact that we have a
huge *direct* dependency tree?
For example when creating the Gentoo overlay I am
expecting the vast
majority of my time to be managing and testing our dependency tree.
Packaging the dependencies is no problem. Making sure I have all of them
and that they all work as expected. That's the problem.
But doesn't that problem remain when you make the dependency tree a little
bit more indirect by splitting off the mailer? I mean, the mailer still has
the original dependencies, we now depend on the new mailer, so, all in all
we added 1 more dependency to manage, right?
But now, imagine that we could take the reporting and
and spin them off. How many dependencies could we push to indirect by
doing so? How much easier would it be to test installation of optional
capabilities? By my count we could reduce direct dependencies by at least
2 and optional dependencies by 4 or 6 depending on how we do it. If we
spin off the reporting engine as a whole we could perhaps reduce
dependencies long-term by at least 10 (4 standard and 6 optional).
I'm less optimistic there, because I'm assuming that at least for a number
of years to come, we'll be the only ones depending on those intermediate
dependencies, which means we basically just added complexity to the mix.
" Not depend directly on modules which have
Big caveat here is that some modules that are used *because* they are
overlapping might be in a different position. Moo/Moose is a good example
and both are widely used enough that there is no harm in using Moo instead
of Moose in spun-off modules. However we still have some old paradigms in
the code that need to be cleaned up and we should also focus on this.
Well, the point about "depend directly" is that our tree either depends
on Moose *or* on Moo, but not both. Whether or not our dependencies
themselves depend on the other of the two, is up to our dependencies, of
If we incubate modules we are spinning off, we may want to depend on Moo
and Moose temporarily because the justification for adding Moose to a
spun-off dependency I going to be a bit higher than Moo.
Fair enough. While the policy in my opinion should be taken seriously, I
also think that we should still allow ourselves to be reasonable. This is a
case where I think the policy should probably be suspended for a certain
period of time. I'm thinking that we'd need to discuss that here on a
case-by-case basis in order not to become too hand-wavy with the policy.
"Not require modules which only provide
Also, while I am happy to see optional modules used to provide
nice-to-have functionality it is worth noting that there are two things
that need to be guarded against here. The first are extra dependencies.
The second is extra code complexity that comes with adding such options. I
would say that both need to be justified before we should allow them and
where we do allow them we should think about how to set up proper
interfaces so we don't have lots of conditional module handling everywhere.
Ok. This is a broader point than our dependency management. I think this
is a point to be put in the category "what should we consider when thinking
about adding new features/functionalities?"
"Group feature dependencies into their
respective features as much as
possible (so as to not require them for a more basic installation)"
A corollary here might be that long-run a lot of the application today
should be relegated to optional features as we go forward.
If someone doesn't need inventory tracking, why install it? If one doesn't
need more than GL, customers, vendors, and basic reporting, why install
more than that?
Yup. That does work. Personally I think that installing inventory
tracking (if it doesn't add many new dependencies), shouldn't be a big
problem; we're in the terabyte era and installing one such feature is just
a few (hundred) kilobyte. Also, having the feature installed might actually
reduce code complexity. However, being able *not* to install the feature
would seem to require some infrastructure to create, install and register
components. Which might work out great for (third-party) components which
are yet to be written.
Of course we aren't there yet but I would
suggest that a
forward-thinking view of dependencies we might want to think about it not
only regarding the dependency problems we encounter with maintaining and
working on the current code but also in the question of how modularised and
loosely coupled we may want the code to be in the future.
My approach here has been to put the problem of dependency declaration in
a system with loosely coupled components is to be part of a design of such
a module system; that is, for now, I haven't been looking too far into the
future and trying to make the best out of the current situation for current
package managers and installers-from-source. The immediate trigger for my
previous mail was a discussion on #ledgersmb with Yves and David about the
fact that Yves declared the minimum version of PGObject to be 2.0.2. I'm
against requiring such a new version as the bare minimum since the code
base supports anything at or newer than 1.403.2. Hence my idea about
requiring a policy with a solid reasoning of why we do things the way we
do. Then we also have a good foundation to evaluate the minimum PGObject
Agreed. I am not even sure we should blacklist versions just because they
are broken on CPAN because there could be patched versions packaged in
distros. Mere warnings in logs is not enough to bump a version.
Right. I'm not saying we should be checking the version on each request or
even on each Starman start-up. I'm restricting it completely to
installation time and package-time documentation (but machine-readable).
Looking at the cpanfile, it appears to be the common
because apparently blacklisting individual versions doesn't work. See the
# cpanm doesn't handle our true dependency declaration correctly:
# PGObject::Simple 3.0.1 breaks our file uploads
#requires 'PGObject::Simple', '>=2.0.0, !=3.0.0, !=3.0.1';
#requires 'PGObject::Simple::Role', '1.13.2';
# so we use:
requires 'PGObject::Simple', '3.0.2';
requires 'PGObject::Simple::Role', '2.0.0';
I think we would do better to require PGObject::Simple 2.0.0 and up and
leave it to packagers to deal with the 3.0.0/3.0.1 breakage for example.
The reason that's there the way it is, is because we test with Travis CI
and we let Travis CI handle the dependency tree setup completely by way of
cpanm. If we were to force-install 3.0.2 before running the rest of the
dependency installation or we'd manage the complete dependency tree
ourselves - as a packager would - then we can return to the commented-out
require statements. In other words, it's the fact that cpanfile is
overloaded in the CI process which causes the commented lines to be
commented. Ideally, we manage the installation of dependencies of TravisCI
ourselves (at least partially); then I can remove the work-around lines and
uncomment the original lines.
Thanks for your response! I think it's good to have this discussion in
order to have a common understanding of what our respective expectations
and goals are.
-- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.