Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • disk usage could go up
    • See explanation, below.
  • if you're not using Linux, you need VirtualBox-ish
  • running systemctl or systemd units inside a container seems impossible / very difficult
  • best practices lead to lots of framework
    • E.g., the suggested basic Magnolia setup (1 author, 2 publics, each with different database) would require 6 different docker containers.  Those 6 docker containers would have to be strung together with Docker compose, and Docker compose does not seem to work on multiple hosts, so, you'd also need to set up a Docker swarm. 
  • to fully utilize CI/CD, might need an enterprise-level Docker license


Info
titleFrom a discussion with Nicolas B. in Pre-Sales room:

...

[9:48 AM] Nicolas Barbé: In short, there are two issues : 1. Magnolia is stateful 2. You cannot run Two magnolia instances on the same DB. Because of that, we can't leverage advanced cloud features which are provided out-of the box by k8s or aws such as auto-scalability or B/G deployment. Our customers have to implement a lot of glue code to make it work, which kills the argument of using such platforms. With container based orchestration, the situation is even worse, because they can reallocate dynamically the containers to different hosts. This is the way failover is implemented and that the whole cluster scale. Magnolia instances must be declared as "fixed" instances, which cannot be moved to a different host. Which again kills all the advantage of having k8s (or other container based orchestrator).

Our customers ask for k8s because they have invested in this technology to leverage these otb features and to have a consistent way of managing "services".

[9:50 AM] Nicolas Barbé: To be more complete : JCR clustering is broken and  k8s statefulset  is not something you want to do with Magnolia

[10:13 AM] Nicolas Barbé: Sure, it's written in the Jackrabbit documentation itself https://wiki.apache.org/jackrabbit/Clustering
[10:15 AM] Nicolas Barbé: In short, JCR clustering uses a log.  This log is used to spin up new instances and sync them (index) relatively quickly. The log grows up quickly, to clean-up the log, you need to activate the janitor. If you do so, you can't spin up new instances since part of the history will be missing
[10:16 AM] Nicolas Barbé: Plus, even with this mechanism, creating the new instance is not something obvious.
[10:17 AM] Nicolas Barbé: Good new is that things are different with OAK, I don't know if other JCR implementations have the same issues

[10:18 AM] Jan Haderka:

    |  you can't spin up new instances since part of the history will be missing

[10:18 AM] Jan Haderka: that's just partially true.

[10:19 AM] Jan Haderka: you can still spin new instances, but from a snapshot taken from synced cluster node after last janitor run

[10:20 AM] Jan Haderka: and there's other workarounds.

[10:22 AM] Nicolas Barbé: yes true, that's what i meant with "glue code" earlier

[10:27 AM] Nicolas Barbé: k8s and similar tools come with a cost, customers expect to get an ROI out of that mainly not writing glue code anymore. Actually in k8s there is no good way to trigger the glue code (unless it has changed)

[10:28 AM] Nicolas Barbé: Or they have to deal with low-level k8s API which limit their capability to upgrade the cluster.

...

Explanation

How Docker works: There are many pre-fabricated Docker images available online (and you may make / contribute your own).  For any task / service you want to run (say, mysql or tomcat), there's probably already some image in place.  When you want to use an image that already exists, Docker uses an algorithm called 'copy-on-write'; this means basically lazy copying: Docker will effectively create a pointer to the image you want to use, and only creates a new image when you want to actually alter that existing image; this makes starting it up pretty fast. 

...