____Review and audit guidelines

This review and audit guideline is intended for Solution Architects who will be performing the review.

If you want to learn more about the guidelines and how to go through it, contact Andrew Warinner

Description

This is a guide for reviewing a customer's Magnolia project sources. It is based on this Project Audit Matrix but is more focused on the code and files of a Magnolia project than the deployment and performance of the project.

IMPORTANT: Feel free to add or modify any of the checks or review items. Your expertise is needed!

Goal

The point of the review is to give the customer feedback on their Magnolia project. Some of that feedback may be stylistic or opinionated, and the customer may ignore some or all of it. The point of the review identify things that vary from what we consider as best practice, gauge their importance or severity, and recommend fixes or alternative approaches.

Each review item is classified:

Best practice: Magnolia recommended - organization and approaches corresponding to the Magnolia roadmap and avoidance of common Magnolia problems.

Best practice: clean code - general guidelines for writing clean code (usually Java), not necessarily Magnolia specific but helpful to maintaining your Magnolia project.

Each review item is evaluated by severity or significance:

Problem: immediate correction recommended - a problem that should be corrected as soon as possible, may cause significant headaches if left as is.

Problem: correct soon - a problem that should be addressed in the near future, may cause difficulties in using a new Magnolia release or not be in alignment with the Magnolia roadmap.

Problem: minor concern - a stylistic or cosmetic problem that can affect the readability of code

The review covers the following areas:

Magnolia project setup
Magnolia module organization
Magnolia light development
Magnolia themes
Magnolia Freemarker templates
Magnolia Java code
Magnolia legacy
Magnolia operation (optional)

Review: Magnolia project setup

Check	Why	What	Notes
Project uses the most recent version of the Magnolia major release.	It is important to keep up with bug fixes and minor improvements in your chosen Magnolia release.	Best practice: Magnolia recommended
Project uses Maven to build deployable web app.	While it is possible to use other build automation tools, Magnolia uses Maven and ensures that dependencies are assembled correctly into a runnable web app.	Best practice: Magnolia recommended
Maven dependencies are clearly organized and documented in Maven pom.xml files.	Clear organization and comments help in understanding dependencies and their use.	Best practice: Magnolia recommended
Versions for Maven dependencies should be parameterized and collected in a central location, ideally in the root pom.xml.	Parameterizing module versions and collecting into one location helps identify Magnolia module dependencies.	Best practice: Magnolia recommended	Regex to find literal versions: <version>[^\$]
Module definition should use placeholders for module versions, versions should be defined in root pom.xml.	Parameterizing module versions and collecting into one location helps identify Magnolia module dependencies.	Best practice: Magnolia recommended
Module definition should not use hardcoded versions.	Parameterizing module versions and collecting into one location helps identify Magnolia module dependencies.	Best practice: Magnolia recommended

Review: Magnolia module organization

Check	Why	What	Notes
A version handler is defined for each Magnolia module.	Version handlers should handle any configuration updates to an already installed module.	Best practice: Magnolia recommended
Bootstrapped content should be local to module, not dependent on content added by other modules.	It’s a good idea to build self-contained Magnolia modules that have their own configuration to localise or isolate each module so changes do not affect multiple modules.	Best practice: Magnolia recommended	Use taskcount.pl to find and count tasks used (see below)
A version handler should not bootstrap an entire file.	Version handlers should use the same bootstrap files as used in the first install of the Magnolia module and use BootstrapSingleModuleResource with a subpath to update specific configuration. Using BootstrapSingleModuleResource without specifying a subpath to update content in a version handler can cause problems during a fresh installation of a module.	Best practice: Magnolia recommended	Use taskcount.pl to find and count tasks used (see below)
Bootstrap tasks in a version handler should not specify an import UUID policy.	Version handlers should use the same bootstrap files as used in the first install of the Magnolia module and use BootstrapSingleModuleResource with a subpath to update specific configuration. Using BootstrapSingleModuleResource without specifying a subpath to update content in a version handler can cause problems during a fresh installation of a module.	Best practice: Magnolia recommended
Avoid using PartialBootstapTask.	PartialBootstrapTask was deprecated in Magnolia 5.4.4 and will cause problems in Magnolia 6.2+. Uses of PartialBootstrapTask should be replaced with BootstrapSingleModuleResource(String name, String description, String resourceName, String subPath).	Best practice: Magnolia recommended	Use taskcount.pl to find and count tasks used (see below)
Node and property tasks are preferred to bootstrap tasks in a version handler.	Using node and property tasks can make small configuration changes and avoid bootstrap problems.	Best practice: Magnolia recommended
Version handlers should avoid bootstrapping content into other modules.	Bootstrapping content into other modules by the version handler introduces dependencies between modules. While Magnolia can handle dependencies between modules at first install and at update, bootstrapping can fail without notable errors if module dependencies are not correctly defined.	Best practice: Magnolia recommended
Use decoration rather than bootstrapping to modify the configuration of other modules.	Decoration is less invasive than bootstrapping when modifying the configuration of other modules. Decoration modules easier to install and remove.	Best practice: Magnolia recommended
Version handlers should use node and properties predicates to check configuration before making modifications.	Version handlers can change anything in Magnolia’s configuration and should check for changed configuration before making their changes to avoid overwriting existing configuration changes.	Best practice: Magnolia recommended	Use taskcount.pl to find and count tasks used (see below)
Version handlers should use version specific tasks to make changes.	Version specific tasks predicates indicate what changes were made with each version and give a running history of what was changed in specific versions.	Best practice: Magnolia recommended	Use taskcount.pl to find and count tasks used (see below)
The Magnolia module descriptor contains dependencies corresponding to bootstrap content contained in the module.	If it is necessary to bootstrap content into other modules, there should be module dependencies defined in the module descriptor to ensure the correct first installation of the module.	Best practice: Magnolia recommended

Review: Magnolia light development

Check	Why	What	Notes
Templates defined through Light Development are preferred.	Using Light Development is easier to create, modify and uninstall templates than definition through JCR configuration.	Best practice: Magnolia recommended
Dialogs defined through Light Development are preferred.	Using Light Development is easier to create, modify and uninstall templates than definition through JCR configuration.	Best practice: Magnolia recommended
Avoid excessive includes in YAML definitions.	While using includes in YAML allows common content to be shared, too many includes can become confusing and make the YAML definition difficult to understand. Changing included YAML may lead to unexpected changes in other YAML definitions.	Best practice: clean code	use inccheck.pl (see below)
Avoid recursive includes in YAML definitions.	Recursive includes - included YAML files including other YAML files) - can be difficult to troubleshoot and may obscure dependencies between YAML definitions.	Best practice: clean code	use rinccheck.pl to check for recursive includes (see below)
Follow a naming convention between templates, dialogs and YAML definitions.	Using a naming convention, such as using the same name for a template, dialog and YAML definition, helps in organizing and understanding your light content.	Best practice: clean code	use templatecheck.pl (see below)
Avoid use of label properties in YAML definitions.	Use auto-generated keys for internationalization instead of labels to reduce the size of your YAML definitions. cf. https://documentation.magnolia-cms.com/display/DOCS/Generic+i18n+keys+and+their+list#Generici18nkeysandtheirlist-Dialog	Best practice: clean code	use ldcheck.pl (see below) use labelxref.pl to find i18n label usage.
Avoid hardcoded text in label and description properties.	Use I18N keys or auto-generated keys for internationalization.	Best practice: clean code	use ldcheck.pl (see below)
The Definitions app should report no problems with a module.	The Definitions app will note any problems Magnolia finds in a light module.	Best practice: Magnolia recommended

Review: Magnolia themes

Check	Why	What
A Magnolia theme is contained within a single module.	A theme should be contained by itself in the module to isolate it and future changes to the theme.	Best practice: Magnolia recommended
Theme resources are used in one common central script.	Theme resources should be used in the prototype's script or a central script for all templates of the same site.	Best practice: Magnolia recommended
Avoid legacy resource processing.	Resource processing was removed from the standard Resources module. Even though Processed Resources app module brings back this functionality, it's not part of preconfigured Magnolia bundles and it is recommended to move the business logic to page/components model classes or templating functions whenever possible and make use of tools like grunt/sass/etc for static resource processing.	Best practice: Magnolia recommended
A defined theme is associated with a site.	A theme should be associated with a site.	Best practice: clean code

Review: Magnolia Freemarker templates

Check	Why	What
Freemarker templates should contain HTML only.	Templates should not define Javascript or non-HTML content. Javascript should be managed through resources.	Best practice: Magnolia recommended
Freemarker templates make use of i18n labels if multilanguage is supported.	If multi language is being used, templates should use the i18n framework for displaying labels and the i18n properties files should be in the same module as the Freemarker templates.	Best practice: Magnolia recommended
Freemarker templates make use of includes functionality when possible.	As with any programming language, duplicated code should be avoided. In Freemarker, the includes functionality helps you avoid code duplication.	Best practice: clean code

Review: Magnolia Java code

Check	Why	What	Notes
No use of deprecated classes or methods from Magnolia.	Use of deprecated classes or methods can cause future problems in your Magnolia project. It is best to keep up to date with the Magnolia API as it evolves.	Best practice: Magnolia recommended
Java code should conform to general clean code conventions.	Java coding conventions vary widely, however, some general guidelines for code readability, structure and maintainability should be followed.	Best practice: clean code	See notes on automated code review tools below.
Filter implementations should be efficient and performant.	A non-optimized integration done in a filter can end up in a performance problem. Also, the order in which the filters are placed can cause performance problems.	Best practice: Magnolia recommended

Review: Magnolia legacy

This section should be expanded!

Check	Why	What	Notes
Magnolia STK is not used.	The Magnolia STK reached end of life on December 31, 2018 and is no longer supported.	Best practice: Magnolia recommended

Review: Magnolia operation (optional)

Note: this section is being developed and will be expanded in the future. Customers don't usually provide access to their development, test and production environments so this part of the review may need to be done in discussion with the customer.

Check	Why	What
Magnolia public instance is regularly backed up.	Magnolia backups can be used to restore content on a failed Magnolia public instance or new Magnolia instance.	Best practice: Magnolia recommended
Backup WAR, filesystem and DB at the same time	Magnolia uses three different locations to store its contents/configs/files: WAR for the compiled Java code, templates, definitions and resources filesystem for JCR configs, indexes and binary contents Database for most of the contents This means that all of three have to be backed up at the same time. A snapshot of a virtual machine holding all is recommended.	Best practice: Magnolia recommended
Avoid using modules and bootstrapping to populate content.	Creating content with bootstrapping and Magnolia modules can be error prone and doesn't work well in a multi-environment (development, test, and production).	Best practice: Magnolia recommended
A restore or recovery procedure, preferably automated, is in place.	Recovering from a major malfunction of a Magnolia environment is necessary for high availability operations. Implementing a disaster recovery strategy and taking regular backups goes a long way towards ensuring availability. Recovery process is necessary is deployments go wrong. Every time a deployment is done, a backup should be available and be possible to be recovered in case of failure.	Best practice: Magnolia recommended
Public instances should set `magnolia.develop=false`.	Setting magnolia.develop=false will improve performance on public instances.	Best practice: Magnolia recommended
The `magnolia.resources.dir` property should not include the tmp directory.	The directory specified by magnolia.resources.dir should point to a directory containing light module content. Changes to this directory will be detected by Magnolia, including the tmp directory used by publication and activation will consume additional resources. If the tmp directory must be placed under magnolia.resources.dir, use the Magnolia property `magnolia.resources.filesystem.observation.excludedDirectories` to exclude it from observation.	Best practice: Magnolia recommended
Production deployments should use the Magnolia certified stack	The environments are using the Certified stack for operating system, java version, application server and database.	Best practice: Magnolia recommended
Magnolia instances should have sufficient heap settings.	The JVM of each instance has at least 2GB assigned.	Best practice: Magnolia recommended
Production Magnolia instances use relational databases for the JCR repository.	In-memory and file system based databases should only be used on environments other than production.	Best practice: Magnolia recommended
The network transmission time from Magnolia to the database hosting the JCR repository should be minimized.	To ensure fast transactions between Magnolia and the content store the database should be on the same physical machine as Magnolia.	Best practice: Magnolia recommended
Traffic to Magnolia public instances should be distributed by a load balancer.	A distributed instance setup allows you to respond to high availability requirements and sudden increases in traffic. There should be at least 2 publics.	Best practice: Magnolia recommended
Avoid Magnolia author clustering.	While clustering the authoring environment can increase availability for hundreds of concurrently working editors, it has little benefit for disaster recovery since all nodes in the cluster share one single repository and any sort of corruption to the repository will bring down all the cluster nodes.	Best practice: Magnolia recommended
Stage and QA Magnolia environments are used.	The QA environment is intended for user acceptance testing prior to a release. The staging environment is used to ensure stability and non-regression for a release.	Best practice: Magnolia recommended
Git or another version control system is used for Magnolia WAR sources and light development sources.	All files needed to build a WAR file and/or install light modules is managed by version control system like Git.	Best practice: Magnolia recommended
Jenkins/Bamboo or another build/continuous integration platform is used to build the Magnolia web app.	Magnolias deployment on different environments is managed by a platform like Jenkins/Bamboo with a release based approach.	Best practice: Magnolia recommended
Magnolia author instance and public instances use the same built WAR web app.	The WAR file contains the compiled Java code, templates, definitions and resources necessary for the project to function correctly within Magnolia, or to adapt the existing configuration to changes introduced with a new release. Building different versions of the WAR web app for author and public instances can introduce discrepancies and incompatibilities between author and public instances.	Best practice: Magnolia recommended
Light modules should be deployed along with the Magnolia web app.	You can use WAR for Java and bootstraps, while you can use light modules for front-end development. This combination can be deployed together with an additional trigger in the orchestration service to trigger light module deployment via Git update on deployment of a new WAR.	Best practice: Magnolia recommended
Light modules reside on a shared file system.	Git repository on a shared mount or alternatively, that each environment to be configured with access rights to any kind of Git repository you might be using and having a clone of such repository locally.	Best practice: Magnolia recommended
Integration testing of Magnolia rendered web pages is recommended.	Integration testing can be performed using Magnolia’s Selenium test packages: UI tests	Best practice: Magnolia recommended
Use blue / green deployment for Magnolia public instances in production environments.	Passive author and public instances are used when updating QA and production environments in order to reduce down time when swapping old releases for new ones.	Best practice: Magnolia recommended
Disable publication while updating or deploying public instances in production environments.	Magnolia’s workflow is flexible and configurable enough to allow the implementation of a verification process that will deactivate the publication process during the time that the passive author instance gets updated.	Best practice: Magnolia recommended
Sync of passive and active instances while updating	Even if there were an activation or publication lock set on the active author, as described previously, there may still be items queued for publication or changes made to content by authors on the active author instance.	Best practice: Magnolia recommended
Cache everything but dynamic content.	Cache improves performance greatly by using a filesystem based cache to serve previously rendered contents. Ensure that cache is enabled for all static pages.	Best practice: Magnolia recommended
Magnolia caches should only be flushed when necessary.	Flushing policy can impact the performance of the cache greatly and hence Magnolia's ability to serve content. The default policy will flush all contents registered, which would generally mean all pages and assets. Ensure that flushing only happens when pages/assets or contents used in pages are activated.	Best practice: Magnolia recommended

Review tools

Checking lots of source files by eyeball is boring and difficult to do consistently and accurately. While not all of the above checks can be automated, some checks be automated.

You can find all the latest versions of the review scripts mentioned above here: GIT repo: audit-scripts

inccheck.pl - a perl script to count includes in YAML definitions.

Pass the paths of YAML files to this script and it will report files with excessive numbers of includes.

Command line options:

inccheck.pl [-t <include threshold>] [-a] [-v] [-V] [-r] [-p <path prefix] <YAML file...>

-t <include threshold>

Sets the threshold for the number of includes that will be reported, the default is 0, meaning any inclusion will be reported (probably not what you want).

-a

Reports all inclusions in the input files, overriding the setting of the include threshold (-t).

-r

Check for recursive includes (e.g. including a definition that contains inclusions).

-p

Remove the path prefix from the YAML file path. Since the YAML file path may not be the same as the final resource path. Removing the path prefix should be used to transform a file path to the resource path of the YAML file so the check for recursive inclusions (-r) will be correct.

-v

Generate debugging output.

-V

Generate much more debugging output.

rinccheck.pl - a perl script to check for recursive includes in YAML

Print out an include tree for YAML files. For each file, any included YAML will be listed with the indentation and each included file will show its inclusions, ad nauseum.

Command line options:

rinccheck.pl [-v] [-V] [-p <path prefix] <YAML file...>

-v

Generate debugging output.

-V

Generate much more debugging output.

-p <path prefix>

Remove the path specified from the output.

templatecheck.pl - a perl script to check the naming of templates and definitions

Pass the paths of YAML definitions to this script and it will check the naming of definitions of templates, definitions and dialogs.

Will report:

if the template definition name is different from the Freemarker template script name
if the template script in the YAML definition does not exist
if the dialog definition has been defined
if the dialog definition does not exist

You can perform individual checks using the -N, -F, -D options or use the -A option to check for everything.

Command line options:

templatecheck.pl [-A] [-N] [-F] [-D] [-p <template path prefix>] [-d <dialog path prefix>] [-v] <YAML template definitions...>

-A
Check all (dialog existence, Freemarker existence and Freemarker/template definition naming. Note: is equivalent to -N -F -D.

-N

Check if the naming of the Freemarker template matches the naming of the template definition (e.g. mytemplate.yaml → mytemplate.ftl)

-F

Check if the Freemarker template in the template definition exists.

-D

Check if the dialog is defined in the template definition and the dialog YAML file exists.
-p <template path prefix>

Path prefix used to locate the corresponding Freemarker template for the YAML template definition.

-d <dialog path prefix>

Path prefix used to locate the corresponding dialog definition for the YAML template definition

-v

Generate copious debugging output

Example execution

perl ~/audit-scripts/templatecheck.pl -A \
-p ~/magnolia-audits/client/project/light-modules/ \
-d ~/magnolia-audits/client/project/light-modules/global/dialogs/ \
templates/components/landingText/landingText.yaml \
templates/components/articleCallout/articleCallout.yaml \
templates/components/video/video.yaml \
templates/components/card/card.yaml

Note: The template path is the root of the common light module directory. Template script references are literal. ie. /name-of-module/path/within/module

The dialog path assumes all dialogs are in the same module as the template definition. Dialog references use a registry and begin with a module name. ie. name-of-module:path/within/dialogs/folder

ldcheck.pl - a perl script to check for various Light Development best practices, including:

use of "renderType: freemarker" properties (not necessary, could use default)
hardcoded values for label and description properties (should use auto generated I18N keys)
empty label properties
label and description values using I18N keys that do not correspond to auto generated I18N keys (auto generated I18N keys preferred)
label and description values using auto generated I18N keys (unnecessary)

Command line options

ldcheck.pl [-v] <YAML template definitions...>

-v

Generate copious debugging output

bsanalysis.pl - a perl script to check for module bootstrapping

Checks for modules bootstrapping content into other modules.

Command line options

bsanalysis.pl [-v] [-s] [-r <project root directory>]<directory>

-v

Generate lotsa debugging output

-s

Check the bootstrap samples directories as well. If not specified, the bootstrap samples directories will be ignored.

-r

Specify the project root directory. If not specified, the current working directory will be used.

taskcount.pl - count tasks used in version handlers

Counts each task used in version handlers and prints out a listing of tasks plus how many times they were used.

Command line options

taskcount.pl [-v] [-t] <version handler source file>

-v

Generates debugging output

-t

Print out the tasks found and their counts across all version handlers supplied.

labelxref.pl - a perl script to cross reference usage of i18n labels in yaml definitions

Searches for "label:" references in input files and outputs the label values and the files it is used in.

Command line options

labelxref.pl -v <yaml definition files>

-v

Generate beaucoup debugging output.

Automated Java code review tools

There are many automated code review tools out there. I chose PMD:

https://pmd.github.io

PMD is not necessarily better or worse than any other tool out there, but it is useful because:

it's open source (versus a commercial product requiring a licence)
flexible, you can choose which rule sets you use to analyze Java sources
can be run from the command line

The goal is use the automated code review tools is locate sources that should be inspected by you (the human).

Automated code review tools can produce tons of output about nit picky details, so it is best to use rules/standards for obvious problems or poor style.

The following PMD rule sets were the best for finding problems / poor style without generating too output:

Basic: The Basic ruleset contains a collection of good practices which should be followed.
Code Size: The Code Size ruleset contains rules that find problems related to code size or complexity.
Import Statements: These rules deal with different problems that can occur with import statements.
J2EE: Rules specific to the use of J2EE implementations.
Security Code Guidelines: These rules check the security guidelines from Sun, published at http://java.sun.com/security/seccodeguide.html#gcg
Unnecessary: The Unnecessary Ruleset contains a collection of rules for unnecessary code.
Unused Code: The Unused Code ruleset contains rules that find unused or ineffective code.

NOTE: pmd v5.* included the above rule sets. However, pmd v6.* deprecated these rule sets. You can run them but pmd will complain about using deprecated rules.

Here is a custom rule set definition that includes all rules from the above rule sets and can be executed without pmd complaints:

Bechtle review (XML)

Review templates

Project review template (Pages)

An Apple Pages document containing a boilerplate description of the review checks described above.

A good starting point for generating a review document.

Page tree

____Review and audit guidelines

Description

Goal

Review: Magnolia project setup

Review: Magnolia module organization

Review: Magnolia light development

Review: Magnolia themes

Review: Magnolia Freemarker templates

Review: Magnolia Java code

Review: Magnolia legacy

Review: Magnolia operation (optional)

Review tools

inccheck.pl - a perl script to count includes in YAML definitions.

rinccheck.pl - a perl script to check for recursive includes in YAML

templatecheck.pl - a perl script to check the naming of templates and definitions

ldcheck.pl - a perl script to check for various Light Development best practices, including:

bsanalysis.pl - a perl script to check for module bootstrapping

taskcount.pl - count tasks used in version handlers

labelxref.pl - a perl script to cross reference usage of i18n labels in yaml definitions

Automated Java code review tools

Review templates