Introduction

Quite often customers of Magnolia CMS usually run into the issue of repository inconsistence and sometimes it blocks Magnolia instance from starting up because the inconsistence issue was happening for core configuration workspace. By applying repository splitting technique, we are not only split out 'sensitive' workspaces from common repository, we also can tailor each of repository's configuration based on our real needs.

Further more, when it comes to production deployment, not only backup / restore and security are put into concern, the elasticity of the system under errors pressure, the capability of system to shutdown a part of it for maintenance, the separation of concerns from data point of view are also a major consideration to your developed system.

In this article I will explain in more detail how could Magnolia CMS support splitting of a big 'global' repository into separated repos. Optimization points that we could apply for different kinds of repository that we splitted based on its characteristics and our expectations. This guideline will bring you from our use cases analysis to step by step implementation, and finally a testing and verification for our efforts.

Overview

We will now configure to split our bit 'global' repository into 3 repositories - let's use "repo" for short. By doing so we can have separated, optimized Jackrabbit bundle configuration for them.

  1. First of all we still maintaining "magnolia" repository as a default one because it's mandatory for system to run and register workspaces on the fly. However you could also doing some tweak in this default one based on your detail analysis of data. For example from my checking of most of blog posts and examples from our Travel demo content, I recognized that most of our text content fields (including the main content of the blog) are not bigger than 32kb. Our "minRecordLength" currently is 1024 (1kb) which will push out text fields that bigger than 1kb into DataStore - Reference to Jackrabbit DataStore definition for more information. The thing is DataStore is not the one that we are expecting to save our text content. We need very detail fulltext-search, highlighting, indexing, analytics, etc. running on those text content, also we need modifying, sometimes storing it just within Database backend (not file store). So by learning and analysing your data, an optimized number could be figure out, and based on your system requirement, you can determine how should you go with your default repository named "magnolia".
  2. A "magnolia_system" repository to store our "config", "users", "userroles" and "usergroups" workspaces. This would be the key point that supports us splitting the system configuration and sensitive data from the sea of customers' data. For this repository, I would suggest that you follow me in using in memory Lucene search index storage media. This would
    1. Speed up your system speed when accessing these indexes
    2. Eliminate the issue of repository inconsistence in system workspaces so that you would never end up with stalling instance because of this issue. How could this be achieve is that Jackrabbit on any of its startup will automatically rebuilt its indexes if the index folder does not exist and in this case the indexes are totally cleaned up because of the RAM store. Also because the size of this repository should not be so large in compare to the other one, just 5MB for my demo instance, the rebuilt index just take us very few seconds of processing. The value it bring to us is quite high.
    3. Also for this special repository, I disabled DataStore from Jackrabbit to make sure that my contents are handle just by my configured PersistenceManager. So if I am using Oracle as my backed end storage media, all my system configuration, system users, configured groups and roles are all managed by my RDBMS. I am a bit trust on Oracle RDBMS on its security and consistency.
  3. And the third one is "magnolia_dam" which will mostly contains our big binary data items. This includes "dam" and "imaging" workspaces. We would also having some specific configuration for this repo so that it would best match for this kind of data. 
    1. Let's start with minRecordLength configuration. Because this repo is used to store binary objects + a bit of its metadata and somehow a short description of them. Picking a 300 words text file randomly you will see that it is not larger than 2kb, then let's take 2048 bytes configured for this FileDataStore so that mostly we will have our text content stored in PersistenceManager media while most of binary content should go to DataStore per Jackrabbit recommendation.
    2. We will increase bundleCacheSize of PersistenceManager from default one (10Mb per Jackrabbit implementation) to 128MB. Because I am having a bit of RAM available and I just want my binary object to be fast access, fast read and write (not so fast actually) from Jackrabbit cache so I use 128MB for this purpose.
    3. You could also tweaking some more things within Lucene SearchIndex configuration section such as changing mergeFactor so that the number of iNodes usage could be affected or changing the bufferSize and cacheSize of search result to improve your index query speed. 

A note on H2 connection

Currently we are configuring H2 Persistence Manager as below:

    <PersistenceManager class="org.apache.jackrabbit.core.persistence.pool.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=FALSE;LOG=0;CACHE_SIZE=131072;LOCK_MODE=0;UNDO_LOG=0" />
      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
      <param name="bundleCacheSize" value="128" /> <!-- in Megabytes -->
    </PersistenceManager>

AUTO_SERVER option: If you enable it, you will be able to connect to H2Console when system is running. It depends on your use case to turn this on or off, however for DAM and Imaging workspaces, I would recommend that you turn this off. There are not so much information there for you when you connect to it during runtime. Just UUID and its encoded data.

CACHE_SIZE: The database keeps most frequently used data in the main memory. The amount of memory used for caching can be changed using the setting CACHE_SIZE

LOG, LOCK_MODE & UNDO_LOG: because I am doing demo here and I won't need any DB log or DB rollback function, I just disabled it by setting a "0" there to speed up my instance. However depends on your actual use-case, please set it up based on your needs.

Step by step guide

Magnolia supported Repositories configuration

The configuration file is located under "src/main/webapp/WEB-INF/config/default/repositories.xml" of my development environment. It should located under "TOMCAT/webapps/your_magnolia_webapp/WEB-INF/config/default/repositories.xml" of your deployed server.

Our updated one for repository splitting is below:

repositories.xml
<!--
    $Id$
-->
<!DOCTYPE JCR [
<!ELEMENT Map (#PCDATA)>
<!ATTLIST Map
    name CDATA #REQUIRED
    repositoryName CDATA #REQUIRED
    workspaceName CDATA #REQUIRED>
<!ELEMENT JCR (RepositoryMapping|Repository)*>
<!ELEMENT param (#PCDATA)>
<!ATTLIST param
    name CDATA #REQUIRED
    value CDATA #REQUIRED>
<!ELEMENT Repository (param|workspace)*>
<!ATTLIST Repository
    loadOnStartup CDATA #REQUIRED
    name CDATA #REQUIRED
    provider CDATA #REQUIRED>
<!ELEMENT workspace (#PCDATA)>
<!ATTLIST workspace
    name CDATA #REQUIRED>
<!ELEMENT RepositoryMapping (Map)*>
]><JCR>
    <RepositoryMapping>
        <Map name="website" repositoryName="magnolia" workspaceName="website" />
        <!-- below workspaces are moved to magnolia_system repository 
        <Map name="config" repositoryName="magnolia" workspaceName="config" />
        <Map name="users" repositoryName="magnolia" workspaceName="users" />
        <Map name="userroles" repositoryName="magnolia" workspaceName="userroles" />
        <Map name="usergroups" repositoryName="magnolia" workspaceName="usergroups" />
         -->
        <Map name="config" repositoryName="magnolia_system" workspaceName="config" />
        <Map name="users" repositoryName="magnolia_system" workspaceName="users" />
        <Map name="userroles" repositoryName="magnolia_system" workspaceName="userroles" />
        <Map name="usergroups" repositoryName="magnolia_system" workspaceName="usergroups" />
        <!-- below workspaces present here to explicitly telling the system that they belong to magnolia_dam repository -->
        <Map name="dam" repositoryName="magnolia_dam" workspaceName="dam" />
        <Map name="imaging" repositoryName="magnolia_dam" workspaceName="imaging" />
    </RepositoryMapping>

    <!-- magnolia default repository -->
    <Repository name="magnolia" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia" />
        <!-- the default node types are loaded automatically
            <param name="customNodeTypes" value="WEB-INF/config/repo-conf/nodetypes/magnolia_nodetypes.xml" />
        -->
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}" />
        <workspace name="website" />
        <!-- 
        <workspace name="config" />
        <workspace name="users" />
        <workspace name="userroles" />
        <workspace name="usergroups" />
         -->
    </Repository>
    
    <Repository name="magnolia_system" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config.system}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia_system" />
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}_system" />
        <workspace name="config" />
        <workspace name="users" />
        <workspace name="userroles" />
        <workspace name="usergroups" />
    </Repository>
    
    <Repository name="magnolia_dam" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config.dam}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia_dam" />
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}_dam" />
        <workspace name="dam" />
        <workspace name="imaging" />
    </Repository>
</JCR>


Compare to default one (which I omit some parent nodes and comment)

repositories default config
    <RepositoryMapping>
        <Map name="website" repositoryName="magnolia" workspaceName="website" />
        <Map name="config" repositoryName="magnolia" workspaceName="config" />
        <Map name="users" repositoryName="magnolia" workspaceName="users" />
        <Map name="userroles" repositoryName="magnolia" workspaceName="userroles" />
        <Map name="usergroups" repositoryName="magnolia" workspaceName="usergroups" />
    </RepositoryMapping>

    <!-- magnolia default repository -->
    <Repository name="magnolia" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia" />
        <!-- the default node types are loaded automatically
            <param name="customNodeTypes" value="WEB-INF/config/repo-conf/nodetypes/magnolia_nodetypes.xml" />
        -->
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}" />
        <workspace name="website" />
        <workspace name="config" />
        <workspace name="users" />
        <workspace name="userroles" />
        <workspace name="usergroups" />
    </Repository>

You will see that we are having 3 repositories configured instead of just only default one "magnolia". Also we are having some mappings and some new workspaces defined. This helps specifying that those workspaces should be created and managed by our custom repositories.

Adding these 3 below Jackrabbit bundle configuration files under your "WEB-INF/config/repo-conf" folder then configure your "WEB-INF/config/default/magnolia.properties" to include these 2 added parameters:

magnolia.repositories.jackrabbit.config.dam=WEB-INF/config/repo-conf/jackrabbit-bundle-h2-dam.xml
magnolia.repositories.jackrabbit.config.system=WEB-INF/config/repo-conf/jackrabbit-bundle-h2-system.xml

Note how the property names are mapped with "repositories.xml" configuration and how the folder structure is mapped with your file location and file name.

Jackrabbit configuration for default workspace

Here is our configuration:

jackrabbit-bundle-h2-search.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" "http://jackrabbit.apache.org/dtd/repository-2.0.dtd">
<Repository>
  <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
    <param name="path" value="${rep.home}/repository" />
  </FileSystem>
  <Security appName="magnolia">
    <SecurityManager class="org.apache.jackrabbit.core.DefaultSecurityManager"/>
    <AccessManager class="org.apache.jackrabbit.core.security.DefaultAccessManager">
    </AccessManager>
    <!-- login module defined here is used by the repo to authenticate every request. not by the webapp to authenticate user against the webapp context (this one has to be passed before thing here gets invoked -->
    <LoginModule class="info.magnolia.jaas.sp.jcr.JackrabbitAuthenticationModule">
    </LoginModule>
  </Security>
  <DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
    <param name="path" value="${rep.home}/repository/datastore"/>
    <param name="minRecordLength" value="32768"/>
  </DataStore>
  <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
  <Workspace name="default">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${wsp.home}/default" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE" />
      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
    </PersistenceManager>
    <SearchIndex class="info.magnolia.jackrabbit.lucene.SearchIndex">
      <param name="path" value="${wsp.home}/index" />
      <!-- SearchIndex will get the indexing configuration from the classpath, if not found in the workspace home -->
      <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_${wsp.name}.xml"/>
      <param name="useCompoundFile" value="true" />
      <param name="minMergeDocs" value="100" />
      <param name="volatileIdleTime" value="3" />
      <param name="maxMergeDocs" value="100000" />
      <param name="mergeFactor" value="10" />
      <param name="maxFieldLength" value="10000" />
      <param name="bufferSize" value="1000" />
      <param name="cacheSize" value="10000" />
      <param name="forceConsistencyCheck" value="false" />
      <param name="autoRepair" value="false" />
      <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" />
      <param name="respectDocumentOrder" value="true" />
      <param name="resultFetchSize" value="100" />
      <param name="extractorPoolSize" value="3" />
      <param name="extractorTimeout" value="100" />
      <param name="extractorBackLogSize" value="100" />
      <!-- needed to highlight the searched term -->
      <param name="supportHighlighting" value="true"/>
      <!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
      <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/>
    </SearchIndex>
    <WorkspaceSecurity>
      <AccessControlProvider class="info.magnolia.cms.core.MagnoliaAccessProvider" />
    </WorkspaceSecurity>
  </Workspace>
  <Versioning rootPath="${rep.home}/version">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/workspaces/version" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${rep.home}/version/db;AUTO_SERVER=TRUE" />
      <param name="schemaObjectPrefix" value="version_" />
    </PersistenceManager>
  </Versioning>
</Repository>


You can reference to each detail of the configuration file to see how we implemented the changes from our Overview section here such as setting minRecordLength of FileDataStore to 32kb.

Jackrabbit configuration for system workspace

Similarly here is our system workspace configuration:

jackrabbit-bundle-h2-system.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" "http://jackrabbit.apache.org/dtd/repository-2.0.dtd">
<Repository>
  <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
    <param name="path" value="${rep.home}/repository" />
  </FileSystem>
  <Security appName="magnolia">
    <SecurityManager class="org.apache.jackrabbit.core.DefaultSecurityManager"/>
    <AccessManager class="org.apache.jackrabbit.core.security.DefaultAccessManager" />
    <!-- login module defined here is used by the repo to authenticate every request. not by the webapp to authenticate user against the webapp context (this one has to be passed before thing here gets invoked -->
    <LoginModule class="info.magnolia.jaas.sp.jcr.JackrabbitAuthenticationModule" />
  </Security>
  <!-- <DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
    <param name="path" value="${rep.home}/repository/datastore"/>
    <param name="minRecordLength" value="1048575"/> in bytes 1048576 = 1Mb
  </DataStore> -->
  <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
  <Workspace name="default">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${wsp.home}/default" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.pool.H2PersistenceManager">
      <!-- <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE" /> -->
      <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE;LOG=0;CACHE_SIZE=16192;LOCK_MODE=0;UNDO_LOG=0" />
      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
      <param name="bundleCacheSize" value="16" /> <!-- 16Mb-->
    </PersistenceManager>
    <SearchIndex class="info.magnolia.jackrabbit.lucene.SearchIndex">
      <param name="path" value="${wsp.home}/index" />
      <!-- SearchIndex will get the indexing configuration from the classpath, if not found in the workspace home -->
      <!-- <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_${wsp.name}.xml"/> -->
      <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_default.xml"/>
      <param name="useCompoundFile" value="true" />
      <param name="minMergeDocs" value="1000" />
      <param name="volatileIdleTime" value="3" />
      <param name="maxMergeDocs" value="100000" />
      <param name="mergeFactor" value="10" />
      <param name="maxFieldLength" value="10000" />
      <param name="bufferSize" value="100" />
      <param name="cacheSize" value="10000" />
      <param name="forceConsistencyCheck" value="true" />
      <param name="autoRepair" value="true" />
      <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" />
      <param name="respectDocumentOrder" value="true" />
      <param name="resultFetchSize" value="100" />
      <param name="extractorPoolSize" value="3" />
      <param name="extractorTimeout" value="100" />
      <param name="extractorBackLogSize" value="100" />
      <!-- needed to highlight the searched term -->
      <param name="supportHighlighting" value="false"/>
      <!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
      <!-- <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/> -->
      
      <param name="useSimpleFSDirectory" value="false" />
      <param name="directoryManagerClass" value="org.apache.jackrabbit.core.query.lucene.directory.RAMDirectoryManager" />
      
    </SearchIndex>
    <WorkspaceSecurity>
      <AccessControlProvider class="info.magnolia.cms.core.MagnoliaAccessProvider" />
    </WorkspaceSecurity>
  </Workspace>
  <Versioning rootPath="${rep.home}/version">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/workspaces/version" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${rep.home}/version/db;AUTO_SERVER=TRUE" />
      <param name="schemaObjectPrefix" value="version_" />
    </PersistenceManager>
  </Versioning>
</Repository>


Things you can find in the file are:

  • Disable datastore because of very low usage possibility
  • Also this could make it easier if user want to go for all data backed by database just by changing H2 to MySQL or Oracle or whatever RDBMS.
  • Disabled supportHighlighting of Lucene search because we don't need this on our system workspaces
  • Use in memory backed storage for Lucene search index

Jackrabbit configuration for DAM and Imaging workspace

The configuration is below:

jackrabbit-bundle-h2-dam.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" "http://jackrabbit.apache.org/dtd/repository-2.0.dtd">
<Repository>
  <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
    <param name="path" value="${rep.home}/repository" />
  </FileSystem>
  <Security appName="magnolia">
    <SecurityManager class="org.apache.jackrabbit.core.DefaultSecurityManager"/>
    <AccessManager class="org.apache.jackrabbit.core.security.DefaultAccessManager" />
    <!-- login module defined here is used by the repo to authenticate every request. not by the webapp to authenticate user against the webapp context (this one has to be passed before thing here gets invoked -->
    <LoginModule class="info.magnolia.jaas.sp.jcr.JackrabbitAuthenticationModule" />
  </Security>
  <DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
    <param name="path" value="${rep.home}/repository/datastore"/>
    <param name="minRecordLength" value="2048"/> <!-- in bytes mysql 32768 = 32k-->
  </DataStore>
  <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
  <Workspace name="default">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${wsp.home}/default" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.pool.H2PersistenceManager">
      <!-- <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE" /> -->
      <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=FALSE;LOG=0;CACHE_SIZE=131072;LOCK_MODE=0;UNDO_LOG=0" />
      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
      <param name="bundleCacheSize" value="128" /> <!-- 128MB -->
    </PersistenceManager>
    <SearchIndex class="info.magnolia.jackrabbit.lucene.SearchIndex">
      <param name="path" value="${wsp.home}/index" />
      <!-- SearchIndex will get the indexing configuration from the classpath, if not found in the workspace home -->
      <!-- <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_${wsp.name}.xml"/> -->
      <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_dam.xml"/>
      <param name="useCompoundFile" value="true" />
      <param name="minMergeDocs" value="100" />
      <param name="volatileIdleTime" value="3" />
      <param name="maxMergeDocs" value="100100" />
      <param name="mergeFactor" value="10" />
      <param name="maxFieldLength" value="10000" />
      <param name="bufferSize" value="100" />
      <param name="cacheSize" value="10000" />
      <param name="forceConsistencyCheck" value="true" />
      <param name="autoRepair" value="true" />
      <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" />
      <param name="respectDocumentOrder" value="true" />
      <param name="resultFetchSize" value="100" />
      <param name="extractorPoolSize" value="3" />
      <param name="extractorTimeout" value="100" />
      <param name="extractorBackLogSize" value="100" />
      <!-- needed to highlight the searched term -->
      <param name="supportHighlighting" value="true"/>
      <!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
      <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/>
    </SearchIndex>
    <WorkspaceSecurity>
      <AccessControlProvider class="info.magnolia.cms.core.MagnoliaAccessProvider" />
    </WorkspaceSecurity>
  </Workspace>
  <Versioning rootPath="${rep.home}/version">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/workspaces/version" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${rep.home}/version/db;AUTO_SERVER=TRUE" />
      <param name="schemaObjectPrefix" value="version_" />
    </PersistenceManager>
  </Versioning>
</Repository>

Things that you can verify from this configuration:

  • Changed minRecordLength configured in org.apache.jackrabbit.core.data.FileDataStore from 1024 to 2048.
  • Increase bundleCacheSize of PersistenceManager to 128MB

Testing and verification

Scenario

This test is used to verify performance impact when we split 'magnolia' repository into 'magnolia' (default one), 'magnolia_system', and 'magnolia_dam' as below:

Environment setup

Both Magnolia instance and JMeter testing software are running on the same machine.

Software

  • Oracle JDK 10
  • JMeter 5.1
  • Magnolia CE Travel demo 5.7.2
  • Tomcat 9.0.8
  • Vaadin 8.4.2

Hardware

  • 16GB RAM
  • 4 CPUs (8 virtual CPUs)
  • MAC OS High Sierra

Test case

  • 200 concurrent users ramp up each 10 seconds from one Macbook Pro 2017
  • Each user login to admincentral using 'superuser' account, browsing few pages under http://host:port/contextPath/travel

Test result

Non-splitting repository setup - 103s

Viets-MacBook-Pro:jmeter5 vietnguyen$ bin/jmeter -n -t /viet/support/jmeter/mgnl_author_suffing.jmx -l mgnl_author_suffing_aio.jtl -e -o mgnl_author_suffing_aio_report
WARNING: package sun.awt.X11 not in java.desktop
Creating summariser <summary>
Created the tree successfully using /viet/support/jmeter/mgnl_author_suffing.jmx
Starting the test @ Thu Mar 21 18:00:52 ICT 2019 (1553166052850)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
summary +    137 in 00:00:07 =   21.0/s Avg:   220 Min:     2 Max:  3397 Err:   131 (95.62%) Active: 131 Started: 131 Finished: 0
summary +   1096 in 00:00:30 =   36.5/s Avg:  3302 Min:     2 Max: 19518 Err:    69 (6.30%) Active: 200 Started: 200 Finished: 0
summary =   1233 in 00:00:37 =   33.7/s Avg:  2959 Min:     2 Max: 19518 Err:   200 (16.22%)
summary +   1358 in 00:00:30 =   45.4/s Avg:  6297 Min:  1525 Max: 20036 Err:     0 (0.00%) Active: 200 Started: 200 Finished: 0
summary =   2591 in 00:01:07 =   39.0/s Avg:  4708 Min:     2 Max: 20036 Err:   200 (7.72%)
summary +   1723 in 00:00:30 =   57.4/s Avg:  3257 Min:   378 Max:  7694 Err:     0 (0.00%) Active: 200 Started: 200 Finished: 0
summary =   4314 in 00:01:37 =   44.7/s Avg:  4129 Min:     2 Max: 20036 Err:   200 (4.64%)
summary +    686 in 00:00:05 =  136.4/s Avg:  1805 Min:     6 Max:  6864 Err:   200 (29.15%) Active: 0 Started: 200 Finished: 200
summary =   5000 in 00:01:42 =   49.2/s Avg:  3810 Min:     2 Max: 20036 Err:   400 (8.00%)
Tidying up ...    @ Thu Mar 21 18:02:35 ICT 2019 (1553166155048)
... end of run
 
8 + 60 + 35 = 103s = 1m43s

Splitted repository setup - 96s

Viets-MacBook-Pro:jmeter5 vietnguyen$ bin/jmeter -n -t /viet/support/jmeter/mgnl_author_suffing.jmx -l mgnl_author_suffing_split.jtl -e -o mgnl_author_suffing_split_report
WARNING: package sun.awt.X11 not in java.desktop
Creating summariser <summary>
Created the tree successfully using /viet/support/jmeter/mgnl_author_suffing.jmx
Starting the test @ Thu Mar 21 17:37:09 ICT 2019 (1553164629770)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
summary + 1010 in 00:00:20 = 50.7/s Avg: 2301 Min: 2 Max: 13708 Err: 200 (19.80%) Active: 200 Started: 200 Finished: 0
summary + 1014 in 00:00:30 = 33.8/s Avg: 5248 Min: 2 Max: 29121 Err: 0 (0.00%) Active: 200 Started: 200 Finished: 0
summary = 2024 in 00:00:50 = 40.5/s Avg: 3777 Min: 2 Max: 29121 Err: 200 (9.88%)
summary + 1714 in 00:00:30 = 57.1/s Avg: 4031 Min: 13 Max: 31831 Err: 33 (1.93%) Active: 167 Started: 200 Finished: 33
summary = 3738 in 00:01:20 = 46.8/s Avg: 3894 Min: 2 Max: 31831 Err: 233 (6.23%)
summary + 1262 in 00:00:15 = 84.0/s Avg: 1903 Min: 4 Max: 5607 Err: 167 (13.23%) Active: 0 Started: 200 Finished: 200
summary = 5000 in 00:01:35 = 52.7/s Avg: 3391 Min: 2 Max: 31831 Err: 400 (8.00%)
Tidying up ... @ Thu Mar 21 17:38:45 ICT 2019 (1553164725024)
... end of run
 
51 + 45 = 96s = 1m36s


Conclusion

Users are recommended to split their repositories into some smaller repos as above due to provided facts and figures. Note that sometimes splitting repositories resulting a bit slower of this performance testing report.

A small drawback:

  • Jackrabbit create quite a lot of threads to serve each repository like below which makes my running instance increase its running threads by 32. However this does not affect overall result.

org.apache.jackrabbit.core.JackrabbitThreadPool.size=Runtime.getRuntime().availableProcessors() * 2;

Changing a running system with its existing data

Scenario

Customers who are having a running Magnolia instance with an 'all in one' repository would like to split it follow this guideline.

This guideline will provide you with steps to move "config", "users", "userroles", and "usergroups" workspaces into a new "magnolia_system" repository from a running system using H2 persistence manager and FileSystem DataStore

Create and move physical data

Create and move your existing workspaces data

Create a new folder named "magnolia_system"

For your new repository under your configured "repositories" location. Check your configuration point named "magnolia.repositories.home" within your "WEB-INF/config/default/magnolia.properties" for this physical folder.

Create a sub-folder named "workspaces" under "magnolia_system" folder

Copy your "magnolia/workspaces/config" to the new location "magnolia_system/workspaces/config"

Similarly for "users", "userroles", and "usergroups" workspaces

New folder structure should look like below:

repositories
├── magnolia
│   ├── repository
│   │   ├── datastore
│   │   ├── meta
│   │   ├── namespaces
│   │   ├── nodetypes
│   │   └── privileges
│   └── workspaces
│       ├── category
│       ├── config
│       ├── contacts
│       ├── dam
│       ├── default
│       ├── googleSitemaps
│       ├── imaging
│       ├── keystore
│       ├── messages
│       ├── mgnlSystem
│       ├── mgnlVersion
│       ├── observation
│       ├── profiles
│       ├── resources
│       ├── scripts
│       ├── tasks
│       ├── tours
│       ├── usergroups
│       ├── userroles
│       ├── users
│       └── website
└── magnolia_system
    └── workspaces
        ├── config
        ├── usergroups
        ├── userroles
        └── users


Remove all "index" folders under each workspace

Edit each "workspace.xml" configuration file under each workspace:

Replace this

<!-- needed to highlight the searched term -->
<param name="supportHighlighting" value="true"/>
<!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
<param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/>

By this

<!-- needed to highlight the searched term -->
<param name="supportHighlighting" value="false"/>
<!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
<!-- <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/> -->
<param name="useSimpleFSDirectory" value="false"/>
<param name="directoryManagerClass" value="org.apache.jackrabbit.core.query.lucene.directory.RAMDirectoryManager"/>

Update configuration

New repositories.xml file

(file location: "WEB-INF/config/default/repositories.xml")

<!--
    $Id$
-->
<!DOCTYPE JCR [
<!ELEMENT Map (#PCDATA)>
<!ATTLIST Map
    name CDATA #REQUIRED
    repositoryName CDATA #REQUIRED
    workspaceName CDATA #REQUIRED>
<!ELEMENT JCR (RepositoryMapping|Repository)*>
<!ELEMENT param (#PCDATA)>
<!ATTLIST param
    name CDATA #REQUIRED
    value CDATA #REQUIRED>
<!ELEMENT Repository (param|workspace)*>
<!ATTLIST Repository
    loadOnStartup CDATA #REQUIRED
    name CDATA #REQUIRED
    provider CDATA #REQUIRED>
<!ELEMENT workspace (#PCDATA)>
<!ATTLIST workspace
    name CDATA #REQUIRED>
<!ELEMENT RepositoryMapping (Map)*>
]><JCR>
    
    <RepositoryMapping>
        <Map name="website" repositoryName="magnolia" workspaceName="website" />
        <Map name="config" repositoryName="magnolia_system" workspaceName="config" />
        <Map name="users" repositoryName="magnolia_system" workspaceName="users" />
        <Map name="userroles" repositoryName="magnolia_system" workspaceName="userroles" />
        <Map name="usergroups" repositoryName="magnolia_system" workspaceName="usergroups" />
    </RepositoryMapping>
    
    <Repository name="magnolia" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config.default}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia" />
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}" />
        <workspace name="website" />
    </Repository>

    <Repository name="magnolia_system" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.config.system}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnolia_system" />
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}_system" />
        <workspace name="config" />
        <workspace name="users" />
        <workspace name="userroles" />
        <workspace name="usergroups" />
    </Repository>
</JCR>

This should be your jackrabbit-bundle-h2-system.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" "http://jackrabbit.apache.org/dtd/repository-2.0.dtd">
<Repository>
  <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
    <param name="path" value="${rep.home}/repository" />
  </FileSystem>
  <Security appName="magnolia">
    <SecurityManager class="org.apache.jackrabbit.core.DefaultSecurityManager"/>
    <AccessManager class="org.apache.jackrabbit.core.security.DefaultAccessManager" />
    <!-- login module defined here is used by the repo to authenticate every request. not by the webapp to authenticate user against the webapp context (this one has to be passed before thing here gets invoked -->
    <LoginModule class="info.magnolia.jaas.sp.jcr.JackrabbitAuthenticationModule" />
  </Security>
  <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
  <Workspace name="default">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${wsp.home}/default" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.pool.H2PersistenceManager">
      <!-- <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE" /> -->
      <param name="url" value="jdbc:h2:${wsp.home}/db;AUTO_SERVER=TRUE;LOG=0;CACHE_SIZE=16192;LOCK_MODE=0;UNDO_LOG=0" />
      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
      <param name="bundleCacheSize" value="16" /> <!-- 16MB -->
    </PersistenceManager>
    <SearchIndex class="info.magnolia.jackrabbit.lucene.SearchIndex">
      <param name="path" value="${wsp.home}/index" />
      <!-- SearchIndex will get the indexing configuration from the classpath, if not found in the workspace home -->
      <!-- <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_${wsp.name}.xml"/> -->
      <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_default.xml"/>
      <param name="useCompoundFile" value="true" />
      <param name="minMergeDocs" value="1000" />
      <param name="volatileIdleTime" value="3" />
      <param name="maxMergeDocs" value="100000" />
      <param name="mergeFactor" value="10" />
      <param name="maxFieldLength" value="10000" />
      <param name="bufferSize" value="100" />
      <param name="cacheSize" value="10000" />
      <param name="forceConsistencyCheck" value="true" />
      <param name="autoRepair" value="true" />
      <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" />
      <param name="respectDocumentOrder" value="true" />
      <param name="resultFetchSize" value="100" />
      <param name="extractorPoolSize" value="3" />
      <param name="extractorTimeout" value="100" />
      <param name="extractorBackLogSize" value="100" />
      <!-- needed to highlight the searched term -->
      <param name="supportHighlighting" value="false"/>
      <!-- custom provider for getting an HTML excerpt in a query result with rep:excerpt() -->
      <!-- <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/> -->
      
      <param name="useSimpleFSDirectory" value="false" />
      <param name="directoryManagerClass" value="org.apache.jackrabbit.core.query.lucene.directory.RAMDirectoryManager" />
      
    </SearchIndex>
    <WorkspaceSecurity>
      <AccessControlProvider class="info.magnolia.cms.core.MagnoliaAccessProvider" />
    </WorkspaceSecurity>
  </Workspace>
  <Versioning rootPath="${rep.home}/version">
    <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/workspaces/version" />
    </FileSystem>
    <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.H2PersistenceManager">
      <param name="url" value="jdbc:h2:${rep.home}/version/db;AUTO_SERVER=TRUE" />
      <param name="schemaObjectPrefix" value="version_" />
    </PersistenceManager>
  </Versioning>
</Repository>

Your "WEB-INF/config/default/magnolia.properties" file should contain:

magnolia.repositories.jackrabbit.config.default=WEB-INF/config/repo-conf/jackrabbit-bundle-h2-default.xml
magnolia.repositories.jackrabbit.config.system=WEB-INF/config/repo-conf/jackrabbit-bundle-h2-system.xml

Get it up and run

Known error for the 1st time startup

2019-04-08 15:31:12,434 INFO  org.apache.jackrabbit.core.RepositoryImpl         : workspace 'contacts' initialized
2019-04-08 15:31:14,516 ERROR info.magnolia.cms.security.JCRSessionOp           : Failed to execute  load repository [config] path [/server/security]. session operation with Unable to provision, see the following errors:

1) Error injecting constructor, java.lang.IllegalStateException: This is a proxy used to support circular references. The object we're proxying is not constructed yet. Please wait until after injection has completed to use this object.
  at info.magnolia.cms.core.version.VersionManager.<init>(VersionManager.java:72)
  at info.magnolia.objectfactory.guice.GuiceComponentConfigurationModule.bindImplementation(GuiceComponentConfigurationModule.java:155) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> info.magnolia.objectfactory.guice.GuiceComponentProviderBuilder$1 -> info.magnolia.objectfactory.guice.GuiceComponentConfigurationModule)
  while locating info.magnolia.cms.core.version.VersionManager

1 error
com.google.inject.ProvisionException: Unable to provision, see the following errors:

1) Error injecting constructor, java.lang.IllegalStateException: This is a proxy used to support circular references. The object we're proxying is not constructed yet. Please wait until after injection has completed to use this object.
  at info.magnolia.cms.core.version.VersionManager.<init>(VersionManager.java:72)
  at info.magnolia.objectfactory.guice.GuiceComponentConfigurationModule.bindImplementation(GuiceComponentConfigurationModule.java:155) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> info.magnolia.objectfactory.guice.GuiceComponentProviderBuilder$1 -> info.magnolia.objectfactory.guice.GuiceComponentConfigurationModule)
  while locating info.magnolia.cms.core.version.VersionManager

1 error
	at com.google.inject.internal.InternalProvisionException.toProvisionException(InternalProvisionException.java:226) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1053) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1086) ~[guice-4.2.0.jar:?]
	at info.magnolia.objectfactory.guice.GuiceComponentProvider.getComponent(GuiceComponentProvider.java:109) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.Components.getComponent(Components.java:107) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.core.version.MgnlVersioningNodeWrapper.<init>(MgnlVersioningNodeWrapper.java:69) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.core.version.MgnlVersioningContentDecorator.wrapNode(MgnlVersioningContentDecorator.java:49) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorSessionWrapper.wrapNode(ContentDecoratorSessionWrapper.java:177) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorSessionWrapper.getNode(ContentDecoratorSessionWrapper.java:124) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.DelegateSessionWrapper.getNode(DelegateSessionWrapper.java:177) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorSessionWrapper.nodeExists(ContentDecoratorSessionWrapper.java:115) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.DelegateSessionWrapper.nodeExists(DelegateSessionWrapper.java:272) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorSessionWrapper.nodeExists(ContentDecoratorSessionWrapper.java:115) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.DelegateSessionWrapper.nodeExists(DelegateSessionWrapper.java:272) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorSessionWrapper.nodeExists(ContentDecoratorSessionWrapper.java:115) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.ObservedComponentFactory$2.doExec(ObservedComponentFactory.java:143) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.ObservedComponentFactory$2.doExec(ObservedComponentFactory.java:138) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.security.SilentSessionOp.exec(SilentSessionOp.java:70) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext.doInSystemContext(MgnlContext.java:378) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext.doInSystemContext(MgnlContext.java:356) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.ObservedComponentFactory.load(ObservedComponentFactory.java:138) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.ObservedComponentFactory.<init>(ObservedComponentFactory.java:94) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.ObservedComponentFactory.<init>(ObservedComponentFactory.java:86) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.security.SecuritySupportObservedComponentFactory.<init>(SecuritySupportObservedComponentFactory.java:55) ~[magnolia-core-5.7.2.jar:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:?]
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:488) ~[?:?]
	at java.lang.Class.newInstance(Class.java:560) ~[?:?]
	at info.magnolia.objectfactory.ComponentFactoryUtil.createFactory(ComponentFactoryUtil.java:54) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.guice.GuiceUtils$2.get(GuiceUtils.java:82) ~[magnolia-core-5.7.2.jar:?]
	at com.google.inject.util.Providers$GuicifiedProvider.get(Providers.java:121) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:85) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision(InternalFactoryToInitializableAdapter.java:57) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:59) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalFactoryToInitializableAdapter.get(InternalFactoryToInitializableAdapter.java:47) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:148) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:42) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:65) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:113) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:148) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:62) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:211) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:182) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109) ~[guice-4.2.0.jar:?]
	at com.google.inject.Guice.createInjector(Guice.java:87) ~[guice-4.2.0.jar:?]
	at com.google.inject.Guice.createInjector(Guice.java:78) ~[guice-4.2.0.jar:?]
	at info.magnolia.objectfactory.guice.GuiceComponentProviderBuilder.build(GuiceComponentProviderBuilder.java:149) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.objectfactory.guice.GuiceComponentProviderBuilder.build(GuiceComponentProviderBuilder.java:196) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.beans.config.ConfigLoader.load(ConfigLoader.java:142) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener$1.doExec(MagnoliaServletContextListener.java:259) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext$VoidOp.exec(MgnlContext.java:407) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext$VoidOp.exec(MgnlContext.java:404) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext.doInSystemContext(MgnlContext.java:378) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.startServer(MagnoliaServletContextListener.java:256) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.contextInitialized(MagnoliaServletContextListener.java:182) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.contextInitialized(MagnoliaServletContextListener.java:128) ~[magnolia-core-5.7.2.jar:?]
	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4627) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5091) ~[catalina.jar:9.0.8]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1427) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1417) ~[catalina.jar:9.0.8]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[tomcat-util.jar:9.0.8]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140) ~[?:?]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:943) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:839) ~[catalina.jar:9.0.8]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1427) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1417) ~[catalina.jar:9.0.8]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[tomcat-util.jar:9.0.8]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140) ~[?:?]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:943) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:258) ~[catalina.jar:9.0.8]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.StandardService.startInternal(StandardService.java:422) ~[catalina.jar:9.0.8]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.8]
	at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:770) ~[catalina.jar:9.0.8]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.8]
	at org.apache.catalina.startup.Catalina.start(Catalina.java:682) ~[catalina.jar:9.0.8]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:564) ~[?:?]
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:350) ~[bootstrap.jar:9.0.8]
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:492) ~[bootstrap.jar:9.0.8]
Caused by: java.lang.IllegalStateException: This is a proxy used to support circular references. The object we're proxying is not constructed yet. Please wait until after injection has completed to use this object.
	at com.google.common.base.Preconditions.checkState(Preconditions.java:501) ~[guava-23.1-jre.jar:?]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:34) ~[guice-4.2.0.jar:?]
	at com.sun.proxy.$Proxy35.getUserManager(Unknown Source) ~[?:?]
	at info.magnolia.cms.security.Security.getSystemUser(Security.java:81) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.AbstractContext.getUser(AbstractContext.java:64) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.context.MgnlContext.getUser(MgnlContext.java:91) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.MgnlPropertySettingContentDecorator.getCurrentUserName(MgnlPropertySettingContentDecorator.java:707) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.MgnlPropertySettingContentDecorator.setCreatedDate(MgnlPropertySettingContentDecorator.java:652) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.MgnlPropertySettingContentDecorator.setCreatedDate(MgnlPropertySettingContentDecorator.java:621) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.MgnlPropertySettingNodeWrapper.setCreatedProperty(MgnlPropertySettingNodeWrapper.java:282) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.MgnlPropertySettingNodeWrapper.addNode(MgnlPropertySettingNodeWrapper.java:216) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.DelegateNodeWrapper.addNode(DelegateNodeWrapper.java:129) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorNodeWrapper.addNode(ContentDecoratorNodeWrapper.java:131) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.audit.MgnlAuditLoggingContentDecoratorNodeWrapper.addNode(MgnlAuditLoggingContentDecoratorNodeWrapper.java:84) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.wrapper.DelegateNodeWrapper.addNode(DelegateNodeWrapper.java:129) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.jcr.decoration.ContentDecoratorNodeWrapper.addNode(ContentDecoratorNodeWrapper.java:131) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.core.version.BaseVersionManager.createInitialStructure(BaseVersionManager.java:176) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.core.version.VersionManager.<init>(VersionManager.java:74) ~[magnolia-core-5.7.2.jar:?]
	at info.magnolia.cms.core.version.VersionManager$$FastClassByGuice$$d6682ab7.newInstance(<generated>) ~[magnolia-core-5.7.2.jar:?]
	at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:148) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39) ~[guice-4.2.0.jar:?]
	at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1050) ~[guice-4.2.0.jar:?]
	... 91 more
2019-04-08 15:31:14,539 INFO  info.magnolia.context.LifeTimeJCRSessionUtil      : Will handle lifetime sessions because the system context is of type interface info.magnolia.context.ThreadDependentSystemContext
...

Turn it off then start again

After this step, you should have your repository splitted into 2 different ones with following physical directory structure

repositories
├── magnolia
│   ├── repository
│   │   ├── datastore
│   │   │   ├── 05
│   │   │   ├── 06
│   │   │   ├── 08
│   │   │   ├── 09
│   │   │   ├── 0a
│   │   ├── meta
│   │   │   └── rootUUID
│   │   ├── namespaces
│   │   │   ├── ns_idx.properties
│   │   │   └── ns_reg.properties
│   │   ├── nodetypes
│   │   │   └── custom_nodetypes.xml
│   │   └── privileges
│   └── workspaces
│       ├── category
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── contacts
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── dam
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── default
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── googleSitemaps
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── imaging
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── keystore
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── messages
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── mgnlSystem
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── mgnlVersion
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── observation
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── profiles
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── resources
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── scripts
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── tasks
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       ├── tours
│       │   ├── db.mv.db
│       │   ├── default
│       │   ├── index
│       │   └── workspace.xml
│       └── website
│           ├── db.mv.db
│           ├── default
│           ├── index
│           └── workspace.xml
└── magnolia_system
    ├── repository
    │   ├── meta
    │   │   └── rootUUID
    │   ├── namespaces
    │   │   ├── ns_idx.properties
    │   │   └── ns_reg.properties
    │   ├── nodetypes
    │   │   └── custom_nodetypes.xml
    │   └── privileges
    └── workspaces
        ├── config
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        ├── default
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        ├── mgnlSystem
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        ├── mgnlVersion
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        ├── usergroups
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        ├── userroles
        │   ├── db.mv.db
        │   ├── default
        │   └── workspace.xml
        └── users
            ├── db.mv.db
            ├── default
            └── workspace.xml

A note on Magnolia Repository

Magnolia will automatically create a new Repository structure under its configured "repositoryHome" folder within "WEB-INF/config/default/repositories.xml" file follow below structure:

magnolia_system
    └── repository
        ├── meta
        │   └── rootUUID
        ├── namespaces
        │   ├── ns_idx.properties
        │   └── ns_reg.properties
        ├── nodetypes
        │   └── custom_nodetypes.xml
        └── privileges

Within "rootUUID" file you will see its default root UUID as "cafebabe-cafe-babe-cafe-babecafebabe".

The "ns_idx.properties" and "ns_reg.properties" contain the name and hash code of registered namespaces of the repository.

The "custom_nodetypes.xml" contains all your nodeType definitions like below - reference to Jackrabbit NodeType definition and Jackrabbit NodeType Visualization for more information.

Hope this helps and have a good day!



2 Comments

  1. Hi there, great manual. I'm curious the values you use for the bundleCacheSize parameter in the persistence manager config are huge. Are the values not in MB there (https://jackrabbit.apache.org/api/2.8/org/apache/jackrabbit/core/persistence/bundle/AbstractBundlePersistenceManager.html#setBundleCacheSize(java.lang.String)? Or I'm wrong? Nicole


    1. Thank you Nicole Stutz, you are right, the number is in Megabytes, I've updated the documentation as your suggestion. Thank you for using and helping us.