Importing and exporting JCR data

Importing and exporting content and data is useful when migrating content from one system to another, taking a backup or bringing in a custom taxonomy. The default file format for content exports is JCR System View XML. The tools available range from a quick right-click export to scripting and writing custom import handlers depending on the frequency of use.

Export file formats

JCR System View XML

Magnolia exports JCR data to JCR System View XML format by default. Imported XML files must also adhere to the same format. The exported XML reflects the hierarchy of the data in the repository. File name matches the path of the data in the repository such as website.travel.tour.xml. When you export a node its children are also exported.

YAML

You can export JCR data from the config workspace to YAML. This is useful for moving from repository-based configuration to file-based configuration.

YAML export is only available for definition items:

The Export to YAML action is disabled for other items.

When using YAML files originating from export, you might have to change the file name if you want to use it within a module; for instance the name of the YAML file defining an app must reflect the app name with the pattern <app-name>.yaml.

Tools

Import and export actions

You can import and export nodes from most workspaces with the Import and Export actions. They are available in the action bar and in the context menu (right-click).

The actions use the import and export commands to export and import XML. When you import a file the imported nodes become children of the selected parent node. The commands are configured in the ui-admincentral module and implemented in

$webResourceManager.requireResource("info.magnolia.sys.confluence.artifact-info-plugin:javadoc-resource-macro-resources") ImportCommand

and

$webResourceManager.requireResource("info.magnolia.sys.confluence.artifact-info-plugin:javadoc-resource-macro-resources") ExportCommand

classes.

To export XML:

Select a node to export the node and its children.
Click Export.

To import XML:

Select a parent node under which you want to import the nodes.
Click Import.

If a node in the incoming file has an ID (UUID) that is already used in the repository, the imported UUID will be changed.

Zip files

You can upload a ZIP file in the Asset app. See the Export tool on how to export content to a ZIP file.

To import a ZIP file:

Click Upload ZIP archive.
Browse to the file.
In Encoding, select UTF-8 (Windows) or CP437 (Mac) depending on what system the ZIP file was created on.
Click Save.

Import and export tools

In Magnolia 5.4.6+ the JCR Tools app replaces the earlier legacy apps Export app, Import app and Query app.

The import and export tools allow you to operate on data in all Magnolia workspaces, including those where the Export and Import actions are not available in the workspace-specific app, for example in the Forums and Security apps.

To export:

Select the Workspace where the content resides.
In Base path, type the path to the node to export.
Select Format XML if you need to format an XML file.
Select the type of Compression: XML (no compression), ZIP or GZIP.
Click Execute.

To import:

Select the Workspace into which the content should be imported.
In Base path, type the path into which content should be imported.
Browse to the file to import.
Select how to handle conflicting UUIDs. These options only apply when an identical UUID already exists in the repository.
- Generate a new id for imported nodes will result in a new UUID being generated for nodes being imported.
- Magnolia 5.4.6 +. Only import if no existing node only import and generate a new UUID if the node does not already exist.
- Remove existing nodes with the same id will result in nodes with the same UUID as those imported being deleted before the import.
- Replace existing nodes with the same id will result in nodes with the same UUID as those imported being replaced with the imported nodes.
Click Execute.

Content Translation app

In the Content Translation app can import and export page content in Excel, CSV, ZIP and Google Spreadsheet formats. When open the exported Google Spreadsheet in Google Drive is machine-translated automatically.

Scripting

You can export content from Magnolia using a Groovy script. This example exports the about page and its children. The script is equivalent to selecting the about page and using the Export command.

import info.magnolia.importexport.DataTransporter
hm = ctx.getHierarchyManager('website')
aboutRoot = hm.getContent('/travel/about')
xmlFileOutput = new FileOutputStream('C:/test/export/about.xml')
DataTransporter.executeExport(xmlFileOutput, false, false, 
  hm.getWorkspace().getSession(), aboutRoot.getHandle(), 'website', 
  DataTransporter.XML)
 xmlFileOutput.close()

Similarly, you can import content with a Groovy script. This example imports the XML for the careers page and its children. The script is equivalent to selecting the parent page about and using the Import command.

import info.magnolia.importexport.DataTransporter
import javax.jcr.ImportUUIDBehavior
hm = ctx.getHierarchyManager('website')
aboutRoot = hm.getContent('/travel/about')
xmlFile = new File('C:/test/export/about.xml')
DataTransporter.importFile(xmlFile, 'website', aboutRoot.getHandle(), false, 
  ImportUUIDBehavior.IMPORT_UUID_CREATE_NEW, true, true)

For more information on Groovy see Groovy module. The

$webResourceManager.requireResource("info.magnolia.sys.confluence.artifact-info-plugin:javadoc-resource-macro-resources") DataTransporter

utility class explains the parameters for the executeExport and importFile methods.

Programmatically

The import and export functionality is implemented in the info.magnolia.importexport package. This implementation is mostly contained in the DataTransporter and PropertiesImportExport classes. You can invoke methods in these classes from your own class.

Here is an example of implementing the executeExport method:

File xmlFile = new File(folder.getAbsoluteFile(), xmlName);
FileOutputStream fos = new FileOutputStream(xmlFile);
try {
  DataTransporter.executeExport(fos, false, false,
  MgnlContext.getHierarchyManager(repository).getWorkspace().getSession(),
  node.getHandle(), repository, DataTransporter.XML);
 }
finally { IOUtils.closeQuietly(fos);}

These classes will not complete the import for any UUIDs that are identical to existing UUIDs.

Use cases

Here are cases when importing and exporting is useful.

Site migration

You can accomplish site migration in a number of ways.

For smaller sites (less than 300 pages), you can simply copy the page content and paste into the editor.
For larger sites, scripting is better than copying and pasting. The script examples above export from one site and import to another. The script can also add Magnolia-specific metadata such as whether a page should be visible in navigation.
Import non-Magnolia content. Store the content in XML files that adhere to the JCR System View XML Mapping format. If the XML file does not adhere to this format, convert it first. You can do that with a conversion script. The conversion script should identify content types in the file and transform them into the format that Magnolia can import.
Import data. Create a content app to manage structured data that is independent from page content, such as addresses, employees and client references.

Backup

You can back up content by exporting it to XML and store the files in a disaster recovery system. The file name is the path of the exported data, making identification easier.

The Backup module is an alternative to file system and database backup solutions. With the module you can take manual and scheduled backups.

Importing a taxonomy

How to import tags depends on the size and format of the taxonomy. It also depends on whether you need to do it once or repeatedly. If the taxonomy does not need to be added repeatedly and its size is reasonable, create the tags manually in the Categories app. If the taxonomy is large, import the tags as mgnl:category content type into the category workspace and use the Categories app to manage them, or create your own content app and content types.

Taxonomy size	Import frequency	Recommendation
Small	Once	Create the taxonomy by hand. Use the existing `mgnl:category` content type in the Categories app. If that content type does not work for you, register a new content type in your module descriptor. While you are at it, register a new workspace too.
Large	Once	Write a groovy script.
Large	Repeatedly	Write a groovy script and create a command that executes it so that editors can run the process at will or that you can schedule it.

Copying production data to a test environment

Copying production data to a test or development environment is a task you may need to do regularly. You should test new templates and features with realistic content before releasing them production. Here are strategies for prod-to-test exporting.

Option 1: Clone the production instance

Transfer the data and the JCR Datastore (all binaries in the file system) to the test instance.

In production:

Dump the SQL database.
Copy the JCR Datastore folder
If needed, copy the repository folder to preserve the Apache Lucene search index. It is generally not needed as the index and the repository folder are recreated on startup. But it would save time.

In test:

Load the database dump file.
Copy the JCR Datastore folder to the configured place.
Replace the repository folder.
Start the instance.

Pros

Cons

1:1 copy of all data. Everything is identical to production.
Very fast. A SQL dump is much faster than JCR export.
No running instance is needed as the data is loaded directly into the SQL database.

It is not possible to get only a part of the data.
Data could be to big for test. All data is usually needed in a staging environment but not in test.
Configuration is also an identical copy so it needs to be changed after startup. For example, subscribers still point to the production instance. This is usually solved with an additional Magnolia module that is specific to the test environment. Add such a module .jar to the WEB-INF/lib directory or even Tomcat lib folder. The module jar changes all configurations needed for the test environment.

Option 2: Use the backup and restore JSP scripts

Use the Backup and restore JSP scripts. This option is a quick win as you can be use it without any development and without having to restart the system. Export only the data you need in test. Define the exported content in the JSP, for example:

backup.jsp

public void run() {
   MgnlContext.setInstance(MgnlContext.getSystemContext());
      try{
         backupChildren(ContentRepository.WEBSITE, "/travel");
         backupChildren("dms", "/");
         backupChildren("data", "/products");
         backupChildren(ContentRepository.USERS, "/admin");
         backupChildren(ContentRepository.USERS, "/system");
         backupChildren(ContentRepository.USER_GROUPS, "/");
         backupChildren(ContentRepository.USER_ROLES, "/");
         // backupChildren(ContentRepository.CONFIG, "/modules");
         // backupChildren(ContentRepository.CONFIG, "/server");
         }
      catch(Exception e) {
         logMsg("can't backup", e);
      }
      finally {
         // nothing to do here
      }
      logMsg("backup completed");
   }

In production:

Place the backup JSP script in for example docroot.
Execute the script by requesting its URL. The script creates JCR XML exports into the webapp/backup folder.
Use for example a shell script to copy the exports from the production server to the test server.

In test:

Place the restore JSP script in docroot.
Execute the script by requesting its URL. The script will import all XMLs automatically.

Pros	Cons
Can be easily be configured to export only parts of the content or only certain workspaces. Is a very stable option as the script can call the garbage collector explicitly and import smaller export files. Easy to use since JSP can be executed by its URL. Easy to extend for more specific needs. Creates order files for importing the content in the right order again.	Needs a running system. Much slower than a DB dump It is a JSP script.

Option 3: Use Magnolia's export command

Magnolia's own export command is more flexible but still very comparable to the Jackrabbit JCR import/export tool. You can also use the tool in combination with the Scheduler module or trigger it from any other Java process.

In production:

Trigger the export command regularly with the scheduler and export into the file system. Extend the command so that it transfers the exported data to the test machine.

In test:

Trigger the ImportCommand which imports the transferred XML files.

Pros	Cons
The command is available out of the box and well tested since we use it extensively in many places. Can be used in various combinations such as with scheduler or with workflow. More flexible than the backup JSP as it is fully integrated and aware of the Magnolia system.	Slower than a database dump. Needs development.

Page tree

Importing and exporting JCR data

Export file formats

JCR System View XML

YAML

Tools

Import and export actions

Zip files

Import and export tools

Content Translation app

Scripting

Programmatically

Use cases

Site migration

Backup

Importing a taxonomy

Copying production data to a test environment

Option 1: Clone the production instance

Option 2: Use the backup and restore JSP scripts

Option 3: Use Magnolia's export command