Drupal 8 - Migrate Content from D6


I would like to share my experience on the last time I implemented the Migration API on Drupal 8, a heads-up before jumping to write code: keep calm. This topic might looks overwhelming on first sight, what I hightlight recommend you to take one step back, relax and prepare yourself to read in order to understand how it works.

Migration API is described as a process which extracts data from a source, then transforms the given data to load it into a destination:

d8-migration-process

When I finished the task, I was told to migrate all contents for particular content type from a Drupal 6's  instance into a new fresh instance of Drupal 8, a funny fact was that both instances were using PostgreSQL as database, nonetheless, out-of-the-box Drupal already support it, I had one step ready, it is not just about database driver, but to know that I was be able to restore Drupal 6 database into the same server where the current Drupal 8 database is living.

Adding credentials to the settings file:

After I created a new database and restored the backup from the Drupal 6 instance, I included a new database key into the settings file to expose those new credentials.

<?php
$databases['drupal6']['default'] = array (
  'database' => 'my_drupal6_database',
  'username' => 'drupal6_user',
  'password' => 's3cr3t',
  'prefix' => '',
  'host' => 'localhost',
  'port' => '5432',
  'namespace' => 'Drupal\\Core\\Database\\Driver\\pgsql',
  'driver' => 'pgsql',
);

It isn't complex, similar to your default credentials, but they are pointing to the restored Drupal 6 database.

Custom module

Let's include some custom code to make the magic happen, module definition has a very important section, dependencies, since those provide powerful tools to make migration easier and faster.

Setup Info YML (/drupal6_migrate.info.yml):

name: 'Drupal6 Migrate'
type: module
description: 'D6 Migration'
core: 8.x
package: 'custom'
dependencies:
  - migrate
  - migrate_plus
  - migrate_tools
  - migrate_drupal
  - migrate_file

We are now almost ready to start, but here comes a big question I made myself, where should I put those migrations files? It might be confusing, since there are websites that suggest that you use the configuration folder (/config/install), but re-install your module each time you made a change, then other that suggest to use `migrations` folder but some "drush" commands won't work, for example, those provided by Migration Tools.

If you read the Migration configuration official documentation, the correct place to store migration files is `migrations` folder, on the root of your module, however, by the time I wrote this article, it seems that groups still live into configuration folder, then somehow those "drush" commands weren't able to recognize YML files within `migrations` folder, perhaps they didn't do it because I was using migration groups, but I am not sure at this point.

Then, to complete our requirements, I took a decision, I choose to include all YML files into the configuration folder. Let's see how a migration group YML file looks like:

Group (/config/install/migrate_plus.migration_group.d6group.yml):

# The machine name of the group.
id: d6group

# A human-friendly label for the group.
label: Drupal6 Content

# A brief description about the group.
description: Shared configuration to migrate content from Drupal 6

# Description of the type of source (Drupal 6, WordPress, etc).
source_type: Drupal 6

# Data that will be shared among all migrations in the group.
shared_configuration:

  # Configuration will be merged into 'source' configuration of each migration.
  source:

    # External database connection added previously in settings.php
    key: drupal6

# migration_group configuration will be removed on module uninstall.
dependencies:
  enforced:
    module:
    - drupal6_migrate

The next thing I did was to create the migration's file per se, where I created the relationship to the previous migration group, in order to merge the shared configuration, since Drupal 6 database credential where defined there, this migration will move the article content from Drupal 6 to article entity node on Drupal 8,

Migration (/config/install/migrate_plus.migration.content_node.yml):

# Migration ID for this migration.
id: content_node

# Label of current migration
label: Content node

# Define which migration group it belongs to.
migration_group: d6group

# Migration tags of current migration.
migration_tags:
  - Drupal 6
  - Content

# Source definition
source:

  # Here we are using plugin to migration nodes from Drupal 6.
  plugin: d6_node
  
  # It define which particular content type it needs to look up into Drupal 6.
  node_type: article

# Destination definition
destination:

  # Plugin to know where content will be store.
  plugin: entity:node

  # Default bundle is setting up what content type.
  default_bundle: article

# Process is place where it parse data from source in order to store into destination.
process:

  # On left we have machine name field of destination and right is telling migration what field needs to look up on source.
  nid: tnid

  # Matching version ID.
  vid: vid

  # Grabbing title value.
  title: title

  # Into user ID we set up all cases to admin user.
  uid:
    plugin: default_value
    default_value: 1

  # Retrieve status field.
  status: status

  # Dates of creation and update also are migreted.
  created: created
  changed: changed

  # Comment, Promote, Sticky flags are migrated as well.
  comment: comment
  promote: promote
  sticky: sticky

  # Extract body value to add into body field specifically into value key on body of Drupal 8.
  'body/value': body

  # To setup body summary, it takes value from specific field value on Drupal 6.
  'body/summary': 'field_summary_details/0/value'

  # Format on body is defined to be `full HTML`
  'body/format':
    plugin: default_value
    default_value: full_html

  # Getting value from a field to another by using simplest way, since it's a plan text there's no need to special parse here.
  field_plain_text_note: field_note_info

  # A list text is a little complex since values on Drupal 6 has some spaces.
  field_article_type:
   # First plugin extract the values as machine name to avoid spaces and spacial characters.
    -
      plugin: machine_name
      source: 'field_type/0/value'

   # Then second plugin map clean values to new values into Drupal 8.
    -
      plugin: static_map
      map:
        preloaded_case: 'prepopulated_value'
        quantity_case: 'amount_value'

  # This is a date field but migration makes magic to store values properly.
  field_external_date: field_date

  # Boolean field is mapped into a list text values.
  field_allowed_denied:
    plugin: static_map
    source: 'field_article_status/0/value'
    map:
      0: 'Denied'
      1: 'Allowed'

  # Here unlimited field field is mapped by using sub_process plugin to iterate on each value.  
  field_common_files_related:
    plugin: sub_process
    source: field_pdf_files

    # It define what process will be executed on each record.
    process:
    
      # It import file firstly then it will create an entity field by using `file_import` plugin.
      target_id:
        plugin: file_import
        
        # The source values: `realpath`, `newpath` and `superuser` are added dynamically by event.subscripber, 
        # it will be explain later in this blog.
        # It needs source to know where to lookup for a file per se.
        source: realpath

        # It define where to store this new file.
        destination: newpath

        # It setup user ID by using superuser source value.
        uid: superuser

        # This configuration will allow to avoid duplication and it will only return entity id, then it will be stored into target_id.
        reuse: true
        skip_on_missing_source: true
        id_only: true
      
      # It will match display destination with list from source.
      display: list
      
      # It will migrate description in a simplest way.
      description: details

# Even when there is not required dependecies, it is defined as empty. 
migration_dependencies:
  required: { }

# But this dependency will force to remove this migration from configuration when this module is uninstalled.
dependencies:
  enforced:
    module:
    - drupal6_migrate

It definitely looks super complex at first sight, I have a couple of tricks I would share with you in order to make it a little bit easier to work with. I recommend you to do the next steps:

  • Breath and take your more enjoyable drink (cup of coffee in my case).
  • Read the Migration process overview official documentation.
  • Divide and conquer, don't try to make everything work at the first try.
  • You may use a "drush" command to import only one at a time:
    drush migrate-import content_node --limit=1
  • Then you would be able to debug whatever field, process or plugin you might have doubts with, since it is a "drush" command, you may use a simple var_dump() call to see it on your terminal, or go further by using Migrate Devel module.
  • Of course, you can check what's the status of your migrations by checking whole migration group: 
    drush migrate-status --group=d6group
  • If that made the trick, you can move along, otherwise you can rollback your migration and start over:
    drush migrate-rollback content_node
  • If you made changes to your migration YML file definition, then you would need to re-install your module, devel module does it faster, or you can follow these official recommendations (I chose re-install the module using "drush", it was faster for me).

To migrate an entity's file content, it's recommended to use different migration process, to keep track on entities if you need to execute a rollback process, since I was migrating file fields where cardinality is defined as unlimitted, it was really tricky to keep track on references, in order to match values, then I decided to make the whole process into a single migration. 

To import files, I saw that Migrate Files (extended) module provides a plugin, which is able to avoid duplicated entity files, I went for it, but I realized that Drupal 6 file fields only returns the `fid`, then I created a event subscriber to expose some extra information I needed to import them properly, let's see how the service was defined:

Service (/drupal6_migrate.services.yml):

services:
  drupal6_migrate.event_subscriber:
    class: Drupal\drupal6_migrate\EventSubscriber\D6MigrateSubscriber
    arguments: []
    tags:
      - { name: event_subscriber }

Here is the tricky part, to import files, I created a folder named `d6files` inside the root of the Drupal 8's instance, where I copied all the files from Drupal 6's instance, in order to migrate them, after the migration was executed I deleted that folder, and this event subscriber would inject, dynamically, extra values into the source file fields that are going to be used by the migration process in order to create the entity files successfully, it looks like this:

EventSubscriber (/src/EventSubscriber/D6MigrateSubscriber.php):

<?php

namespace Drupal\drupal6_migrate\EventSubscriber;

use Drupal\Component\Utility\NestedArray;
use Drupal\migrate_plus\Event\MigrateEvents;
use Drupal\migrate_plus\Event\MigratePrepareRowEvent;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;

/**
 * Class D6MigrateSubscriber
 *
 * @package Drupal\drupal6_migrate\EventSubscriber
 */
class D6MigrateSubscriber implements EventSubscriberInterface {

  /**
   * {@inheritdoc}
   */
  public static function getSubscribedEvents() {
    $events[MigrateEvents::PREPARE_ROW] = ['prepareRow'];
    return $events;
  }

  /**
   * This method is called whenever the migrate_plus.prepare_row event is
   * dispatched.
   *
   * @param MigratePrepareRowEvent $event
   *
   * @throws \Exception
   */
  public function prepareRow(MigratePrepareRowEvent $event) {
    // Run only when `content_node` is being executed.
    if ('content_node' === $event->getMigration()->id()) {
      /** @var array $fieldPdfFiles */
      $fieldPdfFiles = $event->getRow()
          ->getSourceProperty('field_pdf_files');
      
      /** @var \Drupal\node\Plugin\migrate\source\d6\Node $source */
      $source = $event->getSource();

      // Walk-through all files found.
      foreach ($fieldPdfFiles as &$file) {
        /** @var int $fid */
        $fid = NestedArray::getValue($file, ['fid']);

        /** @var array $event_ids */
        $result = $source->getDatabase()
          ->select('files', 'f')
          ->fields('f', ['filepath'])
          ->condition('fid', $fid)
          ->execute()
          ->fetch();

        // When there is not result, go to next file.
        if (empty($result)) {
          continue;
        }

        // Inject file's real path value.
        $file['realpath'] = DRUPAL_ROOT . '/d6files/' . $result->filepath;

        // Init extra variables.
        $file['details'] = '';
        $file['superuser'] = 1;
        $file['newpath'] = 'public://import/articles/';

        // Unserialize data value.
        $callback = 'unserialize';
        if(function_exists($callback)) {
          $data = $callback($file['data']);
          $data = is_array($data) ? $data : [];
          $file['details'] = NestedArray::getValue($data, ['description']);
        }
      }

      // Set back source property value.
      $event->getRow()->setSourceProperty(
          'field_pdf_files',
          $fieldPdfFiles
       );
    }
  }
}

All those steps I took in order to achieve what I was asked for, to migrate article content with its files from a Drupal 6's instance to a Drupal 8's new fresh instance, where both of them use PosgreSQL database engine. I am aware that each case might be different, based on business logic, but I really hope this article helps you to shed some light on how Migration API works.

Happy coding!