Drupal 8 - Migrate Content from D6


I would like to share my experience on the last time I implemented the Migration API on Drupal 8, a heads-up before jumping to write code: keep calm. This topic might looks overwhelming on first sight, what I hightlight recommend you to take one step back, relax and prepare yourself to read in order to understand how it works.

Migration API is described as a process which extract data from a source, then transform given data to load it into a destination:

d8-migration-process

When I ended up with this task, I was told to migrate all content from particular content type from an instance of Drupal 6 into a new fresh instance of Drupal 8, a funny fact was both instances were using PostgreSQL as database nonetheless out-of-box Drupal already support it, I had one step ready, it is not just about database driver but to know that I was be able to restore Drupal 6 database into same server where current Drupal 8 database is living,

 

Add credentials to settings file:

After I created a new database and restored backup from Drupal 6 instance, then I included a new database key into settings file to expose those new credentials,

<?php
$databases['drupal6']['default'] = array (
  'database' => 'my_drupal6_database',
  'username' => 'drupal6_user',
  'password' => 's3cr3t',
  'prefix' => '',
  'host' => 'localhost',
  'port' => '5432',
  'namespace' => 'Drupal\\Core\\Database\\Driver\\pgsql',
  'driver' => 'pgsql',
);

It does not have anything complex, similar than your default credentials, but they are pointing to the restored Drupal 6 database.

 

Custom module

Let's include some custom code to make magic happens, module definition has a very import section what is dependencies, since those dependencies modules provide powerful tools to make migration easier and faster,

Setup Info YML (/drupal6_migrate.info.yml):

name: 'Drupal6 Migrate'
type: module
description: 'D6 Migration'
core: 8.x
package: 'custom'
dependencies:
  - migrate
  - migrate_plus
  - migrate_tools
  - migrate_drupal
  - migrate_file

We are now almost ready to start, but here comes with a big question I made, where should I put those migrations files? It might be confusing, since there are websites that suggest, that you should use the configuration folder (/config/install) but you should re-install your module each time you made a change, then other that suggest to use `migrations` folder but some drush commands won't work as example those provided by Migration Tools.

If you read the Migration configuration official documentation, the correct place to store migration files is `migrations` folder on root of your module, however by time I wrote this article, it seems that groups still lives into configuration folder then somehow those drush commands weren't able to recognize YML files within `migrations` folder perhaps they did not because I was using migration groups but I am not sure at this point.

Then to complete our requirements, I took a decision, I chose to include all YML files into the configuration folder. Let's see how a migration group YML file looks like:

Group (/config/install/migrate_plus.migration_group.d6group.yml):

# The machine name of the group.
id: d6group

# A human-friendly label for the group.
label: Drupal6 Content

# A brief description about the group.
description: Shared configuration to migrate content from Drupal 6

# Description of the type of source (Drupal 6, WordPress, etc).
source_type: Drupal 6

# Data that will be shared among all migrations in the group.
shared_configuration:

  # Configuration will be merged into 'source' configuration of each migration.
  source:

    # External database connection added previously in settings.php
    key: drupal6

# migration_group configuration will be removed on module uninstall.
dependencies:
  enforced:
    module:
    - drupal6_migrate

Next thing I made was to create the migration file per se, where I created relationship to previous migration group in order to get merge shared configuration, since drupal 6 database credential where defined there, this migration will move article content from drupal 6 to article entity node on drupal 8,

Migration (/config/install/migrate_plus.migration.content_node.yml):

# Migration ID for this migration.
id: content_node

# Label of current migration
label: Content node

# Define which migration group it belongs to.
migration_group: d6group

# Migration tags of current migration.
migration_tags:
  - Drupal 6
  - Content

# Source definition
source:

  # Here we are using plugin to migration nodes from Drupal 6.
  plugin: d6_node
  
  # It define which particular content type it needs to look up into Drupal 6.
  node_type: article

# Destination definition
destination:

  # Plugin to know where content will be store.
  plugin: entity:node

  # Default bundle is setting up what content type.
  default_bundle: article

# Process is place where it parse data from source in order to store into destination.
process:

  # On left we have machine name field of destination and right is telling migration what field needs to look up on source.
  nid: tnid

  # Matching version ID.
  vid: vid

  # Grabbing title value.
  title: title

  # Into user ID we set up all cases to admin user.
  uid:
    plugin: default_value
    default_value: 1

  # Retrieve status field.
  status: status

  # Dates of creation and update also are migreted.
  created: created
  changed: changed

  # Comment, Promote, Sticky flags are migrated as well.
  comment: comment
  promote: promote
  sticky: sticky

  # Extract body value to add into body field specifically into value key on body of Drupal 8.
  'body/value': body

  # To setup body summary, it takes value from specific field value on Drupal 6.
  'body/summary': 'field_summary_details/0/value'

  # Format on body is defined to be `full HTML`
  'body/format':
    plugin: default_value
    default_value: full_html

  # Getting value from a field to another by using simplest way, since it's a plan text there's no need to special parse here.
  field_plain_text_note: field_note_info

  # A list text is a little complex since values on Drupal 6 has some spaces.
  field_article_type:
   # First plugin extract the values as machine name to avoid spaces and spacial characters.
    -
      plugin: machine_name
      source: 'field_type/0/value'

   # Then second plugin map clean values to new values into Drupal 8.
    -
      plugin: static_map
      map:
        preloaded_case: 'prepopulated_value'
        quantity_case: 'amount_value'

  # This is a date field but migration makes magic to store values properly.
  field_external_date: field_date

  # Boolean field is mapped into a list text values.
  field_allowed_denied:
    plugin: static_map
    source: 'field_article_status/0/value'
    map:
      0: 'Denied'
      1: 'Allowed'

  # Here unlimited field field is mapped by using sub_process plugin to iterate on each value.  
  field_common_files_related:
    plugin: sub_process
    source: field_pdf_files

    # It define what process will be executed on each record.
    process:
    
      # It import file firstly then it will create an entity field by using `file_import` plugin.
      target_id:
        plugin: file_import
        
        # The source values: `realpath`, `newpath` and `superuser` are added dynamically by event.subscripber, 
        # it will be explain later in this blog.
        # It needs source to know where to lookup for a file per se.
        source: realpath

        # It define where to store this new file.
        destination: newpath

        # It setup user ID by using superuser source value.
        uid: superuser

        # This configuration will allow to avoid duplication and it will only return entity id, then it will be stored into target_id.
        reuse: true
        skip_on_missing_source: true
        id_only: true
      
      # It will match display destination with list from source.
      display: list
      
      # It will migrate description in a simplest way.
      description: details

# Even when there is not required dependecies, it is defined as empty. 
migration_dependencies:
  required: { }

# But this dependency will force to remove this migration from configuration when this module is uninstalled.
dependencies:
  enforced:
    module:
    - drupal6_migrate

Definitely it looks super complex at first sight, I have a couple of tricks I would share with you in order to make a little bit more easy to work with. I recommend to do next steps:

  • Breath and take your more enjoyable drink (cup of coffee in my case)
  • Read about Migration process overview official documentation
  • Divide and conquer, do not try to make it work everything at the first try
  • You might use a drush command to import only one at time:
    drush migrate-import content_node --limit=1
  • Then you would be able to debug whatever field, process, plugin you might have doubts, since it is a drush command, you may use simple var_dump() to see it on your terminal or go further by using Migrate Devel module.
  • Of course, you can check what's status of your migrations by checking whole migration group: 
    drush migrate-status --group=d6group
  • If it made the trick you were expecting, you can move alone, otherwise you can rollback your migration and startover:
    drush migrate-rollback content_node
  • If you made changes on your migration YML file definition, then you would need to re-install your module, devel module does it faster or follow this recomendation official recomendations (I chose re-install the module by drush, it was faster to me).

To make migration of entity file content, it's recommend to use different migration process to keep track on entities when you are executing rollback process, since I was migrating file fields where cardinality is defined as unlimitted, it was really tricky to keep tracking on references in order to match values, then I decided to make whole process into single migration. 

To import files I saw that Migrate Files (extended) module provides a plugin which I was able to avoid duplicate entity files, then I went for it, but I faced that Drupal 6 file fields only returns me `fid` then I created a event subscriber to expose extra information I needed to import them properly, let's see how service was defined:

Service (/drupal6_migrate.services.yml):

services:
  drupal6_migrate.event_subscriber:
    class: Drupal\drupal6_migrate\EventSubscriber\D6MigrateSubscriber
    arguments: []
    tags:
      - { name: event_subscriber }

Here is the tricky part, to import files, I created a folder named `d6files` inside root Drupal 8 instance, where I copied all files from Drupal 6 instance in order to migrate them, after migration is executed I deleted that folder, and this event subscriber will inject dynamically extra values into source file fields to be used by migration process in order to create entity files successfully, it looks like this:

EventSubscriber (/src/EventSubscriber/D6MigrateSubscriber.php):

<?php

namespace Drupal\drupal6_migrate\EventSubscriber;

use Drupal\Component\Utility\NestedArray;
use Drupal\migrate_plus\Event\MigrateEvents;
use Drupal\migrate_plus\Event\MigratePrepareRowEvent;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;


/**
 * Class D6MigrateSubscriber.
 */
class D6MigrateSubscriber implements EventSubscriberInterface {

  /**
   * {@inheritdoc}
   */
  static function getSubscribedEvents() {
    $events[MigrateEvents::PREPARE_ROW] = ['prepareRow'];

    return $events;
  }

  /**
   * This method is called whenever the migrate_plus.prepare_row event is
   * dispatched.
   *
   * @param \Drupal\migrate_plus\Event\MigratePrepareRowEvent $event
   *
   * @throws \Exception
   */
  public function prepareRow(MigratePrepareRowEvent $event) {
    // Run only when `content_node` is being executed.
    if('content_node' === $event->getMigration()->id()) {
      /** @var array $fieldPdfFiles */
      $fieldPdfFiles = $event->getRow()
        ->getSourceProperty('field_pdf_files');

      /** @var \Drupal\node\Plugin\migrate\source\d6\Node $source */
      $source = $event->getSource();

      // Walk-through all files found.
      foreach ($fieldPdfFiles as &$file) {
        /** @var int $fid */
        $fid = NestedArray::getValue($file, ['fid']);

        /** @var array $event_ids */
        $result = $source->getDatabase()
          ->select('files', 'f')
          ->fields('f', ['filepath'])
          ->condition('fid', $fid)
          ->execute()
          ->fetch();

        // When there is not result, go to next file.
        if(empty($result)) {
          continue;
        }

        // Inject file's real path value.
        $file['realpath'] = DRUPAL_ROOT . '/d6files/' . $result->filepath;

        // Init extra variables.
        $file['details'] = '';
        $file['superuser'] = 1;
        $file['newpath'] = 'public://import/articles/';

        // Unserialize data value.
        $data = unserialize($file['data']);
        if(is_array($data)) {
         $file['details'] = NestedArray::getValue($data, ['description']);
        }
      }

      // Set back source property value.
      $event->getRow()->setSourceProperty(
        'field_pdf_files',
        $fieldPdfFiles
      );
    }
  }
}

All those steps I walked in order to achieve what I was asked for, to migrate article content with its files from Drupal 6 instance to Drupal 8 new fresh instance where both use PosgreSQL database engine. I am aware that each case might be different based on business logic but I really hope this article helps you to clarify how Migration API works.

Happy coding!