Score:0

Media migration results in numerous unexpected file usage records (CSV source): How to do this correctly?

pe flag

I'm migrating lots of records from a non-Drupal site into Drupal 9. The records have file 'attachments,' and I want these to be media in the Drupal site.

I have all the files already on the server where they need to permanently live and have all the paths/URI's stored with all the other site data in CSV files.

Successful so far: I can migrate the URIs into file entities without errors, using the migrate_source_csv module (along with migrate_plus and migration_tools)

The media items have some meta data, so I need to migrate them in separately. This is where results have been interesting.

  • I can migrate the media in from the same source CSV file and use the migration_lookup plugin to create the media entities and relate them to their files.

  • I can also migrate the media in using entity_lookup plugin with very similar results.

In both cases, everything seems normal except that many thousands of file uses of generic.png are generated. Generic.png is not in my source data. I think it comes from the media module.

I can remove these uses from the file_usage table without any apparent negative consequences, but I'm uneasy doing that. This is a partial migration for test purposes, so... 12,552 items created. I hesitate to do the whole thing and move on to migrating in the 'nodes' until I understand what's going on here and potential consequences.

At this stage, I'm using a custom media type called 'document'

The file+media with dependency and migrate_lookup configuration files look like this:

The files file...

uuid: 1bcec3e7-0a49-4473-87a2-6dca09b91abjan-docf
id: docfiles_import
label: "Import files for doc media type"
migration_group: docfilesmedia
source:
  plugin: 'csv'
  path: '/srv/imports/docmedia122.tab'
  delimiter: "\t"
  enclosure: '"'
  header_offset: null
  ids: [filename]
# not using most of these fields in the file import
# but including because maybe needed for grouping
# and migrate_lookup in the media import
  fields:
    0:
      name: filename
      label: 'Unique filename'
    1:
      name: title
      label: 'description'
    2:
      name: doctype
      label: 'document type'
    3:
      name: formflag
      label: 'FormYN'
    4:
      name: newpath
      label: 'path'

process:
  uid:
    plugin: default_value
    default_value: 2
  uri: newpath
  status: 
    plugin: default_value
    default_value: 1
 # 1 equals 'permanent'
destination:
  plugin: entity:file

The media file...

uuid: 1bcec3e7-0a49-4473-87a2-6dca09b91abjan-docmed
id: docmedia_import
label: Import media of document type
migration_group: docfilesmedia

source:
  plugin: 'csv'
  path: '/srv/imports/docmedia122.tab'
  delimiter: "\t"
  enclosure: '"'
  header_offset: null
  ids: [filename]
  fields:
    0:
      name: filename
      label: 'Unique filename'
    1:
      name: title
      label: 'description'
    2:
      name: doctype
      label: 'document type'
    3:
      name: formflag
      label: 'FormYN'
    4:
      name: newpath
      label: 'path'

process:
  name: title
  uid:
    plugin: default_value
    default_value: 179
  field_media_document/target_id:
  # above is name of the file entity ref field
    plugin: migration_lookup
    migration: docfiles_import
    source: filename
    # Unclear how this works. I think it means
    # 'create target ids, using filename field to determine
    # which row to use in the csv file'
  field_doctype_select:
    plugin: entity_generate
    source: doctype
    value_key: name
    bundle: document_type
    entity_type: taxonomy_term
    ignore_case: true
  field_formcheckbox: formflag

migration_dependencies:
  required:
    - docfiles_import
  optional: []

destination:
  plugin: entity:media
  default_bundle: document

The files & media migration examples I've been able to find all have assumptions that differ from my situation (not csv, files need to be moved or created or copied, Drupal to Drupal, etc.). So I'm sure I've just made errors in adapting the guidance.

Ideas on how to do to it right or understand what's wrong? If a solution preventing the phantom file-usage items can't be found, am I probably safe removing these from the file_usage table and acting like it never happened?

I'm also seeing this error intermittently. I don't know if it's related. Since the mysql seems to work fine, I've been assuming it has something to do with using a csv rather than database source?

[error] Message: Failed to connect to your database server. The server reports the following message: /No database connection configured for source plugin variable/. * Is the database server running? * Does the database exist, and have you entered the correct database name? * Have you entered the correct username and password? * Have you entered the correct database hostname?

pe flag
A clue about where generic.png comes from, though I don't yet know it may help: https://www.drupal.org/project/drupal/issues/3060509
Score:0
pe flag

I have come to believe this is normal behavior. The media system uses thumbnails, and the thumbnails count as file uses. The result is that if the imported media are documents, a generic document thumbnail is used once for each imported item.

If the imported media are images, the generic thumbnail isn't used because the image file is used. So the uses column in the admin/content/files list will show 2 uses for each imported image (assuming no other content is yet using them).

Removing the uses from the file_usage table (in mysql) doesn't seem to hurt anything, so far. But leaving them there doesn't seem to hurt anything either.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.