I'm doing a migration from a non-Drupal site--so far, for test purposes.
I have all the data in CSV files.
Since I'm brand new to migrating into Drupal 9 from outside Drupal, I'm learning this in small, somewhat simple phases.
The source data includes a bunch of records, many of which have attached files.
Some files are attached to more than one record.
Some records have multiple files attached.
"Attached" here means a URL is stored, along with a bit of metadata, like a short description, and a type. In the database, this in a join table to relate files to records.
What I want to achieve eventually in Drupal:
All these records migrated in as nodes, the files joined to media and those joined to the correct nodes by entity reference. The creted media entities should have the old meta stuff (description, type) in custom fields.
I gather that the way this should be done is:
- Migrate the files into file entities
- Using a migration group (migration_plus, migrate_tools, and migrate_source_csv modules) use the same data source and migration_lookup to migrate the media entities
- Migrate the nodes in and use the process plugin entity_generate and a value_key of target ID or something to relate the nodes to the right media entities.
The files are already where they need to be on the server, and the paths/URIs are stored in a csv file along with the description and type, a unique ID field and the ID of each related record.
As a starting point, I attempted to import 30 files as a standalone import. The migrate_files module didn't seem like a good fit, mostly because I can't figure out how to adapt it to a situation where the media entities are going to pull field data from a csv... and the file uri's are stored in a csv also.
So I thought I'd try it with mostly standard.
This was my yaml:
uuid: 1bcec3e7-0a49-4473-87a2-6dca09b91abjan-test1
id: fileimptest
label: Test file import
migration_group: default
source:
plugin: 'csv'
path: '/srv/imports/filetest1.tab'
delimiter: "\t"
enclosure: '"'
header_offset: null
ids: [aid]
# not using most of these fields in the file import
# but including because maybe needed for grouping
# and migrate_lookup in the media import?
fields:
0:
name: aid
label: 'Unique Id'
1:
name: title
label: 'description'
2:
name: formflag
label: 'FormYN'
3:
name: newpath
label: 'path'
4:
name: docnum
label: 'doc number'
5:
name: doctype
label: 'document type'
process:
uid:
plugin: default_value
default_value: 179
uri: path
destination:
plugin: entity:file
The result was that my 30 test files appeared in the file list under admin/conent with a status of temporary. The links look correct, but clicking them results in 403 access denied (folder permissions are 777 and owned by the webserver). (I am using private file system and have several files uploaded through normal field widgets: this list with status 'permanent.' The links look the same other than subdirectories but open normally when clicked.)
So questions are:
- What am I doing wrong so far?
- Is there a better way? (I'm pretty sure there is, but what?)
(Detail: uid 179 is just a user I created named "importer")
I should note that I've read this and this, and lots of examples in the related modules. They have, together, informed what I've come up with so far, to the degree I understand them.)
Edit: "temporary status" just means there are no uses yet, so not important at this point.
The only thing that seems wrong with this test import is the access denied issue. The migrate process is missing something necessary to full function of the private file system?
Maybe when using private files, they can only be viewed if 'used' on another entity? I haven't found info on this or come up with a way to test it yet.
Edit2: per comments and answer below, the 'temporary' status can be set programmatically during the import, and the access denied is normal under these conditions: when the imported file is both a) not used anywhere and b) clicked by a user other than the uid on the file.