This is possible with custom search_api processors.
First, I created an abstract class to use as a base for the shared functionality. I.e. a method to index arbitrary entity data with a piece of content.
namespace Drupal\my_module\Plugin\search_api\processor;
use Drupal\Core\Entity\ContentEntityInterface;
use Drupal\search_api\Datasource\DatasourceInterface;
use Drupal\search_api\Item\ItemInterface;
use Drupal\search_api\Processor\EntityProcessorProperty;
use Drupal\search_api\Processor\ProcessorPluginBase;
use Drupal\search_api\Utility\Utility;
/**
* Base plugin class for indexing arbitrarily related entity data.
*
* This can be helpful to index properties of entities referencing an entity or
* entities related in some other arbitrary way.
*
* @package Drupal\my_module\Plugin\search_api\processor
*/
abstract class RelatedEntityBase extends ProcessorPluginBase {
/**
* {@inheritdoc}
*/
public function getPropertyDefinitions(DatasourceInterface $datasource = NULL) {
$plugin_definition = $this->getPluginDefinition();
$properties = [];
if (!$datasource || $datasource->getEntityTypeId() !== $this->getIndexedEntityTypeId()) {
return $properties;
}
$definition = [
'label' => $plugin_definition['label'],
'description' => $plugin_definition['description'],
'type' => 'entity:' . $this->getRelatedEntityTypeId(),
'processor_id' => $this->getPluginId(),
'is_list' => TRUE,
];
$property = new EntityProcessorProperty($definition);
$property->setEntityTypeId($this->getRelatedEntityTypeId());
$properties[$this->getPluginId()] = $property;
return $properties;
}
/**
* {@inheritdoc}
*/
public function addFieldValues(ItemInterface $item) {
/** @var \Drupal\Core\Entity\ContentEntityInterface $entity */
$entity = $item->getOriginalObject()->getValue();
$to_extract = [];
foreach ($item->getFields() as $field) {
$datasource = $field->getDatasource();
$property_path = $field->getPropertyPath();
[$direct, $nested] = Utility::splitPropertyPath($property_path, FALSE);
if ($datasource && $datasource->getEntityTypeId() === $entity->getEntityTypeId() && $direct === $this->getPluginId()) {
$to_extract[$nested][] = $field;
}
}
foreach ($this->getRelatedEntities($entity) as $relation) {
$this->getFieldsHelper()
->extractFields($relation->getTypedData(), $to_extract, $item->getLanguage());
}
}
/**
* Get an array of related entities.
*
* This should return an array of fully loaded entities that relate to the
* $entity being indexed.
*
* @param \Drupal\Core\Entity\ContentEntityInterface $entity
* The entity being indexed.
*
* @return array
* An array of entities related to $entity.
*/
abstract protected function getRelatedEntities(ContentEntityInterface $entity): array;
/**
* Get the entity type id of the entity being indexed.
*
* This is the entity type of the $entity passed to
* $this->getRelatedEntities().
*
* @return string
* An entity type id string, e.g. 'node', 'media', or 'taxonomy_term'.
*/
abstract protected function getIndexedEntityTypeId(): string;
/**
* Get the entity type id of the related entities.
*
* This is the entity type of the items returned from
* $this->getRelatedEntities().
*
* @return string
* An entity type id string, e.g. 'node', 'media', or 'taxonomy_term'.
*/
abstract protected function getRelatedEntityTypeId(): string;
}
Next, I created plugin classes that extended my abstract class for each case (Collection's Authors, Article's Collections, Author's Collections). For example, to index data from an Article's Collections as part of the Article's indexed data:
namespace Drupal\my_module\Plugin\search_api\processor;
use Drupal\Core\Entity\ContentEntityInterface;
use Drupal\my_module\Plugin\search_api\processor\RelatedEntityBase;
/**
* Index properties from Collections referencing an Article.
*
* @SearchApiProcessor(
* id = "my_module_article_collections",
* label = @Translation("Article's Collections"),
* description = @Translation("Index properties from Collections referencing this Article."),
* stages = {
* "add_properties" = 0,
* },
* )
*/
class ArticleCollections extends RelatedEntityBase {
/**
* {@inheritdoc}
*/
protected function getRelatedEntities(ContentEntityInterface $entity): array {
return my_function_to_get_article_collections($entity)
}
/**
* {@inheritdoc}
*/
protected function getIndexedEntityTypeId(): string {
return 'node';
}
/**
* {@inheritdoc}
*/
protected function getRelatedEntityTypeId(): string {
return 'node';
}
}
This allowed me to index data from a Collection as part of an Article's data, for example the Article's Collection Ids (i.e. the Ids of Collections referencing the Article). I can index any field from the Collection - by selecting the field I want in the UI - the same as if the Article had an entity reference field referencing the Collection. (Note: before you can index any fields with the custom processor, you must first enable it on the Processor tab for your index.)
This all worked great, however, my indexed data did not stay synced with reality. For example, if I added a new Article to a Collection, the indexed data for that new Article would not get updated with information for the new Collection. I.e. the Article was not getting re-indexed if a Collection referencing it was updated. I resolved this with a hook_ENTITY_TYPE_update() implementation that marks dependent Articles to be re-indexed when a Collection is saved.
use Drupal\node\NodeInterface;
/*
* Implements hook_ENTITY_TYPE_update().
*/
function my_module_node_update(NodeInterface $node) {
if ($node->bundle() == 'collection') {
$articles = [];
// Gather all Articles that this Collection references.
$articles = my_function_to_get_collection_articles($node);
// Also gather any Articles that were referenced before this save, but are
// no longer referenced.
$original_node = isset($node->original) ? $node->original : NULL;
if ($original_node instanceof NodeInterface) {
$articles += my_function_to_get_collection_articles($original_node);
}
// Mark the articles to be re-indexed.
foreach ($articles as $article) {
/** @var \Drupal\search_api\Plugin\search_api\datasource\ContentEntityTrackingManager $tracking_manager */
$search_api_tracking_manager = \Drupal::service('search_api.entity_datasource.tracking_manager');
$indexes = $search_api_tracking_manager->getIndexesForEntity($article);
if (!empty($indexes)) {
$item_ids = [];
foreach ($article->getTranslationLanguages() as $langcode => $language) {
$item_ids[] = $article->id() . ':' . $langcode;
}
foreach ($indexes as $index) {
$index->trackItemsUpdated('entity:node', $item_ids);
}
}
}
}
}
After all of this, I can safely index data from arbitrarily related entities.