We have a site that after a few years has built up 10 million records in the paragraphs_item table. This is a result of lots of data migrations (where paragraphs are involved) and there are many orphans now (probably 90% of that).
Entity Reference Revisions ships with a Drush command now to 'purge' orphaned entities. This does appear to work - but with this many to look through it seems like it will take a very long time to go through all of them. It should be noted they are not orphaned as a result of the host entity being deleted. Due to how data is migrated in, the paragraphs are rebuilt and replaced due to the migration definition. We cannot change this now, not until there is some sort of hash on the subset of data to indicate if we should import or skip.
I have a suspicion that this is causing the query on edit forms when editing a Paragraph to get slower over time, now resulting in it not being able to execute (MariaDB stops responding). Even though all fields are indexed per Paragraphs schema, the efficiency is waning.
If I purge and let it delete 200,000 items for example - running an OPTIMIZE sql on paragraphs_item table seems to support this theory. The first edit takes 8 seconds to load a paragraph, after that its almost instant. So, deleting all unused paragraphs is what I think we should do. I realize that OPTIMIZE on InnoDB results in rebuilding the index, but in cases where you upgrade versions of the database this data is so large that I think it cannot rebuild at a certain threshold.
Is there a faster way to query and delete orphaned paragraphs beyond purge?