Score:1

Views Data Export: Plaintext field contains HTML entities

cn flag

I have a CSV export which is done by the "views_data_export" module. When exporting, a plain text field which contains a double quote ends up with a " HTML entitity in the CSV file. This doesn't make sense, and when Im importing this file, this leads to problems.

I'm actually working with Drupal 9.5.10 and different entities, but was able to reproduce this with a fresh Drupal 10.1 installation. I hope I didn't forget any important step in the description.

Steps to reproduce

  • Install Drupal 10.1 (10.1.3-dev)
  • Install the modules "drupal/views" and "drupal/views_data_export"
  • Create an Article node and set the title as >>Test "Hello" Test<<
  • Create a view "article export" and set the following:
    • Add a display of type "data export"
    • Set the content type filter to match Article nodes
    • Under Format > Settings choose "csv". Expand the "CSV Settings" section and uncheck "Strip HTML".
    • Make sure the "title" field is selected and the formatter is set to "Plain text". Uncheck "Link to the content".
    • Set the path and the filename to "all_articles.csv"
    • Save the view and call the path

Expected

A CSV file with this content:

title
"Test ""Hello"" Test"

Actual

A CSV file with this content:

title
"Test &quot;Hello&quot; Test"

Any ideas on how to get the HTML out of there? Simply activating "Strip HTML" is not good enough, since the real-world example also contains fields which are supposed to contain HTML. The export is coupled with an import, so that users can modify the CSV file and reimport it. So the HTML fields need to contai HTML and the other fields should not contain HTML.

Score:0
cn flag

In your view, go to Format and click "Settings" for your Data export.

Settings

Then, uncheck "Strip HTML"

enter image description here

That should give you full HTML output in your export.

This is the return:

enter image description here

cn flag
Thanks for your reply, but I'm not sure you read all details in my question. I already have "Strip HTML" deactivated. With that, I get HTML in all fields, not just in those fields that actually contain HTML. For plain text fields, I don't want quotes being turned into HTML entities.
cstls avatar
cn flag
I edited the answer to show you the return is what you're actually looking for. I'm unsure why you're getting the results you are. You could also try changing your Enclosure to a single quote. I'm running this on Drupal 10.
cn flag
Sorry for taking some time to get back to you. I think the reason is the Formatter which needs to be set for each column in the export view. I selected "Plain text" for the title column and did not change the formatter settings. It is configured here: /admin/config/content/formats/manage/plain_text There are filters like "Display any HTML as plain text" and "Convert line breaks into HTML (i.e. <br> and <p>)" activated for this.
cn flag
So... There is a Formatter "Plain text", which is done by the class "FilterHtmlEscape", which converts some special characters to HTML entities. That sounds like what is happening here. But what can be selected on the view is actually a Formatter class. In my case Drupal\Core\Field\Plugin\Field\FieldFormatter\StringFormatter is selected. I don't even see where this correlates with the text format?!
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.