Score:-2

Get the HTML content from the database

fm flag

I have a copy of a Drupal 9 database from which I need to download all the pages.
I loaded the database in MySQL Workbench CE and connected to it via Python. There are many tables, but no views or stored procedures. I guess that some of those tables house the content, but I have no idea on how to pull them together to extract webpages.

block_content_body looks promising, but then what?

I assume it is a standard Drupal database. Is there a standard schema?

This what I have tried.

#connect to drupal db - done...
#copy to directory assigned.
#othercode config....

try:
    # Define the tables and columns to extract HTML content from
    tables_columns = [
        ('node_field_data', 'body_value'),
        ('block_content_field_data', 'body_value'),
        ('field_data_body', 'body_value'),
        ('field_data_[custom_field_name]', '[custom_field_column]'),
        ('paragraph__field_[custom_field_name]', '[custom_field_column]'),
        ('views_view', 'display_options')
        # Add additional tables and columns as needed
  

    # Iterate over the tables and columns and export HTML content to separate files
    for table, column in tables_columns:
        output_file = os.path.join(output_dir, f'{table}_{column}.html')
        query = f"SELECT {column} FROM {table}"
        cursor.execute(query)
        rows = cursor.fetchall()

        with open(output_file, 'w', encoding='utf-8') as file:
            for row in rows:
                html_content = row[0]
                file.write(html_content + '\n')

        print(f"HTML content from {table}.{column} exported to: {output_file}")

except mysql.connector.Error as error:
    print(f"Error retrieving data from MySQL: {error}")

finally:
    # Close the database connection
    if connection.is_connected():
        cursor.close()
        connection.close()


I still have to correct the field names but guess these are the html files some how joined together here?

Do you have suggestions, or is this just crazy talk?
Jaypan avatar
de flag
Pages are not built that way in Drupal. You'll need to scrape the output to get a copy of all the pages.
Kevin avatar
in flag
There are no stored procedures or table views used by Drupal core. Multiple tables store text content of every field, with other tables holding entities that relate to those fields. You will either need to HTML scrape the site, or construct a migration to a new platform using that platforms migration tools.
Score:1
cn flag

Drupal pages are generated in a complex way. Even the content stored in the database is often parsed by an input filter prior to processing. So you will need to use Drupal to generate each page you want a copy of and save that.

But... that's a pain, so there are some automated ways to do it. The downside is that it takes some work to set them up.

If you want to do this repeatedly, you can try the Tome module. This allows you to generate a static site from Drupal 8+ on demand.

Alternately, you could look into setting up Gatsby with Drupal.

Score:0
ne flag

You can use Tome for that.

Tome is a static site generator, and a static storage system for content.

When Tome is enabled, any changes to config, content, or files will be automatically synced to your local filesystem. These exports can be used to fully rebuild the site from scratch, which removes the need for a persistent SQL database or filesystem. When you're ready to push to production, you can use Tome to generate a static HTML version of your site.

Long story short, you can use Drupal in the same way you would use other static site generators like Jekyll or Hugo - everything lives in one repository, and Drupal only runs on your local machine.

Score:0
id flag

The appropriate and documented way to do this is to run the Drupal website on a web server and then create a static archive of the website.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.