Score:0

Server

Postgresql 13 - Speed up pg_dump to 5 minutes instead of 70 minutes

Janning

2/25/23, 4:04 PM

We use pg_dump nightly to make a snapshot of our database. We did for a long time with a simple command

pg_dump -Fc database_name

This takes about an hour and produces a file of 30+GByte.

How can we speed up things?

1814

0 + 0

postgresql

pg-dump

Score:4

Server

Janning

2/25/23, 4:04 PM

Our biggest table contains raw image data in form of a bytea column.

When we did a simple COPY from psql on this table to stdout, it was quite fast (1 or 2 minutes), but it was very slow with the pg_dump. It took ~60 minutes.

So investigating this I stumbled across this discussion. It seams that the pg_dump compression is rather slow if data is already compressed as it is with image data in a bytea format. And it is better to compress outside of pg_dump (-Z0).

Additionally we found that we can make use of our multi-core cpu (-j10 and pigz). So now we are doing it like this:

$ pg_dump -Z0 -j 10 -Fd database_name -f dumpdir
$ tar -cf - dumpdir | pigz > dumpdir.tar.gz
$ rm dumpdir

The time has dropped from ~70 minutes to ~5 minutes. Quite amazing.

You can restore it like this:

$ pigz -dc dumpdir.tar.gz | tar -C dumpdir --strip-components 1 -xf -
$ pg_restore -j 10 -Fd -O -d database_name dumpdir

0 + 0

Melih

7/22/23, 2:45 PM

We had similar problem. 2 days spent for pg_dump and i canceled however i need to take dump. we have 3 TB database and more than 1,5 TB is bytea image table. The time could dropped in big tables?

Janning

7/25/23, 10:59 AM

Yes, I guess it will help a lot (= hours)

plancys

7/25/23, 8:10 PM

How do you restoring db from this? `pg_restore -d root -j 10 -Fd dump_2022-03-24_17_47_40.tar.gz` returns me `pg_restore: error: could not open input file "dump_2022-03-24_17_47_40.tar.gz/toc.dat": Not a directory`

Melih

7/28/23, 6:47 AM

@Janning 2 days spent. now with -j 2 6.5 hours. i hope there will be less time with -j 10... thanks

Janning

7/28/23, 7:12 AM

The problem is not about -j but important is -Z0 to disable compression and use -Fd to dump it into a directory. -j is just telling the process how many CPUs it should use. If you have plenty use them all.

Gabriel Linassi

12/23/23, 3:30 AM

@Janning the pg_dump tip helps a lot however I'm doing a dump from a remote server and hence the compression is not working because will do it locally after pg_dump is done. Wanted to know how can I execute the pigz compression in the remote server to then download only the compressed file?

Jose Paez

4/11/24, 10:18 PM

I tried this approach and it **didn't work** for me. The backed up DB was a mess so the speed saved wasn't really worth it. I wish I could remove my upvote, I lost a lot of time on this approach

Janning

4/12/24, 6:36 AM

It is working for me for years without any problems. I guess your backup script is wrong. Just post a new question on stack overflow with your script and link it here.

Markus

5/11/24, 6:39 AM

@JosePaez Maybe the problem was the -j option. I read in the docs: _pg_dump -j uses multiple database connections; [..]. Without the synchronized snapshot feature, the different worker jobs wouldn't be guaranteed to see the same data in each connection, which could lead to an **inconsistent backup**._

Alexandr Tovmach

6/5/24, 8:48 PM

Works like a charm, tested on ~30GB database

pkExec

6/8/24, 9:22 AM

Great answer, thanks. For completeness sake, I'd add the "-p10" parameter to pigz, where 10 is the number of cores you want to limit it too. By default, it uses all cores, which may or may not be what you want.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Postgresql 13 - Speed up pg_dump to 5 minutes instead of 70 minutes

TH: Postgresql 13 - เพิ่มความเร็ว pg_dump เป็น 5 นาทีแทนที่จะเป็น 70 นาที

RO: Postgresql 13 - Accelerează pg_dump la 5 minute în loc de 70 de minute

RU: Postgresql 13 — ускорение pg_dump до 5 минут вместо 70 минут

VI: Postgresql 13 - Tăng tốc pg_dump lên 5 phút thay vì 70 phút

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.