Score:0

Optimal automated neo4j backup startegy

gt flag

I'm trying to figure out the most optimal way to do automated daily backups of my > 2TBs neo4j database. I need to get a db copy and upload it to S3. I have a neo4j cluster with 1 primary db and 2 secondaries. I have the neo4j data directory(/var/lib/neo4j/data) on all machines mounted on a separate device. Taking an offline dump of the db takes around 1-1.5 hrs, an online dump takes much longer. So what is the best thing to do here ?

  • Do an online backup on one of the secondaries and upload it to some storage (like s3..)?
  • Do an online backup and use differential backups to speed up the process and make it more efficient? (although this approach is based on transaction logs which may get rotated due to their size or age)
  • Do an offline db backup on one of the secondaries after taking it off?
  • Take a snapshot of the whole storage device that is mounted at /var/lib/neo4j/data ?
  • Something else?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.