Lustre glitch: latency of minutes

in flag

Using a HPC lustre filesystem, we occasionally experience glitchiness where even simply opening a terminal and typing "ls" can take minutes to return. That is, any process that involves the filesystem has random massive latency (but generally produces no actual errors), and processes that do not involve the filesystem (like dragging windows around in an x-windows session) remain responsive.

What can potentially cause lustre to intermittently exhibit excessive latency? (Would it necessarily be a hardware failure, or a misconfiguration, or nearly-full filesystem, or just a nasty usage pattern from some distributed parallel job that day?)


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.