Version: Ubuntu 22.04.3
GCC: 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
Hi,
I'm trying to create a program with a very large shared memory object. This is being done on an AWS EC2 instance with 130GB of RAM. shm_open(), ftruncate(), and mmap() give no errors if a shared memory object up to 120GB is created. However when each memory location in the shared memory object in stepped through and read, a bus error occurs. I created a small test program that is attached which produces exactly repeatable results.
Note that shmmax = 18446744073692774399, shmall = 18446744073692774399, and shmmni = 8092.
When staring at the bottom of the shared memory object and reading up the bus error occurs at 66,936,954,880. When starting at the top (79,999,999,999) and reading down it has a bus error after 13,063,041,023 reads. So at 66,936,958,976 from the bottom. So there is a one page (4096) gap between where the bus error occurs.
Any idea what might be happening?
Thanks,
Gene
Very simple C/C++ test program that shows the issue. Shared memory object simply hard coded to 80GB. Change the commented line to make it increment or decrement through the shared memory object.
// g++ -std=c++20 -O3 test2.cpp -W -Wall -Wextra -pedantic -pthread -o test2
#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
int main() {
uint_fast64_t mem_amt = 80000000000;
std::cout << "mem_amt = " << mem_amt << "\n";
int fd;
std::string shmpath = "/foo";
// Remove any existing shared memory object
shm_unlink(shmpath.c_str());
// Create the shared memory object with read-write access.
fd = shm_open(shmpath.c_str(), O_CREAT | O_EXCL | O_RDWR, S_IRUSR | S_IWUSR);
if (fd == -1) {
std::cerr << "\nshm_open shmbuf failure. Exiting program.\n\n";
exit(EXIT_FAILURE);
}
// Truncate (set) the size.
if (ftruncate64(fd, mem_amt) == -1) {
std::cerr << "\nftruncate shmbuf failure. Exiting program.\n\n";
exit(EXIT_FAILURE);
}
// Map the shared memory object.
char* pool = (char*)mmap(NULL, mem_amt, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (pool == MAP_FAILED) {
std::cerr << "\nmmap pool failure. Exiting program.\n\n";
exit(EXIT_FAILURE);
}
std::cout << "pool = " << (uint_fast64_t)pool << "\n";
char temp;
for (uint_fast64_t i=0; i<mem_amt; i++) {
// for (uint_fast64_t i=mem_amt-1; i>0; i--) {
temp = pool[i];
if (i % 5000000000 == 0) {
std::cout << "i = " << i << "\n";
}
}
std::cout << "temp = " << temp << "\n";
}
gbd output of the core files from incrementing and decrementing respectively:
Core was generated by `./test2'.
Program terminated with signal SIGBUS, Bus error.
#0 0x00005570b7fd1373 in main () at test2.cpp:47
47 temp = pool[i];
(gdb) bt full
#0 0x00005570b7fd1373 in main () at test2.cpp:47
i = 66936954880
mem_amt = 80000000000
fd = <optimized out>
shmpath = "/foo"
pool = 0x7fa09da0e000 ""
temp = <optimized out>
(gdb)
Core was generated by `./test2'.
Program terminated with signal SIGBUS, Bus error.
#0 0x000055e242fdc379 in main () at test2.cpp:47
47 temp = pool[i];
(gdb) bt full
#0 0x000055e242fdc379 in main () at test2.cpp:47
i = 13063041023
mem_amt = 80000000000
fd = <optimized out>
shmpath = "/foo"
pool = 0x7f7366a0e000 ""
temp = <optimized out>
(gdb)