I am writing some gross middleware - basically, I have some old code that needs to open 100,000 files for reading only, expecting them all to be in one folder. It never writes. It is multiprocess so it can try to open ~30 files at the same time. The old way, I would have to actually copy the files into that folder (or use links, NFS, etc.). Worth noting I have no ability to change this old code - its just a binary.
I have some new, fancy code that can retrieve a file almost instantly. I want to tie these things together, so when the old code tries to open the file, it is actually, in real time, running the new code.
So I thought of mkfifo and inotifywait. Instead of a folder of 100,000 files, I can make a folder of 100,000 named pipes. So far so good. The legacy code goes to open the files, not knowing that they are indeed named pipes. The problem is, I don't know what order the legacy code is going to open the files (nice, right?). So I would like to TRIGGER the named pipe WRITE (from my fancy new code) when the legacy code goes in for the read. I can't spawn 100,000 writes and have them all block. So I thought hey - inotifywait makes sense. Every time the legacy goes to open the pipe, it triggers a read event, which can then be used to spawn the pipe writer in the background. The problem is.. inotifywait doesn't trigger the read event until AFTER the writer has been spawned!
Any ideas of how to solve this? Basically - I want to intercept a file open, block for a couple hundred ms while I retrieve the contents of the file, then return that contents. Ideally I don't have to create a custom FUSE filesystem to do this.. its just a read-only file open. The problem is this needs to run fast and in parallel.. and I don't know which files are going to be opened in what order. Gotta be a quick and dirty way!
Thanks in advance for everyone's time.
EDIT - for some more details. Basically I have legacy some code that wants to load a folder full of PNG files. I want those PNG files to actually come from a web server that returns DICOM files. This requires some ugly conversion, etc. The legacy PNG loading code is very inflexible.. it expects these things to be files. So basically, I want to intercept the fopen of the PNG loading code and run the following four lines of bash pseudocode first. The $URL_FOR_DICOM
below can be derived from the $LADY_LOADED.png
filename.
wget -q -O $LAZY_LOADED.dcm $URL_FOR_DICOM
dcmj2pnm --write-png $LAZY_LOADED.dcm $LAZY_LOADED.png
rm $LAZY_LOADED.dcm
convert $LAZY_LOADED.png -resize 1024x1024^ -gravity center -extent 1024x1024 $LAZY_LOADED.png
So when the PNG loader tries to load $LAZY_LOADED.png
(which is actually a FIFO), it would get populated using the above, ideally triggered by inotify. I can't do this in advance because the dataset is massive - like close to 0.5PB.. so I can't have a second copy around, I need it to be loaded on the fly from the web server.
EDIT 2-
when trying ifnotifywait on a named pipe, it blocks ANY events (including open, access, read.. etc) until the named pipe is open for writing AND reading... (i.e. no way to detect that the reader is ready)... ideas?
Another user had a similar problem here with no solution : (