Could somebody explain how corporations like facebook and instagram (or literary anyone who stores the user data) technically handle storage of EU citizen data in Europe?
I am building a website that will allow users to upload their pictures and videos and share them with friends. As per GDPR regulation - all data that identifies EU citizens should be stored and processed in EU. So for example if I upload the picture and there is my face in it - then the data is supposed to be stored in EU.
Does this technically mean that we have to use multiple clusters of distributed file systems (example: GlusterFS
) for this task? For example:
- first cluster - non EU
- second cluster in EU
then we mount the filesystems under following directories:
and when the data is saved it goes either to one server or the other. I find it hard to believe that this is the solution to the problem. Imagine if the US GoogleDrive user migrates to EU, so now google has to migrate xx gigagytes of data to Europe.. Given that all files have the metadata associated with them somewhere in DB - it means that this DB also has to be updated at the same time as data is migrated. Goodbye ACID
principles.
Is there perhaps some kind of labeling mechanism that allows distributed file systems to auto-shuffle the data depending on the filename?
For example if the file is named us_myimage.png
it goes to the server (or a server group) with us
label and if I later rename the file to eu_myimage.png
then the file is re-balanced to the server with eu
label?
I have looked into some of the existing solutions like GlusterFS
and minio
but could not find any leads. The closest solution to this problem is implemented by mongodb
through GridFS
, where you could shuffle the data between clusters based on labels.. In my case I am interested in file system storage solution.