Implement scheduling that considers the required filesystems for jobs, ensuring jobs are only scheduled in clusters with the necessary filesystems available.
Something like adding filesystem paths as constraints in the job launcher file: ``` ... required_filesystems: - /path/to/dir1 - /path/to/dir2 ``` so the job only gets scheduled in clusters where the required filesystems are present. Extension: Supporting different required filesystems depending on the cluster: ``` ... required_filesystems: cluster 1: - /path/to/dir1 cluster 2: - /path/to/dir2 ``` the goal is to solve for cases where the same dataset lives under different locations in different clusters: ``` cluster 1: /data/MNIST/... cluster 2: /home/$user/data/MNIST/... ```