1. 27 2月, 2018 4 次提交
  2. 16 2月, 2018 4 次提交
  3. 14 2月, 2018 1 次提交
    • K
      inotify: Extend ioctl to allow to request id of new watch descriptor · e1603b6e
      Kirill Tkhai 提交于
      Watch descriptor is id of the watch created by inotify_add_watch().
      It is allocated in inotify_add_to_idr(), and takes the numbers
      starting from 1. Every new inotify watch obtains next available
      number (usually, old + 1), as served by idr_alloc_cyclic().
      
      CRIU (Checkpoint/Restore In Userspace) project supports inotify
      files, and restores watched descriptors with the same numbers,
      they had before dump. Since there was no kernel support, we
      had to use cycle to add a watch with specific descriptor id:
      
      	while (1) {
      		int wd;
      
      		wd = inotify_add_watch(inotify_fd, path, mask);
      		if (wd < 0) {
      			break;
      		} else if (wd == desired_wd_id) {
      			ret = 0;
      			break;
      		}
      
      		inotify_rm_watch(inotify_fd, wd);
      	}
      
      (You may find the actual code at the below link:
       https://github.com/checkpoint-restore/criu/blob/v3.7/criu/fsnotify.c#L577)
      
      The cycle is suboptiomal and very expensive, but since there is no better
      kernel support, it was the only way to restore that. Happily, we had met
      mostly descriptors with small id, and this approach had worked somehow.
      
      But recent time containers with inotify with big watch descriptors
      begun to come, and this way stopped to work at all. When descriptor id
      is something about 0x34d71d6, the restoring process spins in busy loop
      for a long time, and the restore hungs and delay of migration from node
      to node could easily be watched.
      
      This patch aims to solve this problem. It introduces new ioctl
      INOTIFY_IOC_SETNEXTWD, which allows to request the number of next created
      watch descriptor from userspace. It simply calls idr_set_cursor() primitive
      to populate idr::idr_next, so that next idr_alloc_cyclic() allocation
      will return this id, if it is not occupied. This is the way which is
      used to restore some other resources from userspace. For example,
      /proc/sys/kernel/ns_last_pid works the same for task pids.
      
      The new code is under CONFIG_CHECKPOINT_RESTORE #define, so small system
      may exclude it.
      
      v2: Use INT_MAX instead of custom definition of max id,
      as IDR subsystem guarantees id is between 0 and INT_MAX.
      
      CC: Jan Kara <jack@suse.cz>
      CC: Matthew Wilcox <willy@infradead.org>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Amir Goldstein <amir73il@gmail.com>
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      e1603b6e
  4. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  5. 09 2月, 2018 5 次提交
  6. 07 2月, 2018 25 次提交