1. 18 12月, 2014 1 次提交
  2. 17 12月, 2014 1 次提交
    • L
      microblaze: Fix mmap for cache coherent memory · 3a8e3265
      Lars-Peter Clausen 提交于
      When running in non-cache coherent configuration the memory that was
      allocated with dma_alloc_coherent() has a custom mapping and so there is no
      1-to-1 relationship between the kernel virtual address and the PFN. This
      means that virt_to_pfn() will not work correctly for those addresses and the
      default mmap implementation in the form of dma_common_mmap() will map some
      random, but not the requested, memory area.
      
      Fix this by providing a custom mmap implementation that looks up the PFN
      from the page table rather than using virt_to_pfn.
      Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      3a8e3265
  3. 16 12月, 2014 5 次提交
  4. 14 12月, 2014 7 次提交
  5. 13 12月, 2014 1 次提交
  6. 12 12月, 2014 6 次提交
  7. 11 12月, 2014 19 次提交
    • M
      arm64: mm: dump: don't skip final region · fb59d007
      Mark Rutland 提交于
      If the final page table entry we walk is a valid mapping, the page table
      dumping code will not log the region this entry is part of, as the final
      note_page call in ptdump_show will trigger an early return. Luckily this
      isn't seen on contemporary systems as they typically don't have enough
      RAM to extend the linear mapping right to the end of the address space.
      
      In note_page, we log a region  when we reach its end (i.e. we hit an
      entry immediately afterwards which has different prot bits or is
      invalid). The final entry has no subsequent entry, so we will not log
      this immediately. We try to cater for this with a subsequent call to
      note_page in ptdump_show, but this returns early as 0 < LOWEST_ADDR, and
      hence we will skip a valid mapping if it spans to the final entry we
      note.
      
      Unlike 32-bit ARM, the pgd with the kernel mapping is never shared with
      user mappings, so we do not need the check to ensure we don't log user
      page tables. Due to the way addr is constructed in the walk_* functions,
      it can never be less than LOWEST_ADDR when walking the page tables, so
      it is not necessary to avoid dereferencing invalid table addresses. The
      existing checks for st->current_prot and st->marker[1].start_address are
      sufficient to ensure we will not print and/or dereference garbage when
      trying to log information.
      
      This patch removes the unnecessary check against LOWEST_ADDR, ensuring
      we log all regions in the kernel page table, including those which span
      right to the end of the address space.
      
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NSteve Capper <steve.capper@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb59d007
    • M
      arm64: mm: dump: fix shift warning · 35545f0c
      Mark Rutland 提交于
      When building with 48-bit VAs, it's possible to get the following
      warning when building the arm64 page table dumping code:
      
      arch/arm64/mm/dump.c: In function ‘walk_pgd’:
      arch/arm64/mm/dump.c:266:2: warning: right shift count >= width of type
        pgd_t *pgd = pgd_offset(mm, 0);
        ^
      
      As pgd_offset is a macro and the second argument is not cast to any
      particular type, the zero will be given integer type by the compiler.
      As pgd_offset passes the pargument to pgd_index, we then try to shift
      the 32-bit integer by at least 39 bits (for 4k pages).
      
      Elsewhere the pgd_offset is passed a second argument of unsigned long
      type, so let's do the same here by passing '0UL' rather than '0'.
      
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      35545f0c
    • K
      arm64: psci: Fix build breakage without PM_SLEEP · e5e62d47
      Krzysztof Kozlowski 提交于
      Fix build failure of defconfig when PM_SLEEP is disabled (e.g. by
      disabling SUSPEND) and CPU_IDLE enabled:
      
      arch/arm64/kernel/psci.c:543:2: error: unknown field ‘cpu_suspend’ specified in initializer
        .cpu_suspend = cpu_psci_cpu_suspend,
        ^
      arch/arm64/kernel/psci.c:543:2: warning: initialization from incompatible pointer type [enabled by default]
      arch/arm64/kernel/psci.c:543:2: warning: (near initialization for ‘cpu_psci_ops.cpu_prepare’) [enabled by default]
      make[1]: *** [arch/arm64/kernel/psci.o] Error 1
      
      The cpu_operations.cpu_suspend field exists only if ARM64_CPU_SUSPEND is
      defined, not CPU_IDLE.
      Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e5e62d47
    • J
      xen: switch to post-init routines in xen mmu.c earlier · cdfa0bad
      Juergen Gross 提交于
      With the virtual mapped linear p2m list the post-init mmu operations
      must be used for setting up the p2m mappings, as in case of
      CONFIG_FLATMEM the init routines may trigger BUGs.
      
      paging_init() sets up all infrastructure needed to switch to the
      post-init mmu ops done by xen_post_allocator_init(). With the virtual
      mapped linear p2m list we need some mmu ops during setup of this list,
      so we have to switch to the correct mmu ops as soon as possible.
      
      The p2m list is usable from the beginning, just expansion requires to
      have established the new linear mapping. So the call of
      xen_remap_memory() had to be introduced, but this is not due to the
      mmu ops requiring this.
      
      Summing it up: calling xen_post_allocator_init() not directly after
      paging_init() was conceptually wrong in the beginning, it just didn't
      matter up to now as no functions used between the two calls needed
      some critical mmu ops (e.g. alloc_pte). This has changed now, so I
      corrected it.
      Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      cdfa0bad
    • B
      x86/asm: Unify segment selector defines · be9d1738
      Borislav Petkov 提交于
      Those are identical on 32- and 64-bit, unify them. No functional
      change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418127959-29902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be9d1738
    • B
      x86/asm: Guard against building the 32/64-bit versions of the asm-offsets*.c file directly · 5de2b61a
      Borislav Petkov 提交于
      Sometimes it is helpful to build a kernel compilation unit
      directly, i.e.:
      
        make .../<filename>.i
      
      in order to look at compiler output.
      
      Since asm-offsets_{32,64}.c are included by asm-offsets.c and
      building them directly doesn't evaluate the macros used (thus
      making the preprocessor output not very useful), error out when
      an attempt is made to build them. Issue a hint for the user to
      build asm-offsets.c instead.
      Suggested-by: NMichael Matz <matz@suse.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418139917-12722-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5de2b61a
    • A
      x86_64, switch_to(): Load TLS descriptors before switching DS and ES · f647d7c1
      Andy Lutomirski 提交于
      Otherwise, if buggy user code points DS or ES into the TLS
      array, they would be corrupted after a context switch.
      
      This also significantly improves the comments and documents some
      gotchas in the code.
      
      Before this patch, the both tests below failed.  With this
      patch, the es test passes, although the gsbase test still fails.
      
       ----- begin es test -----
      
      /*
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned short GDT3(int idx)
      {
      	return (idx << 3) | 3;
      }
      
      static int create_tls(int idx, unsigned int base)
      {
      	struct user_desc desc = {
      		.entry_number    = idx,
      		.base_addr       = base,
      		.limit           = 0xfffff,
      		.seg_32bit       = 1,
      		.contents        = 0, /* Data, grow-up */
      		.read_exec_only  = 0,
      		.limit_in_pages  = 1,
      		.seg_not_present = 0,
      		.useable         = 0,
      	};
      
      	if (syscall(SYS_set_thread_area, &desc) != 0)
      		err(1, "set_thread_area");
      
      	return desc.entry_number;
      }
      
      int main()
      {
      	int idx = create_tls(-1, 0);
      	printf("Allocated GDT index %d\n", idx);
      
      	unsigned short orig_es;
      	asm volatile ("mov %%es,%0" : "=rm" (orig_es));
      
      	int errors = 0;
      	int total = 1000;
      	for (int i = 0; i < total; i++) {
      		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
      		usleep(100);
      
      		unsigned short es;
      		asm volatile ("mov %%es,%0" : "=rm" (es));
      		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
      		if (es != GDT3(idx)) {
      			if (errors == 0)
      				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
      				       GDT3(idx), es);
      			errors++;
      		}
      	}
      
      	if (errors) {
      		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
      		return 1;
      	} else {
      		printf("[OK]\tES was preserved\n");
      		return 0;
      	}
      }
      
       ----- end es test -----
      
       ----- begin gsbase test -----
      
      /*
       * gsbase.c, a gsbase test
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned char *testptr, *testptr2;
      
      static unsigned char read_gs_testvals(void)
      {
      	unsigned char ret;
      	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
      	return ret;
      }
      
      int main()
      {
      	int errors = 0;
      
      	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr == MAP_FAILED)
      		err(1, "mmap");
      
      	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr2 == MAP_FAILED)
      		err(1, "mmap");
      
      	*testptr = 0;
      	*testptr2 = 1;
      
      	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
      		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
      		err(1, "ARCH_SET_GS");
      
      	usleep(100);
      
      	if (read_gs_testvals() == 1) {
      		printf("[OK]\tARCH_SET_GS worked\n");
      	} else {
      		printf("[FAIL]\tARCH_SET_GS failed\n");
      		errors++;
      	}
      
      	asm volatile ("mov %0,%%gs" : : "r" (0));
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tWriting 0 to gs worked\n");
      	} else {
      		printf("[FAIL]\tWriting 0 to gs failed\n");
      		errors++;
      	}
      
      	usleep(100);
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tgsbase is still zero\n");
      	} else {
      		printf("[FAIL]\tgsbase was corrupted\n");
      		errors++;
      	}
      
      	return errors == 0 ? 0 : 1;
      }
      
       ----- end gsbase test -----
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: <stable@vger.kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f647d7c1
    • X
      x86/mm: Use min() instead of min_t() in the e820 printout code · 29258cf4
      Xishi Qiu 提交于
      The type of "MAX_DMA_PFN" and "xXx_pfn" are both unsigned long
      now, so use min() instead of min_t().
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Linux MM <linux-mm@kvack.org>
      Cc: <dave@sr71.net>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/5487AB3F.7050807@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      29258cf4
    • X
      x86/mm: Fix zone ranges boot printout · c072b90c
      Xishi Qiu 提交于
      This is the usual physical memory layout boot printout:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
      	[    0.000000]   Normal   [mem 0x100000000-0xc3fffffff]
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0xbf78ffff]
      	[    0.000000]   node   0: [mem 0x100000000-0x63fffffff]
      	[    0.000000]   node   1: [mem 0x640000000-0xc3fffffff]
      	...
      
      This is the log when we set "mem=2G" on the boot cmdline:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]  // should be 0x7fffffff, right?
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      
      This patch fixes the printout, the following log shows the right
      ranges:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0x7fffffff]
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Linux MM <linux-mm@kvack.org>
      Cc: <dave@sr71.net>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/5487AB3D.6070306@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c072b90c
    • H
      arm: omap3: twl: remove usb phy init data · 81e1dadf
      Heikki Krogerus 提交于
      commit dbc98635 ("phy: remove the old lookup method") removes
      struct phy_consumer but twl-common.c still uses the "phy_consumer"
      structure resulting in the following compilation warning.
      
      arch/arm/mach-omap2/twl-common.c:94:21: error: array type has
      incomplete element type
       struct phy_consumer consumers[] = {
      
      Removed using phy_consumer since twl4030 uses the new lookup
      method.
      Signed-off-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Signed-off-by: NKishon Vijay Abraham I <kishon@ti.com>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NGreg Kroah-Hartman <greg@kroah.com>
      81e1dadf
    • A
      make default ->i_fop have ->open() fail with ENXIO · bd9b51e7
      Al Viro 提交于
      As it is, default ->i_fop has NULL ->open() (along with all other methods).
      The only case where it matters is reopening (via procfs symlink) a file that
      didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
      to something sane (default would fail on read/write/ioctl/etc.).
      
      	Unfortunately, such case exists - alloc_file() users, especially
      anon_get_file() ones.  There we have tons of opened files of very different
      kinds sharing the same inode.  As the result, attempt to reopen those via
      procfs succeeds and you get a descriptor you can't do anything with.
      
      	Moreover, in case of sockets we set ->i_fop that will only be used
      on such reopen attempts - and put a failing ->open() into it to make sure
      those do not succeed.
      
      	It would be simpler to put such ->open() into default ->i_fop and leave
      it unchanged both for anon inode (as we do anyway) and for socket ones.  Result:
      	* everything going through do_dentry_open() works as it used to
      	* sock_no_open() kludge is gone
      	* attempts to reopen anon-inode files fail as they really ought to
      	* ditto for aio_private_file()
      	* ditto for perfmon - this one actually tried to imitate sock_no_open()
      trick, but failed to set ->i_fop, so in the current tree reopens succeed and
      yield completely useless descriptor.  Intent clearly had been to fail with
      -ENXIO on such reopens; now it actually does.
      	* everything else that used alloc_file() keeps working - it has ->i_fop
      set for its inodes anyway
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bd9b51e7
    • J
      rtc: omap: drop vendor-prefix from power-controller dt property · 094d3ee3
      Johan Hovold 提交于
      Drop the vendor-prefix from the "ti,system-power-controller" device-tree
      property name.
      
      It has been agreed to make "system-power-controller" a standard property
      and to drop the vendor-prefix that is currently used by several drivers.
      
      Note that drivers that have used "<vendor>,system-power-controller" in a
      released kernel will need to support both versions.
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Benot Cousson <bcousson@baylibre.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Felipe Balbi <balbi@ti.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      094d3ee3
    • J
      ARM: dts: am335x-boneblack: enable power off and rtc wake up · 672e2b14
      Johan Hovold 提交于
      Configure the RTC as system-power controller, which allows the system to
      be powered off as well as woken up again on subsequent RTC alarms.
      
      Note that the PMIC needs to be put in SLEEP (rather than OFF) mode to
      maintain RTC power.  Specifically, this means that the PMIC
      ti,pmic-shutdown-controller property must be left unset in order to be
      able to wake up on RTC alarms.
      
      Tested on BeagleBone Black (rev A5).
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Reviewed-by: NFelipe Balbi <balbi@ti.com>
      Tested-by: NFelipe Balbi <balbi@ti.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Benot Cousson <bcousson@baylibre.com>
      Cc: Lokesh Vutla <lokeshvutla@ti.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Sekhar Nori <nsekhar@ti.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      Cc: Keerthy J <j-keerthy@ti.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      672e2b14
    • J
      ARM: dts: am33xx: update rtc-node compatible property · 6ac7b4a2
      Johan Hovold 提交于
      Enable am33xx specific RTC features (e.g. PMIC control) by adding
      "ti,am3352-rtc" to the compatible property of the rtc node.
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Reviewed-by: NFelipe Balbi <balbi@ti.com>
      Tested-by: NFelipe Balbi <balbi@ti.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Benot Cousson <bcousson@baylibre.com>
      Cc: Lokesh Vutla <lokeshvutla@ti.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Sekhar Nori <nsekhar@ti.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      Cc: Keerthy J <j-keerthy@ti.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ac7b4a2
    • J
      tile: use pr_warn instead of pr_warning · f80e6968
      Joe Perches 提交于
      Use the more common pr_warn.
      
      Coalesce formats, realign arguments.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f80e6968
    • J
      tile: neaten early_printk uses · 4b3ed614
      Joe Perches 提交于
      Coalesce the formats and align arguments.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b3ed614
    • J
      printk: remove used-once early_vprintk · 1dc6244b
      Joe Perches 提交于
      Eliminate the unlikely possibility of message interleaving for
      early_printk/early_vprintk use.
      
      early_vprintk can be done via the %pV extension so remove this
      unnecessary function and change early_printk to have the equivalent
      vprintk code.
      
      All uses of early_printk already end with a newline so also remove the
      unnecessary newline from the early_printk function.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1dc6244b
    • Y
      ppc/cell: replace get_unused_fd() with get_unused_fd_flags(0) · 6b9cdf39
      Yann Droneaud 提交于
      This patch replaces calls to get_unused_fd() with equivalent call to
      get_unused_fd_flags(0) to preserve current behavor for existing code.
      
      In a further patch, get_unused_fd() will be removed so that new code start
      using get_unused_fd_flags(), with the hope O_CLOEXEC could be used, either
      by default or choosen by userspace.
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b9cdf39
    • Y
      ia64: replace get_unused_fd() with get_unused_fd_flags(0) · aeb682dd
      Yann Droneaud 提交于
      This patch replaces calls to get_unused_fd() with equivalent call to
      get_unused_fd_flags(0) to preserve current behavor for existing code.
      
      In a further patch, get_unused_fd() will be removed so that new code start
      using get_unused_fd_flags(), with the hope O_CLOEXEC could be used, either
      by default or choosen by userspace.
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aeb682dd