1. 23 11月, 2016 2 次提交
  2. 12 11月, 2016 3 次提交
    • J
      mm: kmemleak: scan .data.ro_after_init · d7c19b06
      Jakub Kicinski 提交于
      Limit the number of kmemleak false positives by including
      .data.ro_after_init in memory scanning.  To achieve this we need to add
      symbols for start and end of the section to the linker scripts.
      
      The problem was been uncovered by commit 56989f6d ("genetlink: mark
      families as __ro_after_init").
      
      Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.comReviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7c19b06
    • H
      Revert "console: don't prefer first registered if DT specifies stdout-path" · c6c7d83b
      Hans de Goede 提交于
      This reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path").
      
      The reverted commit changes existing behavior on which many ARM boards
      rely.  Many ARM small-board-computers, like e.g.  the Raspberry Pi have
      both a video output and a serial console.  Depending on whether the user
      is using the device as a more regular computer; or as a headless device
      we need to have the console on either one or the other.
      
      Many users rely on the kernel behavior of the console being present on
      both outputs, before the reverted commit the console setup with no
      console= kernel arguments on an ARM board which sets stdout-path in dt
      would look like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
        tty0                 -WU (E  p  )    4:1
      
      Where as after the reverted commit, it looks like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
      
      This commit reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path") restoring the original
      behavior.
      
      Fixes: 05fd007e ("console: don't prefer first registered if DT specifies stdout-path")
      Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6c7d83b
    • V
      mm, frontswap: make sure allocated frontswap map is assigned · 5e322bee
      Vlastimil Babka 提交于
      Christian Borntraeger reports:
      
      With commit 8ea1d2a1 ("mm, frontswap: convert frontswap_enabled to
      static key") kmemleak complains about a memory leak in swapon
      
          unreferenced object 0x3e09ba56000 (size 32112640):
            comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s)
            hex dump (first 32 bytes):
              00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
              00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
            backtrace:
               __vmalloc_node_range+0x194/0x2d8
               vzalloc+0x58/0x68
               SyS_swapon+0xd60/0x12f8
               system_call+0xd6/0x270
      
      Turns out kmemleak is right.  We now allocate the frontswap map
      depending on the kernel config (and no longer on the enablement)
      
        swapfile.c:
        [...]
            if (IS_ENABLED(CONFIG_FRONTSWAP))
                      frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long));
      
      but later on this is passed along
        --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map);
      
      and ignored if frontswap is disabled
        --> frontswap_init(p->type, frontswap_map);
      
        static inline void frontswap_init(unsigned type, unsigned long *map)
        {
              if (frontswap_enabled())
                      __frontswap_init(type, map);
        }
      
      Thing is, that frontswap map is never freed.
      
      The leakage is relatively not that bad, because swapon is an infrequent
      and privileged operation.  However, if the first frontswap backend is
      registered after a swap type has been already enabled, it will WARN_ON
      in frontswap_register_ops() and frontswap will not be available for the
      swap type.
      
      Fix this by making sure the map is assigned by frontswap_init() as long
      as CONFIG_FRONTSWAP is enabled.
      
      Fixes: 8ea1d2a1 ("mm, frontswap: convert frontswap_enabled to static key")
      Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e322bee
  3. 11 11月, 2016 1 次提交
  4. 10 11月, 2016 1 次提交
  5. 08 11月, 2016 1 次提交
  6. 05 11月, 2016 1 次提交
  7. 31 10月, 2016 2 次提交
  8. 30 10月, 2016 6 次提交
  9. 29 10月, 2016 1 次提交
  10. 28 10月, 2016 6 次提交
    • J
      perf/powerpc: Don't call perf_event_disable() from atomic context · 5aab90ce
      Jiri Olsa 提交于
      The trinity syscall fuzzer triggered following WARN() on powerpc:
      
        WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278
        ...
        NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0
        LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0
        Call Trace:
        [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable)
        [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
        [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
        [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
        [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
        [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48
      
      Followed by a lockdep warning:
      
        ===============================
        [ INFO: suspicious RCU usage. ]
        4.8.0-rc5+ #7 Tainted: G        W
        -------------------------------
        ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 0
        2 locks held by ls/2998:
         #0:  (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0
         #1:  (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0
      
        stack backtrace:
        CPU: 9 PID: 2998 Comm: ls Tainted: G        W       4.8.0-rc5+ #7
        Call Trace:
        [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable)
        [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180
        [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0
        [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0
        [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380
        [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60
        [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0
        [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
        [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
        [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
        [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
        [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48
      
      While it looks like the first WARN() is probably valid, the other one is
      triggered by disabling event via perf_event_disable() from atomic context.
      
      The event is disabled here in case we were not able to emulate
      the instruction that hit the breakpoint. By disabling the event
      we unschedule the event and make sure it's not scheduled back.
      
      But we can't call perf_event_disable() from atomic context, instead
      we need to use the event's pending_disable irq_work method to disable it.
      Reported-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20161026094824.GA21397@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5aab90ce
    • B
      mtd: nand: Fix data interface configuration logic · 73f907fd
      Boris Brezillon 提交于
      When changing from one data interface setting to another, one has to
      ensure a specific sequence which is described in the ONFI spec.
      
      One of these constraints is that the CE line has go high after a reset
      before a command can be sent with the new data interface setting, which
      is not guaranteed by the current implementation.
      
      Rework the nand_reset() function and all the call sites to make sure the
      CE line is asserted and released when required.
      
      Also make sure to actually apply the new data interface setting on the
      first die.
      Signed-off-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Fixes: d8e725dd ("mtd: nand: automate NAND timings selection")
      Reviewed-by: NSascha Hauer <s.hauer@pengutronix.de>
      Tested-by: NMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      73f907fd
    • M
      kconfig.h: remove config_enabled() macro · c0a0aba8
      Masahiro Yamada 提交于
      The use of config_enabled() is ambiguous.  For config options,
      IS_ENABLED(), IS_REACHABLE(), etc.  will make intention clearer.
      Sometimes config_enabled() has been used for non-config options because
      it is useful to check whether the given symbol is defined or not.
      
      I have been tackling on deprecating config_enabled(), and now is the
      time to finish this work.
      
      Some new users have appeared for v4.9-rc1, but it is trivial to replace
      them:
      
       - arch/x86/mm/kaslr.c
        replace config_enabled() with IS_ENABLED() because
        CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean.
      
       - include/asm-generic/export.h
        replace config_enabled() with __is_defined().
      
      Then, config_enabled() can be removed now.
      
      Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config
      options, and __is_defined() for non-config symbols.
      
      Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Michal Marek <mmarek@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Paul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0a0aba8
    • D
      net: ipv6: Do not consider link state for nexthop validation · d5d32e4b
      David Ahern 提交于
      Similar to IPv4, do not consider link state when validating next hops.
      
      Currently, if the link is down default routes can fail to insert:
       $ ip -6 ro add vrf blue default via 2100:2::64 dev eth2
       RTNETLINK answers: No route to host
      
      With this patch the command succeeds.
      
      Fixes: 8c14586f ("net: ipv6: Use passed in table for nexthop lookups")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5d32e4b
    • D
      net: ipv6: Fix processing of RAs in presence of VRF · 830218c1
      David Ahern 提交于
      rt6_add_route_info and rt6_add_dflt_router were updated to pull the FIB
      table from the device index, but the corresponding rt6_get_route_info
      and rt6_get_dflt_router functions were not leading to the failure to
      process RA's:
      
          ICMPv6: RA: ndisc_router_discovery failed to add default route
      
      Fix the 'get' functions by using the table id associated with the
      device when applicable.
      
      Also, now that default routes can be added to tables other than the
      default table, rt6_purge_dflt_routers needs to be updated as well to
      look at all tables. To handle that efficiently, add a flag to the table
      denoting if it is has a default route via RA.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      830218c1
    • L
      mm: remove per-zone hashtable of bitlock waitqueues · 9dcb8b68
      Linus Torvalds 提交于
      The per-zone waitqueues exist because of a scalability issue with the
      page waitqueues on some NUMA machines, but it turns out that they hurt
      normal loads, and now with the vmalloced stacks they also end up
      breaking gfs2 that uses a bit_wait on a stack object:
      
           wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE)
      
      where 'gh' can be a reference to the local variable 'mount_gh' on the
      stack of fill_super().
      
      The reason the per-zone hash table breaks for this case is that there is
      no "zone" for virtual allocations, and trying to look up the physical
      page to get at it will fail (with a BUG_ON()).
      
      It turns out that I actually complained to the mm people about the
      per-zone hash table for another reason just a month ago: the zone lookup
      also hurts the regular use of "unlock_page()" a lot, because the zone
      lookup ends up forcing several unnecessary cache misses and generates
      horrible code.
      
      As part of that earlier discussion, we had a much better solution for
      the NUMA scalability issue - by just making the page lock have a
      separate contention bit, the waitqueue doesn't even have to be looked at
      for the normal case.
      
      Peter Zijlstra already has a patch for that, but let's see if anybody
      even notices.  In the meantime, let's fix the actual gfs2 breakage by
      simplifying the bitlock waitqueues and removing the per-zone issue.
      Reported-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Tested-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9dcb8b68
  11. 27 10月, 2016 3 次提交
  12. 26 10月, 2016 2 次提交
    • J
      mac80211: fix some sphinx warnings · b4f7f4ad
      Jani Nikula 提交于
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      b4f7f4ad
    • D
      x86/io: add interface to reserve io memtype for a resource range. (v1.1) · 8ef42276
      Dave Airlie 提交于
      A recent change to the mm code in:
      87744ab3 mm: fix cache mode tracking in vm_insert_mixed()
      
      started enforcing checking the memory type against the registered list for
      amixed pfn insertion mappings. It happens that the drm drivers for a number
      of gpus relied on this being broken. Currently the driver only inserted
      VRAM mappings into the tracking table when they came from the kernel,
      and userspace mappings never landed in the table. This led to a regression
      where all the mapping end up as UC instead of WC now.
      
      I've considered a number of solutions but since this needs to be fixed
      in fixes and not next, and some of the solutions were going to introduce
      overhead that hadn't been there before I didn't consider them viable at
      this stage. These mainly concerned hooking into the TTM io reserve APIs,
      but these API have a bunch of fast paths I didn't want to unwind to add
      this to.
      
      The solution I've decided on is to add a new API like the arch_phys_wc
      APIs (these would have worked but wc_del didn't take a range), and
      use them from the drivers to add a WC compatible mapping to the table
      for all VRAM on those GPUs. This means we can then create userspace
      mapping that won't get degraded to UC.
      
      v1.1: use CONFIG_X86_PAT + add some comments in io.h
      
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: x86@kernel.org
      Cc: mcgrof@suse.com
      Cc: Dan Williams <dan.j.williams@intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      8ef42276
  13. 25 10月, 2016 1 次提交
    • L
      mm: unexport __get_user_pages() · 0d731759
      Lorenzo Stoakes 提交于
      This patch unexports the low-level __get_user_pages() function.
      
      Recent refactoring of the get_user_pages* functions allow flags to be
      passed through get_user_pages() which eliminates the need for access to
      this function from its one user, kvm.
      
      We can see that the two calls to get_user_pages() which replace
      __get_user_pages() in kvm_main.c are equivalent by examining their call
      stacks:
      
        get_user_page_nowait():
          get_user_pages(start, 1, flags, page, NULL)
          __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, start, 1,
      		     flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)
      
        check_user_page_hwpoison():
          get_user_pages(addr, 1, flags, NULL, NULL)
          __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
      		     NULL, NULL)
      Signed-off-by: NLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d731759
  14. 24 10月, 2016 1 次提交
  15. 23 10月, 2016 1 次提交
  16. 21 10月, 2016 4 次提交
    • W
      ipv6: fix a potential deadlock in do_ipv6_setsockopt() · 8651be8f
      WANG Cong 提交于
      Baozeng reported this deadlock case:
      
             CPU0                    CPU1
             ----                    ----
        lock([  165.136033] sk_lock-AF_INET6);
                                     lock([  165.136033] rtnl_mutex);
                                     lock([  165.136033] sk_lock-AF_INET6);
        lock([  165.136033] rtnl_mutex);
      
      Similar to commit 87e9f031
      ("ipv4: fix a potential deadlock in mcast getsockopt() path")
      this is due to we still have a case, ipv6_sock_mc_close(),
      where we acquire sk_lock before rtnl_lock. Close this deadlock
      with the similar solution, that is always acquire rtnl lock first.
      
      Fixes: baf606d9 ("ipv4,ipv6: grab rtnl before locking the socket")
      Reported-by: NBaozeng Ding <sploving1@gmail.com>
      Tested-by: NBaozeng Ding <sploving1@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8651be8f
    • E
      udp: must lock the socket in udp_disconnect() · 286c72de
      Eric Dumazet 提交于
      Baozeng Ding reported KASAN traces showing uses after free in
      udp_lib_get_port() and other related UDP functions.
      
      A CONFIG_DEBUG_PAGEALLOC=y kernel would eventually crash.
      
      I could write a reproducer with two threads doing :
      
      static int sock_fd;
      static void *thr1(void *arg)
      {
      	for (;;) {
      		connect(sock_fd, (const struct sockaddr *)arg,
      			sizeof(struct sockaddr_in));
      	}
      }
      
      static void *thr2(void *arg)
      {
      	struct sockaddr_in unspec;
      
      	for (;;) {
      		memset(&unspec, 0, sizeof(unspec));
      	        connect(sock_fd, (const struct sockaddr *)&unspec,
      			sizeof(unspec));
              }
      }
      
      Problem is that udp_disconnect() could run without holding socket lock,
      and this was causing list corruptions.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      286c72de
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
    • R
      clocksource: Add J-Core timer/clocksource driver · 9995f4f1
      Rich Felker 提交于
      At the hardware level, the J-Core PIT is integrated with the interrupt
      controller, but it is represented as its own device and has an
      independent programming interface. It provides a 12-bit countdown
      timer, which is not presently used, and a periodic timer. The interval
      length for the latter is programmable via a 32-bit throttle register
      whose units are determined by a bus-period register. The periodic
      timer is used to implement both periodic and oneshot clock event
      modes; in oneshot mode the interrupt handler simply disables the timer
      as soon as it fires.
      
      Despite its device tree node representing an interrupt for the PIT,
      the actual irq generated is programmable, not hard-wired. The driver
      is responsible for programming the PIT to generate the hardware irq
      number that the DT assigns to it.
      
      On SMP configurations, J-Core provides cpu-local instances of the PIT;
      no broadcast timer is needed. This driver supports the creation of the
      necessary per-cpu clock_event_device instances.
      
      A nanosecond-resolution clocksource is provided using the J-Core "RTC"
      registers, which give a 64-bit seconds count and 32-bit nanoseconds
      that wrap every second. The driver converts these to a full-range
      32-bit nanoseconds count.
      Signed-off-by: NRich Felker <dalias@libc.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: devicetree@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Link: http://lkml.kernel.org/r/b591ff12cc5ebf63d1edc98da26046f95a233814.1476393790.git.dalias@libc.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9995f4f1
  17. 20 10月, 2016 4 次提交
    • S
      cpufreq: fix overflow in cpufreq_table_find_index_dl() · c6fe46a7
      Sergey Senozhatsky 提交于
      'best' is always less or equals to 'pos', so `best - pos' returns
      a negative value which is then getting casted to `unsigned int'
      and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
      for policy->freq_table selection. This results in
      
       BUG: unable to handle kernel paging request at ffff881019b469f8
       IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
       PGD 267f067
       PUD 0
      
       Oops: 0000 [#1] PREEMPT SMP
       CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
       Workqueue: events dbs_work_handler
       task: ffff88041b808000 task.stack: ffff88041b810000
       RIP: 0010:[<ffffffffa00356c1>]  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
       RSP: 0018:ffff88041b813c60  EFLAGS: 00010282
       RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
       RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
       RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
       R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
       R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
       FS:  0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
       Stack:
        ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
        ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
        0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
       Call Trace:
        [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
        [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
        [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
        [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
        [<ffffffff81336fda>] dbs_work_handler+0x39/0x62
        [<ffffffff81057823>] process_one_work+0x280/0x4a5
        [<ffffffff81058719>] worker_thread+0x24f/0x397
        [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
        [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
        [<ffffffff8105d2b7>] kthread+0xfc/0x104
        [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
        [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
        [<ffffffff814b2092>] ret_from_fork+0x22/0x30
       Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
       08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
       3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
       RIP  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
        RSP <ffff88041b813c60>
       CR2: ffff881019b469f8
       ---[ end trace 16d9fc7a17897d37 ]---
      
      [ rjw: In some cases this bug may also cause incorrect frequencies to
        be selected by cpufreq governors. ]
      
      Fixes: 899bb664 (cpufreq: skip invalid entries when searching the frequency)
      Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2Reported-and-tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Reported-and-tested-by: NJörg Otte <jrg.otte@gmail.com>
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c6fe46a7
    • H
      sched/core, x86: Make struct thread_info arch specific again · c8061485
      Heiko Carstens 提交于
      The following commit:
      
        c65eacbe ("sched/core: Allow putting thread_info into task_struct")
      
      ... made 'struct thread_info' a generic struct with only a
      single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
      selected.
      
      This change however seems to be quite x86 centric, since at least the
      generic preemption code (asm-generic/preempt.h) assumes that struct
      thread_info also has a preempt_count member, which apparently was not
      true for x86.
      
      We could add a bit more #ifdefs to solve this problem too, but it seems
      to be much simpler to make struct thread_info arch specific
      again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
      bit easier for architectures that have a couple of arch specific stuff
      in their thread_info definition.
      
      The arch specific stuff _could_ be moved to thread_struct. However
      keeping them in thread_info makes it easier: accessing thread_info
      members is simple, since it is at the beginning of the task_struct,
      while the thread_struct is at the end. At least on s390 the offsets
      needed to access members of the thread_struct (with task_struct as
      base) are too large for various asm instructions.  This is not a
      problem when keeping these members within thread_info.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8061485
    • A
      mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() · d17af505
      Andy Lutomirski 提交于
      Asking for a non-current task's stack can't be done without races
      unless the task is frozen in kernel mode.  As far as I know,
      vm_is_stack_for_task() never had a safe non-current use case.
      
      The __unused annotation is because some KSTK_ESP implementations
      ignore their parameter, which IMO is further justification for this
      patch.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Jann Horn <jann@thejh.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linux API <linux-api@vger.kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tycho Andersen <tycho.andersen@canonical.com>
      Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d17af505
    • C
      iomap: add IOMAP_REPORT · d33fd776
      Christoph Hellwig 提交于
      This allows the file system to tell a FIEMAP from a read operation, and thus
      avoids the need to report flags that aren't actually used in the read path.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      d33fd776