1. 07 8月, 2014 7 次提交
    • J
      fanotify: fix double free of pending permission events · 5838d444
      Jan Kara 提交于
      Commit 85816794 ("fanotify: Fix use after free for permission
      events") introduced a double free issue for permission events which are
      pending in group's notification queue while group is being destroyed.
      These events are freed from fanotify_handle_event() but they are not
      removed from groups notification queue and thus they get freed again
      from fsnotify_flush_notify().
      
      Fix the problem by removing permission events from notification queue
      before freeing them if we skip processing access response.  Also expand
      comments in fanotify_release() to explain group shutdown in detail.
      
      Fixes: 85816794Signed-off-by: NJan Kara <jack@suse.cz>
      Reported-by: NDouglas Leeder <douglas.leeder@sophos.com>
      Tested-by: NDouglas Leeder <douglas.leeder@sophos.com>
      Reported-by: NHeinrich Schuchard <xypron.glpk@gmx.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5838d444
    • J
      fsnotify: rename event handling functions · 8ba8fa91
      Jan Kara 提交于
      Rename fsnotify_add_notify_event() to fsnotify_add_event() since the
      "notify" part is duplicit.  Rename fsnotify_remove_notify_event() and
      fsnotify_peek_notify_event() to fsnotify_remove_first_event() and
      fsnotify_peek_first_event() respectively since "notify" part is duplicit
      and they really look at the first event in the queue.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Eric Paris <eparis@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ba8fa91
    • F
      fs/fscache: make ctl_table static · 3e584064
      Fabian Frederick 提交于
      fscache_sysctls and fscache_sysctls_root are only used in main.c
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e584064
    • F
      kernel/auditfilter.c: replace count*size kmalloc by kcalloc · bab5e2d6
      Fabian Frederick 提交于
      kcalloc manages count*sizeof overflow.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Cc: Eric Paris <eparis@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bab5e2d6
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · 85417aef
      Linus Torvalds 提交于
      Pull IDE cleanup from David Miller:
       "Just one minor cleanup"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
        ide: use module_platform_driver()
      85417aef
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next · 049711bf
      Linus Torvalds 提交于
      Pull sparc updates from David Miller:
      
       1) Add sparc RAM output to /proc/iomem, from Bob Picco.
      
       2) Allow seeks on /dev/mdesc, from Khalid Aziz.
      
       3) Cleanup sparc64 I/O accessors, from Sam Ravnborg.
      
       4) If update_mmu_cache{,_pmd}() is called with an not-valid mapping, do
          not insert it into the TLB miss hash tables otherwise we'll
          livelock.  Based upon work by Christopher Alexander Tobias Schulze.
      
       5) Fix BREAK detection in sunsab driver when no actual characters are
          pending, from Christopher Alexander Tobias Schulze.
      
       6) Because we have modules --> openfirmware --> vmalloc ordering of
          virtual memory, the lazy VMAP TLB flusher can cons up an invocation
          of flush_tlb_kernel_range() that covers the openfirmware address
          range.  Unfortunately this will flush out the firmware's locked TLB
          mapping which causes all kinds of trouble.  Just split up the flush
          request if this happens, but in the long term the lazy VMAP flusher
          should probably be made a little bit smarter.
      
          Based upon work by Christopher Alexander Tobias Schulze.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
        sparc64: Fix up merge thinko.
        sparc: Add "install" target
        arch/sparc/math-emu/math_32.c: drop stray break operator
        sparc64: ldc_connect() should not return EINVAL when handshake is in progress.
        sparc64: Guard against flushing openfirmware mappings.
        sunsab: Fix detection of BREAK on sunsab serial console
        bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
        sparc64: Do not insert non-valid PTEs into the TSB hash table.
        sparc64: avoid code duplication in io_64.h
        sparc64: reorder functions in io_64.h
        sparc64: drop unused SLOW_DOWN_IO definitions
        sparc64: remove macro indirection in io_64.h
        sparc64: update IO access functions in PeeCeeI
        sparcspkr: use sbus_*() primitives for IO
        sparc: Add support for seek and shorter read to /dev/mdesc
        sparc: use %s for unaligned panic
        drivers/sbus/char: Micro-optimization in display7seg.c
        display7seg: Introduce the use of the managed version of kzalloc
        sparc64 - add mem to iomem resource
      049711bf
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · ae045e24
      Linus Torvalds 提交于
      Pull networking updates from David Miller:
       "Highlights:
      
         1) Steady transitioning of the BPF instructure to a generic spot so
            all kernel subsystems can make use of it, from Alexei Starovoitov.
      
         2) SFC driver supports busy polling, from Alexandre Rames.
      
         3) Take advantage of hash table in UDP multicast delivery, from David
            Held.
      
         4) Lighten locking, in particular by getting rid of the LRU lists, in
            inet frag handling.  From Florian Westphal.
      
         5) Add support for various RFC6458 control messages in SCTP, from
            Geir Ola Vaagland.
      
         6) Allow to filter bridge forwarding database dumps by device, from
            Jamal Hadi Salim.
      
         7) virtio-net also now supports busy polling, from Jason Wang.
      
         8) Some low level optimization tweaks in pktgen from Jesper Dangaard
            Brouer.
      
         9) Add support for ipv6 address generation modes, so that userland
            can have some input into the process.  From Jiri Pirko.
      
        10) Consolidate common TCP connection request code in ipv4 and ipv6,
            from Octavian Purdila.
      
        11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.
      
        12) Generic resizable RCU hash table, with intial users in netlink and
            nftables.  From Thomas Graf.
      
        13) Maintain a name assignment type so that userspace can see where a
            network device name came from (enumerated by kernel, assigned
            explicitly by userspace, etc.) From Tom Gundersen.
      
        14) Automatic flow label generation on transmit in ipv6, from Tom
            Herbert.
      
        15) New packet timestamping facilities from Willem de Bruijn, meant to
            assist in measuring latencies going into/out-of the packet
            scheduler, latency from TCP data transmission to ACK, etc"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
        cxgb4 : Disable recursive mailbox commands when enabling vi
        net: reduce USB network driver config options.
        tg3: Modify tg3_tso_bug() to handle multiple TX rings
        amd-xgbe: Perform phy connect/disconnect at dev open/stop
        amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
        net: sun4i-emac: fix memory leak on bad packet
        sctp: fix possible seqlock seadlock in sctp_packet_transmit()
        Revert "net: phy: Set the driver when registering an MDIO bus device"
        cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
        team: Simplify return path of team_newlink
        bridge: Update outdated comment on promiscuous mode
        net-timestamp: ACK timestamp for bytestreams
        net-timestamp: TCP timestamping
        net-timestamp: SCHED timestamp on entering packet scheduler
        net-timestamp: add key to disambiguate concurrent datagrams
        net-timestamp: move timestamp flags out of sk_flags
        net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
        cxgb4i : Move stray CPL definitions to cxgb4 driver
        tcp: reduce spurious retransmits due to transient SACK reneging
        qlcnic: Initialize dcbnl_ops before register_netdev
        ...
      ae045e24
  2. 06 8月, 2014 33 次提交
    • L
      Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · f4f142ed
      Linus Torvalds 提交于
      Pull randomness updates from Ted Ts'o:
       "Cleanups and bug fixes to /dev/random, add a new getrandom(2) system
        call, which is a superset of OpenBSD's getentropy(2) call, for use
        with userspace crypto libraries such as LibreSSL.
      
        Also add the ability to have a kernel thread to pull entropy from
        hardware rng devices into /dev/random"
      
      * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        hwrng: Pass entropy to add_hwgenerator_randomness() in bits, not bytes
        random: limit the contribution of the hw rng to at most half
        random: introduce getrandom(2) system call
        hw_random: fix sparse warning (NULL vs 0 for pointer)
        random: use registers from interrupted code for CPU's w/o a cycle counter
        hwrng: add per-device entropy derating
        hwrng: create filler thread
        random: add_hwgenerator_randomness() for feeding entropy from devices
        random: use an improved fast_mix() function
        random: clean up interrupt entropy accounting for archs w/o cycle counters
        random: only update the last_pulled time if we actually transferred entropy
        random: remove unneeded hash of a portion of the entropy pool
        random: always update the entropy pool under the spinlock
      f4f142ed
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · bb2cbf5e
      Linus Torvalds 提交于
      Pull security subsystem updates from James Morris:
       "In this release:
      
         - PKCS#7 parser for the key management subsystem from David Howells
         - appoint Kees Cook as seccomp maintainer
         - bugfixes and general maintenance across the subsystem"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits)
        X.509: Need to export x509_request_asymmetric_key()
        netlabel: shorter names for the NetLabel catmap funcs/structs
        netlabel: fix the catmap walking functions
        netlabel: fix the horribly broken catmap functions
        netlabel: fix a problem when setting bits below the previously lowest bit
        PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1
        tpm: simplify code by using %*phN specifier
        tpm: Provide a generic means to override the chip returned timeouts
        tpm: missing tpm_chip_put in tpm_get_random()
        tpm: Properly clean sysfs entries in error path
        tpm: Add missing tpm_do_selftest to ST33 I2C driver
        PKCS#7: Use x509_request_asymmetric_key()
        Revert "selinux: fix the default socket labeling in sock_graft()"
        X.509: x509_request_asymmetric_keys() doesn't need string length arguments
        PKCS#7: fix sparse non static symbol warning
        KEYS: revert encrypted key change
        ima: add support for measuring and appraising firmware
        firmware_class: perform new LSM checks
        security: introduce kernel_fw_from_file hook
        PKCS#7: Missing inclusion of linux/err.h
        ...
      bb2cbf5e
    • C
      ide: use module_platform_driver() · a53dae49
      Christoph Jaeger 提交于
      Eliminate boilerplate code by using module_platform_driver().
      Signed-off-by: NChristoph Jaeger <christophjaeger@linux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a53dae49
    • D
      sparc64: Fix up merge thinko. · 5b6ff9df
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b6ff9df
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · e9011d08
      David S. Miller 提交于
      Conflicts:
      	arch/sparc/mm/init_64.c
      
      Conflict was simple non-overlapping additions.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9011d08
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · d247b6ab
      David S. Miller 提交于
      Conflicts:
      	drivers/net/Makefile
      	net/ipv6/sysctl_net_ipv6.c
      
      Two ipv6_table_template[] additions overlap, so the index
      of the ipv6_table[x] assignments needed to be adjusted.
      
      In the drivers/net/Makefile case, we've gotten rid of the
      garbage whereby we had to list every single USB networking
      driver in the top-level Makefile, there is just one
      "USB_NETWORKING" that guards everything.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d247b6ab
    • L
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7fda6c4
      Linus Torvalds 提交于
      Pull timer and time updates from Thomas Gleixner:
       "A rather large update of timers, timekeeping & co
      
         - Core timekeeping code is year-2038 safe now for 32bit machines.
           Now we just need to fix all in kernel users and the gazillion of
           user space interfaces which rely on timespec/timeval :)
      
         - Better cache layout for the timekeeping internal data structures.
      
         - Proper nanosecond based interfaces for in kernel users.
      
         - Tree wide cleanup of code which wants nanoseconds but does hoops
           and loops to convert back and forth from timespecs.  Some of it
           definitely belongs into the ugly code museum.
      
         - Consolidation of the timekeeping interface zoo.
      
         - A fast NMI safe accessor to clock monotonic for tracing.  This is a
           long standing request to support correlated user/kernel space
           traces.  With proper NTP frequency correction it's also suitable
           for correlation of traces accross separate machines.
      
         - Checkpoint/restart support for timerfd.
      
         - A few NOHZ[_FULL] improvements in the [hr]timer code.
      
         - Code move from kernel to kernel/time of all time* related code.
      
         - New clocksource/event drivers from the ARM universe.  I'm really
           impressed that despite an architected timer in the newer chips SoC
           manufacturers insist on inventing new and differently broken SoC
           specific timers.
      
      [ Ed. "Impressed"? I don't think that word means what you think it means ]
      
         - Another round of code move from arch to drivers.  Looks like most
           of the legacy mess in ARM regarding timers is sorted out except for
           a few obnoxious strongholds.
      
         - The usual updates and fixlets all over the place"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
        timekeeping: Fixup typo in update_vsyscall_old definition
        clocksource: document some basic timekeeping concepts
        timekeeping: Use cached ntp_tick_length when accumulating error
        timekeeping: Rework frequency adjustments to work better w/ nohz
        timekeeping: Minor fixup for timespec64->timespec assignment
        ftrace: Provide trace clocks monotonic
        timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
        seqcount: Add raw_write_seqcount_latch()
        seqcount: Provide raw_read_seqcount()
        timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
        timekeeping: Create struct tk_read_base and use it in struct timekeeper
        timekeeping: Restructure the timekeeper some more
        clocksource: Get rid of cycle_last
        clocksource: Move cycle_last validation to core code
        clocksource: Make delta calculation a function
        wireless: ath9k: Get rid of timespec conversions
        drm: vmwgfx: Use nsec based interfaces
        drm: i915: Use nsec based interfaces
        timekeeping: Provide ktime_get_raw()
        hangcheck-timer: Use ktime_get_ns()
        ...
      e7fda6c4
    • L
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 08d69a25
      Linus Torvalds 提交于
      Pull irq updates from Thomas Gleixner:
       "Nothing spectacular from the irq department this time:
         - overhaul of the crossbar chip driver
         - overhaul of the spear shirq chip driver
         - support for the atmel-aic chip
         - code move from arch to drivers
         - the usual tiny fixlets
         - two reverts worth to mention which undo the too simple attempt of
           supporting wakeup interrupts on shared interrupt lines"
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
        Revert "irq: Warn when shared interrupts do not match on NO_SUSPEND"
        Revert "PM / sleep / irq: Do not suspend wakeup interrupts"
        irq: Warn when shared interrupts do not match on NO_SUSPEND
        irqchip: atmel-aic: Define irq fixups for atmel SoCs
        irqchip: atmel-aic: Implement RTC irq fixup
        irqchip: atmel-aic: Add irq fixup infrastructure
        irqchip: atmel-aic: Add atmel AIC/AIC5 drivers
        irqchip: atmel-aic: Move binding doc to interrupt-controller directory
        genirq: generic chip: Export irq_map_generic_chip function
        PM / sleep / irq: Do not suspend wakeup interrupts
        irqchip: or1k-pic: Migrate from arch/openrisc/
        irqchip: crossbar: Allow for quirky hardware with direct hardwiring of GIC
        documentation: dt: omap: crossbar: Add description for interrupt consumer
        irqchip: crossbar: Introduce centralized check for crossbar write
        irqchip: crossbar: Introduce ti, max-crossbar-sources to identify valid crossbar mapping
        irqchip: crossbar: Add kerneldoc for crossbar_domain_unmap callback
        irqchip: crossbar: Set cb pointer to null in case of error
        irqchip: crossbar: Change the goto naming
        irqchip: crossbar: Return proper error value
        irqchip: crossbar: Fix kerneldoc warning
        ...
      08d69a25
    • T
      x86: MCE: Add raw_lock conversion again · ed5c41d3
      Thomas Gleixner 提交于
      Commit ea431643 ("x86/mce: Fix CMCI preemption bugs") breaks RT by
      the completely unrelated conversion of the cmci_discover_lock to a
      regular (non raw) spinlock.  This lock was annotated in commit
      59d958d2 ("locking, x86: mce: Annotate cmci_discover_lock as raw")
      with a proper explanation why.
      
      The argument for converting the lock back to a regular spinlock was:
      
       - it does percpu ops without disabling preemption. Preemption is not
         disabled due to the mistaken use of a raw spinlock.
      
      Which is complete nonsense.  The raw_spinlock is disabling preemption in
      the same way as a regular spinlock.  In mainline spinlock maps to
      raw_spinlock, in RT spinlock becomes a "sleeping" lock.
      
      raw_spinlock has on RT exactly the same semantics as in mainline.  And
      because this lock is taken in non preemptible context it must be raw on
      RT.
      
      Undo the locking brainfart.
      Reported-by: NClark Williams <williams@redhat.com>
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed5c41d3
    • A
      cxgb4 : Disable recursive mailbox commands when enabling vi · 30f00847
      Anish Bhatt 提交于
      Enabling a Virtual Interface can result in an interrupt during the processing
       of the VI Enable command and, in some paths, result in an attempt to issue
      another command in the interrupt context, eventually crashing the system. Thus,
       we disable interrupts during the course of the VI Enable command and ensure
      enable doesn't sleep.
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Signed-off-by: NCasey Leedom <leedom@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30f00847
    • F
      net: reduce USB network driver config options. · 1bb5a356
      Francois Romieu 提交于
      USB network drivers are already handled in drivers/net/usb/Kconfig.
      Let's save the maintenance burden of dependencies in drivers/net/Makefile.
      
      The newly introduced USB_NET_DRIVERS umbrella config option defaults
      to 'y' so as to minimize the changes of behavior.
      Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bb5a356
    • P
      tg3: Modify tg3_tso_bug() to handle multiple TX rings · 4d8fdc95
      Prashant Sreedharan 提交于
      tg3_tso_bug() was originally designed to handle only HW TX ring 0, Commit
      d3f6f3a1 ("tg3: Prevent page allocation failure
      during TSO workaround") changed the driver logic to use tg3_tso_bug() for all
      HW TX rings that are enabled. This patch fixes the regression by modifying
      tg3_tso_bug() to handle multiple HW TX rings.
      Signed-off-by: NPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d8fdc95
    • D
      Merge branch 'amd-xgbe' · 0ca58d62
      David S. Miller 提交于
      Tom Lendacky says:
      
      ====================
      amd-xgbe: AMD XGBE driver update 2014-08-05
      
      The following series of patches includes fixes/updates to the driver.
      
      - Use dma_set_mask_and_coherent to set the DMA mask
      - Move the phy connect/disconnect logic to allow for module unloading
      
      Changes in V2:
      - Check the return value of the dma_set_mask_and_coherent call
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ca58d62
    • L
      amd-xgbe: Perform phy connect/disconnect at dev open/stop · 88131a81
      Lendacky, Thomas 提交于
      A change added to the mdiobus/phy api added a module_get/module_put
      during phy connect/disconnect processing. Currently, the driver
      performs a phy connect during module probe and a phy disconnect during
      module remove. With the addition of the module_get during phy connect
      the amd-xgbe module use count is incremented and can no longer be
      unloaded.
      
      Move the phy connect/disconnect from the driver probe/remove functions
      to the net_device_ops ndo_open/ndo_stop functions.  This allows the
      module use count to be decremented when the device(s) are brought down
      and allows the module to be unloaded.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88131a81
    • L
      amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask · f3d0e78d
      Lendacky, Thomas 提交于
      Use the dma_set_mask_and_coherent function to set the DMA mask rather
      than setting the DMA mask fields directly.  This was originally done
      to work around a bug in the arm64 DMA support when RAM started above
      the 4GB boundary which has since been fixed.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d0e78d
    • M
      net: sun4i-emac: fix memory leak on bad packet · 2670cc69
      Marc Zyngier 提交于
      Upon reception of a new frame, the emac driver checks for a number
      of error conditions, and flag the packet as "bad" if any of these
      are present. It then allocates a skb unconditionally, but only uses
      it if the packet is "good". On the error path, the skb is just forgotten,
      and the system leaks memory.
      
      The piece of junk I have on my desk seems to encounter such error
      frequently enough so that the box goes OOM after a couple of days,
      which makes me grumpy.
      
      Fix this by moving the allocation on the "good_packet" path (and
      convert it to netdev_alloc_skb while we're at it).
      
      Tested on a random Allwinner A20 board.
      
      Cc: Stefan Roese <sr@denx.de>
      Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
      Cc: <stable@vger.kernel.org> # 3.11+
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2670cc69
    • E
      sctp: fix possible seqlock seadlock in sctp_packet_transmit() · 757efd32
      Eric Dumazet 提交于
      Dave reported following splat, caused by improper use of
      IP_INC_STATS_BH() in process context.
      
      BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
      caller is __this_cpu_preempt_check+0x13/0x20
      CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
       ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
       0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
       ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
      Call Trace:
       [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
       [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
       [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
       [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
       [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
       [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
       [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
       [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
       [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
       [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
       [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
       [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
       [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
       [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
       [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
       [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
       [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
       [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
       [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
       [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
       [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
       [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
       [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
       [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
       [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2
      
      This is a followup of commits f1d8cba6 ("inet: fix possible
      seqlock deadlocks") and 7f88c6b2 ("ipv6: fix possible seqlock
      deadlock in ip6_finish_output2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: NDave Jones <davej@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      757efd32
    • F
      Revert "net: phy: Set the driver when registering an MDIO bus device" · ce7991e8
      Fabio Estevam 提交于
      Commit a71e3c37 ("net: phy: Set the driver when registering an MDIO bus
      device") caused the following regression on the fec driver:
      
      root@imx6qsabresd:~# echo mem > /sys/power/state
      PM: Syncing filesystems ... done.
      Freezing user space processes ... (elapsed 0.003 seconds) done.
      Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
      Unable to handle kernel NULL pointer dereference at virtual address 0000002c
      pgd = bcd14000
      [0000002c] *pgd=4d9e0831, *pte=00000000, *ppte=00000000
      Internal error: Oops: 17 [#1] SMP ARM
      Modules linked in:
      CPU: 0 PID: 617 Comm: sh Not tainted 3.16.0 #17
      task: bc0c4e00 ti: bceb6000 task.ti: bceb6000
      PC is at fec_suspend+0x10/0x70
      LR is at dpm_run_callback.isra.7+0x34/0x6c
      pc : [<803f8a98>]    lr : [<80361f44>]    psr: 600f0013
      sp : bceb7d70  ip : bceb7d88  fp : bceb7d84
      r10: 8091523c  r9 : 00000000  r8 : bd88f478
      r7 : 803f8a88  r6 : 81165988  r5 : 00000000  r4 : 00000000
      r3 : 00000000  r2 : 00000000  r1 : bd88f478  r0 : bd88f478
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      Control: 10c5387d  Table: 4cd1404a  DAC: 00000015
      Process sh (pid: 617, stack limit = 0xbceb6240)
      Stack: (0xbceb7d70 to 0xbceb8000)
      ....
      
      The problem with the original commit is explained by Russell King:
      
      "It has the effect (as can be seen from the oops) of attaching the MDIO bus
      device (itself is a bus-less device) to the platform driver, which means
      that if the platform driver supports power management, it will be called
      to power manage the MDIO bus device.
      
      Moreover, drivers do not expect to be called for power management
      operations for devices which they haven't probed, and certainly not for
      devices which aren't part of the same bus that the driver is registered
      against."
      
      This reverts commit a71e3c37.
      
      Cc: <stable@vger.kernel.org> #3.16
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce7991e8
    • H
      cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine · c2a19856
      Hariprasad Shenai 提交于
      Need to turn off  SGE RX/TX Callback Timers & interrupt in cxgb4vf PCI Shutdown
      routine in order to prevent crashes during reboot/poweroff when traffic is
      running.
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2a19856
    • D
      Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · 6ff4e36f
      David S. Miller 提交于
      Antonio Quartulli says:
      
      ====================
      pull request: batman-adv 2014-08-05
      
      this is a pull request intended for net-next/linux-3.17 (yeah..it's really
      late).
      
      Patches 1, 2 and 4 are really minor changes:
      - kmalloc_array is substituted to kmalloc when possible (as suggested by
        checkpatch);
      - net_ratelimited() is now used properly and the "suppressed" message is not
        printed anymore if not needed;
      - the internal version number has been increased to reflect our current version.
      
      Patch 3 instead is introducing a change in the metric computation function
      by changing the penalty applied at each mesh hop from 15/255 (~6%) to
      30/255 (~11%). This change is introduced by Simon Wunderlich after having
      observed a performance improvement in several networks when using the new value.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ff4e36f
    • T
      team: Simplify return path of team_newlink · ff204cce
      Toshiaki Makita 提交于
      The variable "err" is not necessary.
      Return register_netdevice() directly.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff204cce
    • T
      bridge: Update outdated comment on promiscuous mode · fdb0a662
      Toshiaki Makita 提交于
      Now bridge ports can be non-promiscuous, vlan_vid_add() is no longer an
      unnecessary operation.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdb0a662
    • L
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · f4d33337
      Linus Torvalds 提交于
      Pull media updates from Mauro Carvalho Chehab:
       - removal of sn9c102.  This device driver was replaced a long time ago
         by gspca
       - solo6x10 and go7007 webcam drivers moved from staging into
         mainstream.  They were waiting for an API to allow setting the image
         detection matrix
       - SDR drivers moved from staging into mainstream: sdr-msi3101 (renamed
         as msi2500) and rtl2832
       - added SDR driver for airspy
       - added demux driver: si2165
       - rework at several RC subsystem, making the code for RC-5 SZ variant
         to be added at the standard RC5 decoder
       - added decoder for the XMP IR protocol
       - tuner driver moved from staging into mainstream: msi3101 (renamed as
         msi001)
       - added documentation for some additional SDR pixfmt
       - some device tree bindings documented
       - added support for exynos3250 at s5p-jpeg
       - remove the obsolete, unmaintained and broken mx1_camera driver
       - added support for remote controllers at au0828 driver
       - added a RC driver: sunxi-cir
       - several driver fixes, enhancements and cleanups.
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (455 commits)
        [media] cx23885: fix UNSET/TUNER_ABSENT confusion
        [media] coda: fix build error by making reset control optional
        [media] radio-miropcm20: fix sparse NULL pointer warning
        [media] MAINTAINERS: Update go7007 pattern
        [media] MAINTAINERS: Update solo6x10 patterns
        [media] media: atmel-isi: add primary DT support
        [media] media: atmel-isi: convert the pdata from pointer to structure
        [media] media: atmel-isi: add v4l2 async probe support
        [media] rcar_vin: add devicetree support
        [media] media: pxa_camera device-tree support
        [media] media: mt9m111: add device-tree suppport
        [media] soc_camera: add support for dt binding soc_camera drivers
        [media] media: soc_camera: pxa_camera documentation device-tree support
        [media] media: mt9m111: add device-tree documentation
        [media] s5p-mfc: remove unnecessary calling to function video_devdata()
        [media] s5p-jpeg: add chroma subsampling adjustment for Exynos3250
        [media] s5p-jpeg: Prevent erroneous downscaling for Exynos3250 SoC
        [media] s5p-jpeg: Assure proper crop rectangle initialization
        [media] s5p-jpeg: fix g_selection op
        [media] s5p-jpeg: Adjust jpeg_bound_align_image to Exynos3250 needs
        ...
      f4d33337
    • D
      Merge branch 'net-timestamp-next' · 618896e6
      David S. Miller 提交于
      Willem de Bruijn says:
      
      ====================
      net-timestamp: new tx tstamps and tcp
      
      Extend socket tx timestamping:
      - allow multiple types of software timestamps aside from send (1)
      - add software timestamp on enter packet scheduling (4)
      - add software timestamp for TCP (5)
      - add software timestamp for TCP on ACK (6)
      
      The sk_flags option space is nearly exhausted. Also move the
      many timestamp options to a new sk->sk_tstamps (2).
      
      To disambiguate data when tstamps may arrive out of order,
      optionally return a sequential ID assigned at send (3).
      
      Extend Linux tx timestamping to monitoring of latency
      incurred within the kernel stack and to protocols embedded in TCP.
      Complex kernel setups may have multiple layers of queueing, including
      multiple instances of packet scheduling, and many classes per layer.
      Many applications embed discrete payloads into TCP bytestreams for
      reliability, flow control, etcetera. Detecting application tail
      latency in such scenarios relies on identifying the exact queue
      responsible if on the host, or the network latency if otherwise.
      
      Changelog:
      v4->v5
        - define SCM_TSTAMP_SND == 0, for legacy behavior
        - add TCP tstamps without changing the generated byte stream
          - modify GSO and ACK to find offset: slightly more complex
            than previous invariant that it is the last byte
        - consistent naming of packet scheduling
          - rename SCM_TSTAMP_ENQ to SCM_TSTAMP_SCHED
        - add unique key in ee_data
        - add id field in ee_info to disambiguate tstamps
          - optional, only on new flag SOF_TIMESTAMPING_OPT_ID
          - for bytestream, in bytes
      
      v3->v4
        - (v3 review comment) removed skb->mark packet identification (*A)
        - (v3 review comment) fixed indentation
        - tcp: fixed poll() to return POLLERR on non-zero queue
        - rebased to work without syststamp
        - comments: removed all traces of MSG_TSTAMP_.. (*B)
      
      v2->v3
        - extend the SO_TIMESTAMPING API, instead of defining a new one.
        - add protocol independent support to correlate tstamps with data,
          based on returning skb->mark.
        - removed no-payload optimization and documentation (for now):
      
          I have a follow-on patch that reintroduces MSG_TSTAMP along with a
          new socket option SOF_TIMESTAMPING_OPT_ONFLAG. This is equivalent
          to sequence setsockopt(<enable>); send(..); setsockopt(<disable>),
          but avoids the need to define a MSG_TSTAMP_<TYPE> for each type.
      
          I will leave these three patches as follow-on, as this patchset is
          large enough as is.
      
      v1->v2
        - expand timestamping (existing and new) to SOCK_RAW and ping sockets
        - rename sock_errqueue_timestamping to scm_timestamping
        - change timestamp data format: do not add fields to scm_timestamping.
            Doing so could break legacy applications. Instead, communicate
            through an existing, but unused, field in the error message.
        - rename SOF_.._OPT_TX_NO_PAYLOAD to shorter SOF_.._OPT_TSONLY
        - move msg_tstamp test app out of patchset and to github
            git://github.com/wdebruij/kerneltools.git
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      618896e6
    • W
      net-timestamp: ACK timestamp for bytestreams · e1c8a607
      Willem de Bruijn 提交于
      Add SOF_TIMESTAMPING_TX_ACK, a request for a tstamp when the last byte
      in the send() call is acknowledged. It implements the feature for TCP.
      
      The timestamp is generated when the TCP socket cumulative ACK is moved
      beyond the tracked seqno for the first time. The feature ignores SACK
      and FACK, because those acknowledge the specific byte, but not
      necessarily the entire contents of the buffer up to that byte.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1c8a607
    • W
      net-timestamp: TCP timestamping · 4ed2d765
      Willem de Bruijn 提交于
      TCP timestamping extends SO_TIMESTAMPING to bytestreams.
      
      Bytestreams do not have a 1:1 relationship between send() buffers and
      network packets. The feature interprets a send call on a bytestream as
      a request for a timestamp for the last byte in that send() buffer.
      
      The choice corresponds to a request for a timestamp when all bytes in
      the buffer have been sent. That assumption depends on in-order kernel
      transmission. This is the common case. That said, it is possible to
      construct a traffic shaping tree that would result in reordering.
      The guarantee is strong, then, but not ironclad.
      
      This implementation supports send and sendpages (splice). GSO replaces
      one large packet with multiple smaller packets. This patch also copies
      the option into the correct smaller packet.
      
      This patch does not yet support timestamping on data in an initial TCP
      Fast Open SYN, because that takes a very different data path.
      
      If ID generation in ee_data is enabled, bytestream timestamps return a
      byte offset, instead of the packet counter for datagrams.
      
      The implementation supports a single timestamp per packet. It silenty
      replaces requests for previous timestamps. To avoid missing tstamps,
      flush the tcp queue by disabling Nagle, cork and autocork. Missing
      tstamps can be detected by offset when the ee_data ID is enabled.
      
      Implementation details:
      
      - On GSO, the timestamping code can be included in the main loop. I
      moved it into its own loop to reduce the impact on the common case
      to a single branch.
      
      - To avoid leaking the absolute seqno to userspace, the offset
      returned in ee_data must always be relative. It is an offset between
      an skb and sk field. The first is always set (also for GSO & ACK).
      The second must also never be uninitialized. Only allow the ID
      option on sockets in the ESTABLISHED state, for which the seqno
      is available. Never reset it to zero (instead, move it to the
      current seqno when reenabling the option).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ed2d765
    • W
      net-timestamp: SCHED timestamp on entering packet scheduler · e7fd2885
      Willem de Bruijn 提交于
      Kernel transmit latency is often incurred in the packet scheduler.
      Introduce a new timestamp on transmission just before entering the
      scheduler. When data travels through multiple devices (bonding,
      tunneling, ...) each device will export an individual timestamp.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7fd2885
    • W
      net-timestamp: add key to disambiguate concurrent datagrams · 09c2d251
      Willem de Bruijn 提交于
      Datagrams timestamped on transmission can coexist in the kernel stack
      and be reordered in packet scheduling. When reading looped datagrams
      from the socket error queue it is not always possible to unique
      correlate looped data with original send() call (for application
      level retransmits). Even if possible, it may be expensive and complex,
      requiring packet inspection.
      
      Introduce a data-independent ID mechanism to associate timestamps with
      send calls. Pass an ID alongside the timestamp in field ee_data of
      sock_extended_err.
      
      The ID is a simple 32 bit unsigned int that is associated with the
      socket and incremented on each send() call for which software tx
      timestamp generation is enabled.
      
      The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
      avoid changing ee_data for existing applications that expect it 0.
      The counter is reset each time the flag is reenabled. Reenabling
      does not change the ID of already submitted data. It is possible
      to receive out of order IDs if the timestamp stream is not quiesced
      first.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09c2d251
    • W
      net-timestamp: move timestamp flags out of sk_flags · b9f40e21
      Willem de Bruijn 提交于
      sk_flags is reaching its limit. New timestamping options will not fit.
      Move all of them into a new field sk->sk_tsflags.
      
      Added benefit is that this removes boilerplate code to convert between
      SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.
      
      SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
      timestamp logic (netstamp_needed). That can be simplified and this
      last key removed, but will leave that for a separate patch.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      
      ----
      
      The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
      though that scatters tstamp fields throughout the struct.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f40e21
    • W
      net-timestamp: extend SCM_TIMESTAMPING ancillary data struct · f24b9be5
      Willem de Bruijn 提交于
      Applications that request kernel tx timestamps with SO_TIMESTAMPING
      read timestamps as recvmsg() ancillary data. The response is defined
      implicitly as timespec[3].
      
      1) define struct scm_timestamping explicitly and
      
      2) add support for new tstamp types. On tx, scm_timestamping always
         accompanies a sock_extended_err. Define previously unused field
         ee_info to signal the type of ts[0]. Introduce SCM_TSTAMP_SND to
         define the existing behavior.
      
      The reception path is not modified. On rx, no struct similar to
      sock_extended_err is passed along with SCM_TIMESTAMPING.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f24b9be5
    • A
      cxgb4i : Move stray CPL definitions to cxgb4 driver · a2b81b35
      Anish Bhatt 提交于
      These belong to the t4 msg header, will ensure there is no accidental code
      duplication in the future
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2b81b35
    • N
      tcp: reduce spurious retransmits due to transient SACK reneging · 5ae344c9
      Neal Cardwell 提交于
      This commit reduces spurious retransmits due to apparent SACK reneging
      by only reacting to SACK reneging that persists for a short delay.
      
      When a sequence space hole at snd_una is filled, some TCP receivers
      send a series of ACKs as they apparently scan their out-of-order queue
      and cumulatively ACK all the packets that have now been consecutiveyly
      received. This is essentially misbehavior B in "Misbehaviors in TCP
      SACK generation" ACM SIGCOMM Computer Communication Review, April
      2011, so we suspect that this is from several common OSes (Windows
      2000, Windows Server 2003, Windows XP). However, this issue has also
      been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
      into spurious retransmissions by lack of timestamps?" from March 2014,
      where the receiver was thought to be a BSD box.
      
      Since snd_una would temporarily be adjacent to a previously SACKed
      range in these scenarios, this receiver behavior triggered the Linux
      SACK reneging code path in the sender. This led the sender to clear
      the SACK scoreboard, enter CA_Loss, and spuriously retransmit
      (potentially) every packet from the entire write queue at line rate
      just a few milliseconds before the ACK for each packet arrives at the
      sender.
      
      To avoid such situations, now when a sender sees apparent reneging it
      does not yet retransmit, but rather adjusts the RTO timer to give the
      receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
      that will restore sanity to the SACK scoreboard. If the reneging
      persists until this RTO then, as before, we clear the SACK scoreboard
      and enter CA_Loss.
      
      A 10ms delay tolerates a receiver sending such a stream of ACKs at
      56Kbit/sec. And to allow for receivers with slower or more congested
      paths, we wait for at least RTT/2.
      
      We validated the resulting max(RTT/2, 10ms) delay formula with a mix
      of North American and South American Google web server traffic, and
      found that for ACKs displaying transient reneging:
      
       (1) 90% of inter-ACK delays were less than 10ms
       (2) 99% of inter-ACK delays were less than RTT/2
      
      In tests on Google web servers this commit reduced reneging events by
      75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
      any measurable impact on latency for user HTTP and SPDY requests.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ae344c9
    • D
      Merge branch 'qlcnic' · ff91a550
      David S. Miller 提交于
      Rajesh Borundia says:
      
      ====================
      qlcnic: Bug fixes
      
      The patch series contains following bug fixes.
      
      * Aggregating tx stats in adapter variable was resulting
        in increase of stats when user runs ifconfig command
        and no traffic is running. Instead aggregate tx stats
        in local variable and then assign it to adapter struct
        variable.
      * Set_driver_version was called after registering netdev
        which was resulting in a race between FLR in open
        handler and set_driver_version command as open handler
        can be called simulatneously on another cpu even if probe
        is not complete. So call this command before registering
        netdev.
      * dcbnl_ops should be initialized before registering netdev
        as they are referenced in open handler.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff91a550