1. 11 7月, 2013 5 次提交
  2. 10 7月, 2013 35 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 496322bc
      Linus Torvalds 提交于
      Pull networking updates from David Miller:
       "This is a re-do of the net-next pull request for the current merge
        window.  The only difference from the one I made the other day is that
        this has Eliezer's interface renames and the timeout handling changes
        made based upon your feedback, as well as a few bug fixes that have
        trickeled in.
      
        Highlights:
      
         1) Low latency device polling, eliminating the cost of interrupt
            handling and context switches.  Allows direct polling of a network
            device from socket operations, such as recvmsg() and poll().
      
            Currently ixgbe, mlx4, and bnx2x support this feature.
      
            Full high level description, performance numbers, and design in
            commit 0a4db187 ("Merge branch 'll_poll'")
      
            From Eliezer Tamir.
      
         2) With the routing cache removed, ip_check_mc_rcu() gets exercised
            more than ever before in the case where we have lots of multicast
            addresses.  Use a hash table instead of a simple linked list, from
            Eric Dumazet.
      
         3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
            Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
            Marek Puzyniak, Michal Kazior, and Sujith Manoharan.
      
         4) Support reporting the TUN device persist flag to userspace, from
            Pavel Emelyanov.
      
         5) Allow controlling network device VF link state using netlink, from
            Rony Efraim.
      
         6) Support GRE tunneling in openvswitch, from Pravin B Shelar.
      
         7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
            Daniel Borkmann and Eric Dumazet.
      
         8) Allow controlling of TCP quickack behavior on a per-route basis,
            from Cong Wang.
      
         9) Several bug fixes and improvements to vxlan from Stephen
            Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
            support receiving on multiple UDP ports.
      
        10) Major cleanups, particular in the area of debugging and cookie
            lifetime handline, to the SCTP protocol code.  From Daniel
            Borkmann.
      
        11) Allow packets to cross network namespaces when traversing tunnel
            devices.  From Nicolas Dichtel.
      
        12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
            manner akin to how we monitor real network traffic via ptype_all.
            From Daniel Borkmann.
      
        13) Several bug fixes and improvements for the new alx device driver,
            from Johannes Berg.
      
        14) Fix scalability issues in the netem packet scheduler's time queue,
            by using an rbtree.  From Eric Dumazet.
      
        15) Several bug fixes in TCP loss recovery handling, from Yuchung
            Cheng.
      
        16) Add support for GSO segmentation of MPLS packets, from Simon
            Horman.
      
        17) Make network notifiers have a real data type for the opaque
            pointer that's passed into them.  Use this to properly handle
            network device flag changes in arp_netdev_event().  From Jiri
            Pirko and Timo Teräs.
      
        18) Convert several drivers over to module_pci_driver(), from Peter
            Huewe.
      
        19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
            O(1) calculation instead.  From Eric Dumazet.
      
        20) Support setting of explicit tunnel peer addresses in ipv6, just
            like ipv4.  From Nicolas Dichtel.
      
        21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.
      
        22) Prevent a single high rate flow from overruning an individual cpu
            during RX packet processing via selective flow shedding.  From
            Willem de Bruijn.
      
        23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
            Dumazet.
      
        24) Don't just drop GSO packets which are above the TBF scheduler's
            burst limit, chop them up so they are in-bounds instead.  Also
            from Eric Dumazet.
      
        25) VLAN offloads are missed when configured on top of a bridge, fix
            from Vlad Yasevich.
      
        26) Support IPV6 in ping sockets.  From Lorenzo Colitti.
      
        27) Receive flow steering targets should be updated at poll() time
            too, from David Majnemer.
      
        28) Fix several corner case regressions in PMTU/redirect handling due
            to the routing cache removal, from Timo Teräs.
      
        29) We have to be mindful of ipv4 mapped ipv6 sockets in
            upd_v6_push_pending_frames().  From Hannes Frederic Sowa.
      
        30) Fix L2TP sequence number handling bugs, from James Chapman."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
        drivers/net: caif: fix wrong rtnl_is_locked() usage
        drivers/net: enic: release rtnl_lock on error-path
        vhost-net: fix use-after-free in vhost_net_flush
        net: mv643xx_eth: do not use port number as platform device id
        net: sctp: confirm route during forward progress
        virtio_net: fix race in RX VQ processing
        virtio: support unlocked queue poll
        net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
        Documentation: Fix references to defunct linux-net@vger.kernel.org
        net/fs: change busy poll time accounting
        net: rename low latency sockets functions to busy poll
        bridge: fix some kernel warning in multicast timer
        sfc: Fix memory leak when discarding scattered packets
        sit: fix tunnel update via netlink
        dt:net:stmmac: Add dt specific phy reset callback support.
        dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
        dt:net:stmmac: Allocate platform data only if its NULL.
        net:stmmac: fix memleak in the open method
        ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
        net: ipv6: fix wrong ping_v6_sendmsg return value
        ...
      496322bc
    • L
      Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux · 2e17c5a9
      Linus Torvalds 提交于
      Pull drm updates from Dave Airlie:
       "Okay this is the big one, I was stalled on the fbdev pull req as I
        stupidly let fbdev guys merge a patch I required to fix a warning with
        some patches I had, they ended up merging the patch from the wrong
        place, but the warning should be fixed.  In future I'll just take the
        patch myself!
      
        Outside drm:
      
        There are some snd changes for the HDMI audio interactions on haswell,
        they've been acked for inclusion via my tree.  This relies on the
        wound/wait tree from Ingo which is already merged.
      
        Major changes:
      
        AMD finally released the dynamic power management code for all their
        GPUs from r600->present day, this is great, off by default for now but
        also a huge amount of code, in fact it is most of this pull request.
      
        Since it landed there has been a lot of community testing and Alex has
        sent a lot of fixes for any bugs found so far.  I suspect radeon might
        now be the biggest kernel driver ever :-P p.s.  radeon.dpm=1 to enable
        dynamic powermanagement for anyone.
      
        New drivers:
      
        Renesas r-car display unit.
      
        Other highlights:
      
         - core: GEM CMA prime support, use new w/w mutexs for TTM
           reservations, cursor hotspot, doc updates
         - dvo chips: chrontel 7010B support
         - i915: Haswell (fbc, ips, vecs, watermarks, audio powerwell),
           Valleyview (enabled by default, rc6), lots of pll reworking, 30bpp
           support (this time for sure)
         - nouveau: async buffer object deletion, context/register init
           updates, kernel vp2 engine support, GF117 support, GK110 accel
           support (with external nvidia ucode), context cleanups.
         - exynos: memory leak fixes, Add S3C64XX SoC series support, device
           tree updates, common clock framework support,
         - qxl: cursor hotspot support, multi-monitor support, suspend/resume
           support
         - mgag200: hw cursor support, g200 mode limiting
         - shmobile: prime support
         - tegra: fixes mostly
      
        I've been banging on this quite a lot due to the size of it, and it
        seems to okay on everything I've tested it on."
      
      * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (811 commits)
        drm/radeon/dpm: implement vblank_too_short callback for si
        drm/radeon/dpm: implement vblank_too_short callback for cayman
        drm/radeon/dpm: implement vblank_too_short callback for btc
        drm/radeon/dpm: implement vblank_too_short callback for evergreen
        drm/radeon/dpm: implement vblank_too_short callback for 7xx
        drm/radeon/dpm: add checks against vblank time
        drm/radeon/dpm: add helper to calculate vblank time
        drm/radeon: remove stray line in old pm code
        drm/radeon/dpm: fix display_gap programming on rv7xx
        drm/nvc0/gr: fix gpc firmware regression
        drm/nouveau: fix minor thinko causing bo moves to not be async on kepler
        drm/radeon/dpm: implement force performance level for TN
        drm/radeon/dpm: implement force performance level for ON/LN
        drm/radeon/dpm: implement force performance level for SI
        drm/radeon/dpm: implement force performance level for cayman
        drm/radeon/dpm: implement force performance levels for 7xx/eg/btc
        drm/radeon/dpm: add infrastructure to force performance levels
        drm/radeon: fix surface setup on r1xx
        drm/radeon: add support for 3d perf states on older asics
        drm/radeon: set default clocks for SI when DPM is disabled
        ...
      2e17c5a9
    • L
      Merge tag 'fbdev-for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev · 5f097cd2
      Linus Torvalds 提交于
      Pull fbdev update from Jean-Christophe PLAGNIOL-VILLARD:
       "Various fbdev changes for 3.11
         - xilinxfb updates
         - Small cleanups and fixes to multiple drivers
         - OMAP display subsystem bug updates
         - imxfb dt support"
      
      * tag 'fbdev-for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev: (95 commits)
        video: imxfb: Add DT support
        video: i740fb: Make i740fb_init static
        fb: make fp_get_options name argument const
        video: mmp: fix graphics/video layer enable/mask swap issue
        video: mmp: fix memcpy wrong size for mmp_addr issue
        radeon: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
        aty128fb: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
        video: of_display_timing.h: Declare 'display_timing'
        fbdev: bfin-lq035q1-fb: Use dev_pm_ops
        fbmem: return -EFAULT on copy_to_user() failure
        OMAPDSS: DPI: Fix wrong pixel clock limit
        video: replace strict_strtoul() with kstrtoul()
        uvesafb: Correct/simplify warning message
        fb: fix atyfb unused data warnings
        fb: fix atyfb build warning
        video: imxfb: Make local symbols static
        video: udlfb: Make local symbol static
        video: udlfb: Use NULL instead of 0
        video: smscufx: Use NULL instead of 0
        video: remove unnecessary platform_set_drvdata()
        ...
      5f097cd2
    • L
      Merge branch 'akpm' (updates from Andrew Morton) · a82a729f
      Linus Torvalds 提交于
      Merge second patch-bomb from Andrew Morton:
       - misc fixes
       - audit stuff
       - fanotify/inotify/dnotify things
       - most of the rest of MM.  The new cache shrinker code from Glauber and
         Dave Chinner probably isn't quite stabilized yet.
       - ptrace
       - ipc
       - partitions
       - reboot cleanups
       - add LZ4 decompressor, use it for kernel compression
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
        lib/scatterlist: error handling in __sg_alloc_table()
        scsi_debug: fix do_device_access() with wrap around range
        crypto: talitos: use sg_pcopy_to_buffer()
        lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer()
        lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next()
        crypto: add lz4 Cryptographic API
        lib: add lz4 compressor module
        arm: add support for LZ4-compressed kernel
        lib: add support for LZ4-compressed kernel
        decompressor: add LZ4 decompressor module
        lib: add weak clz/ctz functions
        reboot: move arch/x86 reboot= handling to generic kernel
        reboot: arm: change reboot_mode to use enum reboot_mode
        reboot: arm: prepare reboot_mode for moving to generic kernel code
        reboot: arm: remove unused restart_mode fields from some arm subarchs
        reboot: unicore32: prepare reboot_mode for moving to generic kernel code
        reboot: x86: prepare reboot_mode for moving to generic kernel code
        reboot: checkpatch.pl the new kernel/reboot.c file
        reboot: move shutdown/reboot related functions to kernel/reboot.c
        reboot: remove -stable friendly PF_THREAD_BOUND define
        ...
      a82a729f
    • H
      parisc: Fix gcc miscompilation in pa_memcpy() · 5b879d78
      Helge Deller 提交于
      When running the LTP testsuite one may hit this kernel BUG() with the
      write06 testcase:
      
      kernel BUG at mm/filemap.c:2023!
      CPU: 1 PID: 8614 Comm: writev01 Not tainted 3.10.0-rc7-64bit-c3000+ #6
      IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401e6e84 00000000401e6e88
       IIR: 03ffe01f    ISR: 0000000010340000  IOR: 000001fbe0380820
       CPU:        1   CR30: 00000000bef80000 CR31: ffffffffffffffff
       ORIG_R28: 00000000bdc192c0
       IAOQ[0]: iov_iter_advance+0x3c/0xc0
       IAOQ[1]: iov_iter_advance+0x40/0xc0
       RP(r2): generic_file_buffered_write+0x204/0x3f0
      Backtrace:
       [<00000000401e764c>] generic_file_buffered_write+0x204/0x3f0
       [<00000000401eab24>] __generic_file_aio_write+0x244/0x448
       [<00000000401eadc0>] generic_file_aio_write+0x98/0x150
       [<000000004024f460>] do_sync_readv_writev+0xc0/0x130
       [<000000004025037c>] compat_do_readv_writev+0x12c/0x340
       [<00000000402505f8>] compat_writev+0x68/0xa0
       [<0000000040251d88>] compat_SyS_writev+0x98/0xf8
      
      Reason for this crash is a gcc miscompilation in the fault handlers of
      pa_memcpy() which return the fault address instead of the copied bytes.
      Since this seems to be a generic problem with gcc-4.7.x (and below), it's
      better to simplify the fault handlers in pa_memcpy to avoid this problem.
      
      Here is a simple reproducer for the problem:
      
      int main(int argc, char **argv)
      {
      	int fd, nbytes;
      	struct iovec wr_iovec[] = {
      		{ "TEST STRING                     ",32},
      		{ (char*)0x40005000,32} }; // random memory.
      	fd = open(DATA_FILE, O_RDWR | O_CREAT, 0666);
      	nbytes = writev(fd, wr_iovec, 2);
      	printf("return value = %d, errno %d (%s)\n",
      		nbytes, errno, strerror(errno));
      	return 0;
      }
      
      In addition, John David Anglin wrote:
      There is no gcc PR as pa_memcpy is not legitimate C code. There is an
      implicit assumption that certain variables will contain correct values
      when an exception occurs and the code randomly jumps to one of the
      exception blocks.  There is no guarantee of this.  If a PR was filed, it
      would likely be marked as invalid.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # 3.8+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      5b879d78
    • J
      parisc: Ensure volatile space register %sr1 is not clobbered · e8d8fc21
      John David Anglin 提交于
      I still see the occasional random segv on rp3440.  Looking at one of
      these (a code 15), it appeared the problem must be with the cache
      handling of anonymous pages.  Reviewing this, I noticed that the space
      register %sr1 might be being clobbered when we flush an anonymous page.
      
      Register %sr1 is used for TLB purges in a couple of places.  These
      purges are needed on PA8800 and PA8900 processors to ensure cache
      consistency of flushed cache lines.
      
      The solution here is simply to move the %sr1 load into the TLB lock
      region needed to ensure that one purge executes at a time on SMP
      systems.  This was already the case for one use.  After a few days of
      operation, I haven't had a random segv on my rp3440.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # 3.10
      Signed-off-by: NHelge Deller <deller@gmx.de>
      e8d8fc21
    • H
      parisc: optimize mtsp(0,sr) inline assembly · 92b59929
      Helge Deller 提交于
      If the value which should be moved into a space register is zero, we can
      optimize the inline assembly to become "mtsp %r0,%srX".
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # 3.10
      92b59929
    • H
      parisc: switch to gzip-compressed vmlinuz kernel · 594174d8
      Helge Deller 提交于
      The latest PA-RISC Boot Loader (palo) allows loading of gzip compressed
      vmlinuz kernels. So let's now switch to build a vmlinuz file when we
      build a palo boot image.
      
      PALO version 1.9 (or higher) is required for this which is available at
      git://git.kernel.org/pub/scm/linux/kernel/git/deller/palo.gitSigned-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # 3.10
      594174d8
    • H
      parisc: document the shadow registers · a83f58bc
      Helge Deller 提交于
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # 3.10
      a83f58bc
    • H
      parisc: more capabilities info in /proc/cpuinfo · 30a9f0b2
      Helge Deller 提交于
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # 3.10
      30a9f0b2
    • H
      parisc: fix LMMIO mismatch between PAT length and MASK register · dac76f1b
      Helge Deller 提交于
      The LMMIO length reported by PAT and the length given by the LBA MASK
      register are not consistent. This leads e.g. to a not-working ATI FireGL
      card with the radeon DRM driver since the memory can't be mapped.
      
      Fix this by correctly adjusting the resource sizes.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # 3.10
      dac76f1b
    • K
      drivers/net: caif: fix wrong rtnl_is_locked() usage · 56e0ef52
      Konstantin Khlebnikov 提交于
      rtnl_is_locked() doesn't check who holds this lock, it just tells that it's
      locked right now. if caif::ldisc_close really can be called under rtrnl_lock
      then it should release net device in other context because there is no way
      to grab rtnl_lock without deadlock.
      
      This patch adds work which releases these devices. Also this patch fixes calling
      dev_close/unregister_netdevice without rtnl_lock from caif_ser_exit().
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56e0ef52
    • K
      drivers/net: enic: release rtnl_lock on error-path · e057590b
      Konstantin Khlebnikov 提交于
      enic_change_mtu_work() must call rtnl_unlock() on all exiting paths.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Christian Benvenuti <benve@cisco.com>
      Cc: Roopa Prabhu <roprabhu@cisco.com>
      Cc: Neel Patel <neepatel@cisco.com>
      Cc: Nishank Trivedi <nistrive@cisco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e057590b
    • M
      vhost-net: fix use-after-free in vhost_net_flush · dd7633ec
      Michael S. Tsirkin 提交于
      vhost_net_ubuf_put_and_wait has a confusing name:
      it will actually also free it's argument.
      Thus since commit 1280c27f
          "vhost-net: flush outstanding DMAs on memory change"
      vhost_net_flush tries to use the argument after passing it
      to vhost_net_ubuf_put_and_wait, this results
      in use after free.
      To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
      add an new API for callers that want to free ubufs.
      Acked-by: NAsias He <asias@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd7633ec
    • L
      Merge tag 'for-linus-3.11-merge-window-part-1' of... · 899dd388
      Linus Torvalds 提交于
      Merge tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
      
      Pull 9p update from Eric Van Hensbergen:
       "Grab bag of little fixes and enhancements:
        - optional security enhancements
        - fix path coverage in MAINTAINERS
        - switch to using most used protocol and transport as default
        - clean up buffer dumps in trace code
      
        Held off on RDMA patches as they need to be cleaned up a bit, but will
        try to get the cleaned, checked, and pushed by mid-week"
      
      * tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
        9p: Add rest of 9p files to MAINTAINERS entry
        9p: trace: use %*ph to dump buffer
        net/9p: Handle error in zero copy request correctly for 9p2000.u
        net/9p: Use virtio transpart as the default transport
        net/9p: Make 9P2000.L the default protocol for 9p file system
      899dd388
    • J
      net: mv643xx_eth: do not use port number as platform device id · 785bf6f7
      Jonas Gorski 提交于
      The port number is only local to the ethernet block, not global, so
      there can be two ethernet blocks both using the same port, like
      kirkwood with both using port 0.
      
      Fix this by using the array index offset for the allocated platform
      devices as the id.
      Signed-off-by: NJonas Gorski <jogo@openwrt.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      785bf6f7
    • D
      net: sctp: confirm route during forward progress · 8c2f414a
      Daniel Borkmann 提交于
      This fix has been proposed originally by Vlad Yasevich. He says:
      
        When SCTP makes forward progress (receives a SACK that acks new chunks,
        renegs, or answeres 0-window probes) or when HB-ACK arrives, mark
        the route as confirmed so we don't unnecessarily send NUD probes.
      
      Having a simple SCTP client/server that exchange data chunks every 1sec,
      without this patch ARP requests are sent periodically every 40-60sec.
      With this fix applied, an ARP request is only done once right at the
      "session" beginning. Also, when clearing the related ARP cache entry
      manually during the session, a new request is correctly done. I have
      only "backported" this to net-next and tested that it works, so full
      credit goes to Vlad.
      Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c2f414a
    • D
      virtio_net: fix race in RX VQ processing · e1d6fbc3
      David S. Miller 提交于
      Michael S. Tsirkin says:
      
      ====================
      Jason Wang reported a race in RX VQ processing:
      virtqueue_enable_cb is called outside napi lock,
      violating virtio serialization rules.
      The race has been there from day 1, but it got especially nasty in 3.0
      when commit a5c262c5
      "virtio_ring: support event idx feature"
      added more dependency on vq state.
      
      Please review, and consider for 3.11 and stable.
      
      Changes from v1:
      	- Added Jason's Tested-by tag
      	- minor coding style fix
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1d6fbc3
    • M
      virtio_net: fix race in RX VQ processing · cbdadbbf
      Michael S. Tsirkin 提交于
      virtio net called virtqueue_enable_cq on RX path after napi_complete, so
      with NAPI_STATE_SCHED clear - outside the implicit napi lock.
      This violates the requirement to synchronize virtqueue_enable_cq wrt
      virtqueue_add_buf.  In particular, used event can move backwards,
      causing us to lose interrupts.
      In a debug build, this can trigger panic within START_USE.
      
      Jason Wang reports that he can trigger the races artificially,
      by adding udelay() in virtqueue_enable_cb() after virtio_mb().
      
      However, we must call napi_complete to clear NAPI_STATE_SCHED before
      polling the virtqueue for used buffers, otherwise napi_schedule_prep in
      a callback will fail, causing us to lose RX events.
      
      To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
      set (under napi lock), later call virtqueue_poll with
      NAPI_STATE_SCHED clear (outside the lock).
      Reported-by: NJason Wang <jasowang@redhat.com>
      Tested-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbdadbbf
    • M
      virtio: support unlocked queue poll · cc229884
      Michael S. Tsirkin 提交于
      This adds a way to check ring empty state after enable_cb outside any
      locks. Will be used by virtio_net.
      
      Note: there's room for more optimization: caller is likely to have a
      memory barrier already, which means we might be able to get rid of a
      barrier here.  Deferring this optimization until we do some
      benchmarking.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc229884
    • J
    • G
      Documentation: Fix references to defunct linux-net@vger.kernel.org · b78ba72c
      Geert Uytterhoeven 提交于
      linux-net@vger.kernel.org was replaced by netdev@oss.sgi.com was replaced
      by netdev@vger.kernel.org.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b78ba72c
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 9a5889ae
      Linus Torvalds 提交于
      Pull Ceph updates from Sage Weil:
       "There is some follow-on RBD cleanup after the last window's code drop,
        a series from Yan fixing multi-mds behavior in cephfs, and then a
        sprinkling of bug fixes all around.  Some warnings, sleeping while
        atomic, a null dereference, and cleanups"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits)
        libceph: fix invalid unsigned->signed conversion for timespec encoding
        libceph: call r_unsafe_callback when unsafe reply is received
        ceph: fix race between cap issue and revoke
        ceph: fix cap revoke race
        ceph: fix pending vmtruncate race
        ceph: avoid accessing invalid memory
        libceph: Fix NULL pointer dereference in auth client code
        ceph: Reconstruct the func ceph_reserve_caps.
        ceph: Free mdsc if alloc mdsc->mdsmap failed.
        ceph: remove sb_start/end_write in ceph_aio_write.
        ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.
        ceph: fix sleeping function called from invalid context.
        ceph: move inode to proper flushing list when auth MDS changes
        rbd: fix a couple warnings
        ceph: clear migrate seq when MDS restarts
        ceph: check migrate seq before changing auth cap
        ceph: fix race between page writeback and truncate
        ceph: reset iov_len when discarding cap release messages
        ceph: fix cap release race
        libceph: fix truncate size calculation
        ...
      9a5889ae
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · e3a0dd98
      Linus Torvalds 提交于
      Pull btrfs update from Chris Mason:
       "These are the usual mixture of bugs, cleanups and performance fixes.
        Miao has some really nice tuning of our crc code as well as our
        transaction commits.
      
        Josef is peeling off more and more problems related to early enospc,
        and has a number of important bug fixes in here too"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits)
        Btrfs: wait ordered range before doing direct io
        Btrfs: only do the tree_mod_log_free_eb if this is our last ref
        Btrfs: hold the tree mod lock in __tree_mod_log_rewind
        Btrfs: make backref walking code handle skinny metadata
        Btrfs: fix crash regarding to ulist_add_merge
        Btrfs: fix several potential problems in copy_nocow_pages_for_inode
        Btrfs: cleanup the code of copy_nocow_pages_for_inode()
        Btrfs: fix oops when recovering the file data by scrub function
        Btrfs: make the chunk allocator completely tree lockless
        Btrfs: cleanup orphaned root orphan item
        Btrfs: fix wrong mirror number tuning
        Btrfs: cleanup redundant code in btrfs_submit_direct()
        Btrfs: remove btrfs_sector_sum structure
        Btrfs: check if we can nocow if we don't have data space
        Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc
        Btrfs: use a percpu to keep track of possibly pinned bytes
        Btrfs: check for actual acls rather than just xattrs when caching no acl
        Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate
        Btrfs: optimize reada_for_balance
        Btrfs: optimize read_block_for_search
        ...
      e3a0dd98
    • E
      net/fs: change busy poll time accounting · 76b1e9b9
      Eliezer Tamir 提交于
      Suggested by Linus:
      Changed time accounting for busy-poll:
      - Make it microsecond based.
      - Use unsigned longs.
      - Revert back to use time_after instead of time_in_range.
      Reorder poll/select busy loop conditions:
      - Clear busy_flag after one time we can't busy-poll.
      - Only init busy_end if we actually are going to busy-poll.
      Added one more missing need_resched() test.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76b1e9b9
    • L
      Merge tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs · da89bd21
      Linus Torvalds 提交于
      Pull xfs update from Ben Myers:
       "This includes several bugfixes, part of the work for project quotas
        and group quotas to be used together, performance improvements for
        inode creation/deletion, buffer readahead, and bulkstat,
        implementation of the inode change count, an inode create transaction,
        and the removal of a bunch of dead code.
      
        There are also some duplicate commits that you already have from the
        3.10-rc series.
      
         - part of the work to allow project quotas and group quotas to be
           used together
         - inode change count
         - inode create transaction
         - block queue plugging in buffer readahead and bulkstat
         - ordered log vector support
         - removal of dead code in and around xfs_sync_inode_grab,
           xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES,
           XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and
           xfs_growfs_data_private
         - don't keep silent if sunit/swidth can not be changed via mount
         - fix a leak of remote symlink blocks into the filesystem when xattrs
           are used on symlinks
         - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents
         - part of a fix for xfs_fsr
         - disable speculative preallocation with small files
         - performance improvements for inode creates and deletes"
      
      * tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits)
        xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
        xfs: Change xfs_dquot_acct to be a 2-dimensional array
        xfs: Code cleanup and removal of some typedef usage
        xfs: Replace macro XFS_DQ_TO_QIP with a function
        xfs: Replace macro XFS_DQUOT_TREE with a function
        xfs: Define a new function xfs_is_quota_inode()
        xfs: implement inode change count
        xfs: Use inode create transaction
        xfs: Inode create item recovery
        xfs: Inode create transaction reservations
        xfs: Inode create log items
        xfs: Introduce an ordered buffer item
        xfs: Introduce ordered log vector support
        xfs: xfs_ifree doesn't need to modify the inode buffer
        xfs: don't do IO when creating an new inode
        xfs: don't use speculative prealloc for small files
        xfs: plug directory buffer readahead
        xfs: add pluging for bulkstat readahead
        xfs: Remove dead function prototype xfs_sync_inode_grab()
        xfs: Remove the left function variable from xfs_ialloc_get_rec()
        ...
      da89bd21
    • J
      libceph: fix invalid unsigned->signed conversion for timespec encoding · 8b8cf891
      Josh Durgin 提交于
      __kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit
      architectures.  Just drop this check as it has limited value.
      
      This fixes a crash like:
      
      [  957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164!
      [  957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM
      [  957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc
      [  957.932547] CPU: 1    Tainted: G        W     (3.9.0-ceph-19bb6a83-highbank #1)
      [  957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph]
      [  957.945967] LR is at 0xec520904
      [  957.949103] pc : [<bf13e76c>]    lr : [<ec520904>]    psr: 20000153
      [  957.949103] sp : ec753df8  ip : 00000001  fp : ec53e100
      [  957.960571] r10: ebef25c0  r9 : ec5fa400  r8 : ecbcc000
      [  957.965788] r7 : 00000000  r6 : 00000000  r5 : ffffffff  r4 : 00000020
      [  957.972307] r3 : 51cc8143  r2 : ec520900  r1 : ec753e58  r0 : ec520908
      [  957.978827] Flags: nzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment user
      [  957.986039] Control: 10c5387d  Table: 2c59c04a  DAC: 00000015
      [  957.991777] Process rbd (pid: 2138, stack limit = 0xec752238)
      [  957.997514] Stack: (0xec753df8 to 0xec754000)
      [  958.001864] 3de0:                                                       00000001 00000001
      [  958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe
      [  958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68
      [  958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2
      [  958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000
      [  958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060
      [  958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8
      [  958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70
      [  958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000
      [  958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8
      [  958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84
      [  958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48
      [  958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858
      [  958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68
      [  958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80
      [  958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920
      [  958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21
      [  958.140815] [<bf13e76c>] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd])
      [  958.152739] [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd])
      [  958.164486] [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd])
      [  958.175967] [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd])
      [  958.185975] [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) from [<c02e4c44>] (bus_attr_store+0x20/0x2c)
      [  958.194850] [<c02e4c44>] (bus_attr_store+0x20/0x2c) from [<c016b3dc>] (sysfs_write_file+0x168/0x198)
      [  958.203984] [<c016b3dc>] (sysfs_write_file+0x168/0x198) from [<c0108444>] (vfs_write+0x9c/0x170)
      [  958.212768] [<c0108444>] (vfs_write+0x9c/0x170) from [<c0108858>] (sys_write+0x3c/0x70)
      [  958.220768] [<c0108858>] (sys_write+0x3c/0x70) from [<c000dbc0>] (ret_fast_syscall+0x0/0x30)
      [  958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2)
      
      CC: stable@vger.kernel.org  # 3.4+
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      8b8cf891
    • L
      Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · be0c5d8c
      Linus Torvalds 提交于
      Pull NFS client updates from Trond Myklebust:
       "Feature highlights include:
         - Add basic client support for NFSv4.2
         - Add basic client support for Labeled NFS (selinux for NFSv4.2)
         - Fix the use of credentials in NFSv4.1 stateful operations, and add
           support for NFSv4.1 state protection.
      
        Bugfix highlights:
         - Fix another NFSv4 open state recovery race
         - Fix an NFSv4.1 back channel session regression
         - Various rpc_pipefs races
         - Fix another issue with NFSv3 auth negotiation
      
        Please note that Labeled NFS does require some additional support from
        the security subsystem.  The relevant changesets have all been
        reviewed and acked by James Morris."
      
      * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
        NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
        NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
        nfs: have NFSv3 try server-specified auth flavors in turn
        nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
        nfs: move server_authlist into nfs_try_mount_request
        nfs: refactor "need_mount" code out of nfs_try_mount
        SUNRPC: PipeFS MOUNT notification optimization for dying clients
        SUNRPC: split client creation routine into setup and registration
        SUNRPC: fix races on PipeFS UMOUNT notifications
        SUNRPC: fix races on PipeFS MOUNT notifications
        NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
        NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
        NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
        NFS: Improve legacy idmapping fallback
        NFSv4.1 end back channel session draining
        NFS: Apply v4.1 capabilities to v4.2
        NFSv4.1: Clean up layout segment comparison helper names
        NFSv4.1: layout segment comparison helpers should take 'const' parameters
        NFSv4: Move the DNS resolver into the NFSv4 module
        rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
        ...
      be0c5d8c
    • L
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 1f792dd1
      Linus Torvalds 提交于
      Pull ext3 fix and quota cleanup from Jan Kara:
       "A fix of ext3 error reporting from fsync and a quota cleanup"
      
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        quota: Convert use of typedef ctl_table to struct ctl_table
        ext3: Fix fsync error handling after filesystem abort.
      1f792dd1
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · c75e2475
      Linus Torvalds 提交于
      Pull third set of VFS updates from Al Viro:
       "Misc stuff all over the place.  There will be one more pile in a
        couple of days"
      
      This is an "evil merge" that also uses the new d_count helper in
      fs/configfs/dir.c, missed by commit 84d08fa8 ("helper for reading
      ->d_count")
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ncpfs: fix error return code in ncp_parse_options()
        locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock
        seq_file: add seq_list_*_percpu helpers
        f2fs: fix readdir incorrectness
        mode_t whack-a-mole...
        lustre: kill the pointless wrapper
        helper for reading ->d_count
      c75e2475
    • D
      lib/scatterlist: error handling in __sg_alloc_table() · 27daabd9
      Dan Carpenter 提交于
      I was reviewing code which I suspected might allocate a zero size SG
      table.  That will cause memory corruption.  Also we can't return before
      doing the memset or we could end up using uninitialized memory in the
      cleanup path.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27daabd9
    • A
      scsi_debug: fix do_device_access() with wrap around range · a4517511
      Akinobu Mita 提交于
      do_device_access() is a function that abstracts copying SG list from/to
      ramdisk storage (fake_storep).
      
      It must deal with the ranges exceeding actual fake_storep size, because
      such ranges are valid if virtual_gb is set greater than zero, and they
      should be treated as fake_storep is repeatedly mirrored up to virtual
      size.
      
      Unfortunately, it can't deal with the range which wraps around the end of
      fake_storep.  A wrap around range is copied by two
      sg_copy_{from,to}_buffer() calls, but sg_copy_{from,to}_buffer() can't
      copy from/to in the middle of SG list, therefore the second call can't
      copy correctly.
      
      This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to
      the middle of SG list.
      
      This also simplifies the assignment of sdb->resid in
      fill_from_dev_buffer().  Because fill_from_dev_buffer() is now only called
      once per command execution cycle.  So it is not necessary to take care to
      decrease sdb->resid if fill_from_dev_buffer() is called more than once.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a4517511
    • A
      crypto: talitos: use sg_pcopy_to_buffer() · d0525723
      Akinobu Mita 提交于
      Use sg_pcopy_to_buffer() which is better than the function previously used.
      Because it doesn't do kmap/kunmap for skipped pages.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0525723
    • A
      lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer() · df642cea
      Akinobu Mita 提交于
      The only difference between sg_pcopy_{from,to}_buffer() and
      sg_copy_{from,to}_buffer() is an additional argument that specifies the
      number of bytes to skip the SG list before copying.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df642cea
    • A
      lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next() · 11052004
      Akinobu Mita 提交于
      This patchset introduces sg_pcopy_from_buffer() and sg_pcopy_to_buffer(),
      which copy data between a linear buffer and an SG list.
      
      The only difference between sg_pcopy_{from,to}_buffer() and
      sg_copy_{from,to}_buffer() is an additional argument that specifies the
      number of bytes to skip the SG list before copying.
      
      The main reason for introducing these functions is to fix a problem in
      scsi_debug module.  And there is a local function in crypto/talitos
      module, which can be replaced by sg_pcopy_to_buffer().
      
      This patch:
      
      sg_miter_get_next_page() is used to proceed page iterator to the next page
      if necessary, and will be used to implement the variants of
      sg_copy_{from,to}_buffer() later.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      11052004