1. 20 4月, 2013 2 次提交
  2. 17 4月, 2013 1 次提交
  3. 16 4月, 2013 1 次提交
    • V
      net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api · 4cd729b0
      Vlad Yasevich 提交于
      The current implementation of dev_uc_sync/unsync() assumes that there is
      a strict 1-to-1 relationship between the source and destination of the sync.
      In other words, once an address has been synced to a destination device, it
      will not be synced to any other device through the sync API.
      However, there are some virtual devices that aggreate a number of lower
      devices and need to sync addresses to all of them.  The current
      API falls short there.
      
      This patch introduces a new dev_uc_sync_multiple() api that can be called
      in the above circumstances and allows sync to work for every invocation.
      
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cd729b0
  4. 12 4月, 2013 2 次提交
    • M
      usbnet: handle link change · 4b49f58f
      Ming Lei 提交于
      The link change is detected via the interrupt pipe, and bulk
      pipes are responsible for transfering packets, so it is reasonable
      to stop bulk transfer after link is reported as off.
      
      Two adavantages may be obtained with stopping bulk transfer
      after link becomes off:
      
      - USB bus bandwidth is saved(USB bus is shared bus except for
      USB3.0), for example, lots of 'IN' token packets and 'NYET'
      handshake packets is transfered on 2.0 bus.
      
      - probabaly power might be saved for usb host controller since
      cancelling bulk transfer may disable the asynchronous schedule of
      host controller.
      
      With this patch, when link becomes off, about ~10% performance
      boost can be found on bulk transfer of anther usb device which
      is attached to same bus with the usbnet device, see below
      test on next-20130410:
      
      - read from usb mass storage(Sandisk Extreme USB 3.0) on pandaboard
      with below command after unplugging ethernet cable:
      
      	dd if=/dev/sda iflag=direct of=/dev/null bs=1M count=800
      
      - without the patch
      1, 838860800 bytes (839 MB) copied, 36.2216 s, 23.2 MB/s
      2, 838860800 bytes (839 MB) copied, 35.8368 s, 23.4 MB/s
      3, 838860800 bytes (839 MB) copied, 35.823 s, 23.4 MB/s
      4, 838860800 bytes (839 MB) copied, 35.937 s, 23.3 MB/s
      5, 838860800 bytes (839 MB) copied, 35.7365 s, 23.5 MB/s
      average: 23.6MB/s
      
      - with the patch
      1, 838860800 bytes (839 MB) copied, 32.3817 s, 25.9 MB/s
      2, 838860800 bytes (839 MB) copied, 31.7389 s, 26.4 MB/s
      3, 838860800 bytes (839 MB) copied, 32.438 s, 25.9 MB/s
      4, 838860800 bytes (839 MB) copied, 32.5492 s, 25.8 MB/s
      5, 838860800 bytes (839 MB) copied, 31.6178 s, 26.5 MB/s
      average: 26.1MB/s
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b49f58f
    • M
      usbnet: introduce usbnet_link_change API · ac64995d
      Ming Lei 提交于
      This patch introduces the API of usbnet_link_change, so that
      usbnet can handle link change centrally, which may help to
      implement killing traffic URBs for saving USB bus bandwidth
      and host controller power.
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac64995d
  5. 10 4月, 2013 1 次提交
    • D
      net: sctp: introduce uapi header for sctp · 1b866434
      Daniel Borkmann 提交于
      This patch introduces an UAPI header for the SCTP protocol,
      so that we can facilitate the maintenance and development of
      user land applications or libraries, in particular in terms
      of header synchronization.
      
      To not break compatibility, some fragments from lksctp-tools'
      netinet/sctp.h have been carefully included, while taking care
      that neither kernel nor user land breaks, so both compile fine
      with this change (for lksctp-tools I tested with the old
      netinet/sctp.h header and with a newly adapted one that includes
      the uapi sctp header). lksctp-tools smoke test run through
      successfully as well in both cases.
      Suggested-by: NNeil Horman <nhorman@tuxdriver.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b866434
  6. 08 4月, 2013 1 次提交
  7. 06 4月, 2013 2 次提交
    • P
      netfilter: don't reset nf_trace in nf_reset() · 124dff01
      Patrick McHardy 提交于
      Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
      to reset nf_trace in nf_reset(). This is wrong and unnecessary.
      
      nf_reset() is used in the following cases:
      
      - when passing packets up the the socket layer, at which point we want to
        release all netfilter references that might keep modules pinned while
        the packet is queued. nf_trace doesn't matter anymore at this point.
      
      - when encapsulating or decapsulating IPsec packets. We want to continue
        tracing these packets after IPsec processing.
      
      - when passing packets through virtual network devices. Only devices on
        that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
        used anymore. Its not entirely clear whether those packets should
        be traced after that, however we've always done that.
      
      - when passing packets through virtual network devices that make the
        packet cross network namespace boundaries. This is the only cases
        where we clearly want to reset nf_trace and is also what the
        original patch intended to fix.
      
      Add a new function nf_reset_trace() and use it in dev_forward_skb() to
      fix this properly.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      124dff01
    • P
      netfilter: remove unneeded variable proc_net_netfilter · 12202fa7
      Pablo Neira Ayuso 提交于
      Now that this supports net namespace for nflog and nfqueue,
      we can remove the global proc_net_netfilter which has no
      clients anymore.
      
      Based on patch from Gao feng.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12202fa7
  8. 05 4月, 2013 1 次提交
    • V
      net: count hw_addr syncs so that unsync works properly. · 4543fbef
      Vlad Yasevich 提交于
      A few drivers use dev_uc_sync/unsync to synchronize the
      address lists from master down to slave/lower devices.  In
      some cases (bond/team) a single address list is synched down
      to multiple devices.  At the time of unsync, we have a leak
      in these lower devices, because "synced" is treated as a
      boolean and the address will not be unsynced for anything after
      the first device/call.
      
      Treat "synced" as a count (same as refcount) and allow all
      unsync calls to work.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4543fbef
  9. 03 4月, 2013 1 次提交
    • G
      of_net.h: Provide empty functions if OF_NET is not configured · 65b3841b
      Guenter Roeck 提交于
      of_get_mac_address() and of_get_phy_mode() are only provided if OF_NET
      is configured. While most callers check for the define, not all do, and those
      who do require #ifdef around the code. For those who don't, the missing check
      can result in errors such as
      
      arch/powerpc/sysdev/tsi108_dev.c:107:3: error: implicit declaration of
      	function 'of_get_mac_address' [-Werror=implicit-function-declaration]
      arch/powerpc/sysdev/mv64x60_dev.c:253:2: error: implicit declaration of
      	function 'of_get_mac_address' [-Werror=implicit-function-declaration]
      
      Provide empty functions if OF_NET is not configured.
      This is safe because all callers do check the return values.
      
      Cc: David Daney <david.daney@cavium.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65b3841b
  10. 02 4月, 2013 2 次提交
  11. 01 4月, 2013 1 次提交
    • P
      Revert "lockdep: check that no locks held at freeze time" · dbf520a9
      Paul Walmsley 提交于
      This reverts commit 6aa97070.
      
      Commit 6aa97070 ("lockdep: check that no locks held at freeze time")
      causes problems with NFS root filesystems.  The failures were noticed on
      OMAP2 and 3 boards during kernel init:
      
        [ BUG: swapper/0/1 still has locks held! ]
        3.9.0-rc3-00344-ga937536b #1 Not tainted
        -------------------------------------
        1 lock held by swapper/0/1:
         #0:  (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574
      
        stack backtrace:
          rpc_wait_bit_killable
          __wait_on_bit
          out_of_line_wait_on_bit
          __rpc_execute
          rpc_run_task
          rpc_call_sync
          nfs_proc_get_root
          nfs_get_root
          nfs_fs_mount_common
          nfs_try_mount
          nfs_fs_mount
          mount_fs
          vfs_kern_mount
          do_mount
          sys_mount
          do_mount_root
          mount_root
          prepare_namespace
          kernel_init_freeable
          kernel_init
      
      Although the rootfs mounts, the system is unstable.  Here's a transcript
      from a PM test:
      
        http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt
      
      Here's what the test log should look like:
      
        http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt
      
      Mailing list discussion is here:
      
        http://lkml.org/lkml/2013/3/4/221
      
      Deal with this for v3.9 by reverting the problem commit, until folks can
      figure out the right long-term course of action.
      Signed-off-by: NPaul Walmsley <paul@pwsan.com>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: <maciej.rutecki@gmail.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ben Chan <benchan@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dbf520a9
  12. 31 3月, 2013 1 次提交
    • E
      net: reorder some fields of net_device · 4c3d5e7b
      Eric Dumazet 提交于
      As time passed, some fields were added in net_device, and not
      at sensible offsets.
      
      Lets reorder some fields to reduce number of cache lines in RX path.
      Fields not used in data path should be moved out of this critical cache
      line.
      
      In particular, move broadcast[] to the end of the rx section,
      as it is less used, and ethernet uses only the beginning of the 32bytes
      field.
      
      Before patch :
      
      offsetof(struct net_device,dev_addr)=0x258
      offsetof(struct net_device,rx_handler)=0x2b8
      offsetof(struct net_device,ingress_queue)=0x2c8
      offsetof(struct net_device,broadcast)=0x278
      
      After :
      
      offsetof(struct net_device,dev_addr)=0x280
      offsetof(struct net_device,rx_handler)=0x298
      offsetof(struct net_device,ingress_queue)=0x2a8
      offsetof(struct net_device,broadcast)=0x2b0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c3d5e7b
  13. 30 3月, 2013 4 次提交
  14. 29 3月, 2013 1 次提交
  15. 28 3月, 2013 7 次提交
  16. 27 3月, 2013 4 次提交
    • E
      userns: Restrict when proc and sysfs can be mounted · 87a8ebd6
      Eric W. Biederman 提交于
      Only allow unprivileged mounts of proc and sysfs if they are already
      mounted when the user namespace is created.
      
      proc and sysfs are interesting because they have content that is
      per namespace, and so fresh mounts are needed when new namespaces
      are created while at the same time proc and sysfs have content that
      is shared between every instance.
      
      Respect the policy of who may see the shared content of proc and sysfs
      by only allowing new mounts if there was an existing mount at the time
      the user namespace was created.
      
      In practice there are only two interesting cases: proc and sysfs are
      mounted at their usual places, proc and sysfs are not mounted at all
      (some form of mount namespace jail).
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      87a8ebd6
    • E
      vfs: Add a mount flag to lock read only bind mounts · 90563b19
      Eric W. Biederman 提交于
      When a read-only bind mount is copied from mount namespace in a higher
      privileged user namespace to a mount namespace in a lesser privileged
      user namespace, it should not be possible to remove the the read-only
      restriction.
      
      Add a MNT_LOCK_READONLY mount flag to indicate that a mount must
      remain read-only.
      
      CC: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      90563b19
    • E
      userns: Don't allow creation if the user is chrooted · 3151527e
      Eric W. Biederman 提交于
      Guarantee that the policy of which files may be access that is
      established by setting the root directory will not be violated
      by user namespaces by verifying that the root directory points
      to the root of the mount namespace at the time of user namespace
      creation.
      
      Changing the root is a privileged operation, and as a matter of policy
      it serves to limit unprivileged processes to files below the current
      root directory.
      
      For reasons of simplicity and comprehensibility the privilege to
      change the root directory is gated solely on the CAP_SYS_CHROOT
      capability in the user namespace.  Therefore when creating a user
      namespace we must ensure that the policy of which files may be access
      can not be violated by changing the root directory.
      
      Anyone who runs a processes in a chroot and would like to use user
      namespace can setup the same view of filesystems with a mount
      namespace instead.  With this result that this is not a practical
      limitation for using user namespaces.
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3151527e
    • Y
      firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection. · 6752c8db
      YOSHIFUJI Hideaki / 吉藤英明 提交于
      Inspection of upper layer protocol is considered harmful, especially
      if it is about ARP or other stateful upper layer protocol; driver
      cannot (and should not) have full state of them.
      
      IPv4 over Firewire module used to inspect ARP (both in sending path
      and in receiving path), and record peer's GUID, max packet size, max
      speed and fifo address.  This patch removes such inspection by extending
      our "hardware address" definition to include other information as well:
      max packet size, max speed and fifo.  By doing this, The neighbour
      module in networking subsystem can cache them.
      
      Note: As we have started ignoring sspd and max_rec in ARP/NDP, those
            information will not be used in the driver when sending.
      
      When a packet is being sent, the IP layer fills our pseudo header with
      the extended "hardware address", including GUID and fifo.  The driver
      can look-up node-id (the real but rather volatile low-level address)
      by GUID, and then the module can send the packet to the wire using
      parameters provided in the extendedn hardware address.
      
      This approach is realistic because IP over IEEE1394 (RFC2734) and IPv6
      over IEEE1394 (RFC3146) share same "hardware address" format
      in their address resolution protocols.
      
      Here, extended "hardware address" is defined as follows:
      
      union fwnet_hwaddr {
      	u8 u[16];
      	struct {
      		__be64 uniq_id;		/* EUI-64			*/
      		u8 max_rec;		/* max packet size		*/
      		u8 sspd;		/* max speed			*/
      		__be16 fifo_hi;		/* hi 16bits of FIFO addr	*/
      		__be32 fifo_lo;		/* lo 32bits of FIFO addr	*/
      	} __packed uc;
      };
      
      Note that Hardware address is declared as union, so that we can map full
      IP address into this, when implementing MCAP (Multicast Cannel Allocation
      Protocol) for IPv6, but IP and ARP subsystem do not need to know this
      format in detail.
      
      One difference between original ARP (RFC826) and 1394 ARP (RFC2734)
      is that 1394 ARP Request/Reply do not contain the target hardware address
      field (aka ar$tha).  This difference is handled in the ARP subsystem.
      
      CC: Stephan Gatzka <stephan.gatzka@gmail.com>
      Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6752c8db
  17. 26 3月, 2013 4 次提交
  18. 25 3月, 2013 1 次提交
  19. 23 3月, 2013 3 次提交
    • R
      mm: zone_end_pfn is too small · f9228b20
      Russ Anderson 提交于
      Booting with 32 TBytes memory hits BUG at mm/page_alloc.c:552! (output
      below).
      
      The key hint is "page 4294967296 outside zone".
      4294967296 = 0x100000000 (bit 32 is set).
      
      The problem is in include/linux/mmzone.h:
      
        530 static inline unsigned zone_end_pfn(const struct zone *zone)
        531 {
        532         return zone->zone_start_pfn + zone->spanned_pages;
        533 }
      
      zone_end_pfn is "unsigned" (32 bits).  Changing it to "unsigned long"
      (64 bits) fixes the problem.
      
      zone_end_pfn() was added recently in commit 108bcc96 ("mm: add & use
      zone_end_pfn() and zone_spans_pfn()")
      
      Output from the failure.
      
        No AGP bridge found
        page 4294967296 outside zone [ 4294967296 - 4327469056 ]
        ------------[ cut here ]------------
        kernel BUG at mm/page_alloc.c:552!
        invalid opcode: 0000 [#1] SMP
        Modules linked in:
        CPU 0
        Pid: 0, comm: swapper Not tainted 3.9.0-rc2.dtp+ #10
        RIP: free_one_page+0x382/0x430
        Process swapper (pid: 0, threadinfo ffffffff81942000, task ffffffff81955420)
        Call Trace:
          __free_pages_ok+0x96/0xb0
          __free_pages+0x25/0x50
          __free_pages_bootmem+0x8a/0x8c
          __free_memory_core+0xea/0x131
          free_low_memory_core_early+0x4a/0x98
          free_all_bootmem+0x45/0x47
          mem_init+0x7b/0x14c
          start_kernel+0x216/0x433
          x86_64_start_reservations+0x2a/0x2c
          x86_64_start_kernel+0x144/0x153
        Code: 89 f1 ba 01 00 00 00 31 f6 d3 e2 4c 89 ef e8 66 a4 01 00 e9 2c fe ff ff 0f 0b eb fe 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 eb f3 <0f> 0b eb fe 0f 0b 0f 1f 84 00 00 00 00 00 eb f6 0f 0b eb fe 49
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Reported-by: NGeorge Beshers <gbeshers@sgi.com>
      Acked-by: NHedi Berriche <hedi@sgi.com>
      Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f9228b20
    • F
      printk: Provide a wake_up_klogd() off-case · dc72c32e
      Frederic Weisbecker 提交于
      wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
      nor printk_sched() are in use and there are actually no waiter on
      log_wait waitqueue.  It should be a stub in this case for users like
      bust_spinlocks().
      
      Otherwise this results in this warning when CONFIG_PRINTK=n and
      CONFIG_IRQ_WORK=n:
      
      	kernel/built-in.o In function `wake_up_klogd':
      	(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'
      
      To fix this, provide an off-case for wake_up_klogd() when
      CONFIG_PRINTK=n.
      
      There is much more from console_unlock() and other console related code
      in printk.c that should be moved under CONFIG_PRINTK.  But for now,
      focus on a minimal fix as we passed the merged window already.
      
      [akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc72c32e
    • J
      irq_work.h: fix warning when CONFIG_IRQ_WORK=n · fe8d5261
      James Hogan 提交于
      A randconfig caught repeated compiler warnings when CONFIG_IRQ_WORK=n
      due to the definition of a non-inline static function in
      <linux/irq_work.h>:
      
        include/linux/irq_work.h +40 : warning: 'irq_work_needs_cpu' defined but not used
      
      Make it inline to supress the warning.  This is caused commit
      00b42959 ("irq_work: Don't stop the tick with pending works") merged
      in v3.9-rc1.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe8d5261