1. 23 10月, 2015 2 次提交
  2. 22 10月, 2015 12 次提交
    • D
      net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set · d46a9d67
      David Ahern 提交于
      741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if
      oif is given. Hajime reported that this change breaks the Mobile IPv6
      use case that wants to force the message through one interface yet use
      the source address from another interface. Handle this case by only
      adding the flag if oif is set and saddr is not set.
      
      Fixes: 741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      Cc: Hajime Tazaki <thehajime@gmail.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d46a9d67
    • J
      VSOCK: sock_put wasn't safe to call in interrupt context · 4ef7ea91
      Jorgen Hansen 提交于
      In the vsock vmci_transport driver, sock_put wasn't safe to call
      in interrupt context, since that may call the vsock destructor
      which in turn calls several functions that should only be called
      from process context. This change defers the callling of these
      functions  to a worker thread. All these functions were
      deallocation of resources related to the transport itself.
      
      Furthermore, an unused callback was removed to simplify the
      cleanup.
      
      Multiple customers have been hitting this issue when using
      VMware tools on vSphere 2015.
      
      Also added a version to the vmci transport module (starting from
      1.0.2.0-k since up until now it appears that this module was
      sharing version with vsock that is currently at 1.0.1.0-k).
      Reviewed-by: NAditya Asarwade <asarwade@vmware.com>
      Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ef7ea91
    • D
      netlink: fix locking around NETLINK_LIST_MEMBERSHIPS · 47191d65
      David Herrmann 提交于
      Currently, NETLINK_LIST_MEMBERSHIPS grabs the netlink table while copying
      the membership state to user-space. However, grabing the netlink table is
      effectively a write_lock_irq(), and as such we should not be triggering
      page-faults in the critical section.
      
      This can be easily reproduced by the following snippet:
          int s = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
          void *p = mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
          int r = getsockopt(s, 0x10e, 9, p, (void*)((char*)p + 4092));
      
      This should work just fine, but currently triggers EFAULT and a possible
      WARN_ON below handle_mm_fault().
      
      Fix this by reducing locking of NETLINK_LIST_MEMBERSHIPS to a read-side
      lock. The write-lock was overkill in the first place, and the read-lock
      allows page-faults just fine.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47191d65
    • J
      openvswitch: Serialize nested ct actions if provided · e754ec69
      Joe Stringer 提交于
      If userspace provides a ct action with no nested mark or label, then the
      storage for these fields is zeroed. Later when actions are requested,
      such zeroed fields are serialized even though userspace didn't
      originally specify them. Fix the behaviour by ensuring that no action is
      serialized in this case, and reject actions where userspace attempts to
      set these fields with mask=0. This should make netlink marshalling
      consistent across deserialization/reserialization.
      Reported-by: NJarno Rajahalme <jrajahalme@nicira.com>
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e754ec69
    • J
      openvswitch: Mark connections new when not confirmed. · 4f0909ee
      Joe Stringer 提交于
      New, related connections are marked as such as part of ovs_ct_lookup(),
      but they are not marked as "new" if the commit flag is used. Make this
      consistent by setting the "new" flag whenever !nf_ct_is_confirmed(ct).
      Reported-by: NJarno Rajahalme <jrajahalme@nicira.com>
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f0909ee
    • J
      openvswitch: Reject ct_state masks for unknown bits · 9e384715
      Joe Stringer 提交于
      Currently, 0-bits are generated in ct_state where the bit position is
      undefined, and matches are accepted on these bit-positions. If userspace
      requests to match the 0-value for this bit then it may expect only a
      subset of traffic to match this value, whereas currently all packets
      will have this bit set to 0. Fix this by rejecting such masks.
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e384715
    • R
      tcp: remove improper preemption check in tcp_xmit_probe_skb() · e2e8009f
      Renato Westphal 提交于
      Commit e520af48 introduced the following bug when setting the
      TCP_REPAIR sockoption:
      
      [ 2860.657036] BUG: using __this_cpu_add() in preemptible [00000000] code: daemon/12164
      [ 2860.657045] caller is __this_cpu_preempt_check+0x13/0x20
      [ 2860.657049] CPU: 1 PID: 12164 Comm: daemon Not tainted 4.2.3 #1
      [ 2860.657051] Hardware name: Dell Inc. PowerEdge R210 II/0JP7TR, BIOS 2.0.5 03/13/2012
      [ 2860.657054]  ffffffff81c7f071 ffff880231e9fdf8 ffffffff8185d765 0000000000000002
      [ 2860.657058]  0000000000000001 ffff880231e9fe28 ffffffff8146ed91 ffff880231e9fe18
      [ 2860.657062]  ffffffff81cd1a5d ffff88023534f200 ffff8800b9811000 ffff880231e9fe38
      [ 2860.657065] Call Trace:
      [ 2860.657072]  [<ffffffff8185d765>] dump_stack+0x4f/0x7b
      [ 2860.657075]  [<ffffffff8146ed91>] check_preemption_disabled+0xe1/0xf0
      [ 2860.657078]  [<ffffffff8146edd3>] __this_cpu_preempt_check+0x13/0x20
      [ 2860.657082]  [<ffffffff817e0bc7>] tcp_xmit_probe_skb+0xc7/0x100
      [ 2860.657085]  [<ffffffff817e1e2d>] tcp_send_window_probe+0x2d/0x30
      [ 2860.657089]  [<ffffffff817d1d8c>] do_tcp_setsockopt.isra.29+0x74c/0x830
      [ 2860.657093]  [<ffffffff817d1e9c>] tcp_setsockopt+0x2c/0x30
      [ 2860.657097]  [<ffffffff81767b74>] sock_common_setsockopt+0x14/0x20
      [ 2860.657100]  [<ffffffff817669e1>] SyS_setsockopt+0x71/0xc0
      [ 2860.657104]  [<ffffffff81865172>] entry_SYSCALL_64_fastpath+0x16/0x75
      
      Since tcp_xmit_probe_skb() can be called from process context, use
      NET_INC_STATS() instead of NET_INC_STATS_BH().
      
      Fixes: e520af48 ("tcp: add TCPWinProbe and TCPKeepAlive SNMP counters")
      Signed-off-by: NRenato Westphal <renatow@taghos.com.br>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2e8009f
    • J
      tipc: conditionally expand buffer headroom over udp tunnel · e5356794
      Jon Paul Maloy 提交于
      In commit d999297c ("tipc: reduce locking scope during packet reception")
      we altered the packet retransmission function. Since then, when
      restransmitting packets, we create a clone of the original buffer
      using __pskb_copy(skb, MIN_H_SIZE), where MIN_H_SIZE is the size of
      the area we want to have copied, but also the smallest possible TIPC
      packet size. The value of MIN_H_SIZE is 24.
      
      Unfortunately, __pskb_copy() also has the effect that the headroom
      of the cloned buffer takes the size MIN_H_SIZE. This is too small
      for carrying the packet over the UDP tunnel bearer, which requires
      a minimum headroom of 28 bytes. A change to just use pskb_copy()
      lets the clone inherit the original headroom of 80 bytes, but also
      assumes that the copied data area is of at least that size, something
      that is not always the case. So that is not a viable solution.
      
      We now fix this by adding a check for sufficient headroom in the
      transmit function of udp_media.c, and expanding it when necessary.
      
      Fixes: commit d999297c ("tipc: reduce locking scope during packet reception")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5356794
    • J
      tipc: allow non-linear first fragment buffer · 45c8b7b1
      Jon Paul Maloy 提交于
      The current code for message reassembly is erroneously assuming that
      the the first arriving fragment buffer always is linear, and then goes
      ahead resetting the fragment list of that buffer in anticipation of
      more arriving fragments.
      
      However, if the buffer already happens to be non-linear, we will
      inadvertently drop the already attached fragment list, and later
      on trig a BUG() in __pskb_pull_tail().
      
      We see this happen when running fragmented TIPC multicast across UDP,
      something made possible since
      commit d0f91938 ("tipc: add ip/udp media type")
      
      We fix this by not resetting the fragment list when the buffer is non-
      linear, and by initiatlizing our private fragment list tail pointer to
      the tail of the existing fragment list.
      
      Fixes: commit d0f91938 ("tipc: add ip/udp media type")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45c8b7b1
    • J
      openvswitch: Allocate memory for ovs internal device stats. · 1241365f
      James Morse 提交于
      "openvswitch: Remove vport stats" removed the per-vport statistics, in
      order to use the netdev's statistics fields.
      "openvswitch: Fix ovs_vport_get_stats()" fixed the export of these stats
      to user-space, by using the provided netdev_ops to collate them - but ovs
      internal devices still use an unallocated dev->tstats field to count
      packets, which are no longer exported by this api.
      
      Allocate the dev->tstats field for ovs internal devices, and wire up
      ndo_get_stats64 with the original implementation of
      ovs_vport_get_stats().
      
      On its own, "openvswitch: Fix ovs_vport_get_stats()" fixes the OOPs,
      unmasking a full-on panic on arm64:
      
      =============%<==============
      [<ffffffbffc00ce4c>] internal_dev_recv+0xa8/0x170 [openvswitch]
      [<ffffffbffc0008b4>] do_output.isra.31+0x60/0x19c [openvswitch]
      [<ffffffbffc000bf8>] do_execute_actions+0x208/0x11c0 [openvswitch]
      [<ffffffbffc001c78>] ovs_execute_actions+0xc8/0x238 [openvswitch]
      [<ffffffbffc003dfc>] ovs_packet_cmd_execute+0x21c/0x288 [openvswitch]
      [<ffffffc0005e8c5c>] genl_family_rcv_msg+0x1b0/0x310
      [<ffffffc0005e8e60>] genl_rcv_msg+0xa4/0xe4
      [<ffffffc0005e7ddc>] netlink_rcv_skb+0xb0/0xdc
      [<ffffffc0005e8a94>] genl_rcv+0x38/0x50
      [<ffffffc0005e76c0>] netlink_unicast+0x164/0x210
      [<ffffffc0005e7b70>] netlink_sendmsg+0x304/0x368
      [<ffffffc0005a21c0>] sock_sendmsg+0x30/0x4c
      [SNIP]
      Kernel panic - not syncing: Fatal exception in interrupt
      =============%<==============
      
      Fixes: 8c876639 ("openvswitch: Remove vport stats.")
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1241365f
    • D
      net: Really fix vti6 with oif in dst lookups · f1900fb5
      David Ahern 提交于
      6e28b000 ("net: Fix vti use case with oif in dst lookups for IPv6")
      is missing the checks on FLOWI_FLAG_SKIP_NH_OIF. Add them.
      
      Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1900fb5
    • J
      tipc: extend broadcast link window size · 53387c4e
      Jon Paul Maloy 提交于
      The default fix broadcast window size is currently set to 20 packets.
      This is a very low value, set at a time when we were still testing on
      10 Mb/s hubs, and a change to it is long overdue.
      
      Commit 7845989c ("net: tipc: fix stall during bclink wakeup procedure")
      revealed a problem with this low value. For messages of importance LOW,
      the backlog queue limit will be  calculated to 30 packets, while a
      single, maximum sized message of 66000 bytes, carried across a 1500 MTU
      network consists of 46 packets.
      
      This leads to the following scenario (among others leading to the same
      situation):
      
      1: Msg 1 of 46 packets is sent. 20 packets go to the transmit queue, 26
         packets to the backlog queue.
      2: Msg 2 of 46 packets is attempted sent, but rejected because there is
         no more space in the backlog queue at this level. The sender is added
         to the wakeup queue with a "pending packets chain size" number of 46.
      3: Some packets in the transmit queue are acked and released. We try to
         wake up the sender, but the pending size of 46 is bigger than the LOW
         wakeup limit of 30, so this doesn't happen.
      5: Subsequent acks releases all the remaining buffers. Each time we test
         for the wakeup criteria and find that 46 still is larger than 30,
         even after both the transmit and the backlog queues are empty.
      6: The sender is never woken up and given a chance to send its message.
         He is stuck.
      
      We could now loosen the wakeup criteria (used by link_prepare_wakeup())
      to become equal to the send criteria (used by tipc_link_xmit()), i.e.,
      by ignoring the "pending packets chain size" value altogether, or we can
      just increase the queue limits so that the criteria can be satisfied
      anyway. There are good reasons (potentially multiple waiting senders) to
      not opt for the former solution, so we choose the latter one.
      
      This commit fixes the problem by giving the broadcast link window a
      default value of 50 packets. We also introduce a new minimum link
      window size BCLINK_MIN_WIN of 32, which is enough to always avoid the
      described situation. Finally, in order to not break any existing users
      which may set the window explicitly, we enforce that the window is set
      to the new minimum value in case the user is trying to set it to
      anything lower.
      
      Fixes: 7845989c ("net: tipc: fix stall during bclink wakeup procedure")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53387c4e
  3. 21 10月, 2015 2 次提交
  4. 19 10月, 2015 3 次提交
    • S
      xfrm: Fix pmtu discovery for local generated packets. · ca064bd8
      Steffen Klassert 提交于
      Commit 044a832a ("xfrm: Fix local error reporting crash
      with interfamily tunnels") moved the setting of skb->protocol
      behind the last access of the inner mode family to fix an
      interfamily crash. Unfortunately now skb->protocol might not
      be set at all, so we fail dispatch to the inner address family.
      As a reault, the local error handler is not called and the
      mtu value is not reported back to userspace.
      
      We fix this by setting skb->protocol on message size errors
      before we call xfrm_local_error.
      
      Fixes: 044a832a ("xfrm: Fix local error reporting crash with interfamily tunnels")
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      ca064bd8
    • J
      openvswitch: Scrub skb between namespaces · 740dbc28
      Joe Stringer 提交于
      If OVS receives a packet from another namespace, then the packet should
      be scrubbed. However, people have already begun to rely on the behaviour
      that skb->mark is preserved across namespaces, so retain this one field.
      
      This is mainly to address information leakage between namespaces when
      using OVS internal ports, but by placing it in ovs_vport_receive() it is
      more generally applicable, meaning it should not be overlooked if other
      port types are allowed to be moved into namespaces in future.
      Signed-off-by: NJoe Stringer <joestringer@nicira.com>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      740dbc28
    • A
      netlink: Trim skb to alloc size to avoid MSG_TRUNC · db65a3aa
      Arad, Ronen 提交于
      netlink_dump() allocates skb based on the calculated min_dump_alloc or
      a per socket max_recvmsg_len.
      min_alloc_size is maximum space required for any single netdev
      attributes as calculated by rtnl_calcit().
      max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
      It is capped at 16KiB.
      The intention is to avoid small allocations and to minimize the number
      of calls required to obtain dump information for all net devices.
      
      netlink_dump packs as many small messages as could fit within an skb
      that was sized for the largest single netdev information. The actual
      space available within an skb is larger than what is requested. It could
      be much larger and up to near 2x with align to next power of 2 approach.
      
      Allowing netlink_dump to use all the space available within the
      allocated skb increases the buffer size a user has to provide to avoid
      truncaion (i.e. MSG_TRUNG flag set).
      
      It was observed that with many VLANs configured on at least one netdev,
      a larger buffer of near 64KiB was necessary to avoid "Message truncated"
      error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
      min_alloc_size was only little over 32KiB.
      
      This patch trims skb to allocated size in order to allow the user to
      avoid truncation with more reasonable buffer size.
      Signed-off-by: NRonen Arad <ronen.arad@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db65a3aa
  5. 17 10月, 2015 1 次提交
    • N
      netfilter: ipset: Fix sleeping memory allocation in atomic context · 00db674b
      Nikolay Borisov 提交于
      Commit 00590fdd introduced RCU locking in list type and in
      doing so introduced a memory allocation in list_set_add, which
      is done in an atomic context, due to the fact that ipset rcu
      list modifications are serialised with a spin lock. The reason
      why we can't use a mutex is that in addition to modifying the
      list with ipset commands, it's also being modified when a
      particular ipset rule timeout expires aka garbage collection.
      This gc is triggered from set_cleanup_entries, which in turn
      is invoked from a timer thus requiring the lock to be bh-safe.
      
      Concretely the following call chain can lead to "sleeping function
      called in atomic context" splat:
      call_ad -> list_set_uadt -> list_set_uadd -> kzalloc(, GFP_KERNEL).
      And since GFP_KERNEL allows initiating direct reclaim thus
      potentially sleeping in the allocation path.
      
      To fix the issue change the allocation type to GFP_ATOMIC, to
      correctly reflect that it is occuring in an atomic context.
      
      Fixes: 00590fdd ("netfilter: ipset: Introduce RCU locking in list type")
      Signed-off-by: NNikolay Borisov <kernel@kyup.com>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      00db674b
  6. 16 10月, 2015 9 次提交
  7. 15 10月, 2015 2 次提交
  8. 14 10月, 2015 1 次提交
    • J
      tipc: eliminate risk of stalled link synchronization · 0f8b8e28
      Jon Paul Maloy 提交于
      In commit 6e498158 ("tipc: move link synch and failover to link aggregation level")
      we introduced a new mechanism for performing link failover and
      synchronization. We have now detected a bug in this mechanism.
      
      During link synchronization we use the arrival of any packet on
      the tunnel link to trig a check for whether it has reached the
      synchronization point or not. This has turned out to be too
      permissive, since it may cause an arriving non-last SYNCH packet to
      end the synch state, just to see the next SYNCH packet initiate a
      new synch state with a new, higher synch point. This is not fatal,
      but should be avoided, because it may significantly extend the
      synchronization period, while at the same time we are not allowed
      to send NACKs if packets are lost. In the worst case, a low-traffic
      user may see its traffic stall until a LINK_PROTOCOL state message
      trigs the link to leave synchronization state.
      
      At the same time, LINK_PROTOCOL packets which happen to have a (non-
      valid) sequence number lower than the tunnel link's rcv_nxt value will
      be consistently dropped, and will never be able to resolve the situation
      described above.
      
      We fix this by exempting LINK_PROTOCOL packets from the sequence number
      check, as they should be. We also reduce (but don't completely
      eliminate) the risk of entering multiple synchronization states by only
      allowing the (logically) first SYNCH packet to initiate a synchronization
      state. This works independently of actual packet arrival order.
      
      Fixes: commit 6e498158 ("tipc: move link synch and failover to link aggregation level")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f8b8e28
  9. 13 10月, 2015 4 次提交
  10. 12 10月, 2015 2 次提交
    • C
      svcrdma: Fix NFS server crash triggered by 1MB NFS WRITE · 3be7f328
      Chuck Lever 提交于
      Now that the NFS server advertises a maximum payload size of 1MB
      for RPC/RDMA again, it crashes in svc_process_common() when NFS
      client sends a 1MB NFS WRITE on an NFS/RDMA mount.
      
      The server has set up a 259 element array of struct page pointers
      in rq_pages[] for each incoming request. The last element of the
      array is NULL.
      
      When an incoming request has been completely received,
      rdma_read_complete() attempts to set the starting page of the
      incoming page vector:
      
        rqstp->rq_arg.pages = &rqstp->rq_pages[head->hdr_count];
      
      and the page to use for the reply:
      
        rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
      
      But the value of page_no has already accounted for head->hdr_count.
      Thus rq_respages now points past the end of the incoming pages.
      
      For NFS WRITE operations smaller than the maximum, this is harmless.
      But when the NFS WRITE operation is as large as the server's max
      payload size, rq_respages now points at the last entry in rq_pages,
      which is NULL.
      
      Fixes: cc9a903d ('svcrdma: Change maximum server payload . . .')
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NSagi Grimberg <sagig@dev.mellanox.co.il>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: NShirley Ma <shirley.ma@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      3be7f328
    • L
      netfilter: ipt_rpfilter: remove the nh_scope test in rpfilter_lookup_reverse · cc4998fe
      lucien 提交于
      --accept-local  option works for res.type == RTN_LOCAL, which should be
      from the local table, but there, the fib_info's nh->nh_scope =
      RT_SCOPE_NOWHERE ( > RT_SCOPE_HOST). in fib_create_info().
      
      	if (cfg->fc_scope == RT_SCOPE_HOST) {
      		struct fib_nh *nh = fi->fib_nh;
      
      		/* Local address is added. */
      		if (nhs != 1 || nh->nh_gw)
      			goto err_inval;
      		nh->nh_scope = RT_SCOPE_NOWHERE;   <===
      		nh->nh_dev = dev_get_by_index(net, fi->fib_nh->nh_oif);
      		err = -ENODEV;
      		if (!nh->nh_dev)
      			goto failure;
      
      but in our rpfilter_lookup_reverse():
      
      	if (dev_match || flags & XT_RPFILTER_LOOSE)
      		return FIB_RES_NH(res).nh_scope <= RT_SCOPE_HOST;
      
      if nh->nh_scope > RT_SCOPE_HOST, it will fail. --accept-local option
      will never be passed.
      
      it seems the test is bogus and can be removed to fix this issue.
      
      	if (dev_match || flags & XT_RPFILTER_LOOSE)
      		return FIB_RES_NH(res).nh_scope <= RT_SCOPE_HOST;
      
      ipv6 does not have this issue.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      cc4998fe
  11. 11 10月, 2015 2 次提交