Merge branch 'ipfrags'

Jesper Dangaard Brouer says: ==================== This patchset is V2, with some trivial code fixes, which were noticed by DaveM. It is still a partly respin of my fragmentation optimization patches: http://thread.gmane.org/gmane.linux.network/250914 This is not the complete patchset, from the gmane link above. In this patchset, I primarily focus on adjusting cacheline for better SMP/NUMA performance. Once this patchset have been agreed upon, I will continue and respin the rest of my patches. This time around, I have created a frag DoS generator, via the tool trafgen (http://netsniff-ng.org/). To create a stable DoS scenario (no longer relying on frame dropping due to disabled flow-control). Two 10G interfaces are under-test, and uses Ethernet flow-control. A third interface is used for generating the DoS attack (this interface is also 10G, but it does not need to be, as 500Kpps DoS is enough). Test types summary (netperf): Test-20G64K == 2x10G with 65K fragments Test-20G3F == 2x10G with 3x fragments (3*1472 bytes) Test-20G64K+DoS == Same as 20G64K with frag DoS Test-20G3F+DoS == Same as 20G3F with frag DoS Patch list: Patch-01 - net: cacheline adjust struct netns_frags for better frag performance Patch-02 - net: cacheline adjust struct inet_frags for better frag performance Patch-03 - net: cacheline adjust struct inet_frag_queue Patch-04 - net: frag helper functions for mem limit tracking Patch-05 - net: use lib/percpu_counter API for fragmentation mem accounting Patch-06 - net: frag, move LRU list maintenance outside of rwlock Performance table summary: Test-type: Test-20G64K Test-20G3F 20G64K+DoS 20G3F+DoS ---------- ----------- ---------- ---------- --------- net-next: 15114.5 Mbit/s 8954.21 2444.28 3918.01 Mbit/s Patch-01: 16075.8 Mbit/s 8976.18 2621.49 4072.79 Mbit/s Patch-02: 17806.9 Mbit/s 9280.32 2478.62 4274.59 Mbit/s Patch-03: 17317.4 Mbit/s 9308.62 2546.05 4336.59 Mbit/s Patch-04: 17635.9 Mbit/s 9256.16 2535.25 4327.63 Mbit/s Patch-05: 18027.0 Mbit/s 9918.99 2492.62 3621.68 Mbit/s Patch-06: 18486.7 Mbit/s 10723.20 3657.85 4560.64 Mbit/s I cannot explain the under-DoS regression that patch-05/percpu_counter introduces. But patch-06/LRU-lock corrects the situation again. Below is a testlab setup description, with links to the trafgen DoS packet config used. Testlab ======= Server setup ------------ The machine acting as a server: - 2x CPU (E5-2630) - Thus a NUMA arch/machine - 4x 10Gbit/s ports - NICs 2x Intel Dual port 82599 based (driver ixgbe) Setup: - Interfaces uses Ethernet flow control - Flush all iptables - Remove all iptables related module. - Kill irqbalance - Pin each 10G NIC port to a *single* CPU each Pinning can easily be done by command hacks:: for x in /proc/irq/*/eth8*/../smp_affinity_list ; do echo 1 > $x; done for x in /proc/irq/*/eth9*/../smp_affinity_list ; do echo 3 > $x; done for x in /proc/irq/*/eth31*/../smp_affinity_list; do echo 6 > $x; done for x in /proc/irq/*/eth32*/../smp_affinity_list; do echo 8 > $x; done Notice NUMA setting: The CPU to NIC tying is carefully choosen according to the NUMA node setup. Thus, NICs connected to a PCI-e slot that is connected to a physical CPU socket are tied together. Choosing only a single CPU per NIC (port) is just to ease provoking and debugging this performance issue. (In real setups, you can choose more CPU, just remember the NUMA node in the equation). Tools ----- Netperf is used, with option -T to ensure CPU binding. The netserver processes, are NAPI pinned:: numactl -m0 -c0 netserver numactl -m1 -c 1 netserver -p 1337 I now have a frag DoS generator, created via the tool: trafgen (see: http://netsniff-ng.org/) Trafgen packet config file: http://people.netfilter.org/hawk/frag_work/trafgen/frag_packet03_small_frag.txf Notice, I'm using features of trafgen, recently developed by Daniel Borkmann, thus you need the latest git tree to use my trafgen packet config. git://github.com/borkmann/netsniff-ng.git Command line: trafgen --dev eth51 --conf frag_packet03_small_frag.txf -V -k 100 --cpus 2 Tests types ----------- Test(20G64K) UDP-64K 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((65507)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Test(20G3F) UDP-3xfrags 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((3*1472)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Awk script for summming results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' ==================== Signed-off-by: N David S. Miller <davem@davemloft.net>

Merge branch 'ipfrags'
Jesper Dangaard Brouer says: ==================== This patchset is V2, with some trivial code fixes, which were noticed by DaveM. It is still a partly respin of my fragmentation optimization patches: http://thread.gmane.org/gmane.linux.network/250914 This is not the complete patchset, from the gmane link above. In this patchset, I primarily focus on adjusting cacheline for better SMP/NUMA performance. Once this patchset have been agreed upon, I will continue and respin the rest of my patches. This time around, I have created a frag DoS generator, via the tool trafgen (http://netsniff-ng.org/). To create a stable DoS scenario (no longer relying on frame dropping due to disabled flow-control). Two 10G interfaces are under-test, and uses Ethernet flow-control. A third interface is used for generating the DoS attack (this interface is also 10G, but it does not need to be, as 500Kpps DoS is enough). Test types summary (netperf): Test-20G64K == 2x10G with 65K fragments Test-20G3F == 2x10G with 3x fragments (3*1472 bytes) Test-20G64K+DoS == Same as 20G64K with frag DoS Test-20G3F+DoS == Same as 20G3F with frag DoS Patch list: Patch-01 - net: cacheline adjust struct netns_frags for better frag performance Patch-02 - net: cacheline adjust struct inet_frags for better frag performance Patch-03 - net: cacheline adjust struct inet_frag_queue Patch-04 - net: frag helper functions for mem limit tracking Patch-05 - net: use lib/percpu_counter API for fragmentation mem accounting Patch-06 - net: frag, move LRU list maintenance outside of rwlock Performance table summary: Test-type: Test-20G64K Test-20G3F 20G64K+DoS 20G3F+DoS ---------- ----------- ---------- ---------- --------- net-next: 15114.5 Mbit/s 8954.21 2444.28 3918.01 Mbit/s Patch-01: 16075.8 Mbit/s 8976.18 2621.49 4072.79 Mbit/s Patch-02: 17806.9 Mbit/s 9280.32 2478.62 4274.59 Mbit/s Patch-03: 17317.4 Mbit/s 9308.62 2546.05 4336.59 Mbit/s Patch-04: 17635.9 Mbit/s 9256.16 2535.25 4327.63 Mbit/s Patch-05: 18027.0 Mbit/s 9918.99 2492.62 3621.68 Mbit/s Patch-06: 18486.7 Mbit/s 10723.20 3657.85 4560.64 Mbit/s I cannot explain the under-DoS regression that patch-05/percpu_counter introduces. But patch-06/LRU-lock corrects the situation again. Below is a testlab setup description, with links to the trafgen DoS packet config used. Testlab ======= Server setup ------------ The machine acting as a server: - 2x CPU (E5-2630) - Thus a NUMA arch/machine - 4x 10Gbit/s ports - NICs 2x Intel Dual port 82599 based (driver ixgbe) Setup: - Interfaces uses Ethernet flow control - Flush all iptables - Remove all iptables related module. - Kill irqbalance - Pin each 10G NIC port to a *single* CPU each Pinning can easily be done by command hacks:: for x in /proc/irq/*/eth8*/../smp_affinity_list ; do echo 1 > $x; done for x in /proc/irq/*/eth9*/../smp_affinity_list ; do echo 3 > $x; done for x in /proc/irq/*/eth31*/../smp_affinity_list; do echo 6 > $x; done for x in /proc/irq/*/eth32*/../smp_affinity_list; do echo 8 > $x; done Notice NUMA setting: The CPU to NIC tying is carefully choosen according to the NUMA node setup. Thus, NICs connected to a PCI-e slot that is connected to a physical CPU socket are tied together. Choosing only a single CPU per NIC (port) is just to ease provoking and debugging this performance issue. (In real setups, you can choose more CPU, just remember the NUMA node in the equation). Tools ----- Netperf is used, with option -T to ensure CPU binding. The netserver processes, are NAPI pinned:: numactl -m0 -c0 netserver numactl -m1 -c 1 netserver -p 1337 I now have a frag DoS generator, created via the tool: trafgen (see: http://netsniff-ng.org/) Trafgen packet config file: http://people.netfilter.org/hawk/frag_work/trafgen/frag_packet03_small_frag.txf Notice, I'm using features of trafgen, recently developed by Daniel Borkmann, thus you need the latest git tree to use my trafgen packet config. git://github.com/borkmann/netsniff-ng.git Command line: trafgen --dev eth51 --conf frag_packet03_small_frag.txf -V -k 100 --cpus 2 Tests types ----------- Test(20G64K) UDP-64K 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((65507)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Test(20G3F) UDP-3xfrags 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((3*1472)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Awk script for summming results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' ==================== Signed-off-by: N David S. Miller <davem@davemloft.net>
5a1dc317 · David S. Miller · 656a05c8 · 3ef0eb0d · 5a1dc317 · 5a1dc317
6 changed file
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
 #ifndef __NET_FRAG_H__
 #define __NET_FRAG_H__

+#include <linux/percpu_counter.h>
+
 struct netns_frags {
 	int			nqueues;
-	atomic_t		mem;
 	struct list_head	lru_list;
+	spinlock_t		lru_lock;
+
+	/* The percpu_counter "mem" need to be cacheline aligned.
+	 *  mem.count must not share cacheline with other writers
+	 */
+	struct percpu_counter   mem ____cacheline_aligned_in_smp;

 	/* sysctls */
 	int			timeout;
@@ -13,12 +20,11 @@ struct netns_frags {
 };

 struct inet_frag_queue {
-	struct hlist_node	list;
-	struct netns_frags	*net;
-	struct list_head	lru_list;   /* lru list member */
 	spinlock_t		lock;
-	atomic_t		refcnt;
 	struct timer_list	timer;      /* when will this queue expire? */
+	struct list_head	lru_list;   /* lru list member */
+	struct hlist_node	list;
+	atomic_t		refcnt;
 	struct sk_buff		*fragments; /* list of received fragments */
 	struct sk_buff		*fragments_tail;
 	ktime_t			stamp;
@@ -31,24 +37,29 @@ struct inet_frag_queue {
 #define INET_FRAG_LAST_IN	1

 	u16			max_size;
+
+	struct netns_frags	*net;
 };

 #define INETFRAGS_HASHSZ		64

 struct inet_frags {
 	struct hlist_head	hash[INETFRAGS_HASHSZ];
-	rwlock_t		lock;
-	u32			rnd;
-	int			qsize;
+	/* This rwlock is a global lock (seperate per IPv4, IPv6 and
+	 * netfilter). Important to keep this on a seperate cacheline.
+	 */
+	rwlock_t		lock ____cacheline_aligned_in_smp;
 	int			secret_interval;
 	struct timer_list	secret_timer;
+	u32			rnd;
+	int			qsize;

 	unsigned int		(*hashfn)(struct inet_frag_queue *);
+	bool			(*match)(struct inet_frag_queue *q, void *arg);
 	void			(*constructor)(struct inet_frag_queue *q,
 						void *arg);
 	void			(*destructor)(struct inet_frag_queue *);
 	void			(*skb_free)(struct sk_buff *);
-	bool			(*match)(struct inet_frag_queue *q, void *arg);
 	void			(*frag_expire)(unsigned long data);
 };

@@ -72,4 +83,59 @@ static inline void inet_frag_put(struct inet_frag_queue *q, struct inet_frags *f
 		inet_frag_destroy(q, f, NULL);
 }

+/* Memory Tracking Functions. */
+
+/* The default percpu_counter batch size is not big enough to scale to
+ * fragmentation mem acct sizes.
+ * The mem size of a 64K fragment is approx:
+ *  (44 fragments * 2944 truesize) + frag_queue struct(200) = 129736 bytes
+ */
+static unsigned int frag_percpu_counter_batch = 130000;
+
+static inline int frag_mem_limit(struct netns_frags *nf)
+{
+	return percpu_counter_read(&nf->mem);
+}
+
+static inline void sub_frag_mem_limit(struct inet_frag_queue *q, int i)
+{
+	__percpu_counter_add(&q->net->mem, -i, frag_percpu_counter_batch);
+}
+
+static inline void add_frag_mem_limit(struct inet_frag_queue *q, int i)
+{
+	__percpu_counter_add(&q->net->mem, i, frag_percpu_counter_batch);
+}
+
+static inline void init_frag_mem_limit(struct netns_frags *nf)
+{
+	percpu_counter_init(&nf->mem, 0);
+}
+
+static inline int sum_frag_mem_limit(struct netns_frags *nf)
+{
+	return percpu_counter_sum_positive(&nf->mem);
+}
+
+static inline void inet_frag_lru_move(struct inet_frag_queue *q)
+{
+	spin_lock(&q->net->lru_lock);
+	list_move_tail(&q->lru_list, &q->net->lru_list);
+	spin_unlock(&q->net->lru_lock);
+}
+
+static inline void inet_frag_lru_del(struct inet_frag_queue *q)
+{
+	spin_lock(&q->net->lru_lock);
+	list_del(&q->lru_list);
+	spin_unlock(&q->net->lru_lock);
+}
+
+static inline void inet_frag_lru_add(struct netns_frags *nf,
+				     struct inet_frag_queue *q)
+{
+	spin_lock(&nf->lru_lock);
+	list_add_tail(&q->lru_list, &nf->lru_list);
+	spin_unlock(&nf->lru_lock);
+}
 #endif
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -288,7 +288,7 @@ static inline int ip6_frag_nqueues(struct net *net)

 static inline int ip6_frag_mem(struct net *net)
 {
-	return atomic_read(&net->ipv6.frags.mem);
+	return sum_frag_mem_limit(&net->ipv6.frags);
 }
 #endif


--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -73,8 +73,9 @@ EXPORT_SYMBOL(inet_frags_init);
 void inet_frags_init_net(struct netns_frags *nf)
 {
 	nf->nqueues = 0;
-	atomic_set(&nf->mem, 0);
+	init_frag_mem_limit(nf);
 	INIT_LIST_HEAD(&nf->lru_list);
+	spin_lock_init(&nf->lru_lock);
 }
 EXPORT_SYMBOL(inet_frags_init_net);

@@ -91,6 +92,8 @@ void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f)
 	local_bh_disable();
 	inet_frag_evictor(nf, f, true);
 	local_bh_enable();
+
+	percpu_counter_destroy(&nf->mem);
 }
 EXPORT_SYMBOL(inet_frags_exit_net);

@@ -98,9 +101,9 @@ static inline void fq_unlink(struct inet_frag_queue *fq, struct inet_frags *f)
 {
 	write_lock(&f->lock);
 	hlist_del(&fq->list);
-	list_del(&fq->lru_list);
 	fq->net->nqueues--;
 	write_unlock(&f->lock);
+	inet_frag_lru_del(fq);
 }

 void inet_frag_kill(struct inet_frag_queue *fq, struct inet_frags *f)
@@ -117,12 +120,8 @@ void inet_frag_kill(struct inet_frag_queue *fq, struct inet_frags *f)
 EXPORT_SYMBOL(inet_frag_kill);

 static inline void frag_kfree_skb(struct netns_frags *nf, struct inet_frags *f,
-		struct sk_buff *skb, int *work)
+		struct sk_buff *skb)
 {
-	if (work)
-		*work -= skb->truesize;
-
-	atomic_sub(skb->truesize, &nf->mem);
 	if (f->skb_free)
 		f->skb_free(skb);
 	kfree_skb(skb);
@@ -133,6 +132,7 @@ void inet_frag_destroy(struct inet_frag_queue *q, struct inet_frags *f,
 {
 	struct sk_buff *fp;
 	struct netns_frags *nf;
+	unsigned int sum, sum_truesize = 0;

 	WARN_ON(!(q->last_in & INET_FRAG_COMPLETE));
 	WARN_ON(del_timer(&q->timer) != 0);
@@ -143,13 +143,14 @@ void inet_frag_destroy(struct inet_frag_queue *q, struct inet_frags *f,
 	while (fp) {
 		struct sk_buff *xp = fp->next;

-		frag_kfree_skb(nf, f, fp, work);
+		sum_truesize += fp->truesize;
+		frag_kfree_skb(nf, f, fp);
 		fp = xp;
 	}
-
+	sum = sum_truesize + f->qsize;
 	if (work)
-		*work -= f->qsize;
-	atomic_sub(f->qsize, &nf->mem);
+		*work -= sum;
+	sub_frag_mem_limit(q, sum);

 	if (f->destructor)
 		f->destructor(q);
@@ -164,22 +165,23 @@ int inet_frag_evictor(struct netns_frags *nf, struct inet_frags *f, bool force)
 	int work, evicted = 0;

 	if (!force) {
-		if (atomic_read(&nf->mem) <= nf->high_thresh)
+		if (frag_mem_limit(nf) <= nf->high_thresh)
 			return 0;
 	}

-	work = atomic_read(&nf->mem) - nf->low_thresh;
+	work = frag_mem_limit(nf) - nf->low_thresh;
 	while (work > 0) {
-		read_lock(&f->lock);
+		spin_lock(&nf->lru_lock);
+
 		if (list_empty(&nf->lru_list)) {
-			read_unlock(&f->lock);
+			spin_unlock(&nf->lru_lock);
 			break;
 		}

 		q = list_first_entry(&nf->lru_list,
 				struct inet_frag_queue, lru_list);
 		atomic_inc(&q->refcnt);
-		read_unlock(&f->lock);
+		spin_unlock(&nf->lru_lock);

 		spin_lock(&q->lock);
 		if (!(q->last_in & INET_FRAG_COMPLETE))
@@ -233,9 +235,9 @@ static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,

 	atomic_inc(&qp->refcnt);
 	hlist_add_head(&qp->list, &f->hash[hash]);
-	list_add_tail(&qp->lru_list, &nf->lru_list);
 	nf->nqueues++;
 	write_unlock(&f->lock);
+	inet_frag_lru_add(nf, qp);
 	return qp;
 }

@@ -250,7 +252,8 @@ static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,

 	q->net = nf;
 	f->constructor(q, arg);
-	atomic_add(f->qsize, &nf->mem);
+	add_frag_mem_limit(q, f->qsize);
+
 	setup_timer(&q->timer, f->frag_expire, (unsigned long)q);
 	spin_lock_init(&q->lock);
 	atomic_set(&q->refcnt, 1);

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -122,7 +122,7 @@ int ip_frag_nqueues(struct net *net)

 int ip_frag_mem(struct net *net)
 {
-	return atomic_read(&net->ipv4.frags.mem);
+	return sum_frag_mem_limit(&net->ipv4.frags);
 }

 static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
@@ -161,13 +161,6 @@ static bool ip4_frag_match(struct inet_frag_queue *q, void *a)
 		qp->user == arg->user;
 }

-/* Memory Tracking Functions. */
-static void frag_kfree_skb(struct netns_frags *nf, struct sk_buff *skb)
-{
-	atomic_sub(skb->truesize, &nf->mem);
-	kfree_skb(skb);
-}
-
 static void ip4_frag_init(struct inet_frag_queue *q, void *a)
 {
 	struct ipq *qp = container_of(q, struct ipq, q);
@@ -340,6 +333,7 @@ static inline int ip_frag_too_far(struct ipq *qp)
 static int ip_frag_reinit(struct ipq *qp)
 {
 	struct sk_buff *fp;
+	unsigned int sum_truesize = 0;

 	if (!mod_timer(&qp->q.timer, jiffies + qp->q.net->timeout)) {
 		atomic_inc(&qp->q.refcnt);
@@ -349,9 +343,12 @@ static int ip_frag_reinit(struct ipq *qp)
 	fp = qp->q.fragments;
 	do {
 		struct sk_buff *xp = fp->next;
-		frag_kfree_skb(qp->q.net, fp);
+
+		sum_truesize += fp->truesize;
+		kfree_skb(fp);
 		fp = xp;
 	} while (fp);
+	sub_frag_mem_limit(&qp->q, sum_truesize);

 	qp->q.last_in = 0;
 	qp->q.len = 0;
@@ -496,7 +493,8 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 				qp->q.fragments = next;

 			qp->q.meat -= free_it->len;
-			frag_kfree_skb(qp->q.net, free_it);
+			sub_frag_mem_limit(&qp->q, free_it->truesize);
+			kfree_skb(free_it);
 		}
 	}

@@ -519,7 +517,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	qp->q.stamp = skb->tstamp;
 	qp->q.meat += skb->len;
 	qp->ecn |= ecn;
-	atomic_add(skb->truesize, &qp->q.net->mem);
+	add_frag_mem_limit(&qp->q, skb->truesize);
 	if (offset == 0)
 		qp->q.last_in |= INET_FRAG_FIRST_IN;

@@ -531,9 +529,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	    qp->q.meat == qp->q.len)
 		return ip_frag_reasm(qp, prev, dev);

-	write_lock(&ip4_frags.lock);
-	list_move_tail(&qp->q.lru_list, &qp->q.net->lru_list);
-	write_unlock(&ip4_frags.lock);
+	inet_frag_lru_move(&qp->q);
 	return -EINPROGRESS;

 err:
@@ -617,7 +613,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 		head->len -= clone->len;
 		clone->csum = 0;
 		clone->ip_summed = head->ip_summed;
-		atomic_add(clone->truesize, &qp->q.net->mem);
+		add_frag_mem_limit(&qp->q, clone->truesize);
 	}

 	skb_push(head, head->data - skb_network_header(head));
@@ -645,7 +641,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 		}
 		fp = next;
 	}
-	atomic_sub(sum_truesize, &qp->q.net->mem);
+	sub_frag_mem_limit(&qp->q, sum_truesize);

 	head->next = NULL;
 	head->dev = dev;

--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -319,7 +319,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
 	fq->q.meat += skb->len;
 	if (payload_len > fq->q.max_size)
 		fq->q.max_size = payload_len;
-	atomic_add(skb->truesize, &fq->q.net->mem);
+	add_frag_mem_limit(&fq->q, skb->truesize);

 	/* The first fragment.
 	 * nhoffset is obtained from the first fragment, of course.
@@ -328,9 +328,8 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
 		fq->nhoffset = nhoff;
 		fq->q.last_in |= INET_FRAG_FIRST_IN;
 	}
-	write_lock(&nf_frags.lock);
-	list_move_tail(&fq->q.lru_list, &fq->q.net->lru_list);
-	write_unlock(&nf_frags.lock);
+
+	inet_frag_lru_move(&fq->q);
 	return 0;

 discard_fq:
@@ -398,7 +397,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev)
 		clone->ip_summed = head->ip_summed;

 		NFCT_FRAG6_CB(clone)->orig = NULL;
-		atomic_add(clone->truesize, &fq->q.net->mem);
+		add_frag_mem_limit(&fq->q, clone->truesize);
 	}

 	/* We have to remove fragment header from datagram and to relocate
@@ -422,7 +421,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev)
 			head->csum = csum_add(head->csum, fp->csum);
 		head->truesize += fp->truesize;
 	}
-	atomic_sub(head->truesize, &fq->q.net->mem);
+	sub_frag_mem_limit(&fq->q, head->truesize);

 	head->local_df = 1;
 	head->next = NULL;

--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -327,7 +327,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 	}
 	fq->q.stamp = skb->tstamp;
 	fq->q.meat += skb->len;
-	atomic_add(skb->truesize, &fq->q.net->mem);
+	add_frag_mem_limit(&fq->q, skb->truesize);

 	/* The first fragment.
 	 * nhoffset is obtained from the first fragment, of course.
@@ -341,9 +341,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 	    fq->q.meat == fq->q.len)
 		return ip6_frag_reasm(fq, prev, dev);

-	write_lock(&ip6_frags.lock);
-	list_move_tail(&fq->q.lru_list, &fq->q.net->lru_list);
-	write_unlock(&ip6_frags.lock);
+	inet_frag_lru_move(&fq->q);
 	return -1;

 discard_fq:
@@ -429,7 +427,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 		head->len -= clone->len;
 		clone->csum = 0;
 		clone->ip_summed = head->ip_summed;
-		atomic_add(clone->truesize, &fq->q.net->mem);
+		add_frag_mem_limit(&fq->q, clone->truesize);
 	}

 	/* We have to remove fragment header from datagram and to relocate
@@ -467,7 +465,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 		}
 		fp = next;
 	}
-	atomic_sub(sum_truesize, &fq->q.net->mem);
+	sub_frag_mem_limit(&fq->q, sum_truesize);

 	head->next = NULL;
 	head->dev = dev;