提交 df632d3c 编写于 作者: L Linus Torvalds

Merge tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Features include:

   - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1
     Aside from the issues discussed at the LKS, distros are shipping
     NFSv4.1 with all the trimmings.
   - Fix fdatasync()/fsync() for the corner case of a server reboot.
   - NFSv4 OPEN access fix: finally distinguish correctly between
     open-for-read and open-for-execute permissions in all situations.
   - Ensure that the TCP socket is closed when we're in CLOSE_WAIT
   - More idmapper bugfixes
   - Lots of pNFS bugfixes and cleanups to remove unnecessary state and
     make the code easier to read.
   - In cases where a pNFS read or write fails, allow the client to
     resume trying layoutgets after two minutes of read/write-
     through-mds.
   - More net namespace fixes to the NFSv4 callback code.
   - More net namespace fixes to the NFSv3 locking code.
   - More NFSv4 migration preparatory patches.
     Including patches to detect network trunking in both NFSv4 and
     NFSv4.1
   - pNFS block updates to optimise LAYOUTGET calls."

* tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (113 commits)
  pnfsblock: cleanup nfs4_blkdev_get
  NFS41: send real read size in layoutget
  NFS41: send real write size in layoutget
  NFS: track direct IO left bytes
  NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked()
  NFSv4.1: Ensure that the layout sequence id stays 'close' to the current
  NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close code
  NFSv4 set open access operation call flag in nfs4_init_opendata_res
  NFSv4.1: Remove the dependency on CONFIG_EXPERIMENTAL
  NFSv4 reduce attribute requests for open reclaim
  NFSv4: nfs4_open_done first must check that GETATTR decoded a file type
  NFSv4.1: Deal with wraparound when updating the layout "barrier" seqid
  NFSv4.1: Deal with wraparound issues when updating the layout stateid
  NFSv4.1: Always set the layout stateid if this is the first layoutget
  NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layout
  NFSv4: don't put ACCESS in OPEN compound if O_EXCL
  NFSv4: don't check MAY_WRITE access bit in OPEN
  NFS: Set key construction data for the legacy upcall
  NFSv4.1: don't do two EXCHANGE_IDs on mount
  NFS: nfs41_walk_client_list(): re-lock before iterating
  ...
......@@ -12,9 +12,47 @@ and work is in progress on adding support for minor version 1 of the NFSv4
protocol.
The purpose of this document is to provide information on some of the
upcall interfaces that are used in order to provide the NFS client with
some of the information that it requires in order to fully comply with
the NFS spec.
special features of the NFS client that can be configured by system
administrators.
The nfs4_unique_id parameter
============================
NFSv4 requires clients to identify themselves to servers with a unique
string. File open and lock state shared between one client and one server
is associated with this identity. To support robust NFSv4 state recovery
and transparent state migration, this identity string must not change
across client reboots.
Without any other intervention, the Linux client uses a string that contains
the local system's node name. System administrators, however, often do not
take care to ensure that node names are fully qualified and do not change
over the lifetime of a client system. Node names can have other
administrative requirements that require particular behavior that does not
work well as part of an nfs_client_id4 string.
The nfs.nfs4_unique_id boot parameter specifies a unique string that can be
used instead of a system's node name when an NFS client identifies itself to
a server. Thus, if the system's node name is not unique, or it changes, its
nfs.nfs4_unique_id stays the same, preventing collision with other clients
or loss of state during NFS reboot recovery or transparent state migration.
The nfs.nfs4_unique_id string is typically a UUID, though it can contain
anything that is believed to be unique across all NFS clients. An
nfs4_unique_id string should be chosen when a client system is installed,
just as a system's root file system gets a fresh UUID in its label at
install time.
The string should remain fixed for the lifetime of the client. It can be
changed safely if care is taken that the client shuts down cleanly and all
outstanding NFSv4 state has expired, to prevent loss of NFSv4 state.
This string can be stored in an NFS client's grub.conf, or it can be provided
via a net boot facility such as PXE. It may also be specified as an nfs.ko
module parameter. Specifying a uniquifier string is not support for NFS
clients running in containers.
The DNS resolver
================
......
......@@ -1730,6 +1730,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
will be autodetected by the client, and it will fall
back to using the idmapper.
To turn off this behaviour, set the value to '0'.
nfs.nfs4_unique_id=
[NFS4] Specify an additional fixed unique ident-
ification string that NFSv4 clients can insert into
their nfs_client_id4 string. This is typically a
UUID that is generated at system install time.
nfs.send_implementation_id =
[NFSv4.1] Send client implementation identification
......
......@@ -7,7 +7,6 @@
*/
#include <linux/types.h>
#include <linux/utsname.h>
#include <linux/kernel.h>
#include <linux/ktime.h>
#include <linux/slab.h>
......@@ -19,6 +18,8 @@
#include <asm/unaligned.h>
#include "netns.h"
#define NLMDBG_FACILITY NLMDBG_MONITOR
#define NSM_PROGRAM 100024
#define NSM_VERSION 1
......@@ -40,6 +41,7 @@ struct nsm_args {
u32 proc;
char *mon_name;
char *nodename;
};
struct nsm_res {
......@@ -70,7 +72,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
};
struct rpc_create_args args = {
.net = net,
.protocol = XPRT_TRANSPORT_UDP,
.protocol = XPRT_TRANSPORT_TCP,
.address = (struct sockaddr *)&sin,
.addrsize = sizeof(sin),
.servername = "rpc.statd",
......@@ -83,10 +85,54 @@ static struct rpc_clnt *nsm_create(struct net *net)
return rpc_create(&args);
}
static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res,
struct net *net)
static struct rpc_clnt *nsm_client_get(struct net *net)
{
static DEFINE_MUTEX(nsm_create_mutex);
struct rpc_clnt *clnt;
struct lockd_net *ln = net_generic(net, lockd_net_id);
spin_lock(&ln->nsm_clnt_lock);
if (ln->nsm_users) {
ln->nsm_users++;
clnt = ln->nsm_clnt;
spin_unlock(&ln->nsm_clnt_lock);
goto out;
}
spin_unlock(&ln->nsm_clnt_lock);
mutex_lock(&nsm_create_mutex);
clnt = nsm_create(net);
if (!IS_ERR(clnt)) {
ln->nsm_clnt = clnt;
smp_wmb();
ln->nsm_users = 1;
}
mutex_unlock(&nsm_create_mutex);
out:
return clnt;
}
static void nsm_client_put(struct net *net)
{
struct lockd_net *ln = net_generic(net, lockd_net_id);
struct rpc_clnt *clnt = ln->nsm_clnt;
int shutdown = 0;
spin_lock(&ln->nsm_clnt_lock);
if (ln->nsm_users) {
if (--ln->nsm_users)
ln->nsm_clnt = NULL;
shutdown = !ln->nsm_users;
}
spin_unlock(&ln->nsm_clnt_lock);
if (shutdown)
rpc_shutdown_client(clnt);
}
static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res,
struct rpc_clnt *clnt)
{
int status;
struct nsm_args args = {
.priv = &nsm->sm_priv,
......@@ -94,31 +140,24 @@ static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res,
.vers = 3,
.proc = NLMPROC_NSM_NOTIFY,
.mon_name = nsm->sm_mon_name,
.nodename = clnt->cl_nodename,
};
struct rpc_message msg = {
.rpc_argp = &args,
.rpc_resp = res,
};
clnt = nsm_create(net);
if (IS_ERR(clnt)) {
status = PTR_ERR(clnt);
dprintk("lockd: failed to create NSM upcall transport, "
"status=%d\n", status);
goto out;
}
BUG_ON(clnt == NULL);
memset(res, 0, sizeof(*res));
msg.rpc_proc = &clnt->cl_procinfo[proc];
status = rpc_call_sync(clnt, &msg, 0);
status = rpc_call_sync(clnt, &msg, RPC_TASK_SOFTCONN);
if (status < 0)
dprintk("lockd: NSM upcall RPC failed, status=%d\n",
status);
else
status = 0;
rpc_shutdown_client(clnt);
out:
return status;
}
......@@ -138,6 +177,7 @@ int nsm_monitor(const struct nlm_host *host)
struct nsm_handle *nsm = host->h_nsmhandle;
struct nsm_res res;
int status;
struct rpc_clnt *clnt;
dprintk("lockd: nsm_monitor(%s)\n", nsm->sm_name);
......@@ -150,7 +190,15 @@ int nsm_monitor(const struct nlm_host *host)
*/
nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name : nsm->sm_addrbuf;
status = nsm_mon_unmon(nsm, NSMPROC_MON, &res, host->net);
clnt = nsm_client_get(host->net);
if (IS_ERR(clnt)) {
status = PTR_ERR(clnt);
dprintk("lockd: failed to create NSM upcall transport, "
"status=%d, net=%p\n", status, host->net);
return status;
}
status = nsm_mon_unmon(nsm, NSMPROC_MON, &res, clnt);
if (unlikely(res.status != 0))
status = -EIO;
if (unlikely(status < 0)) {
......@@ -182,9 +230,11 @@ void nsm_unmonitor(const struct nlm_host *host)
if (atomic_read(&nsm->sm_count) == 1
&& nsm->sm_monitored && !nsm->sm_sticky) {
struct lockd_net *ln = net_generic(host->net, lockd_net_id);
dprintk("lockd: nsm_unmonitor(%s)\n", nsm->sm_name);
status = nsm_mon_unmon(nsm, NSMPROC_UNMON, &res, host->net);
status = nsm_mon_unmon(nsm, NSMPROC_UNMON, &res, ln->nsm_clnt);
if (res.status != 0)
status = -EIO;
if (status < 0)
......@@ -192,6 +242,8 @@ void nsm_unmonitor(const struct nlm_host *host)
nsm->sm_name);
else
nsm->sm_monitored = 0;
nsm_client_put(host->net);
}
}
......@@ -430,7 +482,7 @@ static void encode_my_id(struct xdr_stream *xdr, const struct nsm_args *argp)
{
__be32 *p;
encode_nsm_string(xdr, utsname()->nodename);
encode_nsm_string(xdr, argp->nodename);
p = xdr_reserve_space(xdr, 4 + 4 + 4);
*p++ = cpu_to_be32(argp->prog);
*p++ = cpu_to_be32(argp->vers);
......
......@@ -12,6 +12,10 @@ struct lockd_net {
struct delayed_work grace_period_end;
struct lock_manager lockd_manager;
struct list_head grace_list;
spinlock_t nsm_clnt_lock;
unsigned int nsm_users;
struct rpc_clnt *nsm_clnt;
};
extern int lockd_net_id;
......
......@@ -596,6 +596,7 @@ static int lockd_init_net(struct net *net)
INIT_DELAYED_WORK(&ln->grace_period_end, grace_ender);
INIT_LIST_HEAD(&ln->grace_list);
spin_lock_init(&ln->nsm_clnt_lock);
return 0;
}
......
......@@ -95,8 +95,8 @@ config NFS_SWAP
This option enables swapon to work on files located on NFS mounts.
config NFS_V4_1
bool "NFS client support for NFSv4.1 (EXPERIMENTAL)"
depends on NFS_V4 && EXPERIMENTAL
bool "NFS client support for NFSv4.1"
depends on NFS_V4
select SUNRPC_BACKCHANNEL
help
This option enables support for minor version 1 of the NFSv4 protocol
......
......@@ -37,6 +37,7 @@
#include <linux/bio.h> /* struct bio */
#include <linux/buffer_head.h> /* various write calls */
#include <linux/prefetch.h>
#include <linux/pagevec.h>
#include "../pnfs.h"
#include "../internal.h"
......@@ -162,25 +163,39 @@ static struct bio *bl_alloc_init_bio(int npg, sector_t isect,
return bio;
}
static struct bio *bl_add_page_to_bio(struct bio *bio, int npg, int rw,
static struct bio *do_add_page_to_bio(struct bio *bio, int npg, int rw,
sector_t isect, struct page *page,
struct pnfs_block_extent *be,
void (*end_io)(struct bio *, int err),
struct parallel_io *par)
struct parallel_io *par,
unsigned int offset, int len)
{
isect = isect + (offset >> SECTOR_SHIFT);
dprintk("%s: npg %d rw %d isect %llu offset %u len %d\n", __func__,
npg, rw, (unsigned long long)isect, offset, len);
retry:
if (!bio) {
bio = bl_alloc_init_bio(npg, isect, be, end_io, par);
if (!bio)
return ERR_PTR(-ENOMEM);
}
if (bio_add_page(bio, page, PAGE_CACHE_SIZE, 0) < PAGE_CACHE_SIZE) {
if (bio_add_page(bio, page, len, offset) < len) {
bio = bl_submit_bio(rw, bio);
goto retry;
}
return bio;
}
static struct bio *bl_add_page_to_bio(struct bio *bio, int npg, int rw,
sector_t isect, struct page *page,
struct pnfs_block_extent *be,
void (*end_io)(struct bio *, int err),
struct parallel_io *par)
{
return do_add_page_to_bio(bio, npg, rw, isect, page, be,
end_io, par, 0, PAGE_CACHE_SIZE);
}
/* This is basically copied from mpage_end_io_read */
static void bl_end_io_read(struct bio *bio, int err)
{
......@@ -228,14 +243,6 @@ bl_end_par_io_read(void *data, int unused)
schedule_work(&rdata->task.u.tk_work);
}
static bool
bl_check_alignment(u64 offset, u32 len, unsigned long blkmask)
{
if ((offset & blkmask) || (len & blkmask))
return false;
return true;
}
static enum pnfs_try_status
bl_read_pagelist(struct nfs_read_data *rdata)
{
......@@ -246,15 +253,15 @@ bl_read_pagelist(struct nfs_read_data *rdata)
sector_t isect, extent_length = 0;
struct parallel_io *par;
loff_t f_offset = rdata->args.offset;
size_t bytes_left = rdata->args.count;
unsigned int pg_offset, pg_len;
struct page **pages = rdata->args.pages;
int pg_index = rdata->args.pgbase >> PAGE_CACHE_SHIFT;
const bool is_dio = (header->dreq != NULL);
dprintk("%s enter nr_pages %u offset %lld count %u\n", __func__,
rdata->pages.npages, f_offset, (unsigned int)rdata->args.count);
if (!bl_check_alignment(f_offset, rdata->args.count, PAGE_CACHE_MASK))
goto use_mds;
par = alloc_parallel(rdata);
if (!par)
goto use_mds;
......@@ -284,36 +291,53 @@ bl_read_pagelist(struct nfs_read_data *rdata)
extent_length = min(extent_length, cow_length);
}
}
if (is_dio) {
pg_offset = f_offset & ~PAGE_CACHE_MASK;
if (pg_offset + bytes_left > PAGE_CACHE_SIZE)
pg_len = PAGE_CACHE_SIZE - pg_offset;
else
pg_len = bytes_left;
f_offset += pg_len;
bytes_left -= pg_len;
isect += (pg_offset >> SECTOR_SHIFT);
} else {
pg_offset = 0;
pg_len = PAGE_CACHE_SIZE;
}
hole = is_hole(be, isect);
if (hole && !cow_read) {
bio = bl_submit_bio(READ, bio);
/* Fill hole w/ zeroes w/o accessing device */
dprintk("%s Zeroing page for hole\n", __func__);
zero_user_segment(pages[i], 0, PAGE_CACHE_SIZE);
zero_user_segment(pages[i], pg_offset, pg_len);
print_page(pages[i]);
SetPageUptodate(pages[i]);
} else {
struct pnfs_block_extent *be_read;
be_read = (hole && cow_read) ? cow_read : be;
bio = bl_add_page_to_bio(bio, rdata->pages.npages - i,
bio = do_add_page_to_bio(bio, rdata->pages.npages - i,
READ,
isect, pages[i], be_read,
bl_end_io_read, par);
bl_end_io_read, par,
pg_offset, pg_len);
if (IS_ERR(bio)) {
header->pnfs_error = PTR_ERR(bio);
bio = NULL;
goto out;
}
}
isect += PAGE_CACHE_SECTORS;
isect += (pg_len >> SECTOR_SHIFT);
extent_length -= PAGE_CACHE_SECTORS;
}
if ((isect << SECTOR_SHIFT) >= header->inode->i_size) {
rdata->res.eof = 1;
rdata->res.count = header->inode->i_size - f_offset;
rdata->res.count = header->inode->i_size - rdata->args.offset;
} else {
rdata->res.count = (isect << SECTOR_SHIFT) - f_offset;
rdata->res.count = (isect << SECTOR_SHIFT) - rdata->args.offset;
}
out:
bl_put_extent(be);
......@@ -461,6 +485,106 @@ map_block(struct buffer_head *bh, sector_t isect, struct pnfs_block_extent *be)
return;
}
static void
bl_read_single_end_io(struct bio *bio, int error)
{
struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
struct page *page = bvec->bv_page;
/* Only one page in bvec */
unlock_page(page);
}
static int
bl_do_readpage_sync(struct page *page, struct pnfs_block_extent *be,
unsigned int offset, unsigned int len)
{
struct bio *bio;
struct page *shadow_page;
sector_t isect;
char *kaddr, *kshadow_addr;
int ret = 0;
dprintk("%s: offset %u len %u\n", __func__, offset, len);
shadow_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
if (shadow_page == NULL)
return -ENOMEM;
bio = bio_alloc(GFP_NOIO, 1);
if (bio == NULL)
return -ENOMEM;
isect = (page->index << PAGE_CACHE_SECTOR_SHIFT) +
(offset / SECTOR_SIZE);
bio->bi_sector = isect - be->be_f_offset + be->be_v_offset;
bio->bi_bdev = be->be_mdev;
bio->bi_end_io = bl_read_single_end_io;
lock_page(shadow_page);
if (bio_add_page(bio, shadow_page,
SECTOR_SIZE, round_down(offset, SECTOR_SIZE)) == 0) {
unlock_page(shadow_page);
bio_put(bio);
return -EIO;
}
submit_bio(READ, bio);
wait_on_page_locked(shadow_page);
if (unlikely(!test_bit(BIO_UPTODATE, &bio->bi_flags))) {
ret = -EIO;
} else {
kaddr = kmap_atomic(page);
kshadow_addr = kmap_atomic(shadow_page);
memcpy(kaddr + offset, kshadow_addr + offset, len);
kunmap_atomic(kshadow_addr);
kunmap_atomic(kaddr);
}
__free_page(shadow_page);
bio_put(bio);
return ret;
}
static int
bl_read_partial_page_sync(struct page *page, struct pnfs_block_extent *be,
unsigned int dirty_offset, unsigned int dirty_len,
bool full_page)
{
int ret = 0;
unsigned int start, end;
if (full_page) {
start = 0;
end = PAGE_CACHE_SIZE;
} else {
start = round_down(dirty_offset, SECTOR_SIZE);
end = round_up(dirty_offset + dirty_len, SECTOR_SIZE);
}
dprintk("%s: offset %u len %d\n", __func__, dirty_offset, dirty_len);
if (!be) {
zero_user_segments(page, start, dirty_offset,
dirty_offset + dirty_len, end);
if (start == 0 && end == PAGE_CACHE_SIZE &&
trylock_page(page)) {
SetPageUptodate(page);
unlock_page(page);
}
return ret;
}
if (start != dirty_offset)
ret = bl_do_readpage_sync(page, be, start, dirty_offset - start);
if (!ret && (dirty_offset + dirty_len < end))
ret = bl_do_readpage_sync(page, be, dirty_offset + dirty_len,
end - dirty_offset - dirty_len);
return ret;
}
/* Given an unmapped page, zero it or read in page for COW, page is locked
* by caller.
*/
......@@ -494,7 +618,6 @@ init_page_for_write(struct page *page, struct pnfs_block_extent *cow_read)
SetPageUptodate(page);
cleanup:
bl_put_extent(cow_read);
if (bh)
free_buffer_head(bh);
if (ret) {
......@@ -566,6 +689,7 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
struct parallel_io *par = NULL;
loff_t offset = wdata->args.offset;
size_t count = wdata->args.count;
unsigned int pg_offset, pg_len, saved_len;
struct page **pages = wdata->args.pages;
struct page *page;
pgoff_t index;
......@@ -574,10 +698,13 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
NFS_SERVER(header->inode)->pnfs_blksize >> PAGE_CACHE_SHIFT;
dprintk("%s enter, %Zu@%lld\n", __func__, count, offset);
/* Check for alignment first */
if (!bl_check_alignment(offset, count, PAGE_CACHE_MASK))
goto out_mds;
if (header->dreq != NULL &&
(!IS_ALIGNED(offset, NFS_SERVER(header->inode)->pnfs_blksize) ||
!IS_ALIGNED(count, NFS_SERVER(header->inode)->pnfs_blksize))) {
dprintk("pnfsblock nonblock aligned DIO writes. Resend MDS\n");
goto out_mds;
}
/* At this point, wdata->pages is a (sequential) list of nfs_pages.
* We want to write each, and if there is an error set pnfs_error
* to have it redone using nfs.
......@@ -674,10 +801,11 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
if (!extent_length) {
/* We've used up the previous extent */
bl_put_extent(be);
bl_put_extent(cow_read);
bio = bl_submit_bio(WRITE, bio);
/* Get the next one */
be = bl_find_get_extent(BLK_LSEG2EXT(header->lseg),
isect, NULL);
isect, &cow_read);
if (!be || !is_writable(be, isect)) {
header->pnfs_error = -EINVAL;
goto out;
......@@ -694,7 +822,26 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
extent_length = be->be_length -
(isect - be->be_f_offset);
}
if (be->be_state == PNFS_BLOCK_INVALID_DATA) {
dprintk("%s offset %lld count %Zu\n", __func__, offset, count);
pg_offset = offset & ~PAGE_CACHE_MASK;
if (pg_offset + count > PAGE_CACHE_SIZE)
pg_len = PAGE_CACHE_SIZE - pg_offset;
else
pg_len = count;
saved_len = pg_len;
if (be->be_state == PNFS_BLOCK_INVALID_DATA &&
!bl_is_sector_init(be->be_inval, isect)) {
ret = bl_read_partial_page_sync(pages[i], cow_read,
pg_offset, pg_len, true);
if (ret) {
dprintk("%s bl_read_partial_page_sync fail %d\n",
__func__, ret);
header->pnfs_error = ret;
goto out;
}
ret = bl_mark_sectors_init(be->be_inval, isect,
PAGE_CACHE_SECTORS);
if (unlikely(ret)) {
......@@ -703,15 +850,35 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
header->pnfs_error = ret;
goto out;
}
/* Expand to full page write */
pg_offset = 0;
pg_len = PAGE_CACHE_SIZE;
} else if ((pg_offset & (SECTOR_SIZE - 1)) ||
(pg_len & (SECTOR_SIZE - 1))){
/* ahh, nasty case. We have to do sync full sector
* read-modify-write cycles.
*/
unsigned int saved_offset = pg_offset;
ret = bl_read_partial_page_sync(pages[i], be, pg_offset,
pg_len, false);
pg_offset = round_down(pg_offset, SECTOR_SIZE);
pg_len = round_up(saved_offset + pg_len, SECTOR_SIZE)
- pg_offset;
}
bio = bl_add_page_to_bio(bio, wdata->pages.npages - i, WRITE,
bio = do_add_page_to_bio(bio, wdata->pages.npages - i, WRITE,
isect, pages[i], be,
bl_end_io_write, par);
bl_end_io_write, par,
pg_offset, pg_len);
if (IS_ERR(bio)) {
header->pnfs_error = PTR_ERR(bio);
bio = NULL;
goto out;
}
offset += saved_len;
count -= saved_len;
isect += PAGE_CACHE_SECTORS;
last_isect = isect;
extent_length -= PAGE_CACHE_SECTORS;
......@@ -729,17 +896,16 @@ bl_write_pagelist(struct nfs_write_data *wdata, int sync)
}
write_done:
wdata->res.count = (last_isect << SECTOR_SHIFT) - (offset);
if (count < wdata->res.count) {
wdata->res.count = count;
}
wdata->res.count = wdata->args.count;
out:
bl_put_extent(be);
bl_put_extent(cow_read);
bl_submit_bio(WRITE, bio);
put_parallel(par);
return PNFS_ATTEMPTED;
out_mds:
bl_put_extent(be);
bl_put_extent(cow_read);
kfree(par);
return PNFS_NOT_ATTEMPTED;
}
......@@ -874,7 +1040,7 @@ static void free_blk_mountid(struct block_mount_id *mid)
}
}
/* This is mostly copied from the filelayout's get_device_info function.
/* This is mostly copied from the filelayout_get_device_info function.
* It seems much of this should be at the generic pnfs level.
*/
static struct pnfs_block_dev *
......@@ -1011,33 +1177,95 @@ bl_clear_layoutdriver(struct nfs_server *server)
return 0;
}
static bool
is_aligned_req(struct nfs_page *req, unsigned int alignment)
{
return IS_ALIGNED(req->wb_offset, alignment) &&
IS_ALIGNED(req->wb_bytes, alignment);
}
static void
bl_pg_init_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
{
if (!bl_check_alignment(req->wb_offset, req->wb_bytes, PAGE_CACHE_MASK))
if (pgio->pg_dreq != NULL &&
!is_aligned_req(req, SECTOR_SIZE))
nfs_pageio_reset_read_mds(pgio);
else
pnfs_generic_pg_init_read(pgio, req);
}
static bool
bl_pg_test_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *prev,
struct nfs_page *req)
{
if (pgio->pg_dreq != NULL &&
!is_aligned_req(req, SECTOR_SIZE))
return false;
return pnfs_generic_pg_test(pgio, prev, req);
}
/*
* Return the number of contiguous bytes for a given inode
* starting at page frame idx.
*/
static u64 pnfs_num_cont_bytes(struct inode *inode, pgoff_t idx)
{
struct address_space *mapping = inode->i_mapping;
pgoff_t end;
/* Optimize common case that writes from 0 to end of file */
end = DIV_ROUND_UP(i_size_read(inode), PAGE_CACHE_SIZE);
if (end != NFS_I(inode)->npages) {
rcu_read_lock();
end = radix_tree_next_hole(&mapping->page_tree, idx + 1, ULONG_MAX);
rcu_read_unlock();
}
if (!end)
return i_size_read(inode) - (idx << PAGE_CACHE_SHIFT);
else
return (end - idx) << PAGE_CACHE_SHIFT;
}
static void
bl_pg_init_write(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
{
if (!bl_check_alignment(req->wb_offset, req->wb_bytes, PAGE_CACHE_MASK))
if (pgio->pg_dreq != NULL &&
!is_aligned_req(req, PAGE_CACHE_SIZE)) {
nfs_pageio_reset_write_mds(pgio);
else
pnfs_generic_pg_init_write(pgio, req);
} else {
u64 wb_size;
if (pgio->pg_dreq == NULL)
wb_size = pnfs_num_cont_bytes(pgio->pg_inode,
req->wb_index);
else
wb_size = nfs_dreq_bytes_left(pgio->pg_dreq);
pnfs_generic_pg_init_write(pgio, req, wb_size);
}
}
static bool
bl_pg_test_write(struct nfs_pageio_descriptor *pgio, struct nfs_page *prev,
struct nfs_page *req)
{
if (pgio->pg_dreq != NULL &&
!is_aligned_req(req, PAGE_CACHE_SIZE))
return false;
return pnfs_generic_pg_test(pgio, prev, req);
}
static const struct nfs_pageio_ops bl_pg_read_ops = {
.pg_init = bl_pg_init_read,
.pg_test = pnfs_generic_pg_test,
.pg_test = bl_pg_test_read,
.pg_doio = pnfs_generic_pg_readpages,
};
static const struct nfs_pageio_ops bl_pg_write_ops = {
.pg_init = bl_pg_init_write,
.pg_test = pnfs_generic_pg_test,
.pg_test = bl_pg_test_write,
.pg_doio = pnfs_generic_pg_writepages,
};
......
......@@ -41,6 +41,7 @@
#define PAGE_CACHE_SECTORS (PAGE_CACHE_SIZE >> SECTOR_SHIFT)
#define PAGE_CACHE_SECTOR_SHIFT (PAGE_CACHE_SHIFT - SECTOR_SHIFT)
#define SECTOR_SIZE (1 << SECTOR_SHIFT)
struct block_mount_id {
spinlock_t bm_lock; /* protects list */
......@@ -172,7 +173,6 @@ struct bl_msg_hdr {
/* blocklayoutdev.c */
ssize_t bl_pipe_downcall(struct file *, const char __user *, size_t);
void bl_pipe_destroy_msg(struct rpc_pipe_msg *);
struct block_device *nfs4_blkdev_get(dev_t dev);
int nfs4_blkdev_put(struct block_device *bdev);
struct pnfs_block_dev *nfs4_blk_decode_device(struct nfs_server *server,
struct pnfs_device *dev);
......
......@@ -53,22 +53,6 @@ static int decode_sector_number(__be32 **rp, sector_t *sp)
return 0;
}
/* Open a block_device by device number. */
struct block_device *nfs4_blkdev_get(dev_t dev)
{
struct block_device *bd;
dprintk("%s enter\n", __func__);
bd = blkdev_get_by_dev(dev, FMODE_READ, NULL);
if (IS_ERR(bd))
goto fail;
return bd;
fail:
dprintk("%s failed to open device : %ld\n",
__func__, PTR_ERR(bd));
return NULL;
}
/*
* Release the block device
*/
......@@ -172,11 +156,12 @@ nfs4_blk_decode_device(struct nfs_server *server,
goto out;
}
bd = nfs4_blkdev_get(MKDEV(reply->major, reply->minor));
bd = blkdev_get_by_dev(MKDEV(reply->major, reply->minor),
FMODE_READ, NULL);
if (IS_ERR(bd)) {
rc = PTR_ERR(bd);
dprintk("%s failed to open device : %d\n", __func__, rc);
rv = ERR_PTR(rc);
dprintk("%s failed to open device : %ld\n", __func__,
PTR_ERR(bd));
rv = ERR_CAST(bd);
goto out;
}
......
......@@ -683,8 +683,7 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
p = xdr_encode_hyper(p, 0LL);
*p++ = cpu_to_be32(PNFS_BLOCK_READWRITE_DATA);
list_del(&lce->bse_node);
list_add_tail(&lce->bse_node, &bl->bl_committing);
list_move_tail(&lce->bse_node, &bl->bl_committing);
bl->bl_count--;
count++;
}
......
......@@ -12,6 +12,7 @@
#include <linux/sunrpc/svc.h>
#include <linux/sunrpc/svcsock.h>
#include <linux/nfs_fs.h>
#include <linux/errno.h>
#include <linux/mutex.h>
#include <linux/freezer.h>
#include <linux/kthread.h>
......@@ -23,6 +24,7 @@
#include "nfs4_fs.h"
#include "callback.h"
#include "internal.h"
#include "netns.h"
#define NFSDBG_FACILITY NFSDBG_CALLBACK
......@@ -37,7 +39,32 @@ static struct nfs_callback_data nfs_callback_info[NFS4_MAX_MINOR_VERSION + 1];
static DEFINE_MUTEX(nfs_callback_mutex);
static struct svc_program nfs4_callback_program;
unsigned short nfs_callback_tcpport6;
static int nfs4_callback_up_net(struct svc_serv *serv, struct net *net)
{
int ret;
struct nfs_net *nn = net_generic(net, nfs_net_id);
ret = svc_create_xprt(serv, "tcp", net, PF_INET,
nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
if (ret <= 0)
goto out_err;
nn->nfs_callback_tcpport = ret;
dprintk("NFS: Callback listener port = %u (af %u, net %p)\n",
nn->nfs_callback_tcpport, PF_INET, net);
ret = svc_create_xprt(serv, "tcp", net, PF_INET6,
nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
if (ret > 0) {
nn->nfs_callback_tcpport6 = ret;
dprintk("NFS: Callback listener port = %u (af %u, net %p)\n",
nn->nfs_callback_tcpport6, PF_INET6, net);
} else if (ret != -EAFNOSUPPORT)
goto out_err;
return 0;
out_err:
return (ret) ? ret : -ENOMEM;
}
/*
* This is the NFSv4 callback kernel thread.
......@@ -78,38 +105,23 @@ nfs4_callback_svc(void *vrqstp)
* Prepare to bring up the NFSv4 callback service
*/
static struct svc_rqst *
nfs4_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
nfs4_callback_up(struct svc_serv *serv)
{
int ret;
ret = svc_create_xprt(serv, "tcp", &init_net, PF_INET,
nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
if (ret <= 0)
goto out_err;
nfs_callback_tcpport = ret;
dprintk("NFS: Callback listener port = %u (af %u)\n",
nfs_callback_tcpport, PF_INET);
ret = svc_create_xprt(serv, "tcp", &init_net, PF_INET6,
nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
if (ret > 0) {
nfs_callback_tcpport6 = ret;
dprintk("NFS: Callback listener port = %u (af %u)\n",
nfs_callback_tcpport6, PF_INET6);
} else if (ret == -EAFNOSUPPORT)
ret = 0;
else
goto out_err;
return svc_prepare_thread(serv, &serv->sv_pools[0], NUMA_NO_NODE);
out_err:
if (ret == 0)
ret = -ENOMEM;
return ERR_PTR(ret);
}
#if defined(CONFIG_NFS_V4_1)
static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net)
{
/*
* Create an svc_sock for the back channel service that shares the
* fore channel connection.
* Returns the input port (0) and sets the svc_serv bc_xprt on success
*/
return svc_create_xprt(serv, "tcp-bc", net, PF_INET, 0,
SVC_SOCK_ANONYMOUS);
}
/*
* The callback service for NFSv4.1 callbacks
*/
......@@ -149,28 +161,9 @@ nfs41_callback_svc(void *vrqstp)
* Bring up the NFSv4.1 callback service
*/
static struct svc_rqst *
nfs41_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
nfs41_callback_up(struct svc_serv *serv)
{
struct svc_rqst *rqstp;
int ret;
/*
* Create an svc_sock for the back channel service that shares the
* fore channel connection.
* Returns the input port (0) and sets the svc_serv bc_xprt on success
*/
ret = svc_create_xprt(serv, "tcp-bc", &init_net, PF_INET, 0,
SVC_SOCK_ANONYMOUS);
if (ret < 0) {
rqstp = ERR_PTR(ret);
goto out;
}
/*
* Save the svc_serv in the transport so that it can
* be referenced when the session backchannel is initialized
*/
xprt->bc_serv = serv;
INIT_LIST_HEAD(&serv->sv_cb_list);
spin_lock_init(&serv->sv_cb_lock);
......@@ -180,90 +173,74 @@ nfs41_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
svc_xprt_put(serv->sv_bc_xprt);
serv->sv_bc_xprt = NULL;
}
out:
dprintk("--> %s return %ld\n", __func__,
IS_ERR(rqstp) ? PTR_ERR(rqstp) : 0);
return rqstp;
}
static inline int nfs_minorversion_callback_svc_setup(u32 minorversion,
struct svc_serv *serv, struct rpc_xprt *xprt,
static void nfs_minorversion_callback_svc_setup(struct svc_serv *serv,
struct svc_rqst **rqstpp, int (**callback_svc)(void *vrqstp))
{
if (minorversion) {
*rqstpp = nfs41_callback_up(serv, xprt);
*callback_svc = nfs41_callback_svc;
}
return minorversion;
*rqstpp = nfs41_callback_up(serv);
*callback_svc = nfs41_callback_svc;
}
static inline void nfs_callback_bc_serv(u32 minorversion, struct rpc_xprt *xprt,
struct nfs_callback_data *cb_info)
struct svc_serv *serv)
{
if (minorversion)
xprt->bc_serv = cb_info->serv;
/*
* Save the svc_serv in the transport so that it can
* be referenced when the session backchannel is initialized
*/
xprt->bc_serv = serv;
}
#else
static inline int nfs_minorversion_callback_svc_setup(u32 minorversion,
struct svc_serv *serv, struct rpc_xprt *xprt,
struct svc_rqst **rqstpp, int (**callback_svc)(void *vrqstp))
static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net)
{
return 0;
}
static void nfs_minorversion_callback_svc_setup(struct svc_serv *serv,
struct svc_rqst **rqstpp, int (**callback_svc)(void *vrqstp))
{
*rqstpp = ERR_PTR(-ENOTSUPP);
*callback_svc = ERR_PTR(-ENOTSUPP);
}
static inline void nfs_callback_bc_serv(u32 minorversion, struct rpc_xprt *xprt,
struct nfs_callback_data *cb_info)
struct svc_serv *serv)
{
}
#endif /* CONFIG_NFS_V4_1 */
/*
* Bring up the callback thread if it is not already up.
*/
int nfs_callback_up(u32 minorversion, struct rpc_xprt *xprt)
static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
struct svc_serv *serv)
{
struct svc_serv *serv = NULL;
struct svc_rqst *rqstp;
int (*callback_svc)(void *vrqstp);
struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion];
char svc_name[12];
int ret = 0;
int minorversion_setup;
struct net *net = &init_net;
int ret;
mutex_lock(&nfs_callback_mutex);
if (cb_info->users++ || cb_info->task != NULL) {
nfs_callback_bc_serv(minorversion, xprt, cb_info);
goto out;
}
serv = svc_create(&nfs4_callback_program, NFS4_CALLBACK_BUFSIZE, NULL);
if (!serv) {
ret = -ENOMEM;
goto out_err;
}
/* As there is only one thread we need to over-ride the
* default maximum of 80 connections
*/
serv->sv_maxconn = 1024;
nfs_callback_bc_serv(minorversion, xprt, serv);
ret = svc_bind(serv, net);
if (ret < 0) {
printk(KERN_WARNING "NFS: bind callback service failed\n");
goto out_err;
}
if (cb_info->task)
return 0;
minorversion_setup = nfs_minorversion_callback_svc_setup(minorversion,
serv, xprt, &rqstp, &callback_svc);
if (!minorversion_setup) {
switch (minorversion) {
case 0:
/* v4.0 callback setup */
rqstp = nfs4_callback_up(serv, xprt);
rqstp = nfs4_callback_up(serv);
callback_svc = nfs4_callback_svc;
break;
default:
nfs_minorversion_callback_svc_setup(serv,
&rqstp, &callback_svc);
}
if (IS_ERR(rqstp)) {
ret = PTR_ERR(rqstp);
goto out_err;
}
if (IS_ERR(rqstp))
return PTR_ERR(rqstp);
svc_sock_update_bufs(serv);
......@@ -276,41 +253,165 @@ int nfs_callback_up(u32 minorversion, struct rpc_xprt *xprt)
svc_exit_thread(cb_info->rqst);
cb_info->rqst = NULL;
cb_info->task = NULL;
goto out_err;
return PTR_ERR(cb_info->task);
}
dprintk("nfs_callback_up: service started\n");
return 0;
}
static void nfs_callback_down_net(u32 minorversion, struct svc_serv *serv, struct net *net)
{
struct nfs_net *nn = net_generic(net, nfs_net_id);
if (--nn->cb_users[minorversion])
return;
dprintk("NFS: destroy per-net callback data; net=%p\n", net);
svc_shutdown_net(serv, net);
}
static int nfs_callback_up_net(int minorversion, struct svc_serv *serv, struct net *net)
{
struct nfs_net *nn = net_generic(net, nfs_net_id);
int ret;
if (nn->cb_users[minorversion]++)
return 0;
dprintk("NFS: create per-net callback data; net=%p\n", net);
ret = svc_bind(serv, net);
if (ret < 0) {
printk(KERN_WARNING "NFS: bind callback service failed\n");
goto err_bind;
}
switch (minorversion) {
case 0:
ret = nfs4_callback_up_net(serv, net);
break;
case 1:
ret = nfs41_callback_up_net(serv, net);
break;
default:
printk(KERN_ERR "NFS: unknown callback version: %d\n",
minorversion);
ret = -EINVAL;
break;
}
out:
if (ret < 0) {
printk(KERN_ERR "NFS: callback service start failed\n");
goto err_socks;
}
return 0;
err_socks:
svc_rpcb_cleanup(serv, net);
err_bind:
dprintk("NFS: Couldn't create callback socket: err = %d; "
"net = %p\n", ret, net);
return ret;
}
static struct svc_serv *nfs_callback_create_svc(int minorversion)
{
struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion];
struct svc_serv *serv;
/*
* Check whether we're already up and running.
*/
if (cb_info->task) {
/*
* Note: increase service usage, because later in case of error
* svc_destroy() will be called.
*/
svc_get(cb_info->serv);
return cb_info->serv;
}
/*
* Sanity check: if there's no task,
* we should be the first user ...
*/
if (cb_info->users)
printk(KERN_WARNING "nfs_callback_create_svc: no kthread, %d users??\n",
cb_info->users);
serv = svc_create(&nfs4_callback_program, NFS4_CALLBACK_BUFSIZE, NULL);
if (!serv) {
printk(KERN_ERR "nfs_callback_create_svc: create service failed\n");
return ERR_PTR(-ENOMEM);
}
/* As there is only one thread we need to over-ride the
* default maximum of 80 connections
*/
serv->sv_maxconn = 1024;
dprintk("nfs_callback_create_svc: service created\n");
return serv;
}
/*
* Bring up the callback thread if it is not already up.
*/
int nfs_callback_up(u32 minorversion, struct rpc_xprt *xprt)
{
struct svc_serv *serv;
struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion];
int ret;
struct net *net = xprt->xprt_net;
mutex_lock(&nfs_callback_mutex);
serv = nfs_callback_create_svc(minorversion);
if (IS_ERR(serv)) {
ret = PTR_ERR(serv);
goto err_create;
}
ret = nfs_callback_up_net(minorversion, serv, net);
if (ret < 0)
goto err_net;
ret = nfs_callback_start_svc(minorversion, xprt, serv);
if (ret < 0)
goto err_start;
cb_info->users++;
/*
* svc_create creates the svc_serv with sv_nrthreads == 1, and then
* svc_prepare_thread increments that. So we need to call svc_destroy
* on both success and failure so that the refcount is 1 when the
* thread exits.
*/
if (serv)
svc_destroy(serv);
err_net:
svc_destroy(serv);
err_create:
mutex_unlock(&nfs_callback_mutex);
return ret;
out_err:
dprintk("NFS: Couldn't create callback socket or server thread; "
"err = %d\n", ret);
cb_info->users--;
if (serv)
svc_shutdown_net(serv, net);
goto out;
err_start:
nfs_callback_down_net(minorversion, serv, net);
dprintk("NFS: Couldn't create server thread; err = %d\n", ret);
goto err_net;
}
/*
* Kill the callback thread if it's no longer being used.
*/
void nfs_callback_down(int minorversion)
void nfs_callback_down(int minorversion, struct net *net)
{
struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion];
mutex_lock(&nfs_callback_mutex);
nfs_callback_down_net(minorversion, cb_info->serv, net);
cb_info->users--;
if (cb_info->users == 0 && cb_info->task != NULL) {
kthread_stop(cb_info->task);
svc_shutdown_net(cb_info->serv, &init_net);
dprintk("nfs_callback_down: service stopped\n");
svc_exit_thread(cb_info->rqst);
dprintk("nfs_callback_down: service destroyed\n");
cb_info->serv = NULL;
cb_info->rqst = NULL;
cb_info->task = NULL;
......
......@@ -194,7 +194,7 @@ extern __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy,
struct cb_process_state *cps);
#if IS_ENABLED(CONFIG_NFS_V4)
extern int nfs_callback_up(u32 minorversion, struct rpc_xprt *xprt);
extern void nfs_callback_down(int minorversion);
extern void nfs_callback_down(int minorversion, struct net *net);
extern int nfs4_validate_delegation_stateid(struct nfs_delegation *delegation,
const nfs4_stateid *stateid);
extern int nfs4_set_callback_sessionid(struct nfs_client *clp);
......@@ -209,6 +209,5 @@ extern int nfs4_set_callback_sessionid(struct nfs_client *clp);
extern unsigned int nfs_callback_set_tcpport;
extern unsigned short nfs_callback_tcpport;
extern unsigned short nfs_callback_tcpport6;
#endif /* __LINUX_FS_NFS_CALLBACK_H */
......@@ -122,7 +122,15 @@ static struct pnfs_layout_hdr * get_layout_by_fh_locked(struct nfs_client *clp,
ino = igrab(lo->plh_inode);
if (!ino)
continue;
get_layout_hdr(lo);
spin_lock(&ino->i_lock);
/* Is this layout in the process of being freed? */
if (NFS_I(ino)->layout != lo) {
spin_unlock(&ino->i_lock);
iput(ino);
continue;
}
pnfs_get_layout_hdr(lo);
spin_unlock(&ino->i_lock);
return lo;
}
}
......@@ -158,7 +166,7 @@ static u32 initiate_file_draining(struct nfs_client *clp,
ino = lo->plh_inode;
spin_lock(&ino->i_lock);
if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
mark_matching_lsegs_invalid(lo, &free_me_list,
pnfs_mark_matching_lsegs_invalid(lo, &free_me_list,
&args->cbl_range))
rv = NFS4ERR_DELAY;
else
......@@ -166,7 +174,7 @@ static u32 initiate_file_draining(struct nfs_client *clp,
pnfs_set_layout_stateid(lo, &args->cbl_stateid, true);
spin_unlock(&ino->i_lock);
pnfs_free_lseg_list(&free_me_list);
put_layout_hdr(lo);
pnfs_put_layout_hdr(lo);
iput(ino);
return rv;
}
......@@ -196,9 +204,18 @@ static u32 initiate_bulk_draining(struct nfs_client *clp,
continue;
list_for_each_entry(lo, &server->layouts, plh_layouts) {
if (!igrab(lo->plh_inode))
ino = igrab(lo->plh_inode);
if (ino)
continue;
spin_lock(&ino->i_lock);
/* Is this layout in the process of being freed? */
if (NFS_I(ino)->layout != lo) {
spin_unlock(&ino->i_lock);
iput(ino);
continue;
get_layout_hdr(lo);
}
pnfs_get_layout_hdr(lo);
spin_unlock(&ino->i_lock);
BUG_ON(!list_empty(&lo->plh_bulk_recall));
list_add(&lo->plh_bulk_recall, &recall_list);
}
......@@ -211,12 +228,12 @@ static u32 initiate_bulk_draining(struct nfs_client *clp,
ino = lo->plh_inode;
spin_lock(&ino->i_lock);
set_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
if (mark_matching_lsegs_invalid(lo, &free_me_list, &range))
if (pnfs_mark_matching_lsegs_invalid(lo, &free_me_list, &range))
rv = NFS4ERR_DELAY;
list_del_init(&lo->plh_bulk_recall);
spin_unlock(&ino->i_lock);
pnfs_free_lseg_list(&free_me_list);
put_layout_hdr(lo);
pnfs_put_layout_hdr(lo);
iput(ino);
}
return rv;
......
......@@ -93,10 +93,10 @@ static struct nfs_subversion *find_nfs_version(unsigned int version)
spin_unlock(&nfs_version_lock);
return nfs;
}
};
}
spin_unlock(&nfs_version_lock);
return ERR_PTR(-EPROTONOSUPPORT);;
return ERR_PTR(-EPROTONOSUPPORT);
}
struct nfs_subversion *get_nfs_version(unsigned int version)
......@@ -498,7 +498,8 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
return nfs_found_client(cl_init, clp);
}
if (new) {
list_add(&new->cl_share_link, &nn->nfs_client_list);
list_add_tail(&new->cl_share_link,
&nn->nfs_client_list);
spin_unlock(&nn->nfs_client_lock);
new->cl_flags = cl_init->init_flags;
return rpc_ops->init_client(new, timeparms, ip_addr,
......@@ -668,7 +669,8 @@ int nfs_init_server_rpcclient(struct nfs_server *server,
{
struct nfs_client *clp = server->nfs_client;
server->client = rpc_clone_client(clp->cl_rpcclient);
server->client = rpc_clone_client_set_auth(clp->cl_rpcclient,
pseudoflavour);
if (IS_ERR(server->client)) {
dprintk("%s: couldn't create rpc_client!\n", __func__);
return PTR_ERR(server->client);
......@@ -678,16 +680,6 @@ int nfs_init_server_rpcclient(struct nfs_server *server,
timeo,
sizeof(server->client->cl_timeout_default));
server->client->cl_timeout = &server->client->cl_timeout_default;
if (pseudoflavour != clp->cl_rpcclient->cl_auth->au_flavor) {
struct rpc_auth *auth;
auth = rpcauth_create(pseudoflavour, server->client);
if (IS_ERR(auth)) {
dprintk("%s: couldn't create credcache!\n", __func__);
return PTR_ERR(auth);
}
}
server->client->cl_softrtry = 0;
if (server->flags & NFS_MOUNT_SOFT)
server->client->cl_softrtry = 1;
......@@ -761,6 +753,8 @@ static int nfs_init_server(struct nfs_server *server,
data->timeo, data->retrans);
if (data->flags & NFS_MOUNT_NORESVPORT)
set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags);
if (server->options & NFS_OPTION_MIGRATION)
set_bit(NFS_CS_MIGRATION, &cl_init.init_flags);
/* Allocate or find a client reference we can use */
clp = nfs_get_client(&cl_init, &timeparms, NULL, RPC_AUTH_UNIX);
......@@ -855,7 +849,6 @@ static void nfs_server_set_fsinfo(struct nfs_server *server,
if (server->wsize > NFS_MAX_FILE_IO_SIZE)
server->wsize = NFS_MAX_FILE_IO_SIZE;
server->wpages = (server->wsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
server->pnfs_blksize = fsinfo->blksize;
server->wtmult = nfs_block_bits(fsinfo->wtmult, NULL);
......
......@@ -2072,7 +2072,7 @@ static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry *
nfs_access_free_entry(entry);
}
static void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set)
void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set)
{
struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_KERNEL);
if (cache == NULL)
......@@ -2098,6 +2098,20 @@ static void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *s
spin_unlock(&nfs_access_lru_lock);
}
}
EXPORT_SYMBOL_GPL(nfs_access_add_cache);
void nfs_access_set_mask(struct nfs_access_entry *entry, u32 access_result)
{
entry->mask = 0;
if (access_result & NFS4_ACCESS_READ)
entry->mask |= MAY_READ;
if (access_result &
(NFS4_ACCESS_MODIFY | NFS4_ACCESS_EXTEND | NFS4_ACCESS_DELETE))
entry->mask |= MAY_WRITE;
if (access_result & (NFS4_ACCESS_LOOKUP|NFS4_ACCESS_EXECUTE))
entry->mask |= MAY_EXEC;
}
EXPORT_SYMBOL_GPL(nfs_access_set_mask);
static int nfs_do_access(struct inode *inode, struct rpc_cred *cred, int mask)
{
......
......@@ -46,6 +46,7 @@
#include <linux/kref.h>
#include <linux/slab.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/module.h>
#include <linux/nfs_fs.h>
#include <linux/nfs_page.h>
......@@ -78,6 +79,7 @@ struct nfs_direct_req {
atomic_t io_count; /* i/os we're waiting for */
spinlock_t lock; /* protect completion state */
ssize_t count, /* bytes actually processed */
bytes_left, /* bytes left to be sent */
error; /* any reported error */
struct completion completion; /* wait for i/o completion */
......@@ -190,6 +192,12 @@ static void nfs_direct_req_release(struct nfs_direct_req *dreq)
kref_put(&dreq->kref, nfs_direct_req_free);
}
ssize_t nfs_dreq_bytes_left(struct nfs_direct_req *dreq)
{
return dreq->bytes_left;
}
EXPORT_SYMBOL_GPL(nfs_dreq_bytes_left);
/*
* Collects and returns the final error value/byte-count.
*/
......@@ -390,6 +398,7 @@ static ssize_t nfs_direct_read_schedule_segment(struct nfs_pageio_descriptor *de
user_addr += req_len;
pos += req_len;
count -= req_len;
dreq->bytes_left -= req_len;
}
/* The nfs_page now hold references to these pages */
nfs_direct_release_pages(pagevec, npages);
......@@ -450,23 +459,28 @@ static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
ssize_t result = -ENOMEM;
struct inode *inode = iocb->ki_filp->f_mapping->host;
struct nfs_direct_req *dreq;
struct nfs_lock_context *l_ctx;
dreq = nfs_direct_req_alloc();
if (dreq == NULL)
goto out;
dreq->inode = inode;
dreq->bytes_left = iov_length(iov, nr_segs);
dreq->ctx = get_nfs_open_context(nfs_file_open_context(iocb->ki_filp));
dreq->l_ctx = nfs_get_lock_context(dreq->ctx);
if (dreq->l_ctx == NULL)
l_ctx = nfs_get_lock_context(dreq->ctx);
if (IS_ERR(l_ctx)) {
result = PTR_ERR(l_ctx);
goto out_release;
}
dreq->l_ctx = l_ctx;
if (!is_sync_kiocb(iocb))
dreq->iocb = iocb;
NFS_I(inode)->read_io += iov_length(iov, nr_segs);
result = nfs_direct_read_schedule_iovec(dreq, iov, nr_segs, pos, uio);
if (!result)
result = nfs_direct_wait(dreq);
NFS_I(inode)->read_io += result;
out_release:
nfs_direct_req_release(dreq);
out:
......@@ -706,6 +720,7 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_pageio_descriptor *d
user_addr += req_len;
pos += req_len;
count -= req_len;
dreq->bytes_left -= req_len;
}
/* The nfs_page now hold references to these pages */
nfs_direct_release_pages(pagevec, npages);
......@@ -814,6 +829,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
get_dreq(dreq);
atomic_inc(&inode->i_dio_count);
NFS_I(dreq->inode)->write_io += iov_length(iov, nr_segs);
for (seg = 0; seg < nr_segs; seg++) {
const struct iovec *vec = &iov[seg];
result = nfs_direct_write_schedule_segment(&desc, vec, pos, uio);
......@@ -825,7 +841,6 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
pos += vec->iov_len;
}
nfs_pageio_complete(&desc);
NFS_I(dreq->inode)->write_io += desc.pg_bytes_written;
/*
* If no bytes were started, return the error, and let the
......@@ -849,16 +864,21 @@ static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
ssize_t result = -ENOMEM;
struct inode *inode = iocb->ki_filp->f_mapping->host;
struct nfs_direct_req *dreq;
struct nfs_lock_context *l_ctx;
dreq = nfs_direct_req_alloc();
if (!dreq)
goto out;
dreq->inode = inode;
dreq->bytes_left = count;
dreq->ctx = get_nfs_open_context(nfs_file_open_context(iocb->ki_filp));
dreq->l_ctx = nfs_get_lock_context(dreq->ctx);
if (dreq->l_ctx == NULL)
l_ctx = nfs_get_lock_context(dreq->ctx);
if (IS_ERR(l_ctx)) {
result = PTR_ERR(l_ctx);
goto out_release;
}
dreq->l_ctx = l_ctx;
if (!is_sync_kiocb(iocb))
dreq->iocb = iocb;
......
......@@ -259,7 +259,7 @@ nfs_file_fsync_commit(struct file *file, loff_t start, loff_t end, int datasync)
struct dentry *dentry = file->f_path.dentry;
struct nfs_open_context *ctx = nfs_file_open_context(file);
struct inode *inode = dentry->d_inode;
int have_error, status;
int have_error, do_resend, status;
int ret = 0;
dprintk("NFS: fsync file(%s/%s) datasync %d\n",
......@@ -267,15 +267,23 @@ nfs_file_fsync_commit(struct file *file, loff_t start, loff_t end, int datasync)
datasync);
nfs_inc_stats(inode, NFSIOS_VFSFSYNC);
do_resend = test_and_clear_bit(NFS_CONTEXT_RESEND_WRITES, &ctx->flags);
have_error = test_and_clear_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags);
status = nfs_commit_inode(inode, FLUSH_SYNC);
if (status >= 0 && ret < 0)
status = ret;
have_error |= test_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags);
if (have_error)
if (have_error) {
ret = xchg(&ctx->error, 0);
if (!ret && status < 0)
if (ret)
goto out;
}
if (status < 0) {
ret = status;
goto out;
}
do_resend |= test_bit(NFS_CONTEXT_RESEND_WRITES, &ctx->flags);
if (do_resend)
ret = -EAGAIN;
out:
return ret;
}
EXPORT_SYMBOL_GPL(nfs_file_fsync_commit);
......@@ -286,13 +294,22 @@ nfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
int ret;
struct inode *inode = file->f_path.dentry->d_inode;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
if (ret != 0)
goto out;
mutex_lock(&inode->i_mutex);
ret = nfs_file_fsync_commit(file, start, end, datasync);
mutex_unlock(&inode->i_mutex);
out:
do {
ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
if (ret != 0)
break;
mutex_lock(&inode->i_mutex);
ret = nfs_file_fsync_commit(file, start, end, datasync);
mutex_unlock(&inode->i_mutex);
/*
* If nfs_file_fsync_commit detected a server reboot, then
* resend all dirty pages that might have been covered by
* the NFS_CONTEXT_RESEND_WRITES flag
*/
start = 0;
end = LLONG_MAX;
} while (ret == -EAGAIN);
return ret;
}
......
......@@ -32,6 +32,8 @@
#include <asm/uaccess.h>
#include "internal.h"
#define NFSDBG_FACILITY NFSDBG_CLIENT
/*
......
......@@ -55,18 +55,19 @@
static const struct cred *id_resolver_cache;
static struct key_type key_type_id_resolver_legacy;
struct idmap {
struct rpc_pipe *idmap_pipe;
struct key_construction *idmap_key_cons;
struct mutex idmap_mutex;
};
struct idmap_legacy_upcalldata {
struct rpc_pipe_msg pipe_msg;
struct idmap_msg idmap_msg;
struct key_construction *key_cons;
struct idmap *idmap;
};
struct idmap {
struct rpc_pipe *idmap_pipe;
struct idmap_legacy_upcalldata *idmap_upcall_data;
struct mutex idmap_mutex;
};
/**
* nfs_fattr_init_names - initialise the nfs_fattr owner_name/group_name fields
* @fattr: fully initialised struct nfs_fattr
......@@ -158,7 +159,7 @@ static int nfs_map_string_to_numeric(const char *name, size_t namelen, __u32 *re
return 0;
memcpy(buf, name, namelen);
buf[namelen] = '\0';
if (strict_strtoul(buf, 0, &val) != 0)
if (kstrtoul(buf, 0, &val) != 0)
return 0;
*res = val;
return 1;
......@@ -330,7 +331,6 @@ static ssize_t nfs_idmap_get_key(const char *name, size_t namelen,
ret = nfs_idmap_request_key(&key_type_id_resolver_legacy,
name, namelen, type, data,
data_size, idmap);
idmap->idmap_key_cons = NULL;
mutex_unlock(&idmap->idmap_mutex);
}
return ret;
......@@ -364,7 +364,7 @@ static int nfs_idmap_lookup_id(const char *name, size_t namelen, const char *typ
if (data_size <= 0) {
ret = -EINVAL;
} else {
ret = strict_strtol(id_str, 10, &id_long);
ret = kstrtol(id_str, 10, &id_long);
*id = (__u32)id_long;
}
return ret;
......@@ -465,8 +465,6 @@ nfs_idmap_new(struct nfs_client *clp)
struct rpc_pipe *pipe;
int error;
BUG_ON(clp->cl_idmap != NULL);
idmap = kzalloc(sizeof(*idmap), GFP_KERNEL);
if (idmap == NULL)
return -ENOMEM;
......@@ -510,7 +508,6 @@ static int __rpc_pipefs_event(struct nfs_client *clp, unsigned long event,
switch (event) {
case RPC_PIPEFS_MOUNT:
BUG_ON(clp->cl_rpcclient->cl_dentry == NULL);
err = __nfs_idmap_register(clp->cl_rpcclient->cl_dentry,
clp->cl_idmap,
clp->cl_idmap->idmap_pipe);
......@@ -632,9 +629,6 @@ static int nfs_idmap_prepare_message(char *desc, struct idmap *idmap,
substring_t substr;
int token, ret;
memset(im, 0, sizeof(*im));
memset(msg, 0, sizeof(*msg));
im->im_type = IDMAP_TYPE_GROUP;
token = match_token(desc, nfs_idmap_tokens, &substr);
......@@ -665,6 +659,35 @@ static int nfs_idmap_prepare_message(char *desc, struct idmap *idmap,
return ret;
}
static bool
nfs_idmap_prepare_pipe_upcall(struct idmap *idmap,
struct idmap_legacy_upcalldata *data)
{
if (idmap->idmap_upcall_data != NULL) {
WARN_ON_ONCE(1);
return false;
}
idmap->idmap_upcall_data = data;
return true;
}
static void
nfs_idmap_complete_pipe_upcall_locked(struct idmap *idmap, int ret)
{
struct key_construction *cons = idmap->idmap_upcall_data->key_cons;
kfree(idmap->idmap_upcall_data);
idmap->idmap_upcall_data = NULL;
complete_request_key(cons, ret);
}
static void
nfs_idmap_abort_pipe_upcall(struct idmap *idmap, int ret)
{
if (idmap->idmap_upcall_data != NULL)
nfs_idmap_complete_pipe_upcall_locked(idmap, ret);
}
static int nfs_idmap_legacy_upcall(struct key_construction *cons,
const char *op,
void *aux)
......@@ -677,29 +700,28 @@ static int nfs_idmap_legacy_upcall(struct key_construction *cons,
int ret = -ENOMEM;
/* msg and im are freed in idmap_pipe_destroy_msg */
data = kmalloc(sizeof(*data), GFP_KERNEL);
data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
goto out1;
msg = &data->pipe_msg;
im = &data->idmap_msg;
data->idmap = idmap;
data->key_cons = cons;
ret = nfs_idmap_prepare_message(key->description, idmap, im, msg);
if (ret < 0)
goto out2;
BUG_ON(idmap->idmap_key_cons != NULL);
idmap->idmap_key_cons = cons;
ret = -EAGAIN;
if (!nfs_idmap_prepare_pipe_upcall(idmap, data))
goto out2;
ret = rpc_queue_upcall(idmap->idmap_pipe, msg);
if (ret < 0)
goto out3;
nfs_idmap_abort_pipe_upcall(idmap, ret);
return ret;
out3:
idmap->idmap_key_cons = NULL;
out2:
kfree(data);
out1:
......@@ -714,21 +736,32 @@ static int nfs_idmap_instantiate(struct key *key, struct key *authkey, char *dat
authkey);
}
static int nfs_idmap_read_message(struct idmap_msg *im, struct key *key, struct key *authkey)
static int nfs_idmap_read_and_verify_message(struct idmap_msg *im,
struct idmap_msg *upcall,
struct key *key, struct key *authkey)
{
char id_str[NFS_UINT_MAXLEN];
int ret = -EINVAL;
int ret = -ENOKEY;
/* ret = -ENOKEY */
if (upcall->im_type != im->im_type || upcall->im_conv != im->im_conv)
goto out;
switch (im->im_conv) {
case IDMAP_CONV_NAMETOID:
if (strcmp(upcall->im_name, im->im_name) != 0)
break;
sprintf(id_str, "%d", im->im_id);
ret = nfs_idmap_instantiate(key, authkey, id_str);
break;
case IDMAP_CONV_IDTONAME:
if (upcall->im_id != im->im_id)
break;
ret = nfs_idmap_instantiate(key, authkey, im->im_name);
break;
default:
ret = -EINVAL;
}
out:
return ret;
}
......@@ -740,14 +773,16 @@ idmap_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
struct key_construction *cons;
struct idmap_msg im;
size_t namelen_in;
int ret;
int ret = -ENOKEY;
/* If instantiation is successful, anyone waiting for key construction
* will have been woken up and someone else may now have used
* idmap_key_cons - so after this point we may no longer touch it.
*/
cons = ACCESS_ONCE(idmap->idmap_key_cons);
idmap->idmap_key_cons = NULL;
if (idmap->idmap_upcall_data == NULL)
goto out_noupcall;
cons = idmap->idmap_upcall_data->key_cons;
if (mlen != sizeof(im)) {
ret = -ENOSPC;
......@@ -768,16 +803,19 @@ idmap_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
if (namelen_in == 0 || namelen_in == IDMAP_NAMESZ) {
ret = -EINVAL;
goto out;
}
}
ret = nfs_idmap_read_message(&im, cons->key, cons->authkey);
ret = nfs_idmap_read_and_verify_message(&im,
&idmap->idmap_upcall_data->idmap_msg,
cons->key, cons->authkey);
if (ret >= 0) {
key_set_timeout(cons->key, nfs_idmap_cache_timeout);
ret = mlen;
}
out:
complete_request_key(cons, ret);
nfs_idmap_complete_pipe_upcall_locked(idmap, ret);
out_noupcall:
return ret;
}
......@@ -788,14 +826,9 @@ idmap_pipe_destroy_msg(struct rpc_pipe_msg *msg)
struct idmap_legacy_upcalldata,
pipe_msg);
struct idmap *idmap = data->idmap;
struct key_construction *cons;
if (msg->errno) {
cons = ACCESS_ONCE(idmap->idmap_key_cons);
idmap->idmap_key_cons = NULL;
complete_request_key(cons, msg->errno);
}
/* Free memory allocated in nfs_idmap_legacy_upcall() */
kfree(data);
if (msg->errno)
nfs_idmap_abort_pipe_upcall(idmap, msg->errno);
}
static void
......@@ -803,7 +836,8 @@ idmap_release_pipe(struct inode *inode)
{
struct rpc_inode *rpci = RPC_I(inode);
struct idmap *idmap = (struct idmap *)rpci->private;
idmap->idmap_key_cons = NULL;
nfs_idmap_abort_pipe_upcall(idmap, -EPIPE);
}
int nfs_map_name_to_uid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *uid)
......
......@@ -547,8 +547,8 @@ EXPORT_SYMBOL_GPL(nfs_getattr);
static void nfs_init_lock_context(struct nfs_lock_context *l_ctx)
{
atomic_set(&l_ctx->count, 1);
l_ctx->lockowner = current->files;
l_ctx->pid = current->tgid;
l_ctx->lockowner.l_owner = current->files;
l_ctx->lockowner.l_pid = current->tgid;
INIT_LIST_HEAD(&l_ctx->list);
}
......@@ -557,9 +557,9 @@ static struct nfs_lock_context *__nfs_find_lock_context(struct nfs_open_context
struct nfs_lock_context *pos;
list_for_each_entry(pos, &ctx->lock_context.list, list) {
if (pos->lockowner != current->files)
if (pos->lockowner.l_owner != current->files)
continue;
if (pos->pid != current->tgid)
if (pos->lockowner.l_pid != current->tgid)
continue;
atomic_inc(&pos->count);
return pos;
......@@ -578,7 +578,7 @@ struct nfs_lock_context *nfs_get_lock_context(struct nfs_open_context *ctx)
spin_unlock(&inode->i_lock);
new = kmalloc(sizeof(*new), GFP_KERNEL);
if (new == NULL)
return NULL;
return ERR_PTR(-ENOMEM);
nfs_init_lock_context(new);
spin_lock(&inode->i_lock);
res = __nfs_find_lock_context(ctx);
......
......@@ -101,11 +101,11 @@ struct nfs_client_initdata {
*/
struct nfs_parsed_mount_data {
int flags;
int rsize, wsize;
int timeo, retrans;
int acregmin, acregmax,
unsigned int rsize, wsize;
unsigned int timeo, retrans;
unsigned int acregmin, acregmax,
acdirmin, acdirmax;
int namlen;
unsigned int namlen;
unsigned int options;
unsigned int bsize;
unsigned int auth_flavor_len;
......@@ -464,6 +464,7 @@ static inline void nfs_inode_dio_wait(struct inode *inode)
{
inode_dio_wait(inode);
}
extern ssize_t nfs_dreq_bytes_left(struct nfs_direct_req *dreq);
/* nfs4proc.c */
extern void __nfs4_read_done_cb(struct nfs_read_data *);
......@@ -483,6 +484,12 @@ extern int _nfs4_call_sync_session(struct rpc_clnt *clnt,
struct nfs4_sequence_args *args,
struct nfs4_sequence_res *res,
int cache_reply);
extern int nfs40_walk_client_list(struct nfs_client *clp,
struct nfs_client **result,
struct rpc_cred *cred);
extern int nfs41_walk_client_list(struct nfs_client *clp,
struct nfs_client **result,
struct rpc_cred *cred);
/*
* Determine the device name as a string
......
......@@ -5,6 +5,7 @@
#ifndef __NFS_NETNS_H__
#define __NFS_NETNS_H__
#include <linux/nfs4.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
......@@ -22,6 +23,9 @@ struct nfs_net {
struct list_head nfs_volume_list;
#if IS_ENABLED(CONFIG_NFS_V4)
struct idr cb_ident_idr; /* Protected by nfs_client_lock */
unsigned short nfs_callback_tcpport;
unsigned short nfs_callback_tcpport6;
int cb_users[NFS4_MAX_MINOR_VERSION + 1];
#endif
spinlock_t nfs_client_lock;
struct timespec boot_time;
......
......@@ -132,8 +132,8 @@ struct nfs4_lock_owner {
struct nfs4_lock_state {
struct list_head ls_locks; /* Other lock stateids */
struct nfs4_state * ls_state; /* Pointer to open state */
#define NFS_LOCK_INITIALIZED 1
int ls_flags;
#define NFS_LOCK_INITIALIZED 0
unsigned long ls_flags;
struct nfs_seqid_counter ls_seqid;
nfs4_stateid ls_stateid;
atomic_t ls_count;
......@@ -191,6 +191,8 @@ struct nfs4_state_recovery_ops {
int (*establish_clid)(struct nfs_client *, struct rpc_cred *);
struct rpc_cred * (*get_clid_cred)(struct nfs_client *);
int (*reclaim_complete)(struct nfs_client *);
int (*detect_trunking)(struct nfs_client *, struct nfs_client **,
struct rpc_cred *);
};
struct nfs4_state_maintenance_ops {
......@@ -223,7 +225,7 @@ extern int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred);
extern int nfs4_destroy_clientid(struct nfs_client *clp);
extern int nfs4_init_clientid(struct nfs_client *, struct rpc_cred *);
extern int nfs41_init_clientid(struct nfs_client *, struct rpc_cred *);
extern int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait, bool roc);
extern int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait);
extern int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle);
extern int nfs4_proc_fs_locations(struct rpc_clnt *, struct inode *, const struct qstr *,
struct nfs4_fs_locations *, struct page *);
......@@ -320,9 +322,15 @@ extern void nfs4_renew_state(struct work_struct *);
/* nfs4state.c */
struct rpc_cred *nfs4_get_setclientid_cred(struct nfs_client *clp);
struct rpc_cred *nfs4_get_renew_cred_locked(struct nfs_client *clp);
int nfs4_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **);
int nfs40_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **, struct rpc_cred *);
#if defined(CONFIG_NFS_V4_1)
struct rpc_cred *nfs4_get_machine_cred_locked(struct nfs_client *clp);
struct rpc_cred *nfs4_get_exchange_id_cred(struct nfs_client *clp);
int nfs41_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **, struct rpc_cred *);
extern void nfs4_schedule_session_recovery(struct nfs4_session *, int);
#else
static inline void nfs4_schedule_session_recovery(struct nfs4_session *session, int err)
......@@ -351,7 +359,7 @@ extern void nfs41_handle_server_scope(struct nfs_client *,
extern void nfs4_put_lock_state(struct nfs4_lock_state *lsp);
extern int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl);
extern void nfs4_select_rw_stateid(nfs4_stateid *, struct nfs4_state *,
fmode_t, fl_owner_t, pid_t);
fmode_t, const struct nfs_lockowner *);
extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_mask);
extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
......@@ -372,6 +380,9 @@ extern bool nfs4_disable_idmapping;
extern unsigned short max_session_slots;
extern unsigned short send_implementation_id;
#define NFS4_CLIENT_ID_UNIQ_LEN (64)
extern char nfs4_client_id_uniquifier[NFS4_CLIENT_ID_UNIQ_LEN];
/* nfs4sysctl.c */
#ifdef CONFIG_SYSCTL
int nfs4_register_sysctl(void);
......
......@@ -84,7 +84,7 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init)
static void nfs4_destroy_callback(struct nfs_client *clp)
{
if (__test_and_clear_bit(NFS_CS_CALLBACK, &clp->cl_res_state))
nfs_callback_down(clp->cl_mvops->minor_version);
nfs_callback_down(clp->cl_mvops->minor_version, clp->cl_net);
}
static void nfs4_shutdown_client(struct nfs_client *clp)
......@@ -185,6 +185,7 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
rpc_authflavor_t authflavour)
{
char buf[INET6_ADDRSTRLEN + 1];
struct nfs_client *old;
int error;
if (clp->cl_cons_state == NFS_CS_READY) {
......@@ -230,6 +231,17 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
if (!nfs4_has_session(clp))
nfs_mark_client_ready(clp, NFS_CS_READY);
error = nfs4_discover_server_trunking(clp, &old);
if (error < 0)
goto error;
if (clp != old) {
clp->cl_preserve_clid = true;
nfs_put_client(clp);
clp = old;
atomic_inc(&clp->cl_count);
}
return clp;
error:
......@@ -239,6 +251,248 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
return ERR_PTR(error);
}
/*
* SETCLIENTID just did a callback update with the callback ident in
* "drop," but server trunking discovery claims "drop" and "keep" are
* actually the same server. Swap the callback IDs so that "keep"
* will continue to use the callback ident the server now knows about,
* and so that "keep"'s original callback ident is destroyed when
* "drop" is freed.
*/
static void nfs4_swap_callback_idents(struct nfs_client *keep,
struct nfs_client *drop)
{
struct nfs_net *nn = net_generic(keep->cl_net, nfs_net_id);
unsigned int save = keep->cl_cb_ident;
if (keep->cl_cb_ident == drop->cl_cb_ident)
return;
dprintk("%s: keeping callback ident %u and dropping ident %u\n",
__func__, keep->cl_cb_ident, drop->cl_cb_ident);
spin_lock(&nn->nfs_client_lock);
idr_replace(&nn->cb_ident_idr, keep, drop->cl_cb_ident);
keep->cl_cb_ident = drop->cl_cb_ident;
idr_replace(&nn->cb_ident_idr, drop, save);
drop->cl_cb_ident = save;
spin_unlock(&nn->nfs_client_lock);
}
/**
* nfs40_walk_client_list - Find server that recognizes a client ID
*
* @new: nfs_client with client ID to test
* @result: OUT: found nfs_client, or new
* @cred: credential to use for trunking test
*
* Returns zero, a negative errno, or a negative NFS4ERR status.
* If zero is returned, an nfs_client pointer is planted in "result."
*
* NB: nfs40_walk_client_list() relies on the new nfs_client being
* the last nfs_client on the list.
*/
int nfs40_walk_client_list(struct nfs_client *new,
struct nfs_client **result,
struct rpc_cred *cred)
{
struct nfs_net *nn = net_generic(new->cl_net, nfs_net_id);
struct nfs_client *pos, *n, *prev = NULL;
struct nfs4_setclientid_res clid = {
.clientid = new->cl_clientid,
.confirm = new->cl_confirm,
};
int status;
spin_lock(&nn->nfs_client_lock);
list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) {
/* If "pos" isn't marked ready, we can't trust the
* remaining fields in "pos" */
if (pos->cl_cons_state < NFS_CS_READY)
continue;
if (pos->rpc_ops != new->rpc_ops)
continue;
if (pos->cl_proto != new->cl_proto)
continue;
if (pos->cl_minorversion != new->cl_minorversion)
continue;
if (pos->cl_clientid != new->cl_clientid)
continue;
atomic_inc(&pos->cl_count);
spin_unlock(&nn->nfs_client_lock);
if (prev)
nfs_put_client(prev);
status = nfs4_proc_setclientid_confirm(pos, &clid, cred);
if (status == 0) {
nfs4_swap_callback_idents(pos, new);
nfs_put_client(pos);
*result = pos;
dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n",
__func__, pos, atomic_read(&pos->cl_count));
return 0;
}
if (status != -NFS4ERR_STALE_CLIENTID) {
nfs_put_client(pos);
dprintk("NFS: <-- %s status = %d, no result\n",
__func__, status);
return status;
}
spin_lock(&nn->nfs_client_lock);
prev = pos;
}
/*
* No matching nfs_client found. This should be impossible,
* because the new nfs_client has already been added to
* nfs_client_list by nfs_get_client().
*
* Don't BUG(), since the caller is holding a mutex.
*/
if (prev)
nfs_put_client(prev);
spin_unlock(&nn->nfs_client_lock);
pr_err("NFS: %s Error: no matching nfs_client found\n", __func__);
return -NFS4ERR_STALE_CLIENTID;
}
#ifdef CONFIG_NFS_V4_1
/*
* Returns true if the client IDs match
*/
static bool nfs4_match_clientids(struct nfs_client *a, struct nfs_client *b)
{
if (a->cl_clientid != b->cl_clientid) {
dprintk("NFS: --> %s client ID %llx does not match %llx\n",
__func__, a->cl_clientid, b->cl_clientid);
return false;
}
dprintk("NFS: --> %s client ID %llx matches %llx\n",
__func__, a->cl_clientid, b->cl_clientid);
return true;
}
/*
* Returns true if the server owners match
*/
static bool
nfs4_match_serverowners(struct nfs_client *a, struct nfs_client *b)
{
struct nfs41_server_owner *o1 = a->cl_serverowner;
struct nfs41_server_owner *o2 = b->cl_serverowner;
if (o1->minor_id != o2->minor_id) {
dprintk("NFS: --> %s server owner minor IDs do not match\n",
__func__);
return false;
}
if (o1->major_id_sz != o2->major_id_sz)
goto out_major_mismatch;
if (memcmp(o1->major_id, o2->major_id, o1->major_id_sz) != 0)
goto out_major_mismatch;
dprintk("NFS: --> %s server owners match\n", __func__);
return true;
out_major_mismatch:
dprintk("NFS: --> %s server owner major IDs do not match\n",
__func__);
return false;
}
/**
* nfs41_walk_client_list - Find nfs_client that matches a client/server owner
*
* @new: nfs_client with client ID to test
* @result: OUT: found nfs_client, or new
* @cred: credential to use for trunking test
*
* Returns zero, a negative errno, or a negative NFS4ERR status.
* If zero is returned, an nfs_client pointer is planted in "result."
*
* NB: nfs41_walk_client_list() relies on the new nfs_client being
* the last nfs_client on the list.
*/
int nfs41_walk_client_list(struct nfs_client *new,
struct nfs_client **result,
struct rpc_cred *cred)
{
struct nfs_net *nn = net_generic(new->cl_net, nfs_net_id);
struct nfs_client *pos, *n, *prev = NULL;
int error;
spin_lock(&nn->nfs_client_lock);
list_for_each_entry_safe(pos, n, &nn->nfs_client_list, cl_share_link) {
/* If "pos" isn't marked ready, we can't trust the
* remaining fields in "pos", especially the client
* ID and serverowner fields. Wait for CREATE_SESSION
* to finish. */
if (pos->cl_cons_state < NFS_CS_READY) {
atomic_inc(&pos->cl_count);
spin_unlock(&nn->nfs_client_lock);
if (prev)
nfs_put_client(prev);
prev = pos;
error = nfs_wait_client_init_complete(pos);
if (error < 0) {
nfs_put_client(pos);
spin_lock(&nn->nfs_client_lock);
continue;
}
spin_lock(&nn->nfs_client_lock);
}
if (pos->rpc_ops != new->rpc_ops)
continue;
if (pos->cl_proto != new->cl_proto)
continue;
if (pos->cl_minorversion != new->cl_minorversion)
continue;
if (!nfs4_match_clientids(pos, new))
continue;
if (!nfs4_match_serverowners(pos, new))
continue;
spin_unlock(&nn->nfs_client_lock);
dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n",
__func__, pos, atomic_read(&pos->cl_count));
*result = pos;
return 0;
}
/*
* No matching nfs_client found. This should be impossible,
* because the new nfs_client has already been added to
* nfs_client_list by nfs_get_client().
*
* Don't BUG(), since the caller is holding a mutex.
*/
spin_unlock(&nn->nfs_client_lock);
pr_err("NFS: %s Error: no matching nfs_client found\n", __func__);
return -NFS4ERR_STALE_CLIENTID;
}
#endif /* CONFIG_NFS_V4_1 */
static void nfs4_destroy_server(struct nfs_server *server)
{
nfs_server_return_all_delegations(server);
......
......@@ -95,16 +95,25 @@ nfs4_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
int ret;
struct inode *inode = file->f_path.dentry->d_inode;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
if (ret != 0)
goto out;
mutex_lock(&inode->i_mutex);
ret = nfs_file_fsync_commit(file, start, end, datasync);
if (!ret && !datasync)
/* application has asked for meta-data sync */
ret = pnfs_layoutcommit_inode(inode, true);
mutex_unlock(&inode->i_mutex);
out:
do {
ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
if (ret != 0)
break;
mutex_lock(&inode->i_mutex);
ret = nfs_file_fsync_commit(file, start, end, datasync);
if (!ret && !datasync)
/* application has asked for meta-data sync */
ret = pnfs_layoutcommit_inode(inode, true);
mutex_unlock(&inode->i_mutex);
/*
* If nfs_file_fsync_commit detected a server reboot, then
* resend all dirty pages that might have been covered by
* the NFS_CONTEXT_RESEND_WRITES flag
*/
start = 0;
end = LLONG_MAX;
} while (ret == -EAGAIN);
return ret;
}
......
......@@ -190,8 +190,6 @@ static int filelayout_async_handle_error(struct rpc_task *task,
* i/o and all i/o waiting on the slot table to the MDS until
* layout is destroyed and a new valid layout is obtained.
*/
set_bit(NFS_LAYOUT_INVALID,
&NFS_I(inode)->layout->plh_flags);
pnfs_destroy_layout(NFS_I(inode));
rpc_wake_up(&tbl->slot_tbl_waitq);
goto reset;
......@@ -205,7 +203,7 @@ static int filelayout_async_handle_error(struct rpc_task *task,
case -EPIPE:
dprintk("%s DS connection error %d\n", __func__,
task->tk_status);
filelayout_mark_devid_invalid(devid);
nfs4_mark_deviceid_unavailable(devid);
clear_bit(NFS_INO_LAYOUTCOMMIT, &NFS_I(inode)->flags);
_pnfs_return_layout(inode);
rpc_wake_up(&tbl->slot_tbl_waitq);
......@@ -269,6 +267,21 @@ filelayout_set_layoutcommit(struct nfs_write_data *wdata)
(unsigned long) NFS_I(hdr->inode)->layout->plh_lwb);
}
bool
filelayout_test_devid_unavailable(struct nfs4_deviceid_node *node)
{
return filelayout_test_devid_invalid(node) ||
nfs4_test_deviceid_unavailable(node);
}
static bool
filelayout_reset_to_mds(struct pnfs_layout_segment *lseg)
{
struct nfs4_deviceid_node *node = FILELAYOUT_DEVID_NODE(lseg);
return filelayout_test_devid_unavailable(node);
}
/*
* Call ops for the async read/write cases
* In the case of dense layouts, the offset needs to be reset to its
......@@ -453,7 +466,7 @@ static void filelayout_commit_release(void *calldata)
struct nfs_commit_data *data = calldata;
data->completion_ops->completion(data);
put_lseg(data->lseg);
pnfs_put_lseg(data->lseg);
nfs_put_client(data->ds_clp);
nfs_commitdata_release(data);
}
......@@ -608,13 +621,13 @@ filelayout_check_layout(struct pnfs_layout_hdr *lo,
d = nfs4_find_get_deviceid(NFS_SERVER(lo->plh_inode)->pnfs_curr_ld,
NFS_SERVER(lo->plh_inode)->nfs_client, id);
if (d == NULL) {
dsaddr = get_device_info(lo->plh_inode, id, gfp_flags);
dsaddr = filelayout_get_device_info(lo->plh_inode, id, gfp_flags);
if (dsaddr == NULL)
goto out;
} else
dsaddr = container_of(d, struct nfs4_file_layout_dsaddr, id_node);
/* Found deviceid is being reaped */
if (test_bit(NFS_DEVICEID_INVALID, &dsaddr->id_node.flags))
/* Found deviceid is unavailable */
if (filelayout_test_devid_unavailable(&dsaddr->id_node))
goto out_put;
fl->dsaddr = dsaddr;
......@@ -931,7 +944,7 @@ filelayout_pg_init_write(struct nfs_pageio_descriptor *pgio,
nfs_init_cinfo(&cinfo, pgio->pg_inode, pgio->pg_dreq);
status = filelayout_alloc_commit_info(pgio->pg_lseg, &cinfo, GFP_NOFS);
if (status < 0) {
put_lseg(pgio->pg_lseg);
pnfs_put_lseg(pgio->pg_lseg);
pgio->pg_lseg = NULL;
goto out_mds;
}
......@@ -985,7 +998,7 @@ filelayout_clear_request_commit(struct nfs_page *req,
out:
nfs_request_remove_commit_list(req, cinfo);
spin_unlock(cinfo->lock);
put_lseg(freeme);
pnfs_put_lseg(freeme);
}
static struct list_head *
......@@ -1018,7 +1031,7 @@ filelayout_choose_commit_list(struct nfs_page *req,
* off due to a rewrite, in which case it will be done in
* filelayout_clear_request_commit
*/
buckets[i].wlseg = get_lseg(lseg);
buckets[i].wlseg = pnfs_get_lseg(lseg);
}
set_bit(PG_COMMIT_TO_DS, &req->wb_flags);
cinfo->ds->nwritten++;
......@@ -1128,7 +1141,7 @@ filelayout_scan_ds_commit_list(struct pnfs_commit_bucket *bucket,
if (list_empty(src))
bucket->wlseg = NULL;
else
get_lseg(bucket->clseg);
pnfs_get_lseg(bucket->clseg);
}
return ret;
}
......@@ -1159,12 +1172,12 @@ static void filelayout_recover_commit_reqs(struct list_head *dst,
/* NOTE cinfo->lock is NOT held, relying on fact that this is
* only called on single thread per dreq.
* Can't take the lock because need to do put_lseg
* Can't take the lock because need to do pnfs_put_lseg
*/
for (i = 0, b = cinfo->ds->buckets; i < cinfo->ds->nbuckets; i++, b++) {
if (transfer_commit_list(&b->written, dst, cinfo, 0)) {
BUG_ON(!list_empty(&b->written));
put_lseg(b->wlseg);
pnfs_put_lseg(b->wlseg);
b->wlseg = NULL;
}
}
......@@ -1200,7 +1213,7 @@ alloc_ds_commits(struct nfs_commit_info *cinfo, struct list_head *list)
if (list_empty(&bucket->committing))
continue;
nfs_retry_commit(&bucket->committing, bucket->clseg, cinfo);
put_lseg(bucket->clseg);
pnfs_put_lseg(bucket->clseg);
bucket->clseg = NULL;
}
/* Caller will clean up entries put on list */
......
......@@ -128,24 +128,14 @@ filelayout_mark_devid_invalid(struct nfs4_deviceid_node *node)
set_bit(NFS_DEVICEID_INVALID, &node->flags);
}
static inline bool
filelayout_test_layout_invalid(struct pnfs_layout_hdr *lo)
{
return test_bit(NFS_LAYOUT_INVALID, &lo->plh_flags);
}
static inline bool
filelayout_test_devid_invalid(struct nfs4_deviceid_node *node)
{
return test_bit(NFS_DEVICEID_INVALID, &node->flags);
}
static inline bool
filelayout_reset_to_mds(struct pnfs_layout_segment *lseg)
{
return filelayout_test_devid_invalid(FILELAYOUT_DEVID_NODE(lseg)) ||
filelayout_test_layout_invalid(lseg->pls_layout);
}
extern bool
filelayout_test_devid_unavailable(struct nfs4_deviceid_node *node);
extern struct nfs_fh *
nfs4_fl_select_ds_fh(struct pnfs_layout_segment *lseg, u32 j);
......@@ -158,7 +148,7 @@ struct nfs4_pnfs_ds *nfs4_fl_prepare_ds(struct pnfs_layout_segment *lseg,
extern void nfs4_fl_put_deviceid(struct nfs4_file_layout_dsaddr *dsaddr);
extern void nfs4_fl_free_deviceid(struct nfs4_file_layout_dsaddr *dsaddr);
struct nfs4_file_layout_dsaddr *
get_device_info(struct inode *inode, struct nfs4_deviceid *dev_id, gfp_t gfp_flags);
filelayout_get_device_info(struct inode *inode, struct nfs4_deviceid *dev_id, gfp_t gfp_flags);
void nfs4_ds_disconnect(struct nfs_client *clp);
#endif /* FS_NFS_NFS4FILELAYOUT_H */
......@@ -690,7 +690,7 @@ decode_and_add_device(struct inode *inode, struct pnfs_device *dev, gfp_t gfp_fl
* of available devices, and return it.
*/
struct nfs4_file_layout_dsaddr *
get_device_info(struct inode *inode, struct nfs4_deviceid *dev_id, gfp_t gfp_flags)
filelayout_get_device_info(struct inode *inode, struct nfs4_deviceid *dev_id, gfp_t gfp_flags)
{
struct pnfs_device *pdev = NULL;
u32 max_resp_sz;
......@@ -804,13 +804,14 @@ nfs4_fl_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx)
struct nfs4_pnfs_ds *ds = dsaddr->ds_list[ds_idx];
struct nfs4_deviceid_node *devid = FILELAYOUT_DEVID_NODE(lseg);
if (filelayout_test_devid_invalid(devid))
if (filelayout_test_devid_unavailable(devid))
return NULL;
if (ds == NULL) {
printk(KERN_ERR "NFS: %s: No data server for offset index %d\n",
__func__, ds_idx);
goto mark_dev_invalid;
filelayout_mark_devid_invalid(devid);
return NULL;
}
if (!ds->ds_clp) {
......@@ -818,14 +819,12 @@ nfs4_fl_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx)
int err;
err = nfs4_ds_connect(s, ds);
if (err)
goto mark_dev_invalid;
if (err) {
nfs4_mark_deviceid_unavailable(devid);
return NULL;
}
}
return ds;
mark_dev_invalid:
filelayout_mark_devid_invalid(devid);
return NULL;
}
module_param(dataserver_retrans, uint, 0644);
......
......@@ -192,25 +192,13 @@ static rpc_authflavor_t nfs4_negotiate_security(struct inode *inode, struct qstr
struct rpc_clnt *nfs4_create_sec_client(struct rpc_clnt *clnt, struct inode *inode,
struct qstr *name)
{
struct rpc_clnt *clone;
struct rpc_auth *auth;
rpc_authflavor_t flavor;
flavor = nfs4_negotiate_security(inode, name);
if ((int)flavor < 0)
return ERR_PTR(flavor);
return ERR_PTR((int)flavor);
clone = rpc_clone_client(clnt);
if (IS_ERR(clone))
return clone;
auth = rpcauth_create(flavor, clone);
if (!auth) {
rpc_shutdown_client(clone);
clone = ERR_PTR(-EIO);
}
return clone;
return rpc_clone_client_set_auth(clnt, flavor);
}
static struct vfsmount *try_location(struct nfs_clone_mount *mountdata,
......
此差异已折叠。
......@@ -51,18 +51,21 @@
#include <linux/bitops.h>
#include <linux/jiffies.h>
#include <linux/sunrpc/clnt.h>
#include "nfs4_fs.h"
#include "callback.h"
#include "delegation.h"
#include "internal.h"
#include "pnfs.h"
#include "netns.h"
#define NFSDBG_FACILITY NFSDBG_STATE
#define OPENOWNER_POOL_SIZE 8
const nfs4_stateid zero_stateid;
static DEFINE_MUTEX(nfs_clid_init_mutex);
static LIST_HEAD(nfs4_clientid_list);
int nfs4_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
......@@ -73,12 +76,13 @@ int nfs4_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
};
unsigned short port;
int status;
struct nfs_net *nn = net_generic(clp->cl_net, nfs_net_id);
if (test_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state))
goto do_confirm;
port = nfs_callback_tcpport;
port = nn->nfs_callback_tcpport;
if (clp->cl_addr.ss_family == AF_INET6)
port = nfs_callback_tcpport6;
port = nn->nfs_callback_tcpport6;
status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, port, cred, &clid);
if (status != 0)
......@@ -96,6 +100,56 @@ int nfs4_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
return status;
}
/**
* nfs40_discover_server_trunking - Detect server IP address trunking (mv0)
*
* @clp: nfs_client under test
* @result: OUT: found nfs_client, or clp
* @cred: credential to use for trunking test
*
* Returns zero, a negative errno, or a negative NFS4ERR status.
* If zero is returned, an nfs_client pointer is planted in
* "result".
*
* Note: The returned client may not yet be marked ready.
*/
int nfs40_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **result,
struct rpc_cred *cred)
{
struct nfs4_setclientid_res clid = {
.clientid = clp->cl_clientid,
.confirm = clp->cl_confirm,
};
struct nfs_net *nn = net_generic(clp->cl_net, nfs_net_id);
unsigned short port;
int status;
port = nn->nfs_callback_tcpport;
if (clp->cl_addr.ss_family == AF_INET6)
port = nn->nfs_callback_tcpport6;
status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, port, cred, &clid);
if (status != 0)
goto out;
clp->cl_clientid = clid.clientid;
clp->cl_confirm = clid.confirm;
status = nfs40_walk_client_list(clp, result, cred);
switch (status) {
case -NFS4ERR_STALE_CLIENTID:
set_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
case 0:
/* Sustain the lease, even if it's empty. If the clientid4
* goes stale it's of no use for trunking discovery. */
nfs4_schedule_state_renewal(*result);
break;
}
out:
return status;
}
struct rpc_cred *nfs4_get_machine_cred_locked(struct nfs_client *clp)
{
struct rpc_cred *cred = NULL;
......@@ -275,6 +329,33 @@ int nfs41_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
return status;
}
/**
* nfs41_discover_server_trunking - Detect server IP address trunking (mv1)
*
* @clp: nfs_client under test
* @result: OUT: found nfs_client, or clp
* @cred: credential to use for trunking test
*
* Returns NFS4_OK, a negative errno, or a negative NFS4ERR status.
* If NFS4_OK is returned, an nfs_client pointer is planted in
* "result".
*
* Note: The returned client may not yet be marked ready.
*/
int nfs41_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **result,
struct rpc_cred *cred)
{
int status;
status = nfs4_proc_exchange_id(clp, cred);
if (status != NFS4_OK)
return status;
set_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
return nfs41_walk_client_list(clp, result, cred);
}
struct rpc_cred *nfs4_get_exchange_id_cred(struct nfs_client *clp)
{
struct rpc_cred *cred;
......@@ -729,11 +810,8 @@ static void __nfs4_close(struct nfs4_state *state,
if (!call_close) {
nfs4_put_open_state(state);
nfs4_put_state_owner(owner);
} else {
bool roc = pnfs_roc(state->inode);
nfs4_do_close(state, gfp_mask, wait, roc);
}
} else
nfs4_do_close(state, gfp_mask, wait);
}
void nfs4_close_state(struct nfs4_state *state, fmode_t fmode)
......@@ -865,7 +943,7 @@ void nfs4_put_lock_state(struct nfs4_lock_state *lsp)
if (list_empty(&state->lock_states))
clear_bit(LK_STATE_IN_USE, &state->flags);
spin_unlock(&state->state_lock);
if (lsp->ls_flags & NFS_LOCK_INITIALIZED) {
if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
if (nfs4_release_lockowner(lsp) == 0)
return;
}
......@@ -911,17 +989,25 @@ int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl)
}
static bool nfs4_copy_lock_stateid(nfs4_stateid *dst, struct nfs4_state *state,
fl_owner_t fl_owner, pid_t fl_pid)
const struct nfs_lockowner *lockowner)
{
struct nfs4_lock_state *lsp;
fl_owner_t fl_owner;
pid_t fl_pid;
bool ret = false;
if (lockowner == NULL)
goto out;
if (test_bit(LK_STATE_IN_USE, &state->flags) == 0)
goto out;
fl_owner = lockowner->l_owner;
fl_pid = lockowner->l_pid;
spin_lock(&state->state_lock);
lsp = __nfs4_find_lock_state(state, fl_owner, fl_pid, NFS4_ANY_LOCK_TYPE);
if (lsp != NULL && (lsp->ls_flags & NFS_LOCK_INITIALIZED) != 0) {
if (lsp != NULL && test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) {
nfs4_stateid_copy(dst, &lsp->ls_stateid);
ret = true;
}
......@@ -946,11 +1032,11 @@ static void nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
* requests.
*/
void nfs4_select_rw_stateid(nfs4_stateid *dst, struct nfs4_state *state,
fmode_t fmode, fl_owner_t fl_owner, pid_t fl_pid)
fmode_t fmode, const struct nfs_lockowner *lockowner)
{
if (nfs4_copy_delegation_stateid(dst, state->inode, fmode))
return;
if (nfs4_copy_lock_stateid(dst, state, fl_owner, fl_pid))
if (nfs4_copy_lock_stateid(dst, state, lockowner))
return;
nfs4_copy_open_stateid(dst, state);
}
......@@ -1289,7 +1375,7 @@ static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs
if (status >= 0) {
spin_lock(&state->state_lock);
list_for_each_entry(lock, &state->lock_states, ls_locks) {
if (!(lock->ls_flags & NFS_LOCK_INITIALIZED))
if (!test_bit(NFS_LOCK_INITIALIZED, &lock->ls_flags))
pr_warn_ratelimited("NFS: "
"%s: Lock reclaim "
"failed!\n", __func__);
......@@ -1361,7 +1447,7 @@ static void nfs4_clear_open_state(struct nfs4_state *state)
spin_lock(&state->state_lock);
list_for_each_entry(lock, &state->lock_states, ls_locks) {
lock->ls_seqid.flags = 0;
lock->ls_flags &= ~NFS_LOCK_INITIALIZED;
clear_bit(NFS_LOCK_INITIALIZED, &lock->ls_flags);
}
spin_unlock(&state->state_lock);
}
......@@ -1595,8 +1681,8 @@ static int nfs4_check_lease(struct nfs_client *clp)
return nfs4_recovery_handle_error(clp, status);
}
/* Set NFS4CLNT_LEASE_EXPIRED for all v4.0 errors and for recoverable errors
* on EXCHANGE_ID for v4.1
/* Set NFS4CLNT_LEASE_EXPIRED and reclaim reboot state for all v4.0 errors
* and for recoverable errors on EXCHANGE_ID for v4.1
*/
static int nfs4_handle_reclaim_lease_error(struct nfs_client *clp, int status)
{
......@@ -1606,8 +1692,12 @@ static int nfs4_handle_reclaim_lease_error(struct nfs_client *clp, int status)
return -ESERVERFAULT;
/* Lease confirmation error: retry after purging the lease */
ssleep(1);
clear_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
break;
case -NFS4ERR_STALE_CLIENTID:
clear_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
nfs4_state_clear_reclaim_reboot(clp);
nfs4_state_start_reclaim_reboot(clp);
break;
case -NFS4ERR_CLID_INUSE:
pr_err("NFS: Server %s reports our clientid is in use\n",
......@@ -1698,6 +1788,109 @@ static int nfs4_purge_lease(struct nfs_client *clp)
return 0;
}
/**
* nfs4_discover_server_trunking - Detect server IP address trunking
*
* @clp: nfs_client under test
* @result: OUT: found nfs_client, or clp
*
* Returns zero or a negative errno. If zero is returned,
* an nfs_client pointer is planted in "result".
*
* Note: since we are invoked in process context, and
* not from inside the state manager, we cannot use
* nfs4_handle_reclaim_lease_error().
*/
int nfs4_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **result)
{
const struct nfs4_state_recovery_ops *ops =
clp->cl_mvops->reboot_recovery_ops;
rpc_authflavor_t *flavors, flav, save;
struct rpc_clnt *clnt;
struct rpc_cred *cred;
int i, len, status;
dprintk("NFS: %s: testing '%s'\n", __func__, clp->cl_hostname);
len = NFS_MAX_SECFLAVORS;
flavors = kcalloc(len, sizeof(*flavors), GFP_KERNEL);
if (flavors == NULL) {
status = -ENOMEM;
goto out;
}
len = rpcauth_list_flavors(flavors, len);
if (len < 0) {
status = len;
goto out_free;
}
clnt = clp->cl_rpcclient;
save = clnt->cl_auth->au_flavor;
i = 0;
mutex_lock(&nfs_clid_init_mutex);
status = -ENOENT;
again:
cred = ops->get_clid_cred(clp);
if (cred == NULL)
goto out_unlock;
status = ops->detect_trunking(clp, result, cred);
put_rpccred(cred);
switch (status) {
case 0:
break;
case -EACCES:
if (clp->cl_machine_cred == NULL)
break;
/* Handle case where the user hasn't set up machine creds */
nfs4_clear_machine_cred(clp);
case -NFS4ERR_DELAY:
case -ETIMEDOUT:
case -EAGAIN:
ssleep(1);
dprintk("NFS: %s after status %d, retrying\n",
__func__, status);
goto again;
case -NFS4ERR_CLID_INUSE:
case -NFS4ERR_WRONGSEC:
status = -EPERM;
if (i >= len)
break;
flav = flavors[i++];
if (flav == save)
flav = flavors[i++];
clnt = rpc_clone_client_set_auth(clnt, flav);
if (IS_ERR(clnt)) {
status = PTR_ERR(clnt);
break;
}
clp->cl_rpcclient = clnt;
goto again;
case -NFS4ERR_MINOR_VERS_MISMATCH:
status = -EPROTONOSUPPORT;
break;
case -EKEYEXPIRED:
nfs4_warn_keyexpired(clp->cl_hostname);
case -NFS4ERR_NOT_SAME: /* FixMe: implement recovery
* in nfs4_exchange_id */
status = -EKEYEXPIRED;
}
out_unlock:
mutex_unlock(&nfs_clid_init_mutex);
out_free:
kfree(flavors);
out:
dprintk("NFS: %s: status = %d\n", __func__, status);
return status;
}
#ifdef CONFIG_NFS_V4_1
void nfs4_schedule_session_recovery(struct nfs4_session *session, int err)
{
......@@ -2008,6 +2201,7 @@ static void nfs4_state_manager(struct nfs_client *clp)
pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s"
" with error %d\n", section_sep, section,
clp->cl_hostname, -status);
ssleep(1);
nfs4_end_drain_session(clp);
nfs4_clear_state_manager_bit(clp);
}
......
......@@ -9,6 +9,7 @@
#include <linux/nfs_idmap.h>
#include <linux/nfs_fs.h>
#include "nfs4_fs.h"
#include "callback.h"
static const int nfs_set_port_min = 0;
......
......@@ -447,12 +447,14 @@ static int nfs4_stat_to_errno(int);
encode_sequence_maxsz + \
encode_putfh_maxsz + \
encode_open_maxsz + \
encode_access_maxsz + \
encode_getfh_maxsz + \
encode_getattr_maxsz)
#define NFS4_dec_open_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_open_maxsz + \
decode_access_maxsz + \
decode_getfh_maxsz + \
decode_getattr_maxsz)
#define NFS4_enc_open_confirm_sz \
......@@ -467,11 +469,13 @@ static int nfs4_stat_to_errno(int);
encode_sequence_maxsz + \
encode_putfh_maxsz + \
encode_open_maxsz + \
encode_access_maxsz + \
encode_getattr_maxsz)
#define NFS4_dec_open_noattr_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_open_maxsz + \
decode_access_maxsz + \
decode_getattr_maxsz)
#define NFS4_enc_open_downgrade_sz \
(compound_encode_hdr_maxsz + \
......@@ -1509,8 +1513,12 @@ static void encode_open_stateid(struct xdr_stream *xdr,
nfs4_stateid stateid;
if (ctx->state != NULL) {
const struct nfs_lockowner *lockowner = NULL;
if (l_ctx != NULL)
lockowner = &l_ctx->lockowner;
nfs4_select_rw_stateid(&stateid, ctx->state,
fmode, l_ctx->lockowner, l_ctx->pid);
fmode, lockowner);
if (zero_seqid)
stateid.seqid = 0;
encode_nfs4_stateid(xdr, &stateid);
......@@ -2216,6 +2224,8 @@ static void nfs4_xdr_enc_open(struct rpc_rqst *req, struct xdr_stream *xdr,
encode_putfh(xdr, args->fh, &hdr);
encode_open(xdr, args, &hdr);
encode_getfh(xdr, &hdr);
if (args->access)
encode_access(xdr, args->access, &hdr);
encode_getfattr_open(xdr, args->bitmask, args->open_bitmap, &hdr);
encode_nops(&hdr);
}
......@@ -2252,7 +2262,9 @@ static void nfs4_xdr_enc_open_noattr(struct rpc_rqst *req,
encode_sequence(xdr, &args->seq_args, &hdr);
encode_putfh(xdr, args->fh, &hdr);
encode_open(xdr, args, &hdr);
encode_getfattr(xdr, args->bitmask, &hdr);
if (args->access)
encode_access(xdr, args->access, &hdr);
encode_getfattr_open(xdr, args->bitmask, args->open_bitmap, &hdr);
encode_nops(&hdr);
}
......@@ -4095,7 +4107,7 @@ static int decode_change_info(struct xdr_stream *xdr, struct nfs4_change_info *c
return -EIO;
}
static int decode_access(struct xdr_stream *xdr, struct nfs4_accessres *access)
static int decode_access(struct xdr_stream *xdr, u32 *supported, u32 *access)
{
__be32 *p;
uint32_t supp, acc;
......@@ -4109,8 +4121,8 @@ static int decode_access(struct xdr_stream *xdr, struct nfs4_accessres *access)
goto out_overflow;
supp = be32_to_cpup(p++);
acc = be32_to_cpup(p);
access->supported = supp;
access->access = acc;
*supported = supp;
*access = acc;
return 0;
out_overflow:
print_overflow_msg(__func__, xdr);
......@@ -5642,7 +5654,8 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
* and places the remaining xdr data in xdr_buf->tail
*/
pdev->mincount = be32_to_cpup(p);
xdr_read_pages(xdr, pdev->mincount); /* include space for the length */
if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
goto out_overflow;
/* Parse notification bitmap, verifying that it is zero. */
p = xdr_inline_decode(xdr, 4);
......@@ -5887,7 +5900,7 @@ static int nfs4_xdr_dec_access(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
status = decode_putfh(xdr);
if (status != 0)
goto out;
status = decode_access(xdr, res);
status = decode_access(xdr, &res->supported, &res->access);
if (status != 0)
goto out;
decode_getfattr(xdr, res->fattr, res->server);
......@@ -6228,6 +6241,8 @@ static int nfs4_xdr_dec_open(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
status = decode_getfh(xdr, &res->fh);
if (status)
goto out;
if (res->access_request)
decode_access(xdr, &res->access_supported, &res->access_result);
decode_getfattr(xdr, res->f_attr, res->server);
out:
return status;
......@@ -6276,6 +6291,8 @@ static int nfs4_xdr_dec_open_noattr(struct rpc_rqst *rqstp,
status = decode_open(xdr, res);
if (status)
goto out;
if (res->access_request)
decode_access(xdr, &res->access_supported, &res->access_result);
decode_getfattr(xdr, res->f_attr, res->server);
out:
return status;
......
......@@ -41,6 +41,7 @@
#include <scsi/osd_ore.h>
#include "objlayout.h"
#include "../internal.h"
#define NFSDBG_FACILITY NFSDBG_PNFS_LD
......@@ -606,8 +607,14 @@ static bool aligned_on_raid_stripe(u64 offset, struct ore_layout *layout,
void objio_init_write(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
{
unsigned long stripe_end = 0;
u64 wb_size;
pnfs_generic_pg_init_write(pgio, req);
if (pgio->pg_dreq == NULL)
wb_size = i_size_read(pgio->pg_inode) - req_offset(req);
else
wb_size = nfs_dreq_bytes_left(pgio->pg_dreq);
pnfs_generic_pg_init_write(pgio, req, wb_size);
if (unlikely(pgio->pg_lseg == NULL))
return; /* Not pNFS */
......
......@@ -102,6 +102,7 @@ nfs_create_request(struct nfs_open_context *ctx, struct inode *inode,
unsigned int offset, unsigned int count)
{
struct nfs_page *req;
struct nfs_lock_context *l_ctx;
/* try to allocate the request struct */
req = nfs_page_alloc();
......@@ -109,11 +110,12 @@ nfs_create_request(struct nfs_open_context *ctx, struct inode *inode,
return ERR_PTR(-ENOMEM);
/* get lock context early so we can deal with alloc failures */
req->wb_lock_context = nfs_get_lock_context(ctx);
if (req->wb_lock_context == NULL) {
l_ctx = nfs_get_lock_context(ctx);
if (IS_ERR(l_ctx)) {
nfs_page_free(req);
return ERR_PTR(-ENOMEM);
return ERR_CAST(l_ctx);
}
req->wb_lock_context = l_ctx;
/* Initialize the request struct. Initially, we assume a
* long write-back delay. This will be adjusted in
......@@ -290,7 +292,9 @@ static bool nfs_can_coalesce_requests(struct nfs_page *prev,
{
if (req->wb_context->cred != prev->wb_context->cred)
return false;
if (req->wb_lock_context->lockowner != prev->wb_lock_context->lockowner)
if (req->wb_lock_context->lockowner.l_owner != prev->wb_lock_context->lockowner.l_owner)
return false;
if (req->wb_lock_context->lockowner.l_pid != prev->wb_lock_context->lockowner.l_pid)
return false;
if (req->wb_context->state != prev->wb_context->state)
return false;
......
此差异已折叠。
......@@ -62,9 +62,6 @@ enum {
NFS_LAYOUT_RW_FAILED, /* get rw layout failed stop trying */
NFS_LAYOUT_BULK_RECALL, /* bulk recall affecting layout */
NFS_LAYOUT_ROC, /* some lseg had roc bit set */
NFS_LAYOUT_DESTROYED, /* no new use of layout allowed */
NFS_LAYOUT_INVALID, /* layout is being destroyed */
NFS_LAYOUT_RETURNED, /* layout has already been returned */
};
enum layoutdriver_policy_flags {
......@@ -140,6 +137,7 @@ struct pnfs_layout_hdr {
atomic_t plh_outstanding; /* number of RPCs out */
unsigned long plh_block_lgets; /* block LAYOUTGET if >0 */
u32 plh_barrier; /* ignore lower seqids */
unsigned long plh_retry_timestamp;
unsigned long plh_flags;
loff_t plh_lwb; /* last write byte for layoutcommit */
struct rpc_cred *plh_lc_cred; /* layoutcommit cred */
......@@ -172,12 +170,12 @@ extern int nfs4_proc_getdevicelist(struct nfs_server *server,
struct pnfs_devicelist *devlist);
extern int nfs4_proc_getdeviceinfo(struct nfs_server *server,
struct pnfs_device *dev);
extern void nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags);
extern struct pnfs_layout_segment* nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags);
extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp);
/* pnfs.c */
void get_layout_hdr(struct pnfs_layout_hdr *lo);
void put_lseg(struct pnfs_layout_segment *lseg);
void pnfs_get_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_put_lseg(struct pnfs_layout_segment *lseg);
void pnfs_pageio_init_read(struct nfs_pageio_descriptor *, struct inode *,
const struct nfs_pgio_completion_ops *);
......@@ -188,28 +186,29 @@ void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, u32);
void unset_pnfs_layoutdriver(struct nfs_server *);
void pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *, struct nfs_page *);
int pnfs_generic_pg_readpages(struct nfs_pageio_descriptor *desc);
void pnfs_generic_pg_init_write(struct nfs_pageio_descriptor *, struct nfs_page *);
void pnfs_generic_pg_init_write(struct nfs_pageio_descriptor *pgio,
struct nfs_page *req, u64 wb_size);
int pnfs_generic_pg_writepages(struct nfs_pageio_descriptor *desc);
bool pnfs_generic_pg_test(struct nfs_pageio_descriptor *pgio, struct nfs_page *prev, struct nfs_page *req);
void pnfs_set_lo_fail(struct pnfs_layout_segment *lseg);
int pnfs_layout_process(struct nfs4_layoutget *lgp);
struct pnfs_layout_segment *pnfs_layout_process(struct nfs4_layoutget *lgp);
void pnfs_free_lseg_list(struct list_head *tmp_list);
void pnfs_destroy_layout(struct nfs_inode *);
void pnfs_destroy_all_layouts(struct nfs_client *);
void put_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo,
const nfs4_stateid *new,
bool update_barrier);
int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
struct pnfs_layout_hdr *lo,
struct nfs4_state *open_state);
int mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
int pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
struct pnfs_layout_range *recall_range);
bool pnfs_roc(struct inode *ino);
void pnfs_roc_release(struct inode *ino);
void pnfs_roc_set_barrier(struct inode *ino, u32 barrier);
bool pnfs_roc_drain(struct inode *ino, u32 *barrier);
bool pnfs_roc_drain(struct inode *ino, u32 *barrier, struct rpc_task *task);
void pnfs_set_layoutcommit(struct nfs_write_data *wdata);
void pnfs_cleanup_layoutcommit(struct nfs4_layoutcommit_data *data);
int pnfs_layoutcommit_inode(struct inode *inode, bool sync);
......@@ -233,6 +232,7 @@ struct nfs4_threshold *pnfs_mdsthreshold_alloc(void);
/* nfs4_deviceid_flags */
enum {
NFS_DEVICEID_INVALID = 0, /* set when MDS clientid recalled */
NFS_DEVICEID_UNAVAILABLE, /* device temporarily unavailable */
};
/* pnfs_dev.c */
......@@ -242,6 +242,7 @@ struct nfs4_deviceid_node {
const struct pnfs_layoutdriver_type *ld;
const struct nfs_client *nfs_client;
unsigned long flags;
unsigned long timestamp_unavailable;
struct nfs4_deviceid deviceid;
atomic_t ref;
};
......@@ -254,34 +255,12 @@ void nfs4_init_deviceid_node(struct nfs4_deviceid_node *,
const struct nfs4_deviceid *);
struct nfs4_deviceid_node *nfs4_insert_deviceid_node(struct nfs4_deviceid_node *);
bool nfs4_put_deviceid_node(struct nfs4_deviceid_node *);
void nfs4_mark_deviceid_unavailable(struct nfs4_deviceid_node *node);
bool nfs4_test_deviceid_unavailable(struct nfs4_deviceid_node *node);
void nfs4_deviceid_purge_client(const struct nfs_client *);
static inline void
pnfs_mark_layout_returned(struct pnfs_layout_hdr *lo)
{
set_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
}
static inline void
pnfs_clear_layout_returned(struct pnfs_layout_hdr *lo)
{
clear_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
}
static inline bool
pnfs_test_layout_returned(struct pnfs_layout_hdr *lo)
{
return test_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
}
static inline int lo_fail_bit(u32 iomode)
{
return iomode == IOMODE_RW ?
NFS_LAYOUT_RW_FAILED : NFS_LAYOUT_RO_FAILED;
}
static inline struct pnfs_layout_segment *
get_lseg(struct pnfs_layout_segment *lseg)
pnfs_get_lseg(struct pnfs_layout_segment *lseg)
{
if (lseg) {
atomic_inc(&lseg->pls_refcount);
......@@ -406,12 +385,12 @@ static inline void pnfs_destroy_layout(struct nfs_inode *nfsi)
}
static inline struct pnfs_layout_segment *
get_lseg(struct pnfs_layout_segment *lseg)
pnfs_get_lseg(struct pnfs_layout_segment *lseg)
{
return NULL;
}
static inline void put_lseg(struct pnfs_layout_segment *lseg)
static inline void pnfs_put_lseg(struct pnfs_layout_segment *lseg)
{
}
......@@ -443,7 +422,7 @@ pnfs_roc_set_barrier(struct inode *ino, u32 barrier)
}
static inline bool
pnfs_roc_drain(struct inode *ino, u32 *barrier)
pnfs_roc_drain(struct inode *ino, u32 *barrier, struct rpc_task *task)
{
return false;
}
......
......@@ -40,6 +40,8 @@
#define NFS4_DEVICE_ID_HASH_SIZE (1 << NFS4_DEVICE_ID_HASH_BITS)
#define NFS4_DEVICE_ID_HASH_MASK (NFS4_DEVICE_ID_HASH_SIZE - 1)
#define PNFS_DEVICE_RETRY_TIMEOUT (120*HZ)
static struct hlist_head nfs4_deviceid_cache[NFS4_DEVICE_ID_HASH_SIZE];
static DEFINE_SPINLOCK(nfs4_deviceid_lock);
......@@ -218,6 +220,30 @@ nfs4_put_deviceid_node(struct nfs4_deviceid_node *d)
}
EXPORT_SYMBOL_GPL(nfs4_put_deviceid_node);
void
nfs4_mark_deviceid_unavailable(struct nfs4_deviceid_node *node)
{
node->timestamp_unavailable = jiffies;
set_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags);
}
EXPORT_SYMBOL_GPL(nfs4_mark_deviceid_unavailable);
bool
nfs4_test_deviceid_unavailable(struct nfs4_deviceid_node *node)
{
if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
unsigned long start, end;
end = jiffies;
start = end - PNFS_DEVICE_RETRY_TIMEOUT;
if (time_in_range(node->timestamp_unavailable, start, end))
return true;
clear_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags);
}
return false;
}
EXPORT_SYMBOL_GPL(nfs4_test_deviceid_unavailable);
static void
_deviceid_purge_client(const struct nfs_client *clp, long hash)
{
......@@ -276,3 +302,4 @@ nfs4_deviceid_mark_client_invalid(struct nfs_client *clp)
}
rcu_read_unlock();
}
......@@ -88,6 +88,7 @@ enum {
Opt_sharecache, Opt_nosharecache,
Opt_resvport, Opt_noresvport,
Opt_fscache, Opt_nofscache,
Opt_migration, Opt_nomigration,
/* Mount options that take integer arguments */
Opt_port,
......@@ -147,6 +148,8 @@ static const match_table_t nfs_mount_option_tokens = {
{ Opt_noresvport, "noresvport" },
{ Opt_fscache, "fsc" },
{ Opt_nofscache, "nofsc" },
{ Opt_migration, "migration" },
{ Opt_nomigration, "nomigration" },
{ Opt_port, "port=%s" },
{ Opt_rsize, "rsize=%s" },
......@@ -676,6 +679,9 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
if (nfss->options & NFS_OPTION_FSCACHE)
seq_printf(m, ",fsc");
if (nfss->options & NFS_OPTION_MIGRATION)
seq_printf(m, ",migration");
if (nfss->flags & NFS_MOUNT_LOOKUP_CACHE_NONEG) {
if (nfss->flags & NFS_MOUNT_LOOKUP_CACHE_NONE)
seq_printf(m, ",lookupcache=none");
......@@ -1106,7 +1112,7 @@ static int nfs_get_option_ul(substring_t args[], unsigned long *option)
string = match_strdup(args);
if (string == NULL)
return -ENOMEM;
rc = strict_strtoul(string, 10, option);
rc = kstrtoul(string, 10, option);
kfree(string);
return rc;
......@@ -1243,6 +1249,12 @@ static int nfs_parse_mount_options(char *raw,
kfree(mnt->fscache_uniq);
mnt->fscache_uniq = NULL;
break;
case Opt_migration:
mnt->options |= NFS_OPTION_MIGRATION;
break;
case Opt_nomigration:
mnt->options &= NFS_OPTION_MIGRATION;
break;
/*
* options that take numeric values
......@@ -1535,6 +1547,10 @@ static int nfs_parse_mount_options(char *raw,
if (mnt->minorversion && mnt->version != 4)
goto out_minorversion_mismatch;
if (mnt->options & NFS_OPTION_MIGRATION &&
mnt->version != 4 && mnt->minorversion != 0)
goto out_migration_misuse;
/*
* verify that any proto=/mountproto= options match the address
* families in the addr=/mountaddr= options.
......@@ -1572,6 +1588,10 @@ static int nfs_parse_mount_options(char *raw,
printk(KERN_INFO "NFS: mount option vers=%u does not support "
"minorversion=%u\n", mnt->version, mnt->minorversion);
return 0;
out_migration_misuse:
printk(KERN_INFO
"NFS: 'migration' not supported for this NFS version\n");
return 0;
out_nomem:
printk(KERN_INFO "NFS: not enough memory to parse option\n");
return 0;
......@@ -2494,7 +2514,7 @@ EXPORT_SYMBOL_GPL(nfs_kill_super);
/*
* Clone an NFS2/3/4 server record on xdev traversal (FSID-change)
*/
struct dentry *
static struct dentry *
nfs_xdev_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *raw_data)
{
......@@ -2642,6 +2662,7 @@ unsigned int nfs_idmap_cache_timeout = 600;
bool nfs4_disable_idmapping = true;
unsigned short max_session_slots = NFS4_DEF_SLOT_TABLE_SIZE;
unsigned short send_implementation_id = 1;
char nfs4_client_id_uniquifier[NFS4_CLIENT_ID_UNIQ_LEN] = "";
EXPORT_SYMBOL_GPL(nfs_callback_set_tcpport);
EXPORT_SYMBOL_GPL(nfs_callback_tcpport);
......@@ -2649,6 +2670,7 @@ EXPORT_SYMBOL_GPL(nfs_idmap_cache_timeout);
EXPORT_SYMBOL_GPL(nfs4_disable_idmapping);
EXPORT_SYMBOL_GPL(max_session_slots);
EXPORT_SYMBOL_GPL(send_implementation_id);
EXPORT_SYMBOL_GPL(nfs4_client_id_uniquifier);
#define NFS_CALLBACK_MAXPORTNR (65535U)
......@@ -2659,7 +2681,7 @@ static int param_set_portnr(const char *val, const struct kernel_param *kp)
if (!val)
return -EINVAL;
ret = strict_strtoul(val, 0, &num);
ret = kstrtoul(val, 0, &num);
if (ret == -EINVAL || num > NFS_CALLBACK_MAXPORTNR)
return -EINVAL;
*((unsigned int *)kp->arg) = num;
......@@ -2674,6 +2696,8 @@ static struct kernel_param_ops param_ops_portnr = {
module_param_named(callback_tcpport, nfs_callback_set_tcpport, portnr, 0644);
module_param(nfs_idmap_cache_timeout, int, 0644);
module_param(nfs4_disable_idmapping, bool, 0644);
module_param_string(nfs4_unique_id, nfs4_client_id_uniquifier,
NFS4_CLIENT_ID_UNIQ_LEN, 0600);
MODULE_PARM_DESC(nfs4_disable_idmapping,
"Turn off NFSv4 idmapping when using 'sec=sys'");
module_param(max_session_slots, ushort, 0644);
......@@ -2682,6 +2706,7 @@ MODULE_PARM_DESC(max_session_slots, "Maximum number of outstanding NFSv4.1 "
module_param(send_implementation_id, ushort, 0644);
MODULE_PARM_DESC(send_implementation_id,
"Send implementation ID with NFSv4.1 exchange_id");
MODULE_PARM_DESC(nfs4_unique_id, "nfs_client_id4 uniquifier string");
MODULE_ALIAS("nfs4");
#endif /* CONFIG_NFS_V4 */
......@@ -846,6 +846,7 @@ static int nfs_writepage_setup(struct nfs_open_context *ctx, struct page *page,
int nfs_flush_incompatible(struct file *file, struct page *page)
{
struct nfs_open_context *ctx = nfs_file_open_context(file);
struct nfs_lock_context *l_ctx;
struct nfs_page *req;
int do_flush, status;
/*
......@@ -860,9 +861,12 @@ int nfs_flush_incompatible(struct file *file, struct page *page)
req = nfs_page_find_request(page);
if (req == NULL)
return 0;
do_flush = req->wb_page != page || req->wb_context != ctx ||
req->wb_lock_context->lockowner != current->files ||
req->wb_lock_context->pid != current->tgid;
l_ctx = req->wb_lock_context;
do_flush = req->wb_page != page || req->wb_context != ctx;
if (l_ctx) {
do_flush |= l_ctx->lockowner.l_owner != current->files
|| l_ctx->lockowner.l_pid != current->tgid;
}
nfs_release_request(req);
if (!do_flush)
return 0;
......@@ -1576,6 +1580,7 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
/* We have a mismatch. Write the page again */
dprintk(" mismatch\n");
nfs_mark_request_dirty(req);
set_bit(NFS_CONTEXT_RESEND_WRITES, &req->wb_context->flags);
next:
nfs_unlock_and_release_request(req);
}
......
......@@ -81,12 +81,16 @@ struct nfs_access_entry {
int mask;
};
struct nfs_lockowner {
fl_owner_t l_owner;
pid_t l_pid;
};
struct nfs_lock_context {
atomic_t count;
struct list_head list;
struct nfs_open_context *open_context;
fl_owner_t lockowner;
pid_t pid;
struct nfs_lockowner lockowner;
};
struct nfs4_state;
......@@ -99,6 +103,7 @@ struct nfs_open_context {
unsigned long flags;
#define NFS_CONTEXT_ERROR_WRITE (0)
#define NFS_CONTEXT_RESEND_WRITES (1)
int error;
struct list_head list;
......@@ -355,6 +360,8 @@ extern int nfs_refresh_inode(struct inode *, struct nfs_fattr *);
extern int nfs_post_op_update_inode(struct inode *inode, struct nfs_fattr *fattr);
extern int nfs_post_op_update_inode_force_wcc(struct inode *inode, struct nfs_fattr *fattr);
extern int nfs_getattr(struct vfsmount *, struct dentry *, struct kstat *);
extern void nfs_access_add_cache(struct inode *, struct nfs_access_entry *);
extern void nfs_access_set_mask(struct nfs_access_entry *, u32);
extern int nfs_permission(struct inode *, int);
extern int nfs_open(struct inode *, struct file *);
extern int nfs_release(struct inode *, struct file *);
......
......@@ -39,6 +39,7 @@ struct nfs_client {
unsigned long cl_flags; /* behavior switches */
#define NFS_CS_NORESVPORT 0 /* - use ephemeral src port */
#define NFS_CS_DISCRTRY 1 /* - disconnect on RPC retry */
#define NFS_CS_MIGRATION 2 /* - transparent state migr */
struct sockaddr_storage cl_addr; /* server identifier */
size_t cl_addrlen;
char * cl_hostname; /* hostname of server */
......@@ -81,6 +82,7 @@ struct nfs_client {
/* The flags used for obtaining the clientid during EXCHANGE_ID */
u32 cl_exchange_flags;
struct nfs4_session *cl_session; /* shared session */
bool cl_preserve_clid;
struct nfs41_server_owner *cl_serverowner;
struct nfs41_server_scope *cl_serverscope;
struct nfs41_impl_id *cl_implid;
......@@ -125,6 +127,7 @@ struct nfs_server {
unsigned int namelen;
unsigned int options; /* extra options enabled by mount */
#define NFS_OPTION_FSCACHE 0x00000001 /* - local caching enabled */
#define NFS_OPTION_MIGRATION 0x00000002 /* - NFSv4 migration enabled */
struct nfs_fsid fsid;
__u64 maxfilesize; /* maximum file size */
......
......@@ -251,7 +251,6 @@ struct nfs4_layoutget_res {
struct nfs4_layoutget {
struct nfs4_layoutget_args args;
struct nfs4_layoutget_res res;
struct pnfs_layout_segment **lsegpp;
gfp_t gfp_flags;
};
......@@ -335,6 +334,7 @@ struct nfs_openargs {
struct nfs_seqid * seqid;
int open_flags;
fmode_t fmode;
u32 access;
__u64 clientid;
struct stateowner_id id;
union {
......@@ -369,6 +369,9 @@ struct nfs_openres {
struct nfs4_string *owner;
struct nfs4_string *group_owner;
struct nfs4_sequence_res seq_res;
__u32 access_request;
__u32 access_supported;
__u32 access_result;
};
/*
......
......@@ -130,6 +130,8 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *,
const struct rpc_program *, u32);
void rpc_task_reset_client(struct rpc_task *task, struct rpc_clnt *clnt);
struct rpc_clnt *rpc_clone_client(struct rpc_clnt *);
struct rpc_clnt *rpc_clone_client_set_auth(struct rpc_clnt *,
rpc_authflavor_t);
void rpc_shutdown_client(struct rpc_clnt *);
void rpc_release_client(struct rpc_clnt *);
void rpc_task_release_client(struct rpc_task *);
......
......@@ -173,8 +173,7 @@ struct rpc_xprt {
unsigned int min_reqs; /* min number of slots */
atomic_t num_reqs; /* total slots */
unsigned long state; /* transport state */
unsigned char shutdown : 1, /* being shut down */
resvport : 1; /* use a reserved port */
unsigned char resvport : 1; /* use a reserved port */
unsigned int swapper; /* we're swapping over this
transport */
unsigned int bind_index; /* bind function index */
......
此差异已折叠。
此差异已折叠。
......@@ -1119,8 +1119,8 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
return -ENOMEM;
if (rpc_populate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF, NULL))
return -ENOMEM;
dprintk("RPC: sending pipefs MOUNT notification for net %p%s\n", net,
NET_NAME(net));
dprintk("RPC: sending pipefs MOUNT notification for net %p%s\n",
net, NET_NAME(net));
sn->pipefs_sb = sb;
err = blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
RPC_PIPEFS_MOUNT,
......@@ -1155,8 +1155,8 @@ static void rpc_kill_sb(struct super_block *sb)
sn->pipefs_sb = NULL;
mutex_unlock(&sn->pipefs_sb_lock);
put_net(net);
dprintk("RPC: sending pipefs UMOUNT notification for net %p%s\n", net,
NET_NAME(net));
dprintk("RPC: sending pipefs UMOUNT notification for net %p%s\n",
net, NET_NAME(net));
blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
RPC_PIPEFS_UMOUNT,
sb);
......
......@@ -1022,7 +1022,7 @@ static int rpciod_start(void)
* Create the rpciod thread and wait for it to start.
*/
dprintk("RPC: creating workqueue rpciod\n");
wq = alloc_workqueue("rpciod", WQ_MEM_RECLAIM, 0);
wq = alloc_workqueue("rpciod", WQ_MEM_RECLAIM, 1);
rpciod_workqueue = wq;
return rpciod_workqueue != NULL;
}
......
......@@ -730,19 +730,24 @@ static unsigned int xdr_align_pages(struct xdr_stream *xdr, unsigned int len)
if (xdr->nwords == 0)
return 0;
if (nwords > xdr->nwords) {
nwords = xdr->nwords;
len = nwords << 2;
}
/* Realign pages to current pointer position */
iov = buf->head;
if (iov->iov_len > cur)
if (iov->iov_len > cur) {
xdr_shrink_bufhead(buf, iov->iov_len - cur);
xdr->nwords = XDR_QUADLEN(buf->len - cur);
}
/* Truncate page data and move it into the tail */
if (buf->page_len > len)
if (nwords > xdr->nwords) {
nwords = xdr->nwords;
len = nwords << 2;
}
if (buf->page_len <= len)
len = buf->page_len;
else if (nwords < xdr->nwords) {
/* Truncate page data and move it into the tail */
xdr_shrink_pagelen(buf, buf->page_len - len);
xdr->nwords = XDR_QUADLEN(buf->len - cur);
xdr->nwords = XDR_QUADLEN(buf->len - cur);
}
return len;
}
......
......@@ -231,7 +231,7 @@ EXPORT_SYMBOL_GPL(xprt_reserve_xprt);
static void xprt_clear_locked(struct rpc_xprt *xprt)
{
xprt->snd_task = NULL;
if (!test_bit(XPRT_CLOSE_WAIT, &xprt->state) || xprt->shutdown) {
if (!test_bit(XPRT_CLOSE_WAIT, &xprt->state)) {
smp_mb__before_clear_bit();
clear_bit(XPRT_LOCKED, &xprt->state);
smp_mb__after_clear_bit();
......@@ -504,9 +504,6 @@ EXPORT_SYMBOL_GPL(xprt_wait_for_buffer_space);
*/
void xprt_write_space(struct rpc_xprt *xprt)
{
if (unlikely(xprt->shutdown))
return;
spin_lock_bh(&xprt->transport_lock);
if (xprt->snd_task) {
dprintk("RPC: write space: waking waiting task on "
......@@ -679,7 +676,7 @@ xprt_init_autodisconnect(unsigned long data)
struct rpc_xprt *xprt = (struct rpc_xprt *)data;
spin_lock(&xprt->transport_lock);
if (!list_empty(&xprt->recv) || xprt->shutdown)
if (!list_empty(&xprt->recv))
goto out_abort;
if (test_and_set_bit(XPRT_LOCKED, &xprt->state))
goto out_abort;
......@@ -1262,7 +1259,6 @@ struct rpc_xprt *xprt_create_transport(struct xprt_create *args)
static void xprt_destroy(struct rpc_xprt *xprt)
{
dprintk("RPC: destroying transport %p\n", xprt);
xprt->shutdown = 1;
del_timer_sync(&xprt->timer);
rpc_destroy_wait_queue(&xprt->binding);
......
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册