提交 · c07127b48c6367255fb4506e6d6ba6e219205607 · openanolis / cloud-kernel

15 10月, 2014 1 次提交

dlm: fix missing endian conversion of rcom_status flags · c07127b4

由 Neale Ferguson 提交于 10月 14, 2014

The flags are already converted to le when being sent,
but are not being converted back to cpu when received.
Signed-off-by: NNeale Ferguson <neale@sinenomine.net>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

c07127b4

09 8月, 2012 1 次提交

dlm: fix unlock balance warnings · 475f230c

由 David Teigland 提交于 8月 02, 2012

The in_recovery rw_semaphore has always been acquired and
released by different threads by design.  To work around
the "BUG: bad unlock balance detected!" messages, adjust
things so the dlm_recoverd thread always does both down_write
and up_write.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

475f230c

17 7月, 2012 2 次提交

dlm: use idr instead of list for recovered rsbs · 1d7c484e

由 David Teigland 提交于 5月 15, 2012

When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

1d7c484e

dlm: use rsbtbl as resource directory · c04fecb4

由 David Teigland 提交于 5月 10, 2012

Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

c04fecb4

03 5月, 2012 1 次提交

dlm: fixes for nodir mode · 4875647a

由 David Teigland 提交于 4月 26, 2012

The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

4875647a

27 4月, 2012 1 次提交

dlm: limit rcom debug messages · d6e24788

由 David Teigland 提交于 4月 23, 2012

Unify the checking for both types of ignored
rcom messages, and replace the two log_debug
statements with a single, rate limited debug
message.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

d6e24788

04 1月, 2012 1 次提交

dlm: add node slots and generation · 757a4271

由 David Teigland 提交于 10月 20, 2011

Slot numbers are assigned to nodes when they join the lockspace.
The slot number chosen is the minimum unused value starting at 1.
Once a node is assigned a slot, that slot number will not change
while the node remains a lockspace member.  If the node leaves
and rejoins it can be assigned a new slot number.

A new generation number is also added to a lockspace.  It is
set and incremented during each recovery along with the slot
collection/assignment.

The slot numbers will be passed to gfs2 which will use them as
journal id's.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

757a4271

11 3月, 2011 1 次提交

dlm: record full callback state · 8304d6f2

由 David Teigland 提交于 2月 21, 2011

Change how callbacks are recorded for locks. Previously, information
about multiple callbacks was combined into a couple of variables that
indicated what the end result should be. In some situations, we
could not tell from this combined state what the exact sequence of
callbacks were, and would end up either delivering the callbacks in
the wrong order, or suppress redundant callbacks incorrectly. This
new approach records all the data for each callback, leaving no
uncertainty about what needs to be delivered.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

8304d6f2

01 12月, 2009 1 次提交

dlm: always use GFP_NOFS · 573c24c4

由 David Teigland 提交于 11月 30, 2009

Replace all GFP_KERNEL and ls_allocation with GFP_NOFS.
ls_allocation would be GFP_KERNEL for userland lockspaces
and GFP_NOFS for file system lockspaces.

It was discovered that any lockspaces on the system can
affect all others by triggering memory reclaim in the
file system which could in turn call back into the dlm
to acquire locks, deadlocking dlm threads that were
shared by all lockspaces, like dlm_recv.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

573c24c4

22 2月, 2008 1 次提交

dlm: fix rcom_names message to self · 599e0f58

由 David Teigland 提交于 2月 21, 2008

The recent patch to validate data lengths in rcom_names messages
failed to account for fake messages a node directs to itself before
ever sending it.  In this case we need to fill in the message length
in the header for the validation code to use.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

599e0f58

06 2月, 2008 1 次提交

dlm: proper types for asts and basts · e5dae548

由 David Teigland 提交于 2月 06, 2008

Use proper types for ast and bast functions, and use
consistent type for ast param.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

e5dae548

04 2月, 2008 5 次提交

A
dlm: verify that places expecting rcom_lock have packet long enough · ae773d0b
由 Al Viro 提交于 1月 25, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
```
ae773d0b

dlm: missing length check in check_config() · 02ed16b6

由 Al Viro 提交于 1月 25, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

02ed16b6

dlm: use proper type for ->ls_recover_buf · 4007685c

由 Al Viro 提交于 1月 25, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

4007685c

dlm: do not byteswap rcom_config · 93ff2971

由 Al Viro 提交于 1月 25, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

93ff2971

dlm: do not byteswap rcom_lock · 163a1859

由 Al Viro 提交于 1月 25, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

163a1859

31 1月, 2008 1 次提交

dlm: clean ups · dbcfc347

由 David Teigland 提交于 1月 29, 2008

A couple small clean-ups.  Remove unnecessary wrapper-functions in
rcom.c, and remove unnecessary casting and an unnecessary ASSERT in
util.c.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

dbcfc347

10 10月, 2007 1 次提交

[DLM] block dlm_recv in recovery transition · c36258b5

由 David Teigland 提交于 9月 27, 2007

Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
threads while working in the dlm.  This allows dlm_recv activity to be
suspended when the lockspace transitions to, from and between recovery
cycles.

The specific bug prompting this change is one where an in-progress
recovery cycle is aborted by a new recovery cycle.  While dlm_recv was
processing a recovery message, the recovery cycle was aborted and
dlm_recoverd began cleaning up.  dlm_recv decremented recover_locks_count
on an rsb after dlm_recoverd had reset it to zero.  This is fixed by
suspending dlm_recv (taking write lock on the rwsem) before aborting the
current recovery.

The transitions to/from normal and recovery modes are simplified by using
this new ability to block dlm_recv.  The switch from normal to recovery
mode means dlm_recv goes from processing locking messages, to saving them
for later, and vice versa.  Races are avoided by blocking dlm_recv when
setting the flag that switches between modes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c36258b5

14 8月, 2007 1 次提交

[DLM] fix NULL ls usage · 41684f95

由 David Teigland 提交于 7月 13, 2007

Fix regression in recent patch "[DLM] variable allocation" which
attempts to dereference an "ls" struct when it's NULL.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

41684f95

09 7月, 2007 2 次提交

[DLM] variable allocation · 44f487a5

由 Patrick Caulfield 提交于 6月 06, 2007

Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace.
This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL.
(This updated version of the patch uses gfp_t for ls_allocation.)
Signed-Off-By: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-Off-By: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44f487a5

[DLM] wait for config check during join [6/6] · 8b0e7b2c

由 David Teigland 提交于 5月 18, 2007

Joining the lockspace should wait for the initial round of inter-node
config checks to complete before returning. This way, if there's a
configuration mismatch between the joining node and the existing nodes,
the join can fail and return an error to the application.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8b0e7b2c

06 2月, 2007 4 次提交

[DLM] rename dlm_config_info fields · 68c817a1

由 David Teigland 提交于 1月 09, 2007

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch). No functional changes in this patch, just naming changes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

68c817a1

[DLM] change some log_error to log_debug · 8ec68867

由 David Teigland 提交于 1月 09, 2007

Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8ec68867

[DLM] add version check · 9e971b71

由 David Teigland 提交于 12月 13, 2006

Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9e971b71

[DLM] fix old rcom messages · 38aa8b0c

由 David Teigland 提交于 12月 13, 2006

A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery.  There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.

Each recovery sequence already has a locally unique sequence number
associated with it.  This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq.  When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply.  When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.

An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.

The protocol version number is changed to reflect the different rcom
structures.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

38aa8b0c

30 11月, 2006 4 次提交

[DLM] fix format warnings in rcom.c and recoverd.c · 57adf7ee

由 Ryusuke Konishi 提交于 11月 29, 2006

This fixes the following gcc warnings generated on
the architectures where uint64_t != unsigned long long (e.g. ppc64).

fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t'
fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t'
fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
Signed-off-by: NRyusuke Konishi <ryusuke@osrg.net>
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

57adf7ee

[DLM] don't accept replies to old recovery messages · 98f176fb

由 David Teigland 提交于 11月 27, 2006

We often abort a recovery after sending a status request to a remote node.
We want to ignore any potential status reply we get from the remote node.
If we get one of these unwanted replies, we've often moved on to the next
recovery message and incremented the message sequence counter, so the
reply will be ignored due to the seq number. In some cases, we've not
moved on to the next message so the seq number of the reply we want to
ignore is still correct, causing the reply to be accepted. The next
recovery message will then mistake this old reply as a new one.

To fix this, we add the flag RCOM_WAIT to indicate when we can accept a
new reply. We clear this flag if we abort recovery while waiting for a
reply. Before the flag is set again (to allow new replies) we know that
any old replies will be rejected due to their sequence number. We also
initialize the recovery-message sequence number to a random value when a
lockspace is first created. This makes it clear when messages are being
rejected from an old instance of a lockspace that has since been
recreated.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

98f176fb

[DLM] fix size of STATUS_REPLY message · 1babdb45

由 David Teigland 提交于 11月 27, 2006

When the not_ready routine sends a "fake" status reply with blank status
flags, it needs to use the correct size for a normal STATUS_REPLY by
including the size of the would-be config parameters. We also fill in the
non-existant config parameters with an invalid lvblen value so it's easier
to notice if these invalid paratmers are ever being used.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1babdb45

[DLM] status messages ping-pong between unmounted nodes · 435618b7

由 David Teigland 提交于 11月 02, 2006

Red Hat BZ 213682

If two nodes leave the lockspace (while unmounting the fs in the case of
gfs) after one has sent a STATUS message to the other, STATUS/STATUS_REPLY
messages will then ping-pong between the nodes when neither of them can
find the lockspace in question any longer.  We kill this by not sending
another STATUS message when we get a STATUS_REPLY for an unknown
lockspace.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

435618b7

24 8月, 2006 1 次提交

[DLM] sequence number missing in not_ready reply · f5888750

由 David Teigland 提交于 8月 23, 2006

When a status reply is sent for a lockspace that doesn't yet exist, the
message sequence number from the sender was not being copied into the
reply causing the sender to ignore the reply.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f5888750

10 8月, 2006 1 次提交

[DLM] reject replies to old requests · 4a99c3d9

由 David Teigland 提交于 8月 09, 2006

When recoveries are aborted by other recoveries we can get replies to
status or names requests that we've given up on.  This can cause problems
if we're making another request and receive an old reply.  Add a sequence
number to status/names requests and reject replies that don't match.  A
field already exists for the seq number that's used in other message
types.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4a99c3d9

09 8月, 2006 1 次提交

[DLM] show nodeid for recovery message · faa0f267

由 David Teigland 提交于 8月 08, 2006

To aid debugging, it's useful to be able to see what nodeid the dlm is
waiting on for a message reply.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

faa0f267

23 2月, 2006 1 次提交

[DLM] Remove range locks from the DLM · 3bcd3687

由 David Teigland 提交于 2月 23, 2006

This patch removes support for range locking from the DLM
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3bcd3687

18 1月, 2006 1 次提交

[DLM] The core of the DLM for GFS2/CLVM · e7fd4179

由 David Teigland 提交于 1月 18, 2006

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>

e7fd4179

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功