1. 06 2月, 2018 1 次提交
  2. 29 1月, 2018 3 次提交
    • J
      fs: handle inode->i_version more efficiently · f02a9ad1
      Jeff Layton 提交于
      Since i_version is mostly treated as an opaque value, we can exploit that
      fact to avoid incrementing it when no one is watching. With that change,
      we can avoid incrementing the counter on writes, unless someone has
      queried for it since it was last incremented. If the a/c/mtime don't
      change, and the i_version hasn't changed, then there's no need to dirty
      the inode metadata on a write.
      
      Convert the i_version counter to an atomic64_t, and use the lowest order
      bit to hold a flag that will tell whether anyone has queried the value
      since it was last incremented.
      
      When we go to maybe increment it, we fetch the value and check the flag
      bit.  If it's clear then we don't need to do anything if the update
      isn't being forced.
      
      If we do need to update, then we increment the counter by 2, and clear
      the flag bit, and then use a CAS op to swap it into place. If that
      works, we return true. If it doesn't then do it again with the value
      that we fetch from the CAS operation.
      
      On the query side, if the flag is already set, then we just shift the
      value down by 1 bit and return it. Otherwise, we set the flag in our
      on-stack value and again use cmpxchg to swap it into place if it hasn't
      changed. If it has, then we use the value from the cmpxchg as the new
      "old" value and try again.
      
      This method allows us to avoid incrementing the counter on writes (and
      dirtying the metadata) under typical workloads. We only need to increment
      if it has been queried since it was last changed.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Acked-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NKrzysztof Kozlowski <krzk@kernel.org>
      f02a9ad1
    • J
      fs: don't take the i_lock in inode_inc_iversion · 7594c461
      Jeff Layton 提交于
      The rationale for taking the i_lock when incrementing this value is
      lost in antiquity. The readers of the field don't take it (at least
      not universally), so my assumption is that it was only done here to
      serialize incrementors.
      
      If that is indeed the case, then we can drop the i_lock from this
      codepath and treat it as a atomic64_t for the purposes of
      incrementing it. This allows us to use inode_inc_iversion without
      any danger of lock inversion.
      
      Note that the read side is not fetched atomically with this change.
      The assumption here is that that is not a critical issue since the
      i_version is not fully synchronized with anything else anyway.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      7594c461
    • J
      fs: new API for handling inode->i_version · ae5e165d
      Jeff Layton 提交于
      Add a documentation blob that explains what the i_version field is, how
      it is expected to work, and how it is currently implemented by various
      filesystems.
      
      We already have inode_inc_iversion. Add several other functions for
      manipulating and accessing the i_version counter. For now, the
      implementation is trivial and basically works the way that all of the
      open-coded i_version accesses work today.
      
      Future patches will convert existing users of i_version to use the new
      API, and then convert the backend implementation to do things more
      efficiently.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      ae5e165d
  3. 26 1月, 2018 4 次提交
    • C
      regulator: add PM suspend and resume hooks · f7efad10
      Chunyan Zhang 提交于
      In this patch, consumers are allowed to set suspend voltage, and this
      actually just set the "uV" in constraint::regulator_state, when the
      regulator_suspend_late() was called by PM core through callback when
      the system is entering into suspend, the regulator device would act
      suspend activity then.
      
      And it assumes that if any consumer set suspend voltage, the regulator
      device should be enabled in the suspend state.  And if the suspend
      voltage of a regulator device for all consumers was set zero, the
      regulator device would be off in the suspend state.
      
      This patch also provides a new function hook to regulator devices for
      resuming from suspend states.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      f7efad10
    • C
      regulator: empty the old suspend functions · aa27bbc6
      Chunyan Zhang 提交于
      Regualtor suspend/resume functions should only be called by PM suspend
      core via registering dev_pm_ops, and regulator devices should implement
      the callback functions.  Thus, any regulator consumer shouldn't call
      the regulator suspend/resume functions directly.
      
      In order to avoid compile errors, two empty functions with the same name
      still be left for the time being.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      aa27bbc6
    • C
      regulator: leave one item to record whether regulator is enabled · 72069f99
      Chunyan Zhang 提交于
      The items "disabled" and "enabled" are a little redundant, since only one
      of them would be set to record if the regulator device should keep on
      or be switched to off in suspend states.
      
      So in this patch, the "disabled" was removed, only leave the "enabled":
        - enabled == 1 for regulator-on-in-suspend
        - enabled == 0 for regulator-off-in-suspend
        - enabled == -1 means do nothing when entering suspend mode.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      72069f99
    • A
      module/retpoline: Warn about missing retpoline in module · caf7501a
      Andi Kleen 提交于
      There's a risk that a kernel which has full retpoline mitigations becomes
      vulnerable when a module gets loaded that hasn't been compiled with the
      right compiler or the right option.
      
      To enable detection of that mismatch at module load time, add a module info
      string "retpoline" at build time when the module was compiled with
      retpoline support. This only covers compiled C source, but assembler source
      or prebuilt object files are not checked.
      
      If a retpoline enabled kernel detects a non retpoline protected module at
      load time, print a warning and report it in the sysfs vulnerability file.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: gregkh@linuxfoundation.org
      Cc: torvalds@linux-foundation.org
      Cc: jeyu@kernel.org
      Cc: arjan@linux.intel.com
      Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
      caf7501a
  4. 25 1月, 2018 1 次提交
  5. 24 1月, 2018 2 次提交
  6. 22 1月, 2018 1 次提交
  7. 20 1月, 2018 4 次提交
  8. 19 1月, 2018 1 次提交
  9. 18 1月, 2018 4 次提交
  10. 17 1月, 2018 5 次提交
  11. 16 1月, 2018 14 次提交
    • A
      blkcg: simplify statistic accumulation code · ddc21231
      Arnd Bergmann 提交于
      Some older compilers (gcc-4.4 through 4.6 in particular) struggle
      with the way that blkg_rwstat_read() returns a structure, leading
      to excessive stack usage and rather inefficient code:
      
      block/blk-cgroup.c: In function 'blkg_destroy':
      block/blk-cgroup.c:354:1: error: the frame size of 1296 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      block/cfq-iosched.c: In function 'cfqg_stats_add_aux':
      block/cfq-iosched.c:753:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      block/bfq-cgroup.c: In function 'bfqg_stats_add_aux':
      block/bfq-cgroup.c:299:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      
      I also notice that there is no point in using atomic accesses
      for the local variables, so storing the temporaries in simple 'u64'
      variables not only avoids the stack usage on older compilers but
      also improves the object code on modern versions.
      
      Fixes: e6269c44 ("blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it")
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ddc21231
    • F
      nubus: Add support for the driver model · 7f86c765
      Finn Thain 提交于
      This patch brings basic support for the Linux Driver Model to the
      NuBus subsystem.
      
      For flexibility, the matching of boards with drivers is left up to the
      drivers. This is also the approach taken by NetBSD. A board may have
      many functions, and drivers may have to consider many functional
      resources and board resources in order to match a device.
      
      This implementation does not bind drivers to resources (nor does it bind
      many drivers to the same board). Apple's NuBus declaration ROM design
      is flexible enough to allow that, but I don't see a need to support it
      as we don't use the "slot zero" resources (in the main logic board ROM).
      
      Eliminate the global nubus_boards linked list by rewriting the procfs
      board iterator around bus_for_each_dev(). Hence the nubus device refcount
      can be used to determine the lifespan of board objects.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      7f86c765
    • F
      nubus: Adopt standard linked list implementation · 41b84816
      Finn Thain 提交于
      This increases code re-use and improves readability.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      41b84816
    • F
      nubus: Rename struct nubus_dev · 189e19e8
      Finn Thain 提交于
      It is misleading to call a functional resource a "device". In adopting
      the Linux Driver Model, the struct device will be embedded in struct
      nubus_board. That will compound the terminlogy problem because drivers
      will bind with boards, not with functional resources. Avoid this by
      renaming struct nubus_dev as struct nubus_rsrc. "Functional resource"
      is the vendor's terminology so this helps avoid confusion.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      189e19e8
    • F
      nubus: Rework /proc/bus/nubus/s/ implementation · 2f7dd07e
      Finn Thain 提交于
      The /proc/bus/nubus/s/ directory tree for any slot s is missing a lot
      of information. The struct file_operations methods have long been left
      unimplemented (hence the familiar compile-time warning, "Need to set
      some I/O handlers here").
      
      Slot resources have a complex structure which varies depending on board
      function. The logic for interpreting these ROM data structures is found
      in nubus.c. Let's not duplicate that logic in proc.c.
      
      Create the /proc/bus/nubus/s/ inodes while scanning slot s. During
      descent through slot resource subdirectories, call the new
      nubus_proc_add_foo() functions to create the procfs inodes.
      
      Also add a new function, nubus_seq_write_rsrc_mem(), to write the
      contents of a particular slot resource to a given seq_file. This is
      used by the procfs file_operations methods, to finally give userspace
      access to slot ROM information, such as the available video modes.
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      2f7dd07e
    • F
      nubus: Clean up whitespace · 4bccc4b6
      Finn Thain 提交于
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      4bccc4b6
    • F
      nubus: Remove redundant code · 9f97977d
      Finn Thain 提交于
      Eliminate unused values from struct nubus_dev to save wasted memory
      (a Radius PrecisionColor 24X card has about 95 functional resources
      and up to six such cards may be fitted). Also remove redundant static
      variable initialization, an unreachable !MACH_IS_MAC conditional,
      the unused nubus_find_device() function, the bogus get_nubus_list()
      prototype and the pointless card_present temporary variable.
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      9f97977d
    • F
      nubus: Call proc_mkdir() not more than once per slot directory · 6c8b89ea
      Finn Thain 提交于
      This patch fixes the following WARNING.
      
      proc_dir_entry 'nubus/a' already registered
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper Tainted: G        W       4.13.0-00036-gd57552077387 #1
      Stack from 01c1bd9c:
              01c1bd9c 003c2c8b 01c1bdc0 0001b0fe 00000000 00322f4a 01c43a20 01c43b0c
              01c8c420 01c1bde8 0001b1b8 003a4ac3 00000148 000faa26 00000009 00000000
              01c1bde0 003a4b6c 01c1bdfc 01c1be20 000faa26 003a4ac3 00000148 003a4b6c
              01c43a71 01c8c471 01c10000 00326430 0043d00c 00000005 01c71a00 0020bce0
              00322964 01c1be38 000fac04 01c43a20 01c8c420 01c1bee0 01c8c420 01c1be50
              000fac4c 01c1bee0 00000000 01c43a20 00000000 01c1bee8 0020bd26 01c1bee0
      Call Trace: [<0001b0fe>] __warn+0xae/0xde
       [<00322f4a>] memcmp+0x0/0x5c
       [<0001b1b8>] warn_slowpath_fmt+0x2e/0x36
       [<000faa26>] proc_register+0xbe/0xd8
       [<000faa26>] proc_register+0xbe/0xd8
       [<00326430>] sprintf+0x0/0x20
       [<0020bce0>] nubus_proc_attach_device+0x0/0x1b8
       [<00322964>] strcpy+0x0/0x22
       [<000fac04>] proc_mkdir_data+0x64/0x96
       [<000fac4c>] proc_mkdir+0x16/0x1c
       [<0020bd26>] nubus_proc_attach_device+0x46/0x1b8
       [<0020bce0>] nubus_proc_attach_device+0x0/0x1b8
       [<00322964>] strcpy+0x0/0x22
       [<00001ba6>] kernel_pg_dir+0xba6/0x1000
       [<004339a2>] proc_bus_nubus_add_devices+0x1a/0x2e
       [<000faa40>] proc_create_data+0x0/0xf2
       [<0003297c>] parse_args+0x0/0x2d4
       [<00433a08>] nubus_proc_init+0x52/0x5a
       [<00433944>] nubus_init+0x0/0x44
       [<00433982>] nubus_init+0x3e/0x44
       [<000020dc>] do_one_initcall+0x38/0x196
       [<000020a4>] do_one_initcall+0x0/0x196
       [<0003297c>] parse_args+0x0/0x2d4
       [<00322964>] strcpy+0x0/0x22
       [<00040004>] __up_read+0xe/0x40
       [<004231d4>] repair_env_string+0x0/0x7a
       [<0042312e>] kernel_init_freeable+0xee/0x194
       [<00423146>] kernel_init_freeable+0x106/0x194
       [<00433944>] nubus_init+0x0/0x44
       [<000a6000>] kfree+0x0/0x156
       [<0032768c>] kernel_init+0x0/0xda
       [<00327698>] kernel_init+0xc/0xda
       [<0032768c>] kernel_init+0x0/0xda
       [<00002a90>] ret_from_kernel_thread+0xc/0x14
      ---[ end trace 14a6d619908ea253 ]---
      ------------[ cut here ]------------
      
      This gets repeated with each additional functional reasource.
      
      The problem here is the call to proc_mkdir() when the directory already
      exists. Each nubus_board gets a directory, such as /proc/bus/nubus/s/
      where s is the hex slot number. Therefore, store the 'procdir' pointer
      in struct nubus_board instead of struct nubus_dev.
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      6c8b89ea
    • F
      nubus: Use static functions where possible · 460cf95e
      Finn Thain 提交于
      This fixes a couple of warnings from 'make W=1':
      drivers/nubus/nubus.c:790: warning: no previous prototype for 'nubus_probe_slot'
      drivers/nubus/nubus.c:824: warning: no previous prototype for 'nubus_scan_bus'
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      460cf95e
    • F
      nubus: Fix up header split · 1ff2775a
      Finn Thain 提交于
      Due to the '#ifdef __KERNEL__' being located in the wrong place, some
      definitions from the kernel API were placed in the UAPI header during
      the scripted header split. Fix this. Also, remove the duplicate comment
      which is only relevant to the UAPI header.
      
      Fixes: 607ca46e ("UAPI: (Scripted) Disintegrate include/linux")
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      1ff2775a
    • F
      nubus: Avoid array underflow and overflow · 2f828fb2
      Finn Thain 提交于
      Check array indices. Avoid sprintf. Use buffers of sufficient size.
      Use appropriate types for array length parameters.
      Tested-by: NStan Johnson <userm57@yahoo.com>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      2f828fb2
    • A
      hrtimer: Implement support for softirq based hrtimers · 5da70160
      Anna-Maria Gleixner 提交于
      hrtimer callbacks are always invoked in hard interrupt context. Several
      users in tree require soft interrupt context for their callbacks and
      achieve this by combining a hrtimer with a tasklet. The hrtimer schedules
      the tasklet in hard interrupt context and the tasklet callback gets invoked
      in softirq context later.
      
      That's suboptimal and aside of that the real-time patch moves most of the
      hrtimers into softirq context. So adding native support for hrtimers
      expiring in softirq context is a valuable extension for both mainline and
      the RT patch set.
      
      Each valid hrtimer clock id has two associated hrtimer clock bases: one for
      timers expiring in hardirq context and one for timers expiring in softirq
      context.
      
      Implement the functionality to associate a hrtimer with the hard or softirq
      related clock bases and update the relevant functions to take them into
      account when the next expiry time needs to be evaluated.
      
      Add a check into the hard interrupt context handler functions to check
      whether the first expiring softirq based timer has expired. If it's expired
      the softirq is raised and the accounting of softirq based timers to
      evaluate the next expiry time for programming the timer hardware is skipped
      until the softirq processing has finished. At the end of the softirq
      processing the regular processing is resumed.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-29-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5da70160
    • J
      delayacct: Account blkio completion on the correct task · c96f5471
      Josh Snyder 提交于
      Before commit:
      
        e33a9bba ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
      
      delayacct_blkio_end() was called after context-switching into the task which
      completed I/O.
      
      This resulted in double counting: the task would account a delay both waiting
      for I/O and for time spent in the runqueue.
      
      With e33a9bba, delayacct_blkio_end() is called by try_to_wake_up().
      In ttwu, we have not yet context-switched. This is more correct, in that
      the delay accounting ends when the I/O is complete.
      
      But delayacct_blkio_end() relies on 'get_current()', and we have not yet
      context-switched into the task whose I/O completed. This results in the
      wrong task having its delay accounting statistics updated.
      
      Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(),
      so that it can update the statistics of the correct task.
      Signed-off-by: NJosh Snyder <joshs@netflix.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Brendan Gregg <bgregg@netflix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-block@vger.kernel.org
      Fixes: e33a9bba ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
      Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c96f5471
    • A
      hrtimer: Add clock bases and hrtimer mode for softirq context · 98ecadd4
      Anna-Maria Gleixner 提交于
      Currently hrtimer callback functions are always executed in hard interrupt
      context. Users of hrtimers, which need their timer function to be executed
      in soft interrupt context, make use of tasklets to get the proper context.
      
      Add additional hrtimer clock bases for timers which must expire in softirq
      context, so the detour via the tasklet can be avoided. This is also
      required for RT, where the majority of hrtimer is moved into softirq
      hrtimer context.
      
      The selection of the expiry mode happens via a mode bit. Introduce
      HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED
      bits and update the decoding of hrtimer_mode in tracepoints.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keescook@chromium.org
      Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      98ecadd4