提交 · 324a56e16e44baecac3ca799fd216154145c14bf · openeuler / raspberrypi-kernel

12 12月, 2013 1 次提交

kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordingly · 324a56e1

由 Tejun Heo 提交于 12月 11, 2013

kernfs has just been separated out from sysfs and we're already in
full conflict mode.  Nothing can make the situation any worse.  Let's
take the chance to name things properly.

This patch performs the following renames.

* s/sysfs_elem_dir/kernfs_elem_dir/
* s/sysfs_elem_symlink/kernfs_elem_symlink/
* s/sysfs_elem_attr/kernfs_elem_file/
* s/sysfs_dirent/kernfs_node/
* s/sd/kn/ in kernfs proper
* s/parent_sd/parent/
* s/target_sd/target/
* s/dir_sd/parent/
* s/to_sysfs_dirent()/rb_to_kn()/
* misc renames of local vars when they conflict with the above

Because md, mic and gpio dig into sysfs details, this patch ends up
modifying them.  All are sysfs_dirent renames and trivial.  While we
can avoid these by introducing a dummy wrapping struct sysfs_dirent
around kernfs_node, given the limited usage outside kernfs and sysfs
proper, I don't think such workaround is called for.

This patch is strictly rename only and doesn't introduce any
functional difference.

- mic / gpio renames were missing.  Spotted by kbuild test robot.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

324a56e1

09 12月, 2013 6 次提交

driver core: fix device_create() error path · bbc780f8

由 David Herrmann 提交于 11月 21, 2013

We call put_device() in the error path, which is fine for dev==NULL.
However, in case kobject_set_name_vargs() fails, we have dev!=NULL but
device_initialized() wasn't called, yet.

Fix this by splitting device_register() into explicit calls to
device_add() and an early call to device_initialize().
Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bbc780f8

firmware: dmi-sysfs: Remove "dmi" directory on module exit · d0f80f9a

由 Bjorn Helgaas 提交于 12月 05, 2013

With CONFIG_DEBUG_KOBJECT_RELEASE=y, removing and immediately reloading the
dmi-sysfs module causes the following warning:

sysfs: cannot create duplicate filename '/firmware/dmi'
kobject_add_internal failed for dmi with -EEXIST, don't try to register things with the same name in the same directory.

The "dmi" directory stays in sysfs until the dmi_kobj is released, and
DEBUG_KOBJECT_RELEASE delays that.

I don't think we can hit this problem in normal usage because dmi_kobj is
static and nothing outside dmi-sysfs can get a reference to it, so the
only way to delay the "dmi" release is with DEBUG_KOBJECT_RELEASE.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d0f80f9a

firmware: dmi-sysfs: Don't remove dmi-sysfs "raw" file explicitly · a61aca28

由 Bjorn Helgaas 提交于 12月 05, 2013

Removing the dmi-sysfs module causes the following warning:

  # modprobe -r dmi_sysfs
  WARNING: CPU: 11 PID: 6785 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
  sysfs: can not remove 'raw', no directory

This is because putting the entry kobject, e.g., for
"/sys/firmware/dmi/entries/19-0", removes the directory and all its
contents.  By the time dmi_sysfs_entry_release() runs, the "raw" file
inside ".../19-0/" has already been removed.

Therefore, we don't need to remove the "raw" bin file at all in
dmi_sysfs_entry_release().
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a61aca28

firmware: Suppress fallback warnings when CONFIG_FW_LOADER_USER_HELPER=n · 68aeeaaa

由 Takashi Iwai 提交于 12月 02, 2013

The commit [3e358ac2: firmware: Be a bit more verbose about direct
firmware loading failure] introduced a new warning message about
falling back to user helper, but this isn't true when
CONFIG_FW_LOADER_USER_HELPER isn't set.

In this patch, clear the FW_OPT_FALLBACK flag in the case without
userhelper, so that the corresponding code will be disabled.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

68aeeaaa

firmware: Use bit flags instead of boolean combos · 14c4bae7

由 Takashi Iwai 提交于 12月 02, 2013

More than two boolean arguments to a function are rather confusing and
error-prone for callers.  Let's make the behavior bit flags instead of
triple combos.

A nice suggestion by Borislav Petkov.
Acked-by: NBorislav Petkov <bp@suse.de>
Acked-by: NPrarit Bhargava <prarit@redhat.com>
Acked-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

14c4bae7

firmware: Introduce request_firmware_direct() · bba3a87e

由 Takashi Iwai 提交于 12月 02, 2013

When CONFIG_FW_LOADER_USER_HELPER is set, request_firmware() falls
back to the usermode helper for loading via udev when the direct
loading fails.  But the recent udev takes way too long timeout (60
seconds) for non-existing firmware.  This is unacceptable for the
drivers like microcode loader where they load firmwares optionally,
i.e. it's no error even if no requested file exists.

This patch provides a new helper function, request_firmware_direct().
It behaves as same as request_firmware() except for that it doesn't
fall back to usermode helper but returns an error immediately if the
f/w can't be loaded directly in kernel.

Without CONFIG_FW_LOADER_USER_HELPER=y, request_firmware_direct() is
just an alias of request_firmware(), due to obvious reason.
Tested-by: NPrarit Bhargava <prarit@redhat.com>
Acked-by: NMing Lei <ming.lei@canonical.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bba3a87e

05 12月, 2013 2 次提交

PowerCap: Fix mode for energy counter · 95677a9a

由 Srinivas Pandruvada 提交于 12月 04, 2013

As per the documentation of powercap sysfs, energy_uj field is read only,
if it can't be reset. Currently it always allows write but will fail,
if there is no reset callback.
Changing mode field, to read only if there is no reset callback.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reported-by: NDirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

95677a9a

PNP: fix restoring devices after hibernation · 8a37ea50

由 Dmitry Torokhov 提交于 12月 05, 2013

On returning from hibernation 'restore' callback is called,
not 'resume'.  Fix it.

Fixes: eaf140b6 (PNP: convert PNP driver bus legacy pm_ops to dev_pm_ops)
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8a37ea50

04 12月, 2013 3 次提交

video: vt8500: fix error handling in probe() · 46ac2956

由 Dan Carpenter 提交于 12月 02, 2013

We shouldn't kfree(fbi) because that was allocated with devm_kzalloc().
There were several error paths which returned directly instead of
releasing resources.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NTomi Valkeinen <tomi.valkeinen@ti.com>

46ac2956

atmel_lcdfb: fix module autoload · 5a0973f3

由 Johan Hovold 提交于 10月 22, 2013

Add missing module device table which is needed for module autoloading.
Signed-off-by: NJohan Hovold <jhovold@gmail.com>
Acked-by: NNicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: NTomi Valkeinen <tomi.valkeinen@ti.com>

5a0973f3

cpuidle: Check for dev before deregistering it. · 813e8e3d

由 Konrad Rzeszutek Wilk 提交于 12月 03, 2013

If not, we could end up in the unfortunate situation where
we dereference a NULL pointer b/c we have cpuidle disabled.

This is the case when booting under Xen (which uses the
ACPI P/C states but disables the CPU idle driver) - and can
be easily reproduced when booting with cpuidle.off=1.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8156db4a>] cpuidle_unregister_device+0x2a/0x90
.. snip..
Call Trace:
 [<ffffffff813b15b4>] acpi_processor_power_exit+0x3c/0x5c
 [<ffffffff813af0a9>] acpi_processor_stop+0x61/0xb6
 [<ffffffff814215bf>] __device_release_driver+0fffff81421653>] device_release_driver+0x23/0x30
 [<ffffffff81420ed8>] bus_remove_device+0x108/0x180
 [<ffffffff8141d9d9>] device_del+0x129/0x1c0
 [<ffffffff813cb4b0>] ? unregister_xenbus_watch+0x1f0/0x1f0
 [<ffffffff8141da8e>] device_unregister+0x1e/0x60
 [<ffffffff814243e9>] unregister_cpu+0x39/0x60
 [<ffffffff81019e03>] arch_unregister_cpu+0x23/0x30
 [<ffffffff813c3c51>] handle_vcpu_hotplug_event+0xc1/0xe0
 [<ffffffff813cb4f5>] xenwatch_thread+0x45/0x120
 [<ffffffff810af010>] ? abort_exclusive_wait+0xb0/0xb0
 [<ffffffff8108ec42>] kthread+0xd2/0xf0
 [<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180
 [<ffffffff816ce17c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180

This problem also appears in 3.12 and could be a candidate for backport.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

813e8e3d

03 12月, 2013 15 次提交

[SCSI] bfa: Fix crash when symb name set for offline vport · 22a08538

由 Vijaya Mohan Guvva 提交于 11月 21, 2013

This patch fixes a crash when tried setting symbolic name for an offline
vport through sysfs. Crash is due to uninitialized pointer lport->ns,
which gets initialized only on linkup (port online).
Signed-off-by: NVijaya Mohan Guvva <vmohan@brocade.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

22a08538

cpufreq: fix garbage kobjects on errors during suspend/resume · 2167e239

由 Bjørn Mork 提交于 12月 03, 2013

This is effectively a revert of commit 5302c3fb ("cpufreq: Perform
light-weight init/teardown during suspend/resume"), which enabled
suspend/resume optimizations leaving the sysfs files in place.

Errors during suspend/resume are not handled properly, leaving
dead sysfs attributes in case of failures.  There are are number of
functions with special code for the "frozen" case, and all these
need to also have special error handling.

The problem is easy to demonstrate by making cpufreq_driver->init()
or cpufreq_driver->get() fail during resume.

The code is too complex for a simple fix, with split code paths
in multiple blocks within a number of functions.  It is therefore
best to revert the patch enabling this code until the error handling
is in place.

Examples of problems resulting from resume errors:

WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
missing sysfs attribute operations for kobject: (null)
Modules linked in: [stripped as irrelevant]
CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
 0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
 ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
 ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
Call Trace:
 [<ffffffff81380b0e>] dump_stack+0x55/0x76
 [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
 [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
 [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
 [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
 [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
 [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
 [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
 [<ffffffff81122562>] do_dentry_open+0x17c/0x257
 [<ffffffff8112267e>] finish_open+0x41/0x4f
 [<ffffffff81130225>] do_last+0x80c/0x9ba
 [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
 [<ffffffff81130606>] path_openat+0x233/0x4a1
 [<ffffffff81130b7e>] do_filp_open+0x35/0x85
 [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
 [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
 [<ffffffff811233a7>] SyS_openat+0xf/0x11
 [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b

The failure to restore cpufreq devices on cancelled hibernation is
not a new bug. It is caused by the ACPI _PPC call failing unless the
hibernate is completed. This makes the acpi_cpufreq driver fail its
init.

Previously, the cpufreq device could be restored by offlining the
cpu temporarily.  And as a complete hibernation cycle would do this,
it would be automatically restored most of the time.  But after
commit 5302c3fb the leftover sysfs attributes will block any
device add action.  Therefore offlining and onlining CPU 1 will no
longer restore the cpufreq object, and a complete suspend/resume
cycle will replace it with garbage.

Fixes: 5302c3fb ("cpufreq: Perform light-weight init/teardown during suspend/resume")
Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

2167e239

gpiolib: change a warning to debug message when failing to get gpio · 351cfe0f

由 Heikki Krogerus 提交于 11月 29, 2013

It's the drivers responsibility to react on failure to get
the gpio descriptors and not the frameworks. Since there are
some common peripherals that may or may not have certain
pins connected to gpio lines, depending on the platform,
printing the warning there may end up generating useless bug
reports.
Signed-off-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

351cfe0f

powerpc/gpio: Fix the wrong GPIO input data on MPC8572/MPC8536 · 1aeef303

由 Liu Gang 提交于 11月 22, 2013

For MPC8572/MPC8536, the status of GPIOs defined as output
cannot be determined by reading GPDAT register, so the code
use shadow data register instead. But the code may give the
wrong status of GPIOs defined as input under some scenarios:

1. If some pins were configured as inputs and were asserted
high before booting the kernel, the shadow data has been
initialized with those pin values.
2. Some pins have been configured as output first and have
been set to the high value, then reconfigured as input.

The above cases will make the shadow data for those input
pins to be set to high. Then reading the pin status will
always return high even if the actual pin status is low.

The code should eliminate the effects of the shadow data to
the input pins, and the status of those pins should be
read directly from GPDAT.

Cc: stable@vger.kernel.org
Acked-by: NScott Wood <scottwood@freescale.com>
Acked-by: NAnatolij Gustschin <agust@denx.de>
Signed-off-by: NLiu Gang <Gang.Liu@freescale.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

1aeef303

gpiolib: use platform GPIO mappings as fallback · 35c5d7fd

由 Alexandre Courbot 提交于 11月 23, 2013

For platforms that use device tree or ACPI as the standard way to look
GPIOs up, allow the platform-defined GPIO mappings to be used as a
fallback. This may be useful for platforms that need extra GPIOs mappings
not defined by the firmware.
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

35c5d7fd

gpiolib: fix lookup of platform-mapped GPIOs · 7cc67b9c

由 Alexandre Courbot 提交于 11月 23, 2013

A typo resulted in GPIO lookup failing unconditionally.
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Reported-by: NStephen Warren <swarren@wwwdotorg.org>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

7cc67b9c

sh-pfc: sh7372: Fix pin bias setup · 71493de7

由 Laurent Pinchart 提交于 11月 28, 2013

When computing the pin configuration register offset the bias setup code
erroneously compares the pin number range with the loop index instead of
the pin number. Fix it.
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

71493de7

sh-pfc: r8a7740: Fix pin bias setup · 5d276194

由 Laurent Pinchart 提交于 11月 28, 2013

When computing the pin configuration register offset the bias setup code
erroneously compares the pin number range with the loop index instead of
the pin number. Fix it.
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

5d276194

leds: pwm: Fix for deferred probe in DT booted mode · aa1a6d6d

由 Peter Ujfalusi 提交于 11月 28, 2013

We need to make sure that the error code from devm_of_pwm_get() is the one
the module returns in case of failure.
Restructure the code to make this possible for DT booted case.
With this patch the driver can ask for deferred probing when the board is
booted with DT.
Fixes for example omap4-sdp board's keyboard backlight led.
Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: NBryan Wu <cooloney@gmail.com>

aa1a6d6d

uio: we cannot mmap unaligned page contents · b6550287

由 Linus Torvalds 提交于 12月 02, 2013

In commit 7314e613 ("Fix a few incorrectly checked
[io_]remap_pfn_range() calls") the uio driver started more properly
checking the passed-in user mapping arguments against the size of the
actual uio driver data.

That in turn exposed that some driver authors apparently didn't realize
that mmap can only work on a page granularity, and had tried to use it
with smaller mappings, with the new size check catching that out.

So since it's not just the user mmap() arguments that can be confused,
make the uio mmap code also verify that the uio driver has the memory
allocated at page boundaries in order for mmap to work.  If the device
memory isn't properly aligned, we return

  [ENODEV]
    The fildes argument refers to a file whose type is not supported by mmap().

as per the open group documentation on mmap.
Reported-by: NHolger Brunck <holger.brunck@keymile.com>
Acked-by: NGreg KH <gregkh@linuxfoundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6550287

[SCSI] enclosure: fix WARN_ON in dual path device removing · a1470c7b

由 James Bottomley 提交于 11月 15, 2013

Bug report from: wenxiong@linux.vnet.ibm.com

The issue is happened in dual controller configuration. We got the
sysfs warnings when rmmod the ipr module.

enclosure_unregister() in drivers/msic/enclosure.c, call device_unregister()
for each componment deivce, device_unregister() ->device_del()->kobject_del()
->sysfs_remove_dir(). In sysfs_remove_dir(), set kobj->sd = NULL.

For each componment device,
enclosure_component_release()->enclosure_remove_links()->sysfs_remove_link()
in which checking kobj->sd again, it has been set as NULL when doing
device_unregister. So we saw all these sysfs WARNING.

Tested-by: wenxiong@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

a1470c7b

[SCSI] pm80xx: Tasklets synchronization fix. · 6cd60b37

由 Nikith Ganigarakoppal 提交于 11月 11, 2013

When multiple vectors are used, the vector variable is over written,
resulting in unhandled operation for those vectors.
This fix prevents the problem by maitaining HBA instance and
vector values for each irq.

[jejb: checkpatch fixes]
Signed-off-by: Nikith.Ganigarakoppal@pmcs.com
Signed-off-by: Anandkumar.Santhanam@pmcs.com
Reviewed-by: NJack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

6cd60b37

[SCSI] pm80xx: Resetting the phy state. · 7d029005

由 Nikith Ganigarakoppal 提交于 10月 30, 2013

Setting the phy state for hard reset response.
After sending hard reset for a device ,phy down event sets
the phy state to zero but for phy up event it will not set
the phy state again.This will cause problem to successive
hard resets.

Signed-off-by: Nikith.Ganigarakoppal@pmcs.com
Signed-off-by: Anandkumar.Santhanam@pmcs.com
Reviewed-by: NJack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

7d029005

[SCSI] pm80xx: Fix for direct attached device. · 34a9b81b

由 Nikith Ganigarakoppal 提交于 10月 30, 2013

In case of direct attached SATA device delay is not enough.
It will give crash for set device state command response and
wait_for_completion is the best solution for this.

Signed-off-by: Nikith.Ganigarakoppal@pmcs.com
Signed-off-by: Anandkumar.Santhanam@pmcs.com
Reviewed-by: NJack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

34a9b81b

[SCSI] pm80xx: Module author addition · 94f33c16

由 Nikith Ganigarakoppal 提交于 11月 13, 2013

Signed-off-by: Nikith.Ganigarakoppal@pmcs.com
Signed-off-by: Anandkumar.Santhanam@pmcs.com
Reviewed-by: NJack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

94f33c16

02 12月, 2013 4 次提交

pinctrl: abx500: Fix header file include guard · a9e51fe5

由 Axel Lin 提交于 12月 01, 2013

Fix a trivial typo.
Signed-off-by: NAxel Lin <axel.lin@ingics.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

a9e51fe5

net/mlx4_en: Remove selftest TX queues empty condition · 833846e8

由 Eugenia Emantayev 提交于 12月 01, 2013

Remove waiting for TX queues to become empty during selftest.
This check is not necessary for any purpose, and might put
the driver into an infinite loop.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

833846e8

virtio_net: make all RX paths handle erors consistently · f121159d

由 Michael S. Tsirkin 提交于 11月 28, 2013

receive mergeable now handles errors internally.
Do same for big and small packet paths, otherwise
the logic is too hard to follow.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f121159d

virtio_net: fix error handling for mergeable buffers · 8fc3b9e9

由 Michael S. Tsirkin 提交于 11月 28, 2013

Eric Dumazet noticed that if we encounter an error
when processing a mergeable buffer, we don't
dequeue all of the buffers from this packet,
the result is almost sure to be loss of networking.

Jason Wang noticed that we also leak a page and that we don't decrement
the rq buf count, so we won't repost buffers (a resource leak).

Fix both issues.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael Dalton <mwdalton@google.com>
Reported-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fc3b9e9

01 12月, 2013 3 次提交

[SCSI] hpsa: return 0 from driver probe function on success, not 1 · 88bf6d62

由 Stephen M. Cameron 提交于 11月 01, 2013

A return value of 1 is interpreted as an error.  See pci_driver.
in local_pci_probe().  If you're wondering how this ever could
have worked, it's because it used to be the case that only return
values less than zero were interpreted as failure.  But even in
the current kernel if the driver registers its various entry
points with the kernel, and then returns a value which is
interpreted as failure, those registrations aren't undone, so
the driver still mostly works.  However, the driver's remove
function wouldn't be called on rmmod, and pci power management
functions wouldn't work.  In the case of Smart Array, since it
has a battery backed cache (or else no cache) even if the driver
is not shut down properly as long as there is no outstanding
i/o, nothing too bad happens, which is why it took so long to
notice.

Requesting backport to stable because the change to pci-driver.c
which requires driver probe functions to return 0 occurred between
2.6.35 and 2.6.36 (the pci power management breakage) and again
between 3.7 and 3.8 (pci_dev->driver getting set to NULL in
local_pci_probe() preventing driver remove function from being
called on rmmod.)
Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

88bf6d62

[SCSI] hpsa: do not discard scsi status on aborted commands · 2e311fba

由 Stephen M. Cameron 提交于 9月 23, 2013

We inadvertantly discarded the scsi status for aborted commands.
For some commands (e.g. reads from tape drives) these can't be retried,
and if we discarded the scsi status, the scsi mid layer couldn't notice
anything was wrong and the error was not reported.
Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

2e311fba

virtio_net: Fixed a trivial typo (fitler --> filter) · 99e872ae

由 Thomas Huth 提交于 11月 29, 2013

"MAC filter" sounds more reasonable than "MAC fitler".
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99e872ae

30 11月, 2013 6 次提交

ixgbe: Make ixgbe_identify_qsfp_module_generic static · 88217547

由 Mark Rustad 提交于 11月 23, 2013

Correct a namespace complaint by making the function static
and moving the prototype into the .c file.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

88217547

ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default · 8bf1264d

由 John Fastabend 提交于 11月 12, 2013

NETIF_F_HW_L2FW_DOFFLOAD allows upper layer net devices such
as macvlan to use queues in the hardware to directly submit and
receive skbs.

This creates a subtle change in the datapath though. One change
being the skb may no longer use the root devices qdisc.

Because users may not expect this we can't enable the feature
by default unless the hardware can offload all the software
functionality above it. So for now disable it by default and
let users opt in.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8bf1264d

ixgbe: ixgbe_fwd_ring_down needs to be static · ae72c8d0

由 John Fastabend 提交于 11月 09, 2013

When compiling with -Wstrict-prototypes gcc catches a static
I missed.

./ixgbe_main.c:4254: warning: no previous prototype for 'ixgbe_fwd_ring_down'
Reported-by: NPhillip Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ae72c8d0

e1000: fix possible reset_task running after adapter down · 74a1b1ea

由 Vladimir Davydov 提交于 11月 23, 2013

On e1000_down(), we should ensure every asynchronous work is canceled
before proceeding. Since the watchdog_task can schedule other works
apart from itself, it should be stopped first, but currently it is
stopped after the reset_task. This can result in the following race
leading to the reset_task running after the module unload:

e1000_down_and_stop():			e1000_watchdog():
----------------------			-----------------

cancel_work_sync(reset_task)
					schedule_work(reset_task)
cancel_delayed_work_sync(watchdog_task)

The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning
of e1000_down_and_stop() thus ensuring the race is impossible.

Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

74a1b1ea

e1000: fix lockdep warning in e1000_reset_task · b2f963bf

由 Vladimir Davydov 提交于 11月 23, 2013

The patch fixes the following lockdep warning, which is 100%
reproducible on network restart:

======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0+ #47 Tainted: GF
-------------------------------------------------------
kworker/1:1/27 is trying to acquire lock:
 ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70

but task is already holding lock:
 (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&adapter->mutex){+.+...}:
       [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
       [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390
       [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000]
       [<ffffffff8108b972>] process_one_work+0x1d2/0x510
       [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
       [<ffffffff81092c1e>] kthread+0xee/0x110
       [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0

-> #0 ((&(&adapter->watchdog_task)->work)){+.+...}:
       [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
       [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
       [<ffffffff8108a5eb>] flush_work+0x3b/0x70
       [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
       [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
       [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
       [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
       [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
       [<ffffffff8108b972>] process_one_work+0x1d2/0x510
       [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
       [<ffffffff81092c1e>] kthread+0xee/0x110
       [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&adapter->mutex);
                               lock((&(&adapter->watchdog_task)->work));
                               lock(&adapter->mutex);
  lock((&(&adapter->watchdog_task)->work));

 *** DEADLOCK ***

3 locks held by kworker/1:1/27:
 #0:  (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
 #1:  ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
 #2:  (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]

stack backtrace:
CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF            3.12.0+ #47
Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501    05/31/2007
Workqueue: events e1000_reset_task [e1000]
 ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002
 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8
 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00
Call Trace:
 [<ffffffff816b54a2>] dump_stack+0x49/0x5f
 [<ffffffff810ba936>] print_circular_bug+0x216/0x310
 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff8108a5eb>] flush_work+0x3b/0x70
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
 [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
 [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
 [<ffffffff8108b972>] process_one_work+0x1d2/0x510
 [<ffffffff8108b906>] ? process_one_work+0x166/0x510
 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0
 [<ffffffff81092c1e>] kthread+0xee/0x110
 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70

== The issue background ==

The problem occurs, because e1000_down(), which is called under
adapter->mutex by e1000_reset_task(), tries to synchronously cancel
e1000 auxiliary works (reset_task, watchdog_task, phy_info_task,
fifo_stall_task), which take adapter->mutex in their handlers. So the
question is what does adapter->mutex protect there?

The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to
private mutex from rtnl") as a replacement for rtnl_lock() taken in the
asynchronous handlers. It targeted on fixing a similar lockdep warning
issued when e1000_down() was called under rtnl_lock(), and it fixed it,
but unfortunately it introduced the lockdep warning described above.
Anyway, that said the source of this bug is that the asynchronous works
were made to take rtnl_lock() some time ago, so let's look deeper and
find why it was added there.

The rtnl_lock() was added to asynchronous handlers by commit 338c15
("e1000: fix occasional panic on unload") in order to prevent
asynchronous handlers from execution after the module is unloaded
(e1000_down() is called) as it follows from the comment to the commit:

> Net drivers in general have an issue where timers fired
> by mod_timer or work threads with schedule_work are running
> outside of the rtnl_lock.
>
> With no other lock protection these routines are vulnerable
> to races with driver unload or reset paths.
>
> The longer term solution to this might be a redesign with
> safer locks being taken in the driver to guarantee no
> reentrance, but for now a safe and effective fix is
> to take the rtnl_lock in these routines.

I'm not sure if this locking scheme fixed the problem or just made it
unlikely, although I incline to the latter. Anyway, this was long time
ago when e1000 auxiliary works were implemented as timers scheduling
real work handlers in their routines. The e1000_down() function only
canceled the timers, but left the real handlers running if they were
running, which could result in work execution after module unload.
Today, the e1000 driver uses sane delayed works instead of the pair
timer+work to implement its delayed asynchronous handlers, and the
e1000_down() synchronously cancels all the works so that the problem
that commit 338c15 tried to cope with disappeared, and we don't need any
locks in the handlers any more. Moreover, any locking there can
potentially result in a deadlock.

So, this patch reverts commits 0ef4ee and 338c15.

Fixes: 0ef4eedc ("e1000: convert to private mutex from rtnl")
Fixes: 338c15e4 ("e1000: fix occasional panic on unload")
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

b2f963bf

e1000: prevent oops when adapter is being closed and reset simultaneously · 6a7d64e3

由 yzhu1 提交于 11月 23, 2013

This change is based on a similar change made to e1000e support in
commit bb9e44d0 ("e1000e: prevent oops when adapter is being closed
and reset simultaneously").  The same issue has also been observed
on the older e1000 cards.

Here, we have increased the RESET_COUNT value to 50 because there are too
many accesses to e1000 nic on stress tests to e1000 nic, it is not enough
to set RESET_COUT 25. Experimentation has shown that it is enough to set
RESET_COUNT 50.
Signed-off-by: Nyzhu1 <yanjun.zhu@windriver.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6a7d64e3