提交 1758fedd 编写于 作者: L Linus Torvalds

Merge tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Improve stop_machine wait logic: replace cpu_relax_yield call in
   generic stop_machine function with a weak stop_machine_yield
   function. This is overridden on s390, which yields the current cpu to
   the neighbouring cpu after a couple of retries, instead of blindly
   giving up the cpu to the hipervisor. This significantly improves
   stop_machine performance on s390 in overcommitted scenarios.

   This includes common code changes which have been Acked by Peter
   Zijlstra and Thomas Gleixner.

 - Improve jump label transformation speed: transform jump labels
   without using stop_machine.

 - Refactoring of the vfio-ccw cp handling, simplifying the code and
   avoiding unneeded allocating/copying.

 - Various vfio-ccw fixes (ccw translation, state machine).

 - Add support for vfio-ap queue interrupt control in the guest. This
   includes s390 kvm changes which have been Acked by Christian
   Borntraeger.

 - Add protected virtualization support for virtio-ccw.

 - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to
   remove some code which most likely isn't working at all, besides that
   s390 didn't even compile for !CONFIG_SMP.

 - Support for special flagged EP11 CPRBs for zcrypt.

 - Handle PCI devices with no support for new MIO instructions.

 - Avoid KASAN false positives in reworked stack unwinder.

 - Couple of fixes for the QDIO layer.

 - Convert s390 specific documentation to ReST format.

 - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if
   hardware is missing. This way our modules behave like most other
   modules and which is also what systemd's systemd-modules-load.service
   expects.

 - Replace defconfig with performance_defconfig, so there is one config
   file less to maintain.

 - Remove the SCLP call home device driver, which was never useful.

 - Cleanups all over the place.

* tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
  docs: s390: s390dbf: typos and formatting, update crash command
  docs: s390: unify and update s390dbf kdocs at debug.c
  docs: s390: restore important non-kdoc parts of s390dbf.rst
  vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1
  s390/pci: correctly handle MIO opt-out
  s390/pci: deal with devices that have no support for MIO instructions
  s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  s390: ap: implement PAPQ AQIC interception in kernel
  vfio: ap: register IOMMU VFIO notifier
  s390: ap: kvm: add PQAP interception for AQIC
  s390/unwind: cleanup unused READ_ONCE_TASK_STACK
  s390/kasan: avoid false positives during stack unwind
  s390/qdio: don't touch the dsci in tiqdio_add_input_queues()
  s390/qdio: (re-)initialize tiqdio list entries
  s390/dasd: Fix a precision vs width bug in dasd_feature_list()
  s390/cio: introduce driver_override on the css bus
  vfio-ccw: make convert_ccw0_to_ccw1 static
  vfio-ccw: Remove copy_ccw_from_iova()
  vfio-ccw: Factor out the ccw0-to-ccw1 transition
  vfio-ccw: Copy CCW data outside length calculation
  ...
......@@ -33,3 +33,26 @@ Description: Contains the PIM/PAM/POM values, as reported by the
in sync with the values current in the channel subsystem).
Note: This is an I/O-subchannel specific attribute.
Users: s390-tools, HAL
What: /sys/bus/css/devices/.../driver_override
Date: June 2019
Contact: Cornelia Huck <cohuck@redhat.com>
linux-s390@vger.kernel.org
Description: This file allows the driver for a device to be specified. When
specified, only a driver with a name matching the value written
to driver_override will have an opportunity to bind to the
device. The override is specified by writing a string to the
driver_override file (echo vfio-ccw > driver_override) and
may be cleared with an empty string (echo > driver_override).
This returns the device to standard matching rules binding.
Writing to driver_override does not automatically unbind the
device from its current driver or make any attempt to
automatically load the specified driver. If no driver with a
matching name is currently loaded in the kernel, the device
will not bind to any driver. This also allows devices to
opt-out of driver binding using a driver_override name such as
"none". Only a single driver may be specified in the override,
there is no support for parsing delimiters.
Note that unlike the mechanism of the same name for pci, this
file does not allow to override basic matching rules. I.e.,
the driver must still match the subchannel type of the device.
......@@ -478,7 +478,7 @@
others).
ccw_timeout_log [S390]
See Documentation/s390/CommonIO for details.
See Documentation/s390/common_io.rst for details.
cgroup_disable= [KNL] Disable a particular controller
Format: {name of the controller(s) to disable}
......@@ -516,7 +516,7 @@
/selinux/checkreqprot.
cio_ignore= [S390]
See Documentation/s390/CommonIO for details.
See Documentation/s390/common_io.rst for details.
clk_ignore_unused
[CLK]
Prevents the clock framework from automatically gating
......
......@@ -27,7 +27,7 @@ not strictly considered I/O devices. They are considered here as well,
although they are not the focus of this document.
Some additional information can also be found in the kernel source under
Documentation/s390/driver-model.txt.
Documentation/s390/driver-model.rst.
The css bus
===========
......@@ -38,7 +38,7 @@ into several categories:
* Standard I/O subchannels, for use by the system. They have a child
device on the ccw bus and are described below.
* I/O subchannels bound to the vfio-ccw driver. See
Documentation/s390/vfio-ccw.txt.
Documentation/s390/vfio-ccw.rst.
* Message subchannels. No Linux driver currently exists.
* CHSC subchannels (at most one). The chsc subchannel driver can be used
to send asynchronous chsc commands.
......
===============================
IBM 3270 Display System support
===============================
This file describes the driver that supports local channel attachment
of IBM 3270 devices. It consists of three sections:
* Introduction
* Installation
* Operation
INTRODUCTION.
Introduction
============
This paper describes installing and operating 3270 devices under
Linux/390. A 3270 device is a block-mode rows-and-columns terminal of
......@@ -17,12 +21,12 @@ twenty and thirty years ago.
You may have 3270s in-house and not know it. If you're using the
VM-ESA operating system, define a 3270 to your virtual machine by using
the command "DEF GRAF <hex-address>" This paper presumes you will be
defining four 3270s with the CP/CMS commands
defining four 3270s with the CP/CMS commands:
DEF GRAF 620
DEF GRAF 621
DEF GRAF 622
DEF GRAF 623
- DEF GRAF 620
- DEF GRAF 621
- DEF GRAF 622
- DEF GRAF 623
Your network connection from VM-ESA allows you to use x3270, tn3270, or
another 3270 emulator, started from an xterm window on your PC or
......@@ -34,7 +38,8 @@ This paper covers installation of the driver and operation of a
dialed-in x3270.
INSTALLATION.
Installation
============
You install the driver by installing a patch, doing a kernel build, and
running the configuration script (config3270.sh, in this directory).
......@@ -59,13 +64,15 @@ Use #CP TERM CONMODE 3270 to change it to 3270. If you generate only
at boot time to a 3270 if it is a 3215.
In brief, these are the steps:
1. Install the tub3270 patch
2. (If a module) add a line to a file in /etc/modprobe.d/*.conf
2. (If a module) add a line to a file in `/etc/modprobe.d/*.conf`
3. (If VM) define devices with DEF GRAF
4. Reboot
5. Configure
To test that everything works, assuming VM and x3270,
1. Bring up an x3270 window.
2. Use the DIAL command in that window.
3. You should immediately see a Linux login screen.
......@@ -74,7 +81,8 @@ Here are the installation steps in detail:
1. The 3270 driver is a part of the official Linux kernel
source. Build a tree with the kernel source and any necessary
patches. Then do
patches. Then do::
make oldconfig
(If you wish to disable 3215 console support, edit
.config; change CONFIG_TN3215's value to "n";
......@@ -84,20 +92,22 @@ Here are the installation steps in detail:
make modules_install
2. (Perform this step only if you have configured tub3270 as a
module.) Add a line to a file /etc/modprobe.d/*.conf to automatically
module.) Add a line to a file `/etc/modprobe.d/*.conf` to automatically
load the driver when it's needed. With this line added, you will see
login prompts appear on your 3270s as soon as boot is complete (or
with emulated 3270s, as soon as you dial into your vm guest using the
command "DIAL <vmguestname>"). Since the line-mode major number is
227, the line to add should be:
227, the line to add should be::
alias char-major-227 tub3270
3. Define graphic devices to your vm guest machine, if you
haven't already. Define them before you reboot (reipl):
DEFINE GRAF 620
DEFINE GRAF 621
DEFINE GRAF 622
DEFINE GRAF 623
- DEFINE GRAF 620
- DEFINE GRAF 621
- DEFINE GRAF 622
- DEFINE GRAF 623
4. Reboot. The reboot process scans hardware devices, including
3270s, and this enables the tub3270 driver once loaded to respond
......@@ -113,7 +123,8 @@ Here are the installation steps in detail:
changes to /etc/inittab.
Then notify /sbin/init that /etc/inittab has changed, by issuing
the telinit command with the q operand:
the telinit command with the q operand::
cd Documentation/s390
sh config3270.sh
sh /tmp/mkdev3270
......@@ -121,7 +132,8 @@ Here are the installation steps in detail:
This should be sufficient for your first time. If your 3270
configuration has changed and you're reusing config3270, you
should follow these steps:
should follow these steps::
Change 3270 configuration
Reboot
Run config3270 and /tmp/mkdev3270
......@@ -132,8 +144,10 @@ Here are the testing steps in detail:
1. Bring up an x3270 window, or use an actual hardware 3278 or
3279, or use the 3270 emulator of your choice. You would be
running the emulator on your PC or workstation. You would use
the command, for example,
the command, for example::
x3270 vm-esa-domain-name &
if you wanted a 3278 Model 4 with 43 rows of 80 columns, the
default model number. The driver does not take advantage of
extended attributes.
......@@ -144,7 +158,8 @@ Here are the testing steps in detail:
2. Use the DIAL command instead of the LOGIN command to connect
to one of the virtual 3270s you defined with the DEF GRAF
commands:
commands::
dial my-vm-guest-name
3. You should immediately see a login prompt from your
......@@ -171,14 +186,17 @@ Here are the testing steps in detail:
Wrong major number? Wrong minor number? There's your
problem!
D. Do you get the message
D. Do you get the message::
"HCPDIA047E my-vm-guest-name 0620 does not exist"?
If so, you must issue the command "DEF GRAF 620" from your VM
3215 console and then reboot the system.
OPERATION.
==========
The driver defines three areas on the 3270 screen: the log area, the
input area, and the status area.
......@@ -203,8 +221,10 @@ which indicates no scrolling will occur. (If you hit ENTER with "Linux
Running" and nothing typed, the application receives a newline.)
You may change the scrolling timeout value. For example, the following
command line:
command line::
echo scrolltime=60 > /proc/tty/driver/tty3270
changes the scrolling timeout value to 60 sec. Set scrolltime to 0 if
you wish to prevent scrolling entirely.
......@@ -228,7 +248,8 @@ cause an EOF also by typing "^D" and hitting ENTER.
No PF key is preassigned to cause a job suspension, but you may cause a
job suspension by typing "^Z" and hitting ENTER. You may wish to
assign this function to a PF key. To make PF7 cause job suspension,
execute the command:
execute the command::
echo pf7=^z > /proc/tty/driver/tty3270
If the input you type does not end with the two characters "^n", the
......@@ -243,8 +264,10 @@ command is entered into the stack only when the input area is not made
invisible (such as for password entry) and it is not identical to the
current top entry. PF10 rotates backward through the command stack;
PF11 rotates forward. You may assign the backward function to any PF
key (or PA key, for that matter), say, PA3, with the command:
key (or PA key, for that matter), say, PA3, with the command::
echo -e pa3=\\033k > /proc/tty/driver/tty3270
This assigns the string ESC-k to PA3. Similarly, the string ESC-j
performs the forward function. (Rationale: In bash with vi-mode line
editing, ESC-k and ESC-j retrieve backward and forward history.
......@@ -252,15 +275,19 @@ Suggestions welcome.)
Is a stack size of twenty commands not to your liking? Change it on
the fly. To change to saving the last 100 commands, execute the
command:
command::
echo recallsize=100 > /proc/tty/driver/tty3270
Have a command you issue frequently? Assign it to a PF or PA key! Use
the command
the command::
echo pf24="mkdir foobar; cd foobar" > /proc/tty/driver/tty3270
to execute the commands mkdir foobar and cd foobar immediately when you
hit PF24. Want to see the command line first, before you execute it?
Use the -n option of the echo command:
Use the -n option of the echo command::
echo -n pf24="mkdir foo; cd foo" > /proc/tty/driver/tty3270
......
===========================
Linux for S/390 and zSeries
===========================
Common Device Support (CDS)
Device Driver I/O Support Routines
Authors : Ingo Adlung
Cornelia Huck
Authors:
- Ingo Adlung
- Cornelia Huck
Copyright, IBM Corp. 1999-2002
Introduction
============
This document describes the common device support routines for Linux/390.
Different than other hardware architectures, ESA/390 has defined a unified
......@@ -34,11 +38,13 @@ below. Some of them implement common Linux device driver interfaces, while
some of them are ESA/390 platform specific.
Note:
In order to write a driver for S/390, you also need to look into the interface
described in Documentation/s390/driver-model.txt.
In order to write a driver for S/390, you also need to look into the interface
described in Documentation/s390/driver-model.rst.
Note for porting drivers from 2.4:
The major changes are:
* The functions use a ccw_device instead of an irq (subchannel).
* All drivers must define a ccw_driver (see driver-model.txt) and the associated
functions.
......@@ -57,10 +63,7 @@ The major changes are:
ccw_device_get_ciw()
get commands from extended sense data.
ccw_device_start()
ccw_device_start_timeout()
ccw_device_start_key()
ccw_device_start_key_timeout()
ccw_device_start(), ccw_device_start_timeout(), ccw_device_start_key(), ccw_device_start_key_timeout()
initiate an I/O request.
ccw_device_resume()
......@@ -82,12 +85,15 @@ first level interrupt handler only and does not comprise a device driver
callable interface. Instead, the functional description of do_IO() also
describes the input to the device specific interrupt handler.
Note: All explanations apply also to the 64 bit architecture s390x.
Note:
All explanations apply also to the 64 bit architecture s390x.
Common Device Support (CDS) for Linux/390 Device Drivers
========================================================
General Information
-------------------
The following chapters describe the I/O related interface routines the
Linux/390 common device support (CDS) provides to allow for device specific
......@@ -101,6 +107,7 @@ can be found in the architecture specific C header file
linux/arch/s390/include/asm/irq.h.
Overview of CDS interface concepts
----------------------------------
Different to other hardware platforms, the ESA/390 architecture doesn't define
interrupt lines managed by a specific interrupt controller and bus systems
......@@ -164,18 +171,26 @@ get_ciw() - get command information word
This call enables a device driver to get information about supported commands
from the extended SenseID data.
struct ciw *
ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd);
::
struct ciw *
ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd);
cdev - The ccw_device for which the command is to be retrieved.
cmd - The command type to be retrieved.
==== ========================================================
cdev The ccw_device for which the command is to be retrieved.
cmd The command type to be retrieved.
==== ========================================================
ccw_device_get_ciw() returns:
NULL - No extended data available, invalid device or command not found.
!NULL - The command requested.
===== ================================================================
NULL No extended data available, invalid device or command not found.
!NULL The command requested.
===== ================================================================
ccw_device_start() - Initiate I/O Request
::
ccw_device_start() - Initiate I/O Request
The ccw_device_start() routines is the I/O request front-end processor. All
device driver I/O requests must be issued using this routine. A device driver
......@@ -186,24 +201,26 @@ This description also covers the status information passed to the device
driver's interrupt handler as this is related to the rules (flags) defined
with the associated I/O request when calling ccw_device_start().
int ccw_device_start(struct ccw_device *cdev,
::
int ccw_device_start(struct ccw_device *cdev,
struct ccw1 *cpa,
unsigned long intparm,
__u8 lpm,
unsigned long flags);
int ccw_device_start_timeout(struct ccw_device *cdev,
int ccw_device_start_timeout(struct ccw_device *cdev,
struct ccw1 *cpa,
unsigned long intparm,
__u8 lpm,
unsigned long flags,
int expires);
int ccw_device_start_key(struct ccw_device *cdev,
int ccw_device_start_key(struct ccw_device *cdev,
struct ccw1 *cpa,
unsigned long intparm,
__u8 lpm,
__u8 key,
unsigned long flags);
int ccw_device_start_key_timeout(struct ccw_device *cdev,
int ccw_device_start_key_timeout(struct ccw_device *cdev,
struct ccw1 *cpa,
unsigned long intparm,
__u8 lpm,
......@@ -211,63 +228,73 @@ int ccw_device_start_key_timeout(struct ccw_device *cdev,
unsigned long flags,
int expires);
cdev : ccw_device the I/O is destined for
cpa : logical start address of channel program
user_intparm : user specific interrupt information; will be presented
============= =============================================================
cdev ccw_device the I/O is destined for
cpa logical start address of channel program
user_intparm user specific interrupt information; will be presented
back to the device driver's interrupt handler. Allows a
device driver to associate the interrupt with a
particular I/O request.
lpm : defines the channel path to be used for a specific I/O
lpm defines the channel path to be used for a specific I/O
request. A value of 0 will make cio use the opm.
key : the storage key to use for the I/O (useful for operating on a
key the storage key to use for the I/O (useful for operating on a
storage with a storage key != default key)
flag : defines the action to be performed for I/O processing
expires : timeout value in jiffies. The common I/O layer will terminate
flag defines the action to be performed for I/O processing
expires timeout value in jiffies. The common I/O layer will terminate
the running program after this and call the interrupt handler
with ERR_PTR(-ETIMEDOUT) as irb.
============= =============================================================
Possible flag values are :
Possible flag values are:
DOIO_ALLOW_SUSPEND - channel program may become suspended
DOIO_DENY_PREFETCH - don't allow for CCW prefetch; usually
========================= =============================================
DOIO_ALLOW_SUSPEND channel program may become suspended
DOIO_DENY_PREFETCH don't allow for CCW prefetch; usually
this implies the channel program might
become modified
DOIO_SUPPRESS_INTER - don't call the handler on intermediate status
DOIO_SUPPRESS_INTER don't call the handler on intermediate status
========================= =============================================
The cpa parameter points to the first format 1 CCW of a channel program :
The cpa parameter points to the first format 1 CCW of a channel program::
struct ccw1 {
struct ccw1 {
__u8 cmd_code;/* command code */
__u8 flags; /* flags, like IDA addressing, etc. */
__u16 count; /* byte count */
__u32 cda; /* data address */
} __attribute__ ((packed,aligned(8)));
} __attribute__ ((packed,aligned(8)));
with the following CCW flags values defined :
with the following CCW flags values defined:
CCW_FLAG_DC - data chaining
CCW_FLAG_CC - command chaining
CCW_FLAG_SLI - suppress incorrect length
CCW_FLAG_SKIP - skip
CCW_FLAG_PCI - PCI
CCW_FLAG_IDA - indirect addressing
CCW_FLAG_SUSPEND - suspend
=================== =========================
CCW_FLAG_DC data chaining
CCW_FLAG_CC command chaining
CCW_FLAG_SLI suppress incorrect length
CCW_FLAG_SKIP skip
CCW_FLAG_PCI PCI
CCW_FLAG_IDA indirect addressing
CCW_FLAG_SUSPEND suspend
=================== =========================
Via ccw_device_set_options(), the device driver may specify the following
options for the device:
DOIO_EARLY_NOTIFICATION - allow for early interrupt notification
DOIO_REPORT_ALL - report all interrupt conditions
========================= ======================================
DOIO_EARLY_NOTIFICATION allow for early interrupt notification
DOIO_REPORT_ALL report all interrupt conditions
========================= ======================================
The ccw_device_start() function returns :
The ccw_device_start() function returns:
0 - successful completion or request successfully initiated
-EBUSY - The device is currently processing a previous I/O request, or there is
======== ======================================================================
0 successful completion or request successfully initiated
-EBUSY The device is currently processing a previous I/O request, or there is
a status pending at the device.
-ENODEV - cdev is invalid, the device is not operational or the ccw_device is
-ENODEV cdev is invalid, the device is not operational or the ccw_device is
not online.
======== ======================================================================
When the I/O request completes, the CDS first level interrupt handler will
accumulate the status in a struct irb and then call the device interrupt handler.
......@@ -282,9 +309,11 @@ never started, even though ccw_device_start() returned with successful completio
The irb may contain an error value, and the device driver should check for this
first:
-ETIMEDOUT: the common I/O layer terminated the request after the specified
========== =================================================================
-ETIMEDOUT the common I/O layer terminated the request after the specified
timeout value
-EIO: the common I/O layer terminated the request due to an error state
-EIO the common I/O layer terminated the request due to an error state
========== =================================================================
If the concurrent sense flag in the extended status word (esw) in the irb is
set, the field erw.scnt in the esw describes the number of device specific
......@@ -294,6 +323,7 @@ sensing by the device driver itself is required.
The device interrupt handler can use the following definitions to investigate
the primary unit check source coded in sense byte 0 :
======================= ====
SNS0_CMD_REJECT 0x80
SNS0_INTERVENTION_REQ 0x40
SNS0_BUS_OUT_CHECK 0x20
......@@ -301,36 +331,41 @@ SNS0_EQUIPMENT_CHECK 0x10
SNS0_DATA_CHECK 0x08
SNS0_OVERRUN 0x04
SNS0_INCOMPL_DOMAIN 0x01
======================= ====
Depending on the device status, multiple of those values may be set together.
Please refer to the device specific documentation for details.
The irb->scsw.cstat field provides the (accumulated) subchannel status :
SCHN_STAT_PCI - program controlled interrupt
SCHN_STAT_INCORR_LEN - incorrect length
SCHN_STAT_PROG_CHECK - program check
SCHN_STAT_PROT_CHECK - protection check
SCHN_STAT_CHN_DATA_CHK - channel data check
SCHN_STAT_CHN_CTRL_CHK - channel control check
SCHN_STAT_INTF_CTRL_CHK - interface control check
SCHN_STAT_CHAIN_CHECK - chaining check
========================= ============================
SCHN_STAT_PCI program controlled interrupt
SCHN_STAT_INCORR_LEN incorrect length
SCHN_STAT_PROG_CHECK program check
SCHN_STAT_PROT_CHECK protection check
SCHN_STAT_CHN_DATA_CHK channel data check
SCHN_STAT_CHN_CTRL_CHK channel control check
SCHN_STAT_INTF_CTRL_CHK interface control check
SCHN_STAT_CHAIN_CHECK chaining check
========================= ============================
The irb->scsw.dstat field provides the (accumulated) device status :
DEV_STAT_ATTENTION - attention
DEV_STAT_STAT_MOD - status modifier
DEV_STAT_CU_END - control unit end
DEV_STAT_BUSY - busy
DEV_STAT_CHN_END - channel end
DEV_STAT_DEV_END - device end
DEV_STAT_UNIT_CHECK - unit check
DEV_STAT_UNIT_EXCEP - unit exception
===================== =================
DEV_STAT_ATTENTION attention
DEV_STAT_STAT_MOD status modifier
DEV_STAT_CU_END control unit end
DEV_STAT_BUSY busy
DEV_STAT_CHN_END channel end
DEV_STAT_DEV_END device end
DEV_STAT_UNIT_CHECK unit check
DEV_STAT_UNIT_EXCEP unit exception
===================== =================
Please see the ESA/390 Principles of Operation manual for details on the
individual flag meanings.
Usage Notes :
Usage Notes:
ccw_device_start() must be called disabled and with the ccw device lock held.
......@@ -387,19 +422,26 @@ setting the CCW suspend flag on a particular CCW, the channel program execution
is suspended. In order to resume channel program execution the CIO layer
provides the ccw_device_resume() routine.
int ccw_device_resume(struct ccw_device *cdev);
::
cdev - ccw_device the resume operation is requested for
int ccw_device_resume(struct ccw_device *cdev);
==== ================================================
cdev ccw_device the resume operation is requested for
==== ================================================
The ccw_device_resume() function returns:
0 - suspended channel program is resumed
-EBUSY - status pending
-ENODEV - cdev invalid or not-operational subchannel
-EINVAL - resume function not applicable
-ENOTCONN - there is no I/O request pending for completion
========= ==============================================
0 suspended channel program is resumed
-EBUSY status pending
-ENODEV cdev invalid or not-operational subchannel
-EINVAL resume function not applicable
-ENOTCONN there is no I/O request pending for completion
========= ==============================================
Usage Notes:
Please have a look at the ccw_device_start() usage notes for more details on
suspended channel programs.
......@@ -412,22 +454,28 @@ command is provided.
ccw_device_halt() must be called disabled and with the ccw device lock held.
int ccw_device_halt(struct ccw_device *cdev,
::
int ccw_device_halt(struct ccw_device *cdev,
unsigned long intparm);
cdev : ccw_device the halt operation is requested for
intparm : interruption parameter; value is only used if no I/O
======= =====================================================
cdev ccw_device the halt operation is requested for
intparm interruption parameter; value is only used if no I/O
is outstanding, otherwise the intparm associated with
the I/O request is returned
======= =====================================================
The ccw_device_halt() function returns :
The ccw_device_halt() function returns:
0 - request successfully initiated
-EBUSY - the device is currently busy, or status pending.
-ENODEV - cdev invalid.
-EINVAL - The device is not operational or the ccw device is not online.
======= ==============================================================
0 request successfully initiated
-EBUSY the device is currently busy, or status pending.
-ENODEV cdev invalid.
-EINVAL The device is not operational or the ccw device is not online.
======= ==============================================================
Usage Notes :
Usage Notes:
A device driver may write a never-ending channel program by writing a channel
program that at its end loops back to its beginning by means of a transfer in
......@@ -438,25 +486,34 @@ can then perform an appropriate action. Prior to interrupt of an outstanding
read to a network device (with or without PCI flag) a ccw_device_halt()
is required to end the pending operation.
ccw_device_clear() - Terminage I/O Request Processing
::
ccw_device_clear() - Terminage I/O Request Processing
In order to terminate all I/O processing at the subchannel, the clear subchannel
(CSCH) command is used. It can be issued via ccw_device_clear().
ccw_device_clear() must be called disabled and with the ccw device lock held.
int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm);
::
int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm);
cdev: ccw_device the clear operation is requested for
intparm: interruption parameter (see ccw_device_halt())
======= ===============================================
cdev ccw_device the clear operation is requested for
intparm interruption parameter (see ccw_device_halt())
======= ===============================================
The ccw_device_clear() function returns:
0 - request successfully initiated
-ENODEV - cdev invalid
-EINVAL - The device is not operational or the ccw device is not online.
======= ==============================================================
0 request successfully initiated
-ENODEV cdev invalid
-EINVAL The device is not operational or the ccw device is not online.
======= ==============================================================
Miscellaneous Support Routines
------------------------------
This chapter describes various routines to be used in a Linux/390 device
driver programming environment.
......@@ -466,7 +523,8 @@ get_ccwdev_lock()
Get the address of the device specific lock. This is then used in
spin_lock() / spin_unlock() calls.
::
__u8 ccw_device_get_path_mask(struct ccw_device *cdev);
__u8 ccw_device_get_path_mask(struct ccw_device *cdev);
Get the mask of the path currently available for cdev.
S/390 common I/O-Layer - command line parameters, procfs and debugfs entries
============================================================================
======================
S/390 common I/O-Layer
======================
command line parameters, procfs and debugfs entries
===================================================
Command line parameters
-----------------------
......@@ -28,14 +32,20 @@ Command line parameters
keywords can be used to refer to the CCW based boot device and CCW console
device respectively (these are probably useful only when combined with the '!'
operator). The '!' operator will cause the I/O-layer to _not_ ignore a device.
The command line is parsed from left to right.
The command line
is parsed from left to right.
For example::
For example,
cio_ignore=0.0.0023-0.0.0042,0.0.4711
will ignore all devices ranging from 0.0.0023 to 0.0.0042 and the device
0.0.4711, if detected.
As another example,
As another example::
cio_ignore=all,!0.0.4711,!0.0.fd00-0.0.fd02
will ignore all devices but 0.0.4711, 0.0.fd00, 0.0.fd01, 0.0.fd02.
By default, no devices are ignored.
......@@ -54,6 +64,7 @@ Command line parameters
devices.
For example, if devices 0.0.0023 to 0.0.0042 and 0.0.4711 are ignored,
- echo free 0.0.0030-0.0.0032 > /proc/cio_ignore
will un-ignore devices 0.0.0030 to 0.0.0032 and will leave devices 0.0.0023
to 0.0.002f, 0.0.0033 to 0.0.0042 and 0.0.4711 ignored;
......@@ -75,13 +86,17 @@ Command line parameters
disappears and then reappears, it will then be ignored. To make
known devices go away, you need the "purge" command (see below).
For example,
For example::
"echo add 0.0.a000-0.0.accc, 0.0.af00-0.0.afff > /proc/cio_ignore"
will add 0.0.a000-0.0.accc and 0.0.af00-0.0.afff to the list of ignored
devices.
You can remove already known but now ignored devices via
You can remove already known but now ignored devices via::
"echo purge > /proc/cio_ignore"
All devices ignored but still registered and not online (= not in use)
will be deregistered and thus removed from the system.
......@@ -121,5 +136,5 @@ debugfs entries
The level of logging can be changed to be more or less verbose by piping to
/sys/kernel/debug/s390dbf/cio_*/level a number between 0 and 6; see the
documentation on the S/390 debug feature (Documentation/s390/s390dbf.txt)
documentation on the S/390 debug feature (Documentation/s390/s390dbf.rst)
for details.
==================
DASD device driver
==================
S/390's disk devices (DASDs) are managed by Linux via the DASD device
driver. It is valid for all types of DASDs and represents them to
......@@ -34,19 +36,22 @@ accessibility of the DASD from other OSs. In a later stage we will
provide support of partitions, maybe VTOC oriented or using a kind of
partition table in the label record.
USAGE
Usage
=====
-Low-level format (?CKD only)
For using an ECKD-DASD as a Linux harddisk you have to low-level
format the tracks by issuing the BLKDASDFORMAT-ioctl on that
device. This will erase any data on that volume including IBM volume
labels, VTOCs etc. The ioctl may take a 'struct format_data *' or
'NULL' as an argument.
typedef struct {
labels, VTOCs etc. The ioctl may take a `struct format_data *` or
'NULL' as an argument::
typedef struct {
int start_unit;
int stop_unit;
int blksize;
} format_data_t;
} format_data_t;
When a NULL argument is passed to the BLKDASDFORMAT ioctl the whole
disk is formatted to a blocksize of 1024 bytes. Otherwise start_unit
and stop_unit are the first and last track to be formatted. If
......@@ -56,17 +61,23 @@ up to the last track. blksize can be any power of two between 512 and
1kB blocks anyway and you gain approx. 50% of capacity increasing your
blksize from 512 byte to 1kB.
-Make a filesystem
Make a filesystem
=================
Then you can mk??fs the filesystem of your choice on that volume or
partition. For reasons of sanity you should build your filesystem on
the partition /dev/dd?1 instead of the whole volume. You only lose 3kB
but may be sure that you can reuse your data after introduction of a
real partition table.
BUGS:
Bugs
====
- Performance sometimes is rather low because we don't fully exploit clustering
TODO-List:
TODO-List
=========
- Add IBM'S Disk layout to genhd
- Enhance driver to use more than one major number
- Enable usage as a module
......
=============================
S/390 driver model interfaces
-----------------------------
=============================
1. CCW devices
--------------
......@@ -8,9 +9,9 @@ All devices which can be addressed by means of ccws are called 'CCW devices' -
even if they aren't actually driven by ccws.
All ccw devices are accessed via a subchannel, this is reflected in the
structures under devices/:
structures under devices/::
devices/
devices/
- system/
- css0/
- 0.0.0000/0.0.0815/
......@@ -35,14 +36,18 @@ be found under bus/ccw/devices/.
All ccw devices export some data via sysfs.
cutype: The control unit type / model.
cutype:
The control unit type / model.
devtype: The device type / model, if applicable.
devtype:
The device type / model, if applicable.
availability: Can be 'good' or 'boxed'; 'no path' or 'no device' for
availability:
Can be 'good' or 'boxed'; 'no path' or 'no device' for
disconnected devices.
online: An interface to set the device online and offline.
online:
An interface to set the device online and offline.
In the special case of the device being disconnected (see the
notify function under 1.2), piping 0 to online will forcibly delete
the device.
......@@ -52,9 +57,11 @@ The device drivers can add entries to export per-device data and interfaces.
There is also some data exported on a per-subchannel basis (see under
bus/css/devices/):
chpids: Via which chpids the device is connected.
chpids:
Via which chpids the device is connected.
pimpampom: The path installed, path available and path operational masks.
pimpampom:
The path installed, path available and path operational masks.
There also might be additional data, for example for block devices.
......@@ -74,9 +81,9 @@ b. After a. has been performed, if necessary, the device is finally brought up
------------------------------------
The basic struct ccw_device and struct ccw_driver data structures can be found
under include/asm/ccwdev.h.
under include/asm/ccwdev.h::
struct ccw_device {
struct ccw_device {
spinlock_t *ccwlock;
struct ccw_device_private *private;
struct ccw_device_id id;
......@@ -87,9 +94,9 @@ struct ccw_device {
void (*handler) (struct ccw_device *dev, unsigned long intparm,
struct irb *irb);
};
};
struct ccw_driver {
struct ccw_driver {
struct module *owner;
struct ccw_device_id *ids;
int (*probe) (struct ccw_device *);
......@@ -99,16 +106,16 @@ struct ccw_driver {
int (*notify) (struct ccw_device *, int);
struct device_driver driver;
char *name;
};
};
The 'private' field contains data needed for internal i/o operation only, and
is not available to the device driver.
Each driver should declare in a MODULE_DEVICE_TABLE into which CU types/models
and/or device types/models it is interested. This information can later be found
in the struct ccw_device_id fields:
in the struct ccw_device_id fields::
struct ccw_device_id {
struct ccw_device_id {
__u16 match_flags;
__u16 cu_type;
......@@ -117,34 +124,50 @@ struct ccw_device_id {
__u8 dev_model;
unsigned long driver_info;
};
};
The functions in ccw_driver should be used in the following way:
probe: This function is called by the device layer for each device the driver
probe:
This function is called by the device layer for each device the driver
is interested in. The driver should only allocate private structures
to put in dev->driver_data and create attributes (if needed). Also,
the interrupt handler (see below) should be set here.
int (*probe) (struct ccw_device *cdev);
::
int (*probe) (struct ccw_device *cdev);
Parameters: cdev - the device to be probed.
Parameters:
cdev
- the device to be probed.
remove: This function is called by the device layer upon removal of the driver,
remove:
This function is called by the device layer upon removal of the driver,
the device or the module. The driver should perform cleanups here.
int (*remove) (struct ccw_device *cdev);
::
Parameters: cdev - the device to be removed.
int (*remove) (struct ccw_device *cdev);
Parameters:
cdev
- the device to be removed.
set_online: This function is called by the common I/O layer when the device is
set_online:
This function is called by the common I/O layer when the device is
activated via the 'online' attribute. The driver should finally
setup and activate the device here.
int (*set_online) (struct ccw_device *);
::
int (*set_online) (struct ccw_device *);
Parameters: cdev - the device to be activated. The common layer has
Parameters:
cdev
- the device to be activated. The common layer has
verified that the device is not already online.
......@@ -152,15 +175,22 @@ set_offline: This function is called by the common I/O layer when the device is
de-activated via the 'online' attribute. The driver should shut
down the device, but not de-allocate its private data.
int (*set_offline) (struct ccw_device *);
::
int (*set_offline) (struct ccw_device *);
Parameters: cdev - the device to be deactivated. The common layer has
Parameters:
cdev
- the device to be deactivated. The common layer has
verified that the device is online.
notify: This function is called by the common I/O layer for some state changes
notify:
This function is called by the common I/O layer for some state changes
of the device.
Signalled to the driver are:
* In online state, device detached (CIO_GONE) or last path gone
(CIO_NO_PATH). The driver must return !0 to keep the device; for
return code 0, the device will be deleted as usual (also when no
......@@ -174,10 +204,16 @@ notify: This function is called by the common I/O layer for some state changes
wants the device back: !0 for keeping, 0 to make the device being
removed and re-registered.
int (*notify) (struct ccw_device *, int);
::
int (*notify) (struct ccw_device *, int);
Parameters:
cdev
- the device whose state changed.
Parameters: cdev - the device whose state changed.
event - the event that happened. This can be one of CIO_GONE,
event
- the event that happened. This can be one of CIO_GONE,
CIO_NO_PATH or CIO_OPER.
The handler field of the struct ccw_device is meant to be set to the interrupt
......@@ -189,7 +225,9 @@ before the driver is called, and is deregistered during set_offline() after the
driver has been called. Also, after registering / before deregistering, path
grouping resp. disbanding of the path group (if applicable) are performed.
void (*handler) (struct ccw_device *dev, unsigned long intparm, struct irb *irb);
::
void (*handler) (struct ccw_device *dev, unsigned long intparm, struct irb *irb);
Parameters: dev - the device the handler is called for
intparm - the intparm which allows the device driver to identify
......@@ -237,18 +275,22 @@ only the logical state and not the physical state, since we cannot track the
latter consistently due to lacking machine support (we don't need to be aware
of it anyway).
status - Can be 'online' or 'offline'.
status
- Can be 'online' or 'offline'.
Piping 'on' or 'off' sets the chpid logically online/offline.
Piping 'on' to an online chpid triggers path reprobing for all devices
the chpid connects to. This can be used to force the kernel to re-use
a channel path the user knows to be online, but the machine hasn't
created a machine check for.
type - The physical type of the channel path.
type
- The physical type of the channel path.
shared - Whether the channel path is shared.
shared
- Whether the channel path is shared.
cmg - The channel measurement group.
cmg
- The channel measurement group.
3. System devices
-----------------
......@@ -279,9 +321,8 @@ Netiucv connections show up under devices/iucv/ as "netiucv<ifnum>". The interfa
number is assigned sequentially to the connections defined via the 'connection'
attribute.
user - shows the connection partner.
buffer - maximum buffer size.
Pipe to it to change buffer size.
user
- shows the connection partner.
buffer
- maximum buffer size. Pipe to it to change buffer size.
:orphan:
=================
s390 Architecture
=================
.. toctree::
:maxdepth: 1
cds
3270
debugging390
driver-model
monreader
qeth
s390dbf
vfio-ap
vfio-ccw
zfcpdump
dasd
common_io
text_files
.. only:: subproject and html
Indices
=======
* :ref:`genindex`
=================================================
Linux API for read access to z/VM Monitor Records
=================================================
Date : 2004-Nov-26
Author: Gerald Schaefer (geraldsc@de.ibm.com)
Linux API for read access to z/VM Monitor Records
=================================================
Description
===========
This item delivers a new Linux API in the form of a misc char device that is
usable from user space and allows read access to the z/VM Monitor Records
collected by the *MONITOR System Service of z/VM.
collected by the `*MONITOR` System Service of z/VM.
User Requirements
=================
The z/VM guest on which you want to access this API needs to be configured in
order to allow IUCV connections to the *MONITOR service, i.e. it needs the
IUCV *MONITOR statement in its user entry. If the monitor DCSS to be used is
order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the
IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is
restricted (likely), you also need the NAMESAVE <DCSS NAME> statement.
This item will use the IUCV device driver to access the z/VM services, so you
need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1.
......@@ -50,7 +52,9 @@ Your guest virtual storage has to end below the starting address of the DCSS
and you have to specify the "mem=" kernel parameter in your parmfile with a
value greater than the ending address of the DCSS.
Example: DEF STOR 140M
Example::
DEF STOR 140M
This defines 140MB storage size for your guest, the parameter "mem=160M" is
added to the parmfile.
......@@ -66,24 +70,27 @@ kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified
in the parmfile.
The default name for the DCSS is "MONDCSS" if none is specified. In case that
there are other users already connected to the *MONITOR service (e.g.
there are other users already connected to the `*MONITOR` service (e.g.
Performance Toolkit), the monitor DCSS is already defined and you have to use
the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name
of the monitor DCSS, if already defined, and the users connected to the
*MONITOR service.
`*MONITOR` service.
Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor
DCSS if your z/VM doesn't have one already, you need Class E privileges to
define and save a DCSS.
Example:
--------
modprobe monreader mondcss=MYDCSS
::
modprobe monreader mondcss=MYDCSS
This loads the module and sets the DCSS name to "MYDCSS".
NOTE:
-----
This API provides no interface to control the *MONITOR service, e.g. specify
This API provides no interface to control the `*MONITOR` service, e.g. specify
which data should be collected. This can be done by the CP command MONITOR
(Class E privileged), see "CP Command and Utility Reference".
......@@ -98,6 +105,7 @@ If your distribution does not support udev, a device node will not be created
automatically and you have to create it manually after loading the module.
Therefore you need to know the major and minor numbers of the device. These
numbers can be found in /sys/class/misc/monreader/dev.
Typing cat /sys/class/misc/monreader/dev will give an output of the form
<major>:<minor>. The device node can be created via the mknod command, enter
mknod <name> c <major> <minor>, where <name> is the name of the device node
......@@ -105,10 +113,13 @@ to be created.
Example:
--------
# modprobe monreader
# cat /sys/class/misc/monreader/dev
10:63
# mknod /dev/monreader c 10 63
::
# modprobe monreader
# cat /sys/class/misc/monreader/dev
10:63
# mknod /dev/monreader c 10 63
This loads the module with the default monitor DCSS (MONDCSS) and creates a
device node.
......@@ -133,20 +144,21 @@ last byte of data. The start address is needed to handle "end-of-frame" records
correctly (domain 1, record 13), i.e. it can be used to determine the record
start offset relative to a 4K page (frame) boundary.
See "Appendix A: *MONITOR" in the "z/VM Performance" document for a description
See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description
of the monitor control element layout. The layout of the monitor records can
be found here (z/VM 5.1): http://www.vm.ibm.com/pubs/mon510/index.html
The layout of the data stream provided by the monreader device is as follows:
...
<0 byte read>
<first MCE> \
<first set of records> |
... |- data set
<last MCE> |
<last set of records> /
<0 byte read>
...
The layout of the data stream provided by the monreader device is as follows::
...
<0 byte read>
<first MCE> \
<first set of records> |
... |- data set
<last MCE> |
<last set of records> /
<0 byte read>
...
There may be more than one combination of MCE and corresponding record set
within one data set and the end of each data set is indicated by a successful
......@@ -165,14 +177,18 @@ As with most char devices, error conditions are indicated by returning a
negative value for the number of bytes read. In this case, the errno variable
indicates the error condition:
EIO: reply failed, read data is invalid and the application
EIO:
reply failed, read data is invalid and the application
should discard the data read since the last successful read with 0 size.
EFAULT: copy_to_user failed, read data is invalid and the application should
EFAULT:
copy_to_user failed, read data is invalid and the application should
discard the data read since the last successful read with 0 size.
EAGAIN: occurs on a non-blocking read if there is no data available at the
EAGAIN:
occurs on a non-blocking read if there is no data available at the
moment. There is no data missing or corrupted, just try again or rather
use polling for non-blocking reads.
EOVERFLOW: message limit reached, the data read since the last successful
EOVERFLOW:
message limit reached, the data read since the last successful
read with 0 size is valid but subsequent records may be missing.
In the last case (EOVERFLOW) there may be missing data, in the first two cases
......@@ -183,7 +199,7 @@ Open:
-----
Only one user is allowed to open the char device. If it is already in use, the
open function will fail (return a negative value) and set errno to EBUSY.
The open function may also fail if an IUCV connection to the *MONITOR service
The open function may also fail if an IUCV connection to the `*MONITOR` service
cannot be established. In this case errno will be set to EIO and an error
message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER
codes are described in the "z/VM Performance" book, Appendix A.
......@@ -194,4 +210,3 @@ As soon as the device is opened, incoming messages will be accepted and they
will account for the message limit, i.e. opening the device without reading
from it will provoke the "message limit reached" error (EOVERFLOW error code)
eventually.
=============================
IBM s390 QDIO Ethernet Driver
=============================
OSA and HiperSockets Bridge Port Support
========================================
Uevents
-------
To generate the events the device must be assigned a role of either
a primary or a secondary Bridge Port. For more information, see
......@@ -13,12 +17,15 @@ of some configured Bridge Port device on the channel changes, a udev
event with ACTION=CHANGE is emitted on behalf of the corresponding
ccwgroup device. The event has the following attributes:
BRIDGEPORT=statechange - indicates that the Bridge Port device changed
BRIDGEPORT=statechange
indicates that the Bridge Port device changed
its state.
ROLE={primary|secondary|none} - the role assigned to the port.
ROLE={primary|secondary|none}
the role assigned to the port.
STATE={active|standby|inactive} - the newly assumed state of the port.
STATE={active|standby|inactive}
the newly assumed state of the port.
When run on HiperSockets Bridge Capable Port hardware with host address
notifications enabled, a udev event with ACTION=CHANGE is emitted.
......@@ -26,25 +33,32 @@ It is emitted on behalf of the corresponding ccwgroup device when a host
or a VLAN is registered or unregistered on the network served by the device.
The event has the following attributes:
BRIDGEDHOST={reset|register|deregister|abort} - host address
BRIDGEDHOST={reset|register|deregister|abort}
host address
notifications are started afresh, a new host or VLAN is registered or
deregistered on the Bridge Port HiperSockets channel, or address
notifications are aborted.
VLAN=numeric-vlan-id - VLAN ID on which the event occurred. Not included
VLAN=numeric-vlan-id
VLAN ID on which the event occurred. Not included
if no VLAN is involved in the event.
MAC=xx:xx:xx:xx:xx:xx - MAC address of the host that is being registered
MAC=xx:xx:xx:xx:xx:xx
MAC address of the host that is being registered
or deregistered from the HiperSockets channel. Not reported if the
event reports the creation or destruction of a VLAN.
NTOK_BUSID=x.y.zzzz - device bus ID (CSSID, SSID and device number).
NTOK_BUSID=x.y.zzzz
device bus ID (CSSID, SSID and device number).
NTOK_IID=xx - device IID.
NTOK_IID=xx
device IID.
NTOK_CHPID=xx - device CHPID.
NTOK_CHPID=xx
device CHPID.
NTOK_CHID=xxxx - device channel ID.
NTOK_CHID=xxxx
device channel ID.
Note that the NTOK_* attributes refer to devices other than the one
Note that the `NTOK_*` attributes refer to devices other than the one
connected to the system on which the OS is running.
==================
S390 Debug Feature
==================
files:
- arch/s390/kernel/debug.c
- arch/s390/include/asm/debug.h
Description:
------------
The goal of this feature is to provide a kernel debug logging API
where log records can be stored efficiently in memory, where each component
(e.g. device drivers) can have one separate debug log.
One purpose of this is to inspect the debug logs after a production system crash
in order to analyze the reason for the crash.
If the system still runs but only a subcomponent which uses dbf fails,
it is possible to look at the debug logs on a live system via the Linux
debugfs filesystem.
The debug feature may also very useful for kernel and driver development.
Design:
-------
Kernel components (e.g. device drivers) can register themselves at the debug
feature with the function call :c:func:`debug_register()`.
This function initializes a
debug log for the caller. For each debug log exists a number of debug areas
where exactly one is active at one time. Each debug area consists of contiguous
pages in memory. In the debug areas there are stored debug entries (log records)
which are written by event- and exception-calls.
An event-call writes the specified debug entry to the active debug
area and updates the log pointer for the active area. If the end
of the active debug area is reached, a wrap around is done (ring buffer)
and the next debug entry will be written at the beginning of the active
debug area.
An exception-call writes the specified debug entry to the log and
switches to the next debug area. This is done in order to be sure
that the records which describe the origin of the exception are not
overwritten when a wrap around for the current area occurs.
The debug areas themselves are also ordered in form of a ring buffer.
When an exception is thrown in the last debug area, the following debug
entries are then written again in the very first area.
There are four versions for the event- and exception-calls: One for
logging raw data, one for text, one for numbers (unsigned int and long),
and one for sprintf-like formatted strings.
Each debug entry contains the following data:
- Timestamp
- Cpu-Number of calling task
- Level of debug entry (0...6)
- Return Address to caller
- Flag, if entry is an exception or not
The debug logs can be inspected in a live system through entries in
the debugfs-filesystem. Under the toplevel directory "``s390dbf``" there is
a directory for each registered component, which is named like the
corresponding component. The debugfs normally should be mounted to
``/sys/kernel/debug`` therefore the debug feature can be accessed under
``/sys/kernel/debug/s390dbf``.
The content of the directories are files which represent different views
to the debug log. Each component can decide which views should be
used through registering them with the function :c:func:`debug_register_view()`.
Predefined views for hex/ascii, sprintf and raw binary data are provided.
It is also possible to define other views. The content of
a view can be inspected simply by reading the corresponding debugfs file.
All debug logs have an actual debug level (range from 0 to 6).
The default level is 3. Event and Exception functions have a :c:data:`level`
parameter. Only debug entries with a level that is lower or equal
than the actual level are written to the log. This means, when
writing events, high priority log entries should have a low level
value whereas low priority entries should have a high one.
The actual debug level can be changed with the help of the debugfs-filesystem
through writing a number string "x" to the ``level`` debugfs file which is
provided for every debug log. Debugging can be switched off completely
by using "-" on the ``level`` debugfs file.
Example::
> echo "-" > /sys/kernel/debug/s390dbf/dasd/level
It is also possible to deactivate the debug feature globally for every
debug log. You can change the behavior using 2 sysctl parameters in
``/proc/sys/s390dbf``:
There are currently 2 possible triggers, which stop the debug feature
globally. The first possibility is to use the ``debug_active`` sysctl. If
set to 1 the debug feature is running. If ``debug_active`` is set to 0 the
debug feature is turned off.
The second trigger which stops the debug feature is a kernel oops.
That prevents the debug feature from overwriting debug information that
happened before the oops. After an oops you can reactivate the debug feature
by piping 1 to ``/proc/sys/s390dbf/debug_active``. Nevertheless, it's not
suggested to use an oopsed kernel in a production environment.
If you want to disallow the deactivation of the debug feature, you can use
the ``debug_stoppable`` sysctl. If you set ``debug_stoppable`` to 0 the debug
feature cannot be stopped. If the debug feature is already stopped, it
will stay deactivated.
Kernel Interfaces:
------------------
.. kernel-doc:: arch/s390/kernel/debug.c
.. kernel-doc:: arch/s390/include/asm/debug.h
Predefined views:
-----------------
.. code-block:: c
extern struct debug_view debug_hex_ascii_view;
extern struct debug_view debug_raw_view;
extern struct debug_view debug_sprintf_view;
Examples
--------
.. code-block:: c
/*
* hex_ascii- + raw-view Example
*/
#include <linux/init.h>
#include <asm/debug.h>
static debug_info_t *debug_info;
static int init(void)
{
/* register 4 debug areas with one page each and 4 byte data field */
debug_info = debug_register("test", 1, 4, 4 );
debug_register_view(debug_info, &debug_hex_ascii_view);
debug_register_view(debug_info, &debug_raw_view);
debug_text_event(debug_info, 4 , "one ");
debug_int_exception(debug_info, 4, 4711);
debug_event(debug_info, 3, &debug_info, 4);
return 0;
}
static void cleanup(void)
{
debug_unregister(debug_info);
}
module_init(init);
module_exit(cleanup);
.. code-block:: c
/*
* sprintf-view Example
*/
#include <linux/init.h>
#include <asm/debug.h>
static debug_info_t *debug_info;
static int init(void)
{
/* register 4 debug areas with one page each and data field for */
/* format string pointer + 2 varargs (= 3 * sizeof(long)) */
debug_info = debug_register("test", 1, 4, sizeof(long) * 3);
debug_register_view(debug_info, &debug_sprintf_view);
debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__);
debug_sprintf_exception(debug_info, 1, "pointer to debug info: %p\n",&debug_info);
return 0;
}
static void cleanup(void)
{
debug_unregister(debug_info);
}
module_init(init);
module_exit(cleanup);
Debugfs Interface
-----------------
Views to the debug logs can be investigated through reading the corresponding
debugfs-files:
Example::
> ls /sys/kernel/debug/s390dbf/dasd
flush hex_ascii level pages raw
> cat /sys/kernel/debug/s390dbf/dasd/hex_ascii | sort -k2,2 -s
00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | ....
00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE
00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | ....
00 00974733272:682281 1 * 02 0006ab08 41 4c 4c 43 | EXCP
01 00974733272:682284 2 - 02 0006ab16 45 43 4b 44 | ECKD
01 00974733272:682287 2 - 02 0006ab28 00 00 00 04 | ....
01 00974733272:682289 2 - 02 0006ab3e 00 00 00 20 | ...
01 00974733272:682297 2 - 02 0006ad7e 07 ea 4a 90 | ....
01 00974733272:684384 2 - 00 0006ade6 46 52 45 45 | FREE
01 00974733272:684388 2 - 00 0006adf6 07 ea 4a 90 | ....
See section about predefined views for explanation of the above output!
Changing the debug level
------------------------
Example::
> cat /sys/kernel/debug/s390dbf/dasd/level
3
> echo "5" > /sys/kernel/debug/s390dbf/dasd/level
> cat /sys/kernel/debug/s390dbf/dasd/level
5
Flushing debug areas
--------------------
Debug areas can be flushed with piping the number of the desired
area (0...n) to the debugfs file "flush". When using "-" all debug areas
are flushed.
Examples:
1. Flush debug area 0::
> echo "0" > /sys/kernel/debug/s390dbf/dasd/flush
2. Flush all debug areas::
> echo "-" > /sys/kernel/debug/s390dbf/dasd/flush
Changing the size of debug areas
------------------------------------
It is possible the change the size of debug areas through piping
the number of pages to the debugfs file "pages". The resize request will
also flush the debug areas.
Example:
Define 4 pages for the debug areas of debug feature "dasd"::
> echo "4" > /sys/kernel/debug/s390dbf/dasd/pages
Stopping the debug feature
--------------------------
Example:
1. Check if stopping is allowed::
> cat /proc/sys/s390dbf/debug_stoppable
2. Stop debug feature::
> echo 0 > /proc/sys/s390dbf/debug_active
crash Interface
----------------
The ``crash`` tool since v5.1.0 has a built-in command
``s390dbf`` to display all the debug logs or export them to the file system.
With this tool it is possible
to investigate the debug logs on a live system and with a memory dump after
a system crash.
Investigating raw memory
------------------------
One last possibility to investigate the debug logs at a live
system and after a system crash is to look at the raw memory
under VM or at the Service Element.
It is possible to find the anchor of the debug-logs through
the ``debug_area_first`` symbol in the System map. Then one has
to follow the correct pointers of the data-structures defined
in debug.h and find the debug-areas in memory.
Normally modules which use the debug feature will also have
a global variable with the pointer to the debug-logs. Following
this pointer it will also be possible to find the debug logs in
memory.
For this method it is recommended to use '16 * x + 4' byte (x = 0..n)
for the length of the data field in :c:func:`debug_register()` in
order to see the debug entries well formatted.
Predefined Views
----------------
There are three predefined views: hex_ascii, raw and sprintf.
The hex_ascii view shows the data field in hex and ascii representation
(e.g. ``45 43 4b 44 | ECKD``).
The raw view returns a bytestream as the debug areas are stored in memory.
The sprintf view formats the debug entries in the same way as the sprintf
function would do. The sprintf event/exception functions write to the
debug entry a pointer to the format string (size = sizeof(long))
and for each vararg a long value. So e.g. for a debug entry with a format
string plus two varargs one would need to allocate a (3 * sizeof(long))
byte data area in the debug_register() function.
IMPORTANT:
Using "%s" in sprintf event functions is dangerous. You can only
use "%s" in the sprintf event functions, if the memory for the passed string
is available as long as the debug feature exists. The reason behind this is
that due to performance considerations only a pointer to the string is stored
in the debug feature. If you log a string that is freed afterwards, you will
get an OOPS when inspecting the debug feature, because then the debug feature
will access the already freed memory.
NOTE:
If using the sprintf view do NOT use other event/exception functions
than the sprintf-event and -exception functions.
The format of the hex_ascii and sprintf view is as follows:
- Number of area
- Timestamp (formatted as seconds and microseconds since 00:00:00 Coordinated
Universal Time (UTC), January 1, 1970)
- level of debug entry
- Exception flag (* = Exception)
- Cpu-Number of calling task
- Return Address to caller
- data field
The format of the raw view is:
- Header as described in debug.h
- datafield
A typical line of the hex_ascii view will look like the following (first line
is only for explanation and will not be displayed when 'cating' the view)::
area time level exception cpu caller data (hex + ascii)
--------------------------------------------------------------------------
00 00964419409:440690 1 - 00 88023fe
Defining views
--------------
Views are specified with the 'debug_view' structure. There are defined
callback functions which are used for reading and writing the debugfs files:
.. code-block:: c
struct debug_view {
char name[DEBUG_MAX_PROCF_LEN];
debug_prolog_proc_t* prolog_proc;
debug_header_proc_t* header_proc;
debug_format_proc_t* format_proc;
debug_input_proc_t* input_proc;
void* private_data;
};
where:
.. code-block:: c
typedef int (debug_header_proc_t) (debug_info_t* id,
struct debug_view* view,
int area,
debug_entry_t* entry,
char* out_buf);
typedef int (debug_format_proc_t) (debug_info_t* id,
struct debug_view* view, char* out_buf,
const char* in_buf);
typedef int (debug_prolog_proc_t) (debug_info_t* id,
struct debug_view* view,
char* out_buf);
typedef int (debug_input_proc_t) (debug_info_t* id,
struct debug_view* view,
struct file* file, const char* user_buf,
size_t in_buf_size, loff_t* offset);
The "private_data" member can be used as pointer to view specific data.
It is not used by the debug feature itself.
The output when reading a debugfs file is structured like this::
"prolog_proc output"
"header_proc output 1" "format_proc output 1"
"header_proc output 2" "format_proc output 2"
"header_proc output 3" "format_proc output 3"
...
When a view is read from the debugfs, the Debug Feature calls the
'prolog_proc' once for writing the prolog.
Then 'header_proc' and 'format_proc' are called for each
existing debug entry.
The input_proc can be used to implement functionality when it is written to
the view (e.g. like with ``echo "0" > /sys/kernel/debug/s390dbf/dasd/level``).
For header_proc there can be used the default function
:c:func:`debug_dflt_header_fn()` which is defined in debug.h.
and which produces the same header output as the predefined views.
E.g::
00 00964419409:440761 2 - 00 88023ec
In order to see how to use the callback functions check the implementation
of the default views!
Example:
.. code-block:: c
#include <asm/debug.h>
#define UNKNOWNSTR "data: %08x"
const char* messages[] =
{"This error...........\n",
"That error...........\n",
"Problem..............\n",
"Something went wrong.\n",
"Everything ok........\n",
NULL
};
static int debug_test_format_fn(
debug_info_t *id, struct debug_view *view,
char *out_buf, const char *in_buf
)
{
int i, rc = 0;
if (id->buf_size >= 4) {
int msg_nr = *((int*)in_buf);
if (msg_nr < sizeof(messages) / sizeof(char*) - 1)
rc += sprintf(out_buf, "%s", messages[msg_nr]);
else
rc += sprintf(out_buf, UNKNOWNSTR, msg_nr);
}
return rc;
}
struct debug_view debug_test_view = {
"myview", /* name of view */
NULL, /* no prolog */
&debug_dflt_header_fn, /* default header for each entry */
&debug_test_format_fn, /* our own format function */
NULL, /* no input function */
NULL /* no private data */
};
test:
=====
.. code-block:: c
debug_info_t *debug_info;
int i;
...
debug_info = debug_register("test", 0, 4, 4);
debug_register_view(debug_info, &debug_test_view);
for (i = 0; i < 10; i ++)
debug_int_event(debug_info, 1, i);
::
> cat /sys/kernel/debug/s390dbf/test/myview
00 00964419734:611402 1 - 00 88042ca This error...........
00 00964419734:611405 1 - 00 88042ca That error...........
00 00964419734:611408 1 - 00 88042ca Problem..............
00 00964419734:611411 1 - 00 88042ca Something went wrong.
00 00964419734:611414 1 - 00 88042ca Everything ok........
00 00964419734:611417 1 - 00 88042ca data: 00000005
00 00964419734:611419 1 - 00 88042ca data: 00000006
00 00964419734:611422 1 - 00 88042ca data: 00000007
00 00964419734:611425 1 - 00 88042ca data: 00000008
00 00964419734:611428 1 - 00 88042ca data: 00000009
此差异已折叠。
ibm 3270 changelog
------------------
.. include:: 3270.ChangeLog
:literal:
ibm 3270 config3270.sh
----------------------
.. literalinclude:: config3270.sh
:language: shell
==================================
vfio-ccw: the basic infrastructure
==================================
......@@ -11,9 +12,11 @@ virtual machine, while vfio is the means.
Different than other hardware architectures, s390 has defined a unified
I/O access method, which is so called Channel I/O. It has its own access
patterns:
- Channel programs run asynchronously on a separate (co)processor.
- The channel subsystem will access any memory designated by the caller
in the channel program directly, i.e. there is no iommu involved.
Thus when we introduce vfio support for these devices, we realize it
with a mediated device (mdev) implementation. The vfio mdev will be
added to an iommu group, so as to make itself able to be managed by the
......@@ -24,6 +27,7 @@ to perform I/O instructions.
This document does not intend to explain the s390 I/O architecture in
every detail. More information/reference could be found here:
- A good start to know Channel I/O in general:
https://en.wikipedia.org/wiki/Channel_I/O
- s390 architecture:
......@@ -80,6 +84,7 @@ until interrupted. The I/O completion result is received by the
interrupt handler in the form of interrupt response block (IRB).
Back to vfio-ccw, in short:
- ORBs and channel programs are built in guest kernel (with guest
physical addresses).
- ORBs and channel programs are passed to the host kernel.
......@@ -106,6 +111,7 @@ it gets sent to hardware.
Within this implementation, we have two drivers for two types of
devices:
- The vfio_ccw driver for the physical subchannel device.
This is an I/O subchannel driver for the real subchannel device. It
realizes a group of callbacks and registers to the mdev framework as a
......@@ -137,7 +143,7 @@ devices:
vfio_pin_pages and a vfio_unpin_pages interfaces from the vfio iommu
backend for the physical devices to pin and unpin pages by demand.
Below is a high Level block diagram.
Below is a high Level block diagram::
+-------------+
| |
......@@ -158,6 +164,7 @@ Below is a high Level block diagram.
+-------------+
The process of how these work together.
1. vfio_ccw.ko drives the physical I/O subchannel, and registers the
physical device (with callbacks) to mdev framework.
When vfio_ccw probing the subchannel device, it registers device
......@@ -178,17 +185,17 @@ vfio-ccw I/O region
An I/O region is used to accept channel program request from user
space and store I/O interrupt result for user space to retrieve. The
definition of the region is:
definition of the region is::
struct ccw_io_region {
#define ORB_AREA_SIZE 12
struct ccw_io_region {
#define ORB_AREA_SIZE 12
__u8 orb_area[ORB_AREA_SIZE];
#define SCSW_AREA_SIZE 12
#define SCSW_AREA_SIZE 12
__u8 scsw_area[SCSW_AREA_SIZE];
#define IRB_AREA_SIZE 96
#define IRB_AREA_SIZE 96
__u8 irb_area[IRB_AREA_SIZE];
__u32 ret_code;
} __packed;
} __packed;
While starting an I/O request, orb_area should be filled with the
guest ORB, and scsw_area should be filled with the SCSW of the Virtual
......@@ -205,7 +212,7 @@ vfio-ccw follows what vfio-pci did on the s390 platform and uses
vfio-iommu-type1 as the vfio iommu backend.
* CCW translation APIs
A group of APIs (start with 'cp_') to do CCW translation. The CCWs
A group of APIs (start with `cp_`) to do CCW translation. The CCWs
passed in by a user space program are organized with their guest
physical memory addresses. These APIs will copy the CCWs into kernel
space, and assemble a runnable kernel channel program by updating the
......@@ -217,12 +224,14 @@ vfio-iommu-type1 as the vfio iommu backend.
This driver utilizes the CCW translation APIs and introduces
vfio_ccw, which is the driver for the I/O subchannel devices you want
to pass through.
vfio_ccw implements the following vfio ioctls:
vfio_ccw implements the following vfio ioctls::
VFIO_DEVICE_GET_INFO
VFIO_DEVICE_GET_IRQ_INFO
VFIO_DEVICE_GET_REGION_INFO
VFIO_DEVICE_RESET
VFIO_DEVICE_SET_IRQS
This provides an I/O region, so that the user space program can pass a
channel program to the kernel, to do further CCW translation before
issuing them to a real device.
......@@ -236,32 +245,49 @@ bit more detail how an I/O request triggered by the QEMU guest will be
handled (without error handling).
Explanation:
Q1-Q7: QEMU side process.
K1-K5: Kernel side process.
Q1. Get I/O region info during initialization.
Q2. Setup event notifier and handler to handle I/O completion.
- Q1-Q7: QEMU side process.
- K1-K5: Kernel side process.
Q1.
Get I/O region info during initialization.
Q2.
Setup event notifier and handler to handle I/O completion.
... ...
Q3. Intercept a ssch instruction.
Q4. Write the guest channel program and ORB to the I/O region.
K1. Copy from guest to kernel.
K2. Translate the guest channel program to a host kernel space
Q3.
Intercept a ssch instruction.
Q4.
Write the guest channel program and ORB to the I/O region.
K1.
Copy from guest to kernel.
K2.
Translate the guest channel program to a host kernel space
channel program, which becomes runnable for a real device.
K3. With the necessary information contained in the orb passed in
K3.
With the necessary information contained in the orb passed in
by QEMU, issue the ccwchain to the device.
K4. Return the ssch CC code.
Q5. Return the CC code to the guest.
K4.
Return the ssch CC code.
Q5.
Return the CC code to the guest.
... ...
K5. Interrupt handler gets the I/O result and write the result to
K5.
Interrupt handler gets the I/O result and write the result to
the I/O region.
K6. Signal QEMU to retrieve the result.
Q6. Get the signal and event handler reads out the result from the I/O
K6.
Signal QEMU to retrieve the result.
Q6.
Get the signal and event handler reads out the result from the I/O
region.
Q7. Update the irb for the guest.
Q7.
Update the irb for the guest.
Limitations
-----------
......@@ -295,6 +321,6 @@ Reference
1. ESA/s390 Principles of Operation manual (IBM Form. No. SA22-7832)
2. ESA/390 Common I/O Device Commands manual (IBM Form. No. SA22-7204)
3. https://en.wikipedia.org/wiki/Channel_I/O
4. Documentation/s390/cds.txt
4. Documentation/s390/cds.rst
5. Documentation/vfio.txt
6. Documentation/vfio-mediated-device.txt
==================================
The s390 SCSI dump tool (zfcpdump)
==================================
System z machines (z900 or higher) provide hardware support for creating system
dumps on SCSI disks. The dump process is initiated by booting a dump tool, which
......
......@@ -23,7 +23,6 @@ show up in /proc/sys/kernel:
- auto_msgmni
- bootloader_type [ X86 only ]
- bootloader_version [ X86 only ]
- callhome [ S390 only ]
- cap_last_cap
- core_pattern
- core_pipe_limit
......@@ -171,21 +170,6 @@ Documentation/x86/boot.txt for additional information.
==============================================================
callhome:
Controls the kernel's callhome behavior in case of a kernel panic.
The s390 hardware allows an operating system to send a notification
to a service organization (callhome) in case of an operating system panic.
When the value in this file is 0 (which is the default behavior)
nothing happens in case of a kernel panic. If this value is set to "1"
the complete kernel oops message is send to the IBM customer service
organization in case the mainframe the Linux operating system is running
on has a service contract with IBM.
==============================================================
cap_last_cap
Highest valid capability of the running kernel. Exports
......
......@@ -13718,7 +13718,7 @@ L: linux-s390@vger.kernel.org
L: kvm@vger.kernel.org
S: Supported
F: drivers/s390/cio/vfio_ccw*
F: Documentation/s390/vfio-ccw.txt
F: Documentation/s390/vfio-ccw.rst
F: include/uapi/linux/vfio_ccw.h
S390 ZCRYPT DRIVER
......@@ -13738,7 +13738,7 @@ S: Supported
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
F: drivers/s390/crypto/vfio_ap_ops.c
F: Documentation/s390/vfio-ap.txt
F: Documentation/s390/vfio-ap.rst
S390 ZFCP DRIVER
M: Steffen Maier <maier@linux.ibm.com>
......
......@@ -346,8 +346,6 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
#define spin_cpu_relax() barrier()
#define spin_cpu_yield() spin_cpu_relax()
#define spin_end() HMT_medium()
#define spin_until_cond(cond) \
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -24,7 +24,6 @@ CONFIG_CRASH_DUMP=y
# CONFIG_SECCOMP is not set
CONFIG_NET=y
# CONFIG_IUCV is not set
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_BLK_DEV_RAM=y
# CONFIG_BLK_DEV_XPRAM is not set
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
kexec-purgatory.c
purgatory
purgatory.lds
purgatory.ro
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册