bio.doc 17.8 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423
BIO Routines

This documentation is rather sparse, you are probably best 
off looking at the code for specific details.

The BIO library is a IO abstraction that was originally 
inspired by the need to have callbacks to perform IO to FILE 
pointers when using Windows 3.1 DLLs.  There are two types 
of BIO; a source/sink type and a filter type.
The source/sink methods are as follows:
-	BIO_s_mem()  memory buffer - a read/write byte array that
	grows until memory runs out :-).
-	BIO_s_file()  FILE pointer - A wrapper around the normal 
	'FILE *' commands, good for use with stdin/stdout.
-	BIO_s_fd()  File descriptor - A wrapper around file 
	descriptors, often used with pipes.
-	BIO_s_socket()  Socket - Used around sockets.  It is 
	mostly in the Microsoft world that sockets are different 
	from file descriptors and there are all those ugly winsock 
	commands.
-	BIO_s_null()  Null - read nothing and write nothing.; a 
	useful endpoint for filter type BIO's specifically things 
	like the message digest BIO.

The filter types are
-	BIO_f_buffer()  IO buffering - does output buffering into 
	larger chunks and performs input buffering to allow gets() 
	type functions.
-	BIO_f_md()  Message digest - a transparent filter that can 
	be asked to return a message digest for the data that has 
	passed through it.
-	BIO_f_cipher()  Encrypt or decrypt all data passing 
	through the filter.
-	BIO_f_base64()  Base64 decode on read and encode on write.
-	BIO_f_ssl()  A filter that performs SSL encryption on the 
	data sent through it.

Base BIO functions.
The BIO library has a set of base functions that are 
implemented for each particular type.  Filter BIOs will 
normally call the equivalent function on the source/sink BIO 
that they are layered on top of after they have performed 
some modification to the data stream.  Multiple filter BIOs 
can be 'push' into a stack of modifers, so to read from a 
file, unbase64 it, then decrypt it, a BIO_f_cipher, 
BIO_f_base64 and a BIO_s_file would probably be used.  If a 
sha-1 and md5 message digest needed to be generated, a stack 
two BIO_f_md() BIOs and a BIO_s_null() BIO could be used.
The base functions are
-	BIO *BIO_new(BIO_METHOD *type); Create  a new BIO of  type 'type'.
-	int BIO_free(BIO *a); Free a BIO structure.  Depending on 
	the configuration, this will free the underlying data 
	object for a source/sink BIO.
-	int BIO_read(BIO *b, char *data, int len); Read upto 'len' 
	bytes into 'data'. 
-	int BIO_gets(BIO *bp,char *buf, int size); Depending on 
	the BIO, this can either be a 'get special' or a get one 
	line of data, as per fgets();
-	int BIO_write(BIO *b, char *data, int len); Write 'len' 
	bytes from 'data' to the 'b' BIO.
-	int BIO_puts(BIO *bp,char *buf); Either a 'put special' or 
	a write null terminated string as per fputs().
-	long BIO_ctrl(BIO *bp,int cmd,long larg,char *parg);  A 
	control function which is used to manipulate the BIO 
	structure and modify it's state and or report on it.  This 
	function is just about never used directly, rather it 
	should be used in conjunction with BIO_METHOD specific 
	macros.
-	BIO *BIO_push(BIO *new_top, BIO *old); new_top is apped to the
	top of the 'old' BIO list.  new_top should be a filter BIO.
	All writes will go through 'new_top' first and last on read.
	'old' is returned.
-	BIO *BIO_pop(BIO *bio); the new topmost BIO is returned, NULL if
	there are no more.

If a particular low level BIO method is not supported 
(normally BIO_gets()), -2 will be returned if that method is 
called.  Otherwise the IO methods (read, write, gets, puts) 
will return the number of bytes read or written, and 0 or -1 
for error (or end of input).  For the -1 case, 
BIO_should_retry(bio) can be called to determine if it was a 
genuine error or a temporary problem.  -2 will also be 
returned if the BIO has not been initalised yet, in all 
cases, the correct error codes are set (accessible via the 
ERR library).


The following functions are convenience functions:
-	int BIO_printf(BIO *bio, char * format, ..);  printf but 
	to a BIO handle.
-	long BIO_ctrl_int(BIO *bp,int cmd,long larg,int iarg); a 
	convenience function to allow a different argument types 
	to be passed to BIO_ctrl().
-	int BIO_dump(BIO *b,char *bytes,int len); output 'len' 
	bytes from 'bytes' in a hex dump debug format.
-	long BIO_debug_callback(BIO *bio, int cmd, char *argp, int 
	argi, long argl, long ret) - a default debug BIO callback, 
	this is mentioned below.  To use this one normally has to 
	use the BIO_set_callback_arg() function to assign an 
	output BIO for the callback to use.
-	BIO *BIO_find_type(BIO *bio,int type); when there is a 'stack'
	of BIOs, this function scan the list and returns the first
	that is of type 'type', as listed in buffer.h under BIO_TYPE_XXX.
-	void BIO_free_all(BIO *bio); Free the bio and all other BIOs
	in the list.  It walks the bio->next_bio list.



Extra commands are normally implemented as macros calling BIO_ctrl().
-	BIO_number_read(BIO *bio) - the number of bytes processed 
	by BIO_read(bio,.).
-	BIO_number_written(BIO *bio) - the number of bytes written 
	by BIO_write(bio,.).
-	BIO_reset(BIO *bio) - 'reset' the BIO.
-	BIO_eof(BIO *bio) - non zero if we are at the current end 
	of input.
-	BIO_set_close(BIO *bio, int close_flag) - set the close flag.
-	BIO_get_close(BIO *bio) - return the close flag.
	BIO_pending(BIO *bio) - return the number of bytes waiting 
	to be read (normally buffered internally).
-	BIO_flush(BIO *bio) - output any data waiting to be output.
-	BIO_should_retry(BIO *io) - after a BIO_read/BIO_write 
	operation returns 0 or -1, a call to this function will 
	return non zero if you should retry the call later (this 
	is for non-blocking IO).
-	BIO_should_read(BIO *io) - we should retry when data can 
	be read.
-	BIO_should_write(BIO *io) - we should retry when data can 
	be written.
-	BIO_method_name(BIO *io) - return a string for the method name.
-	BIO_method_type(BIO *io) - return the unique ID of the BIO method.
-	BIO_set_callback(BIO *io,  long (*callback)(BIO *io, int 
	cmd, char *argp, int argi, long argl, long ret); - sets 
	the debug callback.
-	BIO_get_callback(BIO *io) - return the assigned function 
	as mentioned above.
-	BIO_set_callback_arg(BIO *io, char *arg)  - assign some 
	data against the BIO.  This is normally used by the debug 
	callback but could in reality be used for anything.  To 
	get an idea of how all this works, have a look at the code 
	in the default debug callback mentioned above.  The 
	callback can modify the return values.

Details of the BIO_METHOD structure.
typedef struct bio_method_st
        {
	int type;
	char *name;
	int (*bwrite)();
	int (*bread)();
	int (*bputs)();
	int (*bgets)();
	long (*ctrl)();
	int (*create)();
	int (*destroy)();
	} BIO_METHOD;

The 'type' is the numeric type of the BIO, these are listed in buffer.h;
'Name' is a textual representation of the BIO 'type'.
The 7 function pointers point to the respective function 
methods, some of which can be NULL if not implemented.
The BIO structure
typedef struct bio_st
	{
	BIO_METHOD *method;
	long (*callback)(BIO * bio, int mode, char *argp, int 
		argi, long argl, long ret);
	char *cb_arg; /* first argument for the callback */
	int init;
	int shutdown;
	int flags;      /* extra storage */
	int num;
	char *ptr;
	struct bio_st *next_bio; /* used by filter BIOs */
	int references;
	unsigned long num_read;
	unsigned long num_write;
	} BIO;

-	'Method' is the BIO method.
-	'callback', when configured, is called before and after 
	each BIO method is called for that particular BIO.  This 
	is intended primarily for debugging and of informational feedback.
-	'init' is 0 when the BIO can be used for operation.  
	Often, after a BIO is created, a number of operations may 
	need to be performed before it is available for use.  An 
	example is for BIO_s_sock().  A socket needs to be 
	assigned to the BIO before it can be used.
-	'shutdown', this flag indicates if the underlying 
	comunication primative being used should be closed/freed 
	when the BIO is closed.
-	'flags' is used to hold extra state.  It is primarily used 
	to hold information about why a non-blocking operation 
	failed and to record startup protocol information for the 
	SSL BIO.
-	'num' and 'ptr' are used to hold instance specific state 
	like file descriptors or local data structures.
-	'next_bio' is used by filter BIOs to hold the pointer of the
	next BIO in the chain. written data is sent to this BIO and
	data read is taken from it.
-	'references' is used to indicate the number of pointers to 
	this structure.  This needs to be '1' before a call to 
	BIO_free() is made if the BIO_free() function is to 
	actually free() the structure, otherwise the reference 
	count is just decreased.  The actual BIO subsystem does 
	not really use this functionality but it is useful when 
	used in more advanced applicaion.
-	num_read and num_write are the total number of bytes 
	read/written via the 'read()' and 'write()' methods.

BIO_ctrl operations.
The following is the list of standard commands passed as the 
second parameter to BIO_ctrl() and should be supported by 
all BIO as best as possible.  Some are optional, some are 
manditory, in any case, where is makes sense, a filter BIO 
should pass such requests to underlying BIO's.
-	BIO_CTRL_RESET	- Reset the BIO back to an initial state.
-	BIO_CTRL_EOF	- return 0 if we are not at the end of input, 
	non 0 if we are.
-	BIO_CTRL_INFO	- BIO specific special command, normal
	information return.
-	BIO_CTRL_SET	- set IO specific parameter.
-	BIO_CTRL_GET	- get IO specific parameter.
-	BIO_CTRL_GET_CLOSE - Get the close on BIO_free() flag, one 
	of BIO_CLOSE or BIO_NOCLOSE.
-	BIO_CTRL_SET_CLOSE - Set the close on BIO_free() flag.
-	BIO_CTRL_PENDING - Return the number of bytes available 
	for instant reading
-	BIO_CTRL_FLUSH	- Output pending data, return number of bytes output.
-	BIO_CTRL_SHOULD_RETRY - After an IO error (-1 returned) 
	should we 'retry' when IO is possible on the underlying IO object.
-	BIO_CTRL_RETRY_TYPE - What kind of IO are we waiting on.

The following command is a special BIO_s_file() specific option.
-	BIO_CTRL_SET_FILENAME - specify a file to open for IO.

The BIO_CTRL_RETRY_TYPE needs a little more explanation.  
When performing non-blocking IO, or say reading on a memory 
BIO, when no data is present (or cannot be written), 
BIO_read() and/or BIO_write() will return -1.  
BIO_should_retry(bio) will return true if this is due to an 
IO condition rather than an actual error.  In the case of 
BIO_s_mem(), a read when there is no data will return -1 and 
a should retry when there is more 'read' data.
The retry type is deduced from 2 macros
BIO_should_read(bio) and BIO_should_write(bio).
Now while it may appear obvious that a BIO_read() failure 
should indicate that a retry should be performed when more 
read data is available, this is often not true when using 
things like an SSL BIO.  During the SSL protocol startup 
multiple reads and writes are performed, triggered by any 
SSL_read or SSL_write.
So to write code that will transparently handle either a 
socket or SSL BIO,
	i=BIO_read(bio,..)
	if (I == -1)
		{
		if (BIO_should_retry(bio))
			{
			if (BIO_should_read(bio))
				{
				/* call us again when BIO can be read */
				}
			if (BIO_should_write(bio))
				{
				/* call us again when BIO can be written */
				}
			}
		}

At this point in time only read and write conditions can be 
used but in the future I can see the situation for other 
conditions, specifically with SSL there could be a condition 
of a X509 certificate lookup taking place and so the non-
blocking BIO_read would require a retry when the certificate 
lookup subsystem has finished it's lookup.  This is all 
makes more sense and is easy to use in a event loop type 
setup.
When using the SSL BIO, either SSL_read() or SSL_write()s 
can be called during the protocol startup and things will 
still work correctly.
The nice aspect of the use of the BIO_should_retry() macro 
is that all the errno codes that indicate a non-fatal error 
are encapsulated in one place.  The Windows specific error 
codes and WSAGetLastError() calls are also hidden from the 
application.

Notes on each BIO method.
Normally buffer.h is just required but depending on the 
BIO_METHOD, ssl.h or evp.h will also be required.

BIO_METHOD *BIO_s_mem(void);
-	BIO_set_mem_buf(BIO *bio, BUF_MEM *bm, int close_flag) - 
	set the underlying BUF_MEM structure for the BIO to use.
-	BIO_get_mem_ptr(BIO *bio, char **pp) - if pp is not NULL, 
	set it to point to the memory array and return the number 
	of bytes available.
A read/write BIO.  Any data written is appended to the 
memory array and any read is read from the front.  This BIO 
can be used for read/write at the same time. BIO_gets() is 
supported in the fgets() sense.
BIO_CTRL_INFO can be used to retrieve pointers to the memory 
buffer and it's length.

BIO_METHOD *BIO_s_file(void);
-	BIO_set_fp(BIO *bio, FILE *fp, int close_flag) - set 'FILE *' to use.
-	BIO_get_fp(BIO *bio, FILE **fp) - get the 'FILE *' in use.
-	BIO_read_filename(BIO *bio, char *name) - read from file.
-	BIO_write_filename(BIO *bio, char *name) - write to file.
-	BIO_append_filename(BIO *bio, char *name) - append to file.
This BIO sits over the normal system fread()/fgets() type 
functions. Gets() is supported.  This BIO in theory could be 
used for read and write but it is best to think of each BIO 
of this type as either a read or a write BIO, not both.

BIO_METHOD *BIO_s_socket(void);
BIO_METHOD *BIO_s_fd(void);
-	BIO_sock_should_retry(int i) - the underlying function 
	used to determine if a call should be retried; the 
	argument is the '0' or '-1' returned by the previous BIO 
	operation.
-	BIO_fd_should_retry(int i) - same as the 
-	BIO_sock_should_retry() except that it is different internally.
-	BIO_set_fd(BIO *bio, int fd, int close_flag) - set the 
	file descriptor to use
-	BIO_get_fd(BIO *bio, int *fd) - get the file descriptor.
These two methods are very similar.  Gets() is not 
supported, if you want this functionality, put a 
BIO_f_buffer() onto it.  This BIO is bi-directional if the 
underlying file descriptor is.  This is normally the case 
for sockets but not the case for stdio descriptors.

BIO_METHOD *BIO_s_null(void);
Read and write as much data as you like, it all disappears 
into this BIO.

BIO_METHOD *BIO_f_buffer(void);
-	BIO_get_buffer_num_lines(BIO *bio) - return the number of 
	complete lines in the buffer.
-	BIO_set_buffer_size(BIO *bio, long size) - set the size of 
	the buffers.
This type performs input and output buffering.  It performs 
both at the same time.  The size of the buffer can be set 
via the set buffer size option.  Data buffered for output is 
only written when the buffer fills.

BIO_METHOD *BIO_f_ssl(void);
-	BIO_set_ssl(BIO *bio, SSL *ssl, int close_flag) - the SSL 
	structure to use.
-	BIO_get_ssl(BIO *bio, SSL **ssl) - get the SSL structure 
	in use.
The SSL bio is a little different from normal BIOs because 
the underlying SSL structure is a little different.  A SSL 
structure performs IO via a read and write BIO.  These can 
be different and are normally set via the
SSL_set_rbio()/SSL_set_wbio() calls.  The SSL_set_fd() calls 
are just wrappers that create socket BIOs and then call 
SSL_set_bio() where the read and write BIOs are the same.  
The BIO_push() operation makes the SSLs IO BIOs the same, so 
make sure the BIO pushed is capable of two directional 
traffic.  If it is not, you will have to install the BIOs 
via the more conventional SSL_set_bio() call.  BIO_pop() will retrieve
the 'SSL read' BIO.

BIO_METHOD *BIO_f_md(void);
-	BIO_set_md(BIO *bio, EVP_MD *md) - set the message digest 
	to use.
-	BIO_get_md(BIO *bio, EVP_MD **mdp) - return the digest 
	method in use in mdp, return 0 if not set yet.
-	BIO_reset() reinitializes the digest (EVP_DigestInit()) 
	and passes the reset to the underlying BIOs.
All data read or written via BIO_read() or BIO_write() to 
this BIO will be added to the calculated digest.  This 
implies that this BIO is only one directional.  If read and 
write operations are performed, two separate BIO_f_md() BIOs 
are reuqired to generate digests on both the input and the 
output.  BIO_gets(BIO *bio, char *md, int size) will place the 
generated digest into 'md' and return the number of bytes.  
The EVP_MAX_MD_SIZE should probably be used to size the 'md' 
array.  Reading the digest will also reset it.

BIO_METHOD *BIO_f_cipher(void);
-	BIO_reset() reinitializes the cipher.
-	BIO_flush() should be called when the last bytes have been 
	output to flush the final block of block ciphers.
-	BIO_get_cipher_status(BIO *b), when called after the last 
	read from a cipher BIO, returns non-zero if the data 
	decrypted correctly, otherwise, 0.
-	BIO_set_cipher(BIO *b, EVP_CIPHER *c, unsigned char *key, 
	unsigned char *iv, int encrypt)   This function is used to 
	setup a cipher BIO.  The length of key and iv are 
	specified by the choice of EVP_CIPHER.  Encrypt is 1 to 
	encrypt and 0 to decrypt.

BIO_METHOD *BIO_f_base64(void);
-	BIO_flush() should be called when the last bytes have been output.
This BIO base64 encodes when writing and base64 decodes when 
reading.  It will scan the input until a suitable begin line 
is found.  After reading data, BIO_reset() will reset the 
BIO to start scanning again.  Do not mix reading and writing 
on the same base64 BIO.  It is meant as a single stream BIO.

Directions	type
both		BIO_s_mem()
one/both	BIO_s_file()
both		BIO_s_fd()
both		BIO_s_socket() 
both		BIO_s_null()
both		BIO_f_buffer()
one		BIO_f_md()  
one		BIO_f_cipher()  
one		BIO_f_base64()  
both		BIO_f_ssl()

It is easy to mix one and two directional BIOs, all one has 
to do is to keep two separate BIO pointers for reading and 
writing and be careful about usage of underlying BIOs.  The 
SSL bio by it's very nature has to be two directional but 
the BIO_push() command will push the one BIO into the SSL 
BIO for both reading and writing.

The best example program to look at is apps/enc.c and/or perhaps apps/dgst.c.