stb_image.h 179.2 KB
Newer Older
1
/* stb_image - v1.48 - public domain JPEG/PNG reader - http://nothings.org/stb_image.c
2 3 4 5 6 7 8
   when you control the images you're loading
                                     no warranty implied; use at your own risk

   Do this:
      #define STB_IMAGE_IMPLEMENTATION
   before you include this file in *one* C or C++ file to create the implementation.

S
Sean Barrett 已提交
9
   #define STBI_ASSERT(x) to avoid using assert.h.
10 11
   #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free

S
Sean Barrett 已提交
12

13 14 15 16 17
   QUICK NOTES:
      Primarily of interest to game developers and other people who can
          avoid problematic images and only need the trivial interface

      JPEG baseline (no JPEG progressive)
O
ocornut 已提交
18
      PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
19 20 21 22 23 24 25 26 27

      TGA (not sure what subset, if a subset)
      BMP non-1bpp, non-RLE
      PSD (composited view only, no extra channels)

      GIF (*comp always reports as 4-channel)
      HDR (radiance rgbE format)
      PIC (Softimage PIC)

28 29
      - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
      - decode from arbitrary I/O callbacks
F
Fabian Giesen 已提交
30
      - SIMD acceleration on x86/x64
31

32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
   Full documentation under "DOCUMENTATION" below.


   Revision 1.49 release notes:

      - The old STBI_SIMD system which allowed installing a user-defined
        IDCT etc. has been removed. If you need this, don't upgrade. My
        assumption is that almost nobody was doing this, and those who
        were will find the next bullet item more satisfactory anyway.

      - x86 platforms now make use of SSE2 SIMD instructions if available.
        This release is 2x faster on our test JPEGs, mostly due to SIMD.
        This work was done by Fabian "ryg" Giesen.

      - Compilation of SIMD code can be suppressed with
            #define STBI_NO_SIMD
        It should not be necessary to disable it unless you have issues
        compiling (e.g. using an x86 compiler which doesn't support SSE
        intrinsics or that doesn't support the method used to detect
        SSE2 support at run-time), and even those can be reported as
        bugs so I can refine the built-in compile-time checking to be
        smarter.        

      - RGB values computed for JPEG images are slightly different from
        previous versions of stb_image. (This is due to using less
        integer precision in SIMD.) The C code has been adjusted so
        that the same RGB values will be computed regardless of whether
        SIMD support is available, so your app should always produce
        consistent results. But these results are slightly different from
        previous versions. (Specifically, about 3% of available YCbCr values
        will compute different RGB results from pre-1.49 versions by +-1;
        most of the deviating values are one smaller in the G channel.)

      - If you must produce consistent results with previous versions of
        stb_image, #define STBI_JPEG_OLD and you will get the same results
        you used to; however, you will not get the SIMD speedups for
        the YCbCr-to-RGB conversion step (although you should still see
        significant JPEG speedup from the other changes).

        Please note that STBI_JPEG_OLD is a temporary feature; it will be
        removed in future versions of the library. It is only intended for
        back-compatibility use.


   Latest revision history:
      1.49 (2014-12-25) optimize JPG, incl. x86 SIMD
                        PGM/PPM support
                        allocation macros
                        stbi_load_into() -- load into pre-defined memory
                        STBI_MALLOC,STBI_REALLOC,STBI_FREE
82
      1.48 (2014-12-14) fix incorrectly-named assert()
83 84 85
      1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
                        optimize PNG
                        fix bug in interlaced PNG with user-specified channel count
86
      1.46 (2014-08-26) fix broken tRNS chunk in non-paletted PNG
87
      1.45 (2014-08-16) workaround MSVC-ARM internal compiler error by wrapping malloc
S
Sean Barrett 已提交
88
      1.44 (2014-08-07) warnings
S
Sean Barrett 已提交
89
      1.43 (2014-07-15) fix MSVC-only bug in 1.42
S
Sean Barrett 已提交
90
      1.42 (2014-07-09) no _CRT_SECURE_NO_WARNINGS; error-path fixes; STBI_ASSERT
91
      1.41 (2014-06-25) fix search&replace that messed up comments/error messages
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

   See end of file for full revision history.


 ============================    Contributors    =========================
              
 Image formats                                Bug fixes & warning fixes
    Sean Barrett (jpeg, png, bmp)                Marc LeBlanc
    Nicolas Schulz (hdr, psd)                    Christpher Lloyd
    Jonathan Dummer (tga)                        Dave Moore
    Jean-Marc Lienher (gif)                      Won Chun
    Tom Seddon (pic)                             the Horde3D community
    Thatcher Ulrich (psd)                        Janez Zemva
                                                 Jonathan Blow
                                                 Laurent Gomila
 Extensions, features                            Aruelien Pocheville
    Jetro Lauha (stbi_info)                      Ryamond Barbiero
    James "moose2000" Brown (iPhone PNG)         David Woo
    Ben "Disch" Wenger (io callbacks)            Roy Eltham
    Martin "SpartanJ" Golini                     Luke Graham
112
    Omar Cornut (1/2/4-bit png)                  Thomas Ruf
113 114 115 116 117
                                                 John Bartholomew
 Optimizations & bugfixes                        Ken Hamada
    Fabian "ryg" Giesen                          Cort Stratton
    Arseny Kapoulkine                            Blazej Dariusz Roszkowski
                                                 Thibault Reuille
S
Sean Barrett 已提交
118 119
                                                 Paul Du Bois
                                                 Guillaume George
120
                                                 Jerry Jansson
S
Sean Barrett 已提交
121 122 123
  If your name should be here but                Hayaki Saito
  isn't, let Sean know.                          Johan Duparc
                                                 Ronny Chevalier
124
                                                 Michal Cichon
125 126 127 128 129
*/

#ifndef STBI_INCLUDE_STB_IMAGE_H
#define STBI_INCLUDE_STB_IMAGE_H

130 131
// DOCUMENTATION
//
132 133
// Limitations:
//    - no jpeg progressive support
S
VC6:  
Sean Barrett 已提交
134
//    - no 16-bit-per-channel PNG
S
Sean Barrett 已提交
135
//    - no 12-bit-per-channel JPEG
136 137 138
//    - no 1-bit BMP
//    - GIF always returns *comp=4
//
S
Sean Barrett 已提交
139
// Basic usage (see HDR discussion below for HDR usage):
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
//    int x,y,n;
//    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
//    // ... process data if not NULL ... 
//    // ... x = width, y = height, n = # 8-bit components per pixel ...
//    // ... replace '0' with '1'..'4' to force that many components per pixel
//    // ... but 'n' will always be the number that it would have been if you said 0
//    stbi_image_free(data)
//
// Standard parameters:
//    int *x       -- outputs image width in pixels
//    int *y       -- outputs image height in pixels
//    int *comp    -- outputs # of image components in image file
//    int req_comp -- if non-zero, # of image components requested in result
//
// The return value from an image loader is an 'unsigned char *' which points
S
Sean Barrett 已提交
155 156
// to the pixel data, or NULL on an allocation failure or if the image is
// corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
157 158 159 160 161 162
// with each pixel consisting of N interleaved 8-bit components; the first
// pixel pointed to is top-left-most in the image. There is no padding between
// image scanlines or between pixels, regardless of format. The number of
// components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
// If req_comp is non-zero, *comp has the number of components that _would_
// have been output otherwise. E.g. if you set req_comp to 4, you will always
S
Sean Barrett 已提交
163 164
// get RGBA output, but you can check *comp to see if it's trivially opaque
// because e.g. there were only 3 channels in the source image.
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
//
// An output image with N components has the following components interleaved
// in this order in each pixel:
//
//     N=#comp     components
//       1           grey
//       2           grey, alpha
//       3           red, green, blue
//       4           red, green, blue, alpha
//
// If image loading fails for any reason, the return value will be NULL,
// and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
// can be queried for an extremely brief, end-user unfriendly explanation
// of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
// compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
// more user-friendly ones.
//
// Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
//
// ===========================================================================
//
S
Sean Barrett 已提交
186
// I/O callbacks
187
//
S
Sean Barrett 已提交
188 189 190 191
// I/O callbacks allow you to read from arbitrary sources, like packaged
// files or some other source. Data read from callbacks are processed
// through a small internal buffer (currently 128 bytes) to try to reduce
// overhead. 
192
//
S
Sean Barrett 已提交
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
// The three functions you must define are "read" (reads some bytes of data),
// "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
//
// ===========================================================================
//
// SIMD support
//
// The JPEG decoder will automatically use SIMD kernels on x86 platforms
// where supported.
//
// (The old do-it-yourself SIMD API is no longer supported in the current
// code.)
//
// The code will automatically detect if the required SIMD instructions are
// available, and fall back to the generic C version where they're not.
//
// The output of the JPEG decoder is slightly different from versions where
// SIMD support was introduced (that is, for versions before 1.49). The
// difference is only +-1 in the 8-bit RGB channels, and only on a small
// fraction of pixels. You can force the pre-1.49 behavior by defining
// STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
// and hence cost some performance.
//
// If for some reason you do not want to use any of SIMD code, or if
// you have issues compiling it, you can disable it entirely by
// defining STBI_NO_SIMD.
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257
//
// ===========================================================================
//
// HDR image support   (disable by defining STBI_NO_HDR)
//
// stb_image now supports loading HDR images in general, and currently
// the Radiance .HDR file format, although the support is provided
// generically. You can still load any file through the existing interface;
// if you attempt to load an HDR file, it will be automatically remapped to
// LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
// both of these constants can be reconfigured through this interface:
//
//     stbi_hdr_to_ldr_gamma(2.2f);
//     stbi_hdr_to_ldr_scale(1.0f);
//
// (note, do not use _inverse_ constants; stbi_image will invert them
// appropriately).
//
// Additionally, there is a new, parallel interface for loading files as
// (linear) floats to preserve the full dynamic range:
//
//    float *data = stbi_loadf(filename, &x, &y, &n, 0);
// 
// If you load LDR images through this interface, those images will
// be promoted to floating point values, run through the inverse of
// constants corresponding to the above:
//
//     stbi_ldr_to_hdr_scale(1.0f);
//     stbi_ldr_to_hdr_gamma(2.2f);
//
// Finally, given a filename (or an open file or memory block--see header
// file for details) containing image data, you can query for the "most
// appropriate" interface to use (that is, whether the image is HDR or
// not), using:
//
//     stbi_is_hdr(char *filename);
//
// ===========================================================================
//
S
Sean Barrett 已提交
258
// iPhone PNG support:
S
VC6:  
Sean Barrett 已提交
259
//
S
Sean Barrett 已提交
260 261 262 263 264
// By default we convert iphone-formatted PNGs back to RGB, even though
// they are internally encoded differently. You can disable this conversion
// by by calling stbi_convert_iphone_png_to_rgb(0), in which case
// you will always just get the native iphone "format" through (which
// is BGR stored in RGB).
S
VC6:  
Sean Barrett 已提交
265
//
S
Sean Barrett 已提交
266 267 268 269
// Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
// pixel to remove any premultiplied alpha *only* if the image file explicitly
// says there's premultiplied data (currently only happens in iPhone images,
// and only if iPhone convert-to-rgb processing is on).
270
//
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312


#ifndef STBI_NO_STDIO
#include <stdio.h>
#endif // STBI_NO_STDIO

#define STBI_VERSION 1

enum
{
   STBI_default = 0, // only used for req_comp

   STBI_grey       = 1,
   STBI_grey_alpha = 2,
   STBI_rgb        = 3,
   STBI_rgb_alpha  = 4
};

typedef unsigned char stbi_uc;

#ifdef __cplusplus
extern "C" {
#endif

#ifdef STB_IMAGE_STATIC
#define STBIDEF static
#else
#define STBIDEF extern
#endif

//////////////////////////////////////////////////////////////////////////////
//
// PRIMARY API - works on images of any type
//

//
// load image by filename, open file, or memory buffer
//

typedef struct
{
   int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read 
S
Sean Barrett 已提交
313
   void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
314 315 316
   int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
} stbi_io_callbacks;

317 318 319 320 321 322 323 324
STBIDEF stbi_uc *stbi_load               (char              const *filename,           int *x, int *y, int *comp, int req_comp);
STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *comp, int req_comp);
STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *comp, int req_comp);

#ifndef STBI_NO_STDIO
STBIDEF stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
// for stbi_load_from_file, file pointer is left pointing immediately after image
#endif
325 326

#ifndef STBI_NO_HDR
327 328 329
   STBIDEF float *stbi_loadf                 (char const *filename,           int *x, int *y, int *comp, int req_comp);
   STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
   STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
330 331 332 333

   #ifndef STBI_NO_STDIO
   STBIDEF float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
   #endif
334
#endif
335

336
#ifndef STBI_NO_HDR
337 338 339 340 341 342 343
   STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
   STBIDEF void   stbi_hdr_to_ldr_scale(float scale);

   STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
   STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
#endif // STBI_NO_HDR

344
// stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403
STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
#ifndef STBI_NO_STDIO
STBIDEF int      stbi_is_hdr          (char const *filename);
STBIDEF int      stbi_is_hdr_from_file(FILE *f);
#endif // STBI_NO_STDIO


// get a VERY brief reason for failure
// NOT THREADSAFE
STBIDEF const char *stbi_failure_reason  (void); 

// free the loaded image -- this is just free()
STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);

// get image dimensions & components without fully decoding
STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);

#ifndef STBI_NO_STDIO
STBIDEF int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
STBIDEF int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);

#endif



// for image formats that explicitly notate that they have premultiplied alpha,
// we just return the colors as stored in the file. set this flag to force
// unpremultiplication. results are undefined if the unpremultiply overflow.
STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);

// indicate whether we should process iphone images back to canonical format,
// or just pass them through "as-is"
STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);


// ZLIB client - used by PNG, available for other purposes

STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);

STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);


#ifdef __cplusplus
}
#endif

//
//
////   end header file   /////////////////////////////////////////////////////
#endif // STBI_INCLUDE_STB_IMAGE_H

#ifdef STB_IMAGE_IMPLEMENTATION

S
Sean Barrett 已提交
404 405 406 407 408
#include <stdarg.h>
#include <stddef.h> // ptrdiff_t on osx
#include <stdlib.h>
#include <string.h>

409 410 411 412 413 414 415
#ifndef STBI_NO_HDR
#include <math.h>  // ldexp
#endif

#ifndef STBI_NO_STDIO
#include <stdio.h>
#endif
S
Sean Barrett 已提交
416

S
Sean Barrett 已提交
417
#ifndef STBI_ASSERT
418
#include <assert.h>
S
Sean Barrett 已提交
419 420
#define STBI_ASSERT(x) assert(x)
#endif
S
Sean Barrett 已提交
421

422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465

#ifndef _MSC_VER
   #ifdef __cplusplus
   #define stbi_inline inline
   #else
   #define stbi_inline
   #endif
#else
   #define stbi_inline __forceinline
#endif


#ifdef _MSC_VER
typedef unsigned short stbi__uint16;
typedef   signed short stbi__int16;
typedef unsigned int   stbi__uint32;
typedef   signed int   stbi__int32;
#else
#include <stdint.h>
typedef uint16_t stbi__uint16;
typedef int16_t  stbi__int16;
typedef uint32_t stbi__uint32;
typedef int32_t  stbi__int32;
#endif

// should produce compiler error if size is wrong
typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];

#ifdef _MSC_VER
#define STBI_NOTUSED(v)  (void)(v)
#else
#define STBI_NOTUSED(v)  (void)sizeof(v)
#endif

#ifdef _MSC_VER
#define STBI_HAS_LROTL
#endif

#ifdef STBI_HAS_LROTL
   #define stbi_lrot(x,y)  _lrotl(x,y)
#else
   #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
#endif

S
Sean Barrett 已提交
466 467 468 469 470 471 472 473 474 475 476 477 478 479
#if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
// ok
#elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
// ok
#else
#error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
#endif

#ifndef STBI_MALLOC
#define STBI_MALLOC(sz)    malloc(sz)
#define STBI_REALLOC(p,sz) realloc(p,sz)
#define STBI_FREE(p)       free(p)
#endif

480 481 482 483 484
#if !defined(STBI_NO_SIMD) && (defined(__x86_64__) || defined(_M_X64) || defined(__i386) || defined(_M_IX86))
#define STBI_SSE2
#include <emmintrin.h>

#ifdef _MSC_VER
S
VC6:  
Sean Barrett 已提交
485 486

#if _MSC_VER >= 1400  // not VC6
487
#include <intrin.h> // __cpuid
S
VC6:  
Sean Barrett 已提交
488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506
static int stbi__cpuid3(void)
{
   int info[4];
   __cpuid(info,1);
   return info[3];
}
#else
static int stbi__cpuid3(void)
{
   int res;
   __asm {
      mov  eax,1
      cpuid
      mov  res,edx
   }
   return res;
}
#endif

507
#define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
508 509 510

static int stbi__sse2_available()
{
S
VC6:  
Sean Barrett 已提交
511 512
   int info3 = stbi__cpuid3();
   return ((info3 >> 26) & 1) != 0;
513
}
514 515
#else // assume GCC-style if not VC++
#define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
516 517 518 519 520 521 522 523 524 525 526 527

static int stbi__sse2_available()
{
#if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
   // GCC 4.8+ has a nice way to do this
   return __builtin_cpu_supports("sse2");
#else
   // portable way to do this, preferably without using GCC inline ASM?
   // just bail for now.
   return 0;
#endif
}
528
#endif
529 530 531 532 533
#endif

#ifndef STBI_SIMD_ALIGN
#define STBI_SIMD_ALIGN(type, name) type name
#endif
534

535 536
///////////////////////////////////////////////
//
S
Sean Barrett 已提交
537
//  stbi__context struct and start_xxx functions
538

S
Sean Barrett 已提交
539
// stbi__context structure is our basic context used by all images, so it
540 541 542 543 544 545 546 547 548 549 550
// contains all the IO context, plus some basic image information
typedef struct
{
   stbi__uint32 img_x, img_y;
   int img_n, img_out_n;
   
   stbi_io_callbacks io;
   void *io_user_data;

   int read_from_callbacks;
   int buflen;
551
   stbi_uc buffer_start[128];
552

553 554
   stbi_uc *img_buffer, *img_buffer_end;
   stbi_uc *img_buffer_original;
S
Sean Barrett 已提交
555
} stbi__context;
556 557


S
Sean Barrett 已提交
558
static void stbi__refill_buffer(stbi__context *s);
559

560
// initialize a memory-decode context
561
static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
562 563 564
{
   s->io.read = NULL;
   s->read_from_callbacks = 0;
565 566
   s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
   s->img_buffer_end = (stbi_uc *) buffer+len;
567 568 569
}

// initialize a callback-based context
S
Sean Barrett 已提交
570
static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
571 572 573 574 575 576
{
   s->io = *c;
   s->io_user_data = user;
   s->buflen = sizeof(s->buffer_start);
   s->read_from_callbacks = 1;
   s->img_buffer_original = s->buffer_start;
S
Sean Barrett 已提交
577
   stbi__refill_buffer(s);
578 579 580 581
}

#ifndef STBI_NO_STDIO

582
static int stbi__stdio_read(void *user, char *data, int size)
583 584 585 586
{
   return (int) fread(data,1,size,(FILE*) user);
}

587
static void stbi__stdio_skip(void *user, int n)
588 589 590 591
{
   fseek((FILE*) user, n, SEEK_CUR);
}

592
static int stbi__stdio_eof(void *user)
593 594 595 596
{
   return feof((FILE*) user);
}

597
static stbi_io_callbacks stbi__stdio_callbacks =
598
{
599 600 601
   stbi__stdio_read,
   stbi__stdio_skip,
   stbi__stdio_eof,
602 603
};

604
static void stbi__start_file(stbi__context *s, FILE *f)
605
{
606
   stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
607 608
}

S
Sean Barrett 已提交
609
//static void stop_file(stbi__context *s) { }
610 611 612

#endif // !STBI_NO_STDIO

613
static void stbi__rewind(stbi__context *s)
614 615 616 617 618 619 620
{
   // conceptually rewind SHOULD rewind to the beginning of the stream,
   // but we just rewind to the beginning of the initial buffer, because
   // we only use it after doing 'test', which only ever looks at at most 92 bytes
   s->img_buffer = s->img_buffer_original;
}

S
Sean Barrett 已提交
621 622 623 624 625 626 627 628 629 630 631 632 633
static int      stbi__jpeg_test(stbi__context *s);
static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
static int      stbi__png_test(stbi__context *s);
static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
static int      stbi__bmp_test(stbi__context *s);
static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__tga_test(stbi__context *s);
static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
static int      stbi__psd_test(stbi__context *s);
static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
634
#ifndef STBI_NO_HDR
S
Sean Barrett 已提交
635 636
static int      stbi__hdr_test(stbi__context *s);
static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
637
#endif
S
Sean Barrett 已提交
638 639 640 641 642
static int      stbi__pic_test(stbi__context *s);
static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__gif_test(stbi__context *s);
static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
643 644 645


// this is not threadsafe
646
static const char *stbi__g_failure_reason;
647 648 649

STBIDEF const char *stbi_failure_reason(void)
{
650
   return stbi__g_failure_reason;
651 652
}

S
Sean Barrett 已提交
653
static int stbi__err(const char *str)
654
{
655
   stbi__g_failure_reason = str;
656 657 658
   return 0;
}

659 660
static void *stbi__malloc(size_t size)
{
S
Sean Barrett 已提交
661
    return STBI_MALLOC(size);
662 663
}

S
Sean Barrett 已提交
664 665 666
// stbi__err - error
// stbi__errpf - error returning pointer to float
// stbi__errpuc - error returning pointer to unsigned char
667 668

#ifdef STBI_NO_FAILURE_STRINGS
S
Sean Barrett 已提交
669
   #define stbi__err(x,y)  0
670
#elif defined(STBI_FAILURE_USERMSG)
S
Sean Barrett 已提交
671
   #define stbi__err(x,y)  stbi__err(y)
672
#else
S
Sean Barrett 已提交
673
   #define stbi__err(x,y)  stbi__err(x)
674 675
#endif

S
Sean Barrett 已提交
676 677
#define stbi__errpf(x,y)   ((float *) (stbi__err(x,y)?NULL:NULL))
#define stbi__errpuc(x,y)  ((unsigned char *) (stbi__err(x,y)?NULL:NULL))
678 679 680

STBIDEF void stbi_image_free(void *retval_from_stbi_load)
{
S
Sean Barrett 已提交
681
   STBI_FREE(retval_from_stbi_load);
682 683 684
}

#ifndef STBI_NO_HDR
685 686
static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
687 688
#endif

S
Sean Barrett 已提交
689
static unsigned char *stbi_load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
690
{
S
Sean Barrett 已提交
691 692 693 694 695 696
   if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
   if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp);
   if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp);
   if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp);
   if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp);
   if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp);
697 698

   #ifndef STBI_NO_HDR
S
Sean Barrett 已提交
699 700
   if (stbi__hdr_test(s)) {
      float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
701
      return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
702 703 704 705
   }
   #endif

   // test tga last because it's a crappy test!
S
Sean Barrett 已提交
706 707 708
   if (stbi__tga_test(s))
      return stbi__tga_load(s,x,y,comp,req_comp);
   return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
709 710 711
}

#ifndef STBI_NO_STDIO
S
Sean Barrett 已提交
712

713
static FILE *stbi__fopen(char const *filename, char const *mode)
S
Sean Barrett 已提交
714 715
{
   FILE *f;
716
#if defined(_MSC_VER) && _MSC_VER >= 1400
717
   if (0 != fopen_s(&f, filename, mode))
S
Sean Barrett 已提交
718 719
      f=0;
#else
720
   f = fopen(filename, mode);
S
Sean Barrett 已提交
721 722 723 724 725
#endif
   return f;
}


726
STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
727
{
S
Sean Barrett 已提交
728
   FILE *f = stbi__fopen(filename, "rb");
729
   unsigned char *result;
S
Sean Barrett 已提交
730
   if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
731 732 733 734 735
   result = stbi_load_from_file(f,x,y,comp,req_comp);
   fclose(f);
   return result;
}

736
STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
737 738
{
   unsigned char *result;
S
Sean Barrett 已提交
739
   stbi__context s;
740
   stbi__start_file(&s,f);
741 742 743 744 745 746 747 748 749
   result = stbi_load_main(&s,x,y,comp,req_comp);
   if (result) {
      // need to 'unget' all the characters in the IO buffer
      fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   }
   return result;
}
#endif //!STBI_NO_STDIO

750
STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
751
{
S
Sean Barrett 已提交
752 753
   stbi__context s;
   stbi__start_mem(&s,buffer,len);
754 755 756
   return stbi_load_main(&s,x,y,comp,req_comp);
}

757
STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
758
{
S
Sean Barrett 已提交
759 760
   stbi__context s;
   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
761 762 763 764 765
   return stbi_load_main(&s,x,y,comp,req_comp);
}

#ifndef STBI_NO_HDR

766
static float *stbi_loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
767 768 769
{
   unsigned char *data;
   #ifndef STBI_NO_HDR
S
Sean Barrett 已提交
770 771
   if (stbi__hdr_test(s))
      return stbi__hdr_load(s,x,y,comp,req_comp);
772 773 774
   #endif
   data = stbi_load_main(s, x, y, comp, req_comp);
   if (data)
775
      return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
S
Sean Barrett 已提交
776
   return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
777 778
}

779
STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
780
{
S
Sean Barrett 已提交
781 782
   stbi__context s;
   stbi__start_mem(&s,buffer,len);
783 784 785
   return stbi_loadf_main(&s,x,y,comp,req_comp);
}

786
STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
787
{
S
Sean Barrett 已提交
788 789
   stbi__context s;
   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
790 791 792 793
   return stbi_loadf_main(&s,x,y,comp,req_comp);
}

#ifndef STBI_NO_STDIO
794
STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
795 796
{
   float *result;
S
Sean Barrett 已提交
797
   FILE *f = stbi__fopen(filename, "rb");
S
Sean Barrett 已提交
798
   if (!f) return stbi__errpf("can't fopen", "Unable to open file");
799 800 801 802 803
   result = stbi_loadf_from_file(f,x,y,comp,req_comp);
   fclose(f);
   return result;
}

804
STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
805
{
S
Sean Barrett 已提交
806
   stbi__context s;
807
   stbi__start_file(&s,f);
808 809 810 811 812 813 814 815 816 817
   return stbi_loadf_main(&s,x,y,comp,req_comp);
}
#endif // !STBI_NO_STDIO

#endif // !STBI_NO_HDR

// these is-hdr-or-not is defined independent of whether STBI_NO_HDR is
// defined, for API simplicity; if STBI_NO_HDR is defined, it always
// reports false!

818
STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
819 820
{
   #ifndef STBI_NO_HDR
S
Sean Barrett 已提交
821 822 823
   stbi__context s;
   stbi__start_mem(&s,buffer,len);
   return stbi__hdr_test(&s);
824 825 826 827 828 829 830 831 832 833
   #else
   STBI_NOTUSED(buffer);
   STBI_NOTUSED(len);
   return 0;
   #endif
}

#ifndef STBI_NO_STDIO
STBIDEF int      stbi_is_hdr          (char const *filename)
{
S
Sean Barrett 已提交
834
   FILE *f = stbi__fopen(filename, "rb");
835 836 837 838 839 840 841 842 843 844 845
   int result=0;
   if (f) {
      result = stbi_is_hdr_from_file(f);
      fclose(f);
   }
   return result;
}

STBIDEF int      stbi_is_hdr_from_file(FILE *f)
{
   #ifndef STBI_NO_HDR
S
Sean Barrett 已提交
846
   stbi__context s;
847
   stbi__start_file(&s,f);
S
Sean Barrett 已提交
848
   return stbi__hdr_test(&s);
849 850 851 852 853 854 855 856 857
   #else
   return 0;
   #endif
}
#endif // !STBI_NO_STDIO

STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
{
   #ifndef STBI_NO_HDR
S
Sean Barrett 已提交
858 859 860
   stbi__context s;
   stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   return stbi__hdr_test(&s);
861 862 863 864 865 866
   #else
   return 0;
   #endif
}

#ifndef STBI_NO_HDR
867 868
static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
869

870 871
STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
872

873 874
STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
875 876 877 878 879 880 881 882 883 884 885 886 887 888 889
#endif


//////////////////////////////////////////////////////////////////////////////
//
// Common code used by all image loaders
//

enum
{
   SCAN_load=0,
   SCAN_type,
   SCAN_header
};

S
Sean Barrett 已提交
890
static void stbi__refill_buffer(stbi__context *s)
891 892 893 894
{
   int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
   if (n == 0) {
      // at end of file, treat same as if from memory, but need to handle case
T
Tero Hänninen 已提交
895
      // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
896 897 898 899 900 901 902 903 904 905
      s->read_from_callbacks = 0;
      s->img_buffer = s->buffer_start;
      s->img_buffer_end = s->buffer_start+1;
      *s->img_buffer = 0;
   } else {
      s->img_buffer = s->buffer_start;
      s->img_buffer_end = s->buffer_start + n;
   }
}

906
stbi_inline static stbi_uc stbi__get8(stbi__context *s)
907 908 909 910
{
   if (s->img_buffer < s->img_buffer_end)
      return *s->img_buffer++;
   if (s->read_from_callbacks) {
S
Sean Barrett 已提交
911
      stbi__refill_buffer(s);
912 913 914 915 916
      return *s->img_buffer++;
   }
   return 0;
}

917
stbi_inline static int stbi__at_eof(stbi__context *s)
918 919 920 921 922 923 924 925 926 927 928
{
   if (s->io.read) {
      if (!(s->io.eof)(s->io_user_data)) return 0;
      // if feof() is true, check if buffer = end
      // special case: we've only got the special 0 character at the end
      if (s->read_from_callbacks == 0) return 1;
   }

   return s->img_buffer >= s->img_buffer_end;   
}

929
static void stbi__skip(stbi__context *s, int n)
930 931 932 933 934
{
   if (s->io.read) {
      int blen = (int) (s->img_buffer_end - s->img_buffer);
      if (blen < n) {
         s->img_buffer = s->img_buffer_end;
S
Sean Barrett 已提交
935
         (s->io.skip)(s->io_user_data, n - blen);
936 937 938 939 940 941
         return;
      }
   }
   s->img_buffer += n;
}

942
static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965
{
   if (s->io.read) {
      int blen = (int) (s->img_buffer_end - s->img_buffer);
      if (blen < n) {
         int res, count;

         memcpy(buffer, s->img_buffer, blen);
         
         count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
         res = (count == (n-blen));
         s->img_buffer = s->img_buffer_end;
         return res;
      }
   }

   if (s->img_buffer+n <= s->img_buffer_end) {
      memcpy(buffer, s->img_buffer, n);
      s->img_buffer += n;
      return 1;
   } else
      return 0;
}

966
static int stbi__get16be(stbi__context *s)
967
{
968 969
   int z = stbi__get8(s);
   return (z << 8) + stbi__get8(s);
970 971
}

972
static stbi__uint32 stbi__get32be(stbi__context *s)
973
{
974 975
   stbi__uint32 z = stbi__get16be(s);
   return (z << 16) + stbi__get16be(s);
976 977
}

978
static int stbi__get16le(stbi__context *s)
979
{
980 981
   int z = stbi__get8(s);
   return z + (stbi__get8(s) << 8);
982 983
}

984
static stbi__uint32 stbi__get32le(stbi__context *s)
985
{
986 987
   stbi__uint32 z = stbi__get16le(s);
   return z + (stbi__get16le(s) << 16);
988 989 990 991 992
}

//////////////////////////////////////////////////////////////////////////////
//
//  generic converter from built-in img_n to req_comp
T
Tero Hänninen 已提交
993
//    individual types do this automatically as much as possible (e.g. jpeg
994 995 996 997 998 999 1000
//    does all cases internally since it needs to colorspace convert anyway,
//    and it never has alpha, so very few cases ). png can automatically
//    interleave an alpha=255 channel, but falls back to this for other cases
//
//  assume data buffer is malloced, so malloc a new one and free that one
//  only failure mode is malloc failing

1001
static stbi_uc stbi__compute_y(int r, int g, int b)
1002
{
1003
   return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
1004 1005
}

1006
static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1007 1008 1009 1010 1011
{
   int i,j;
   unsigned char *good;

   if (req_comp == img_n) return data;
S
Sean Barrett 已提交
1012
   STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1013

1014
   good = (unsigned char *) stbi__malloc(req_comp * x * y);
1015
   if (good == NULL) {
S
Sean Barrett 已提交
1016
      STBI_FREE(data);
S
Sean Barrett 已提交
1017
      return stbi__errpuc("outofmem", "Out of memory");
1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035
   }

   for (j=0; j < (int) y; ++j) {
      unsigned char *src  = data + j * x * img_n   ;
      unsigned char *dest = good + j * x * req_comp;

      #define COMBO(a,b)  ((a)*8+(b))
      #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
      // convert source image with img_n components to one with req_comp components;
      // avoid switch per pixel, so use switch per scanline and massive macros
      switch (COMBO(img_n, req_comp)) {
         CASE(1,2) dest[0]=src[0], dest[1]=255; break;
         CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
         CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
         CASE(2,1) dest[0]=src[0]; break;
         CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
         CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
         CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1036 1037 1038 1039
         CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
         CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
         CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
         CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1040
         CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
S
Sean Barrett 已提交
1041
         default: STBI_ASSERT(0);
1042 1043 1044 1045
      }
      #undef CASE
   }

S
Sean Barrett 已提交
1046
   STBI_FREE(data);
1047 1048 1049 1050
   return good;
}

#ifndef STBI_NO_HDR
1051
static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1052 1053
{
   int i,k,n;
1054
   float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
S
Sean Barrett 已提交
1055
   if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1056 1057 1058 1059
   // compute number of non-alpha components
   if (comp & 1) n = comp; else n = comp-1;
   for (i=0; i < x*y; ++i) {
      for (k=0; k < n; ++k) {
1060
         output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1061 1062 1063
      }
      if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
   }
S
Sean Barrett 已提交
1064
   STBI_FREE(data);
1065 1066 1067
   return output;
}

1068 1069
#define stbi__float2int(x)   ((int) (x))
static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
1070 1071
{
   int i,k,n;
1072
   stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
S
Sean Barrett 已提交
1073
   if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1074 1075 1076 1077
   // compute number of non-alpha components
   if (comp & 1) n = comp; else n = comp-1;
   for (i=0; i < x*y; ++i) {
      for (k=0; k < n; ++k) {
1078
         float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1079 1080
         if (z < 0) z = 0;
         if (z > 255) z = 255;
1081
         output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1082 1083 1084 1085 1086
      }
      if (k < comp) {
         float z = data[i*comp+k] * 255 + 0.5f;
         if (z < 0) z = 0;
         if (z > 255) z = 255;
1087
         output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1088 1089
      }
   }
S
Sean Barrett 已提交
1090
   STBI_FREE(data);
1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126
   return output;
}
#endif

//////////////////////////////////////////////////////////////////////////////
//
//  "baseline" JPEG/JFIF decoder (not actually fully baseline implementation)
//
//    simple implementation
//      - channel subsampling of at most 2 in each dimension
//      - doesn't support delayed output of y-dimension
//      - simple interface (only one output format: 8-bit interleaved RGB)
//      - doesn't try to recover corrupt jpegs
//      - doesn't allow partial loading, loading multiple at once
//      - still fast on x86 (copying globals into locals doesn't help x86)
//      - allocates lots of intermediate memory (full size of all components)
//        - non-interleaved case requires this anyway
//        - allows good upsampling (see next)
//    high-quality
//      - upsampled channels are bilinearly interpolated, even across blocks
//      - quality integer IDCT derived from IJG's 'slow'
//    performance
//      - fast huffman; reasonable integer IDCT
//      - uses a lot of intermediate memory, could cache poorly
//      - load http://nothings.org/remote/anemones.jpg 3 times on 2.8Ghz P4
//          stb_jpeg:   1.34 seconds (MSVC6, default release build)
//          stb_jpeg:   1.06 seconds (MSVC6, processor = Pentium Pro)
//          IJL11.dll:  1.08 seconds (compiled by intel)
//          IJG 1998:   0.98 seconds (MSVC6, makefile provided by IJG)
//          IJG 1998:   0.95 seconds (MSVC6, makefile + proc=PPro)

// huffman decoding acceleration
#define FAST_BITS   9  // larger handles more cases; smaller stomps less cache

typedef struct
{
1127
   stbi_uc  fast[1 << FAST_BITS];
1128 1129
   // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
   stbi__uint16 code[256];
1130 1131
   stbi_uc  values[256];
   stbi_uc  size[257];
1132 1133
   unsigned int maxcode[18];
   int    delta[17];   // old 'firstsymbol' - old 'firstcode'
1134
} stbi__huffman;
1135 1136 1137

typedef struct
{
S
Sean Barrett 已提交
1138
   stbi__context *s;
1139 1140
   stbi__huffman huff_dc[4];
   stbi__huffman huff_ac[4];
1141
   stbi_uc dequant[4][64];
1142
   stbi__int16 fast_ac[4][1 << FAST_BITS];
1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158

// sizes for components, interleaved MCUs
   int img_h_max, img_v_max;
   int img_mcu_x, img_mcu_y;
   int img_mcu_w, img_mcu_h;

// definition of jpeg image component
   struct
   {
      int id;
      int h,v;
      int tq;
      int hd,ha;
      int dc_pred;

      int x,y,w2,h2;
1159
      stbi_uc *data;
1160
      void *raw_data;
1161
      stbi_uc *linebuf;
1162 1163 1164 1165 1166 1167 1168 1169 1170
   } img_comp[4];

   stbi__uint32         code_buffer; // jpeg entropy-coded buffer
   int            code_bits;   // number of valid bits
   unsigned char  marker;      // marker seen while filling entropy buffer
   int            nomore;      // flag if we saw a marker so must stop

   int scan_n, order[4];
   int restart_interval, todo;
1171 1172 1173

// kernels
   void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
F
Fabian Giesen 已提交
1174
   void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1175
   stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1176
} stbi__jpeg;
1177

1178
static int stbi__build_huffman(stbi__huffman *h, int *count)
1179 1180 1181 1182 1183
{
   int i,j,k=0,code;
   // build size list for each symbol (from JPEG spec)
   for (i=0; i < 16; ++i)
      for (j=0; j < count[i]; ++j)
1184
         h->size[k++] = (stbi_uc) (i+1);
1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195
   h->size[k] = 0;

   // compute actual symbols (from jpeg spec)
   code = 0;
   k = 0;
   for(j=1; j <= 16; ++j) {
      // compute delta to add to code to compute symbol id
      h->delta[j] = k - code;
      if (h->size[k] == j) {
         while (h->size[k] == j)
            h->code[k++] = (stbi__uint16) (code++);
S
Sean Barrett 已提交
1196
         if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211
      }
      // compute largest code + 1 for this size, preshifted as needed later
      h->maxcode[j] = code << (16-j);
      code <<= 1;
   }
   h->maxcode[j] = 0xffffffff;

   // build non-spec acceleration table; 255 is flag for not-accelerated
   memset(h->fast, 255, 1 << FAST_BITS);
   for (i=0; i < k; ++i) {
      int s = h->size[i];
      if (s <= FAST_BITS) {
         int c = h->code[i] << (FAST_BITS-s);
         int m = 1 << (FAST_BITS-s);
         for (j=0; j < m; ++j) {
1212
            h->fast[c+j] = (stbi_uc) i;
1213 1214 1215 1216 1217 1218
         }
      }
   }
   return 1;
}

1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245
// build a table that decodes both magnitude and value of small ACs in
// one go.
static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
{
   int i;
   for (i=0; i < (1 << FAST_BITS); ++i) {
      stbi_uc fast = h->fast[i];
      fast_ac[i] = 0;
      if (fast < 255) {
         int rs = h->values[fast];
         int run = (rs >> 4) & 15;
         int magbits = rs & 15;
         int len = h->size[fast];

         if (magbits && len + magbits <= FAST_BITS) {
            // magnitude code followed by receive_extend code
            int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
            int m = 1 << (magbits - 1);
            if (k < m) k += (-1 << magbits) + 1;
            // if the result is small enough, we can fit it in fast_ac table
            if (k >= -128 && k <= 127)
               fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
         }
      }
   }
}

1246
static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1247 1248
{
   do {
1249
      int b = j->nomore ? 0 : stbi__get8(j->s);
1250
      if (b == 0xff) {
1251
         int c = stbi__get8(j->s);
1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263
         if (c != 0) {
            j->marker = (unsigned char) c;
            j->nomore = 1;
            return;
         }
      }
      j->code_buffer |= b << (24 - j->code_bits);
      j->code_bits += 8;
   } while (j->code_bits <= 24);
}

// (1 << n) - 1
1264
static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1265

1266
// decode a jpeg huffman value from the bitstream
1267
stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1268 1269 1270 1271
{
   unsigned int temp;
   int c,k;

1272
   if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306

   // look at the top FAST_BITS and determine what symbol ID it is,
   // if the code is <= FAST_BITS
   c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   k = h->fast[c];
   if (k < 255) {
      int s = h->size[k];
      if (s > j->code_bits)
         return -1;
      j->code_buffer <<= s;
      j->code_bits -= s;
      return h->values[k];
   }

   // naive test is to shift the code_buffer down so k bits are
   // valid, then test against maxcode. To speed this up, we've
   // preshifted maxcode left so that it has (16-k) 0s at the
   // end; in other words, regardless of the number of bits, it
   // wants to be compared against something shifted to have 16;
   // that way we don't need to shift inside the loop.
   temp = j->code_buffer >> 16;
   for (k=FAST_BITS+1 ; ; ++k)
      if (temp < h->maxcode[k])
         break;
   if (k == 17) {
      // error! code not found
      j->code_bits -= 16;
      return -1;
   }

   if (k > j->code_bits)
      return -1;

   // convert the huffman code to the symbol id
1307
   c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
S
Sean Barrett 已提交
1308
   STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1309 1310 1311 1312 1313 1314 1315

   // convert the id to a symbol
   j->code_bits -= k;
   j->code_buffer <<= k;
   return h->values[c];
}

1316 1317 1318
// bias[n] = (-1<<n) + 1
static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};

1319 1320
// combined JPEG 'receive' and JPEG 'extend', since baseline
// always extends everything it receives.
1321
stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1322 1323
{
   unsigned int k;
1324
   int sgn;
1325
   if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1326

1327
   sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1328
   k = stbi_lrot(j->code_buffer, n);
1329 1330
   j->code_buffer = k & ~stbi__bmask[n];
   k &= stbi__bmask[n];
1331
   j->code_bits -= n;
1332
   return k + (stbi__jbias[n] & ~sgn);
1333 1334 1335 1336
}

// given a value that's at position X in the zigzag stream,
// where does it appear in the 8x8 matrix coded as row-major?
1337
static stbi_uc stbi__jpeg_dezigzag[64+15] =
1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351
{
    0,  1,  8, 16,  9,  2,  3, 10,
   17, 24, 32, 25, 18, 11,  4,  5,
   12, 19, 26, 33, 40, 48, 41, 34,
   27, 20, 13,  6,  7, 14, 21, 28,
   35, 42, 49, 56, 57, 50, 43, 36,
   29, 22, 15, 23, 30, 37, 44, 51,
   58, 59, 52, 45, 38, 31, 39, 46,
   53, 60, 61, 54, 47, 55, 62, 63,
   // let corrupt input sample past end
   63, 63, 63, 63, 63, 63, 63, 63,
   63, 63, 63, 63, 63, 63, 63
};

1352
// decode one 64-entry block--
1353
static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1354 1355
{
   int diff,dc,k;
1356 1357 1358 1359
   int t;

   if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   t = stbi__jpeg_huff_decode(j, hdc);
S
Sean Barrett 已提交
1360
   if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1361 1362 1363 1364

   // 0 all the ac values now so we can do it 32-bits at a time
   memset(data,0,64*sizeof(data[0]));

1365
   diff = t ? stbi__extend_receive(j, t) : 0;
1366 1367
   dc = j->img_comp[b].dc_pred + diff;
   j->img_comp[b].dc_pred = dc;
1368
   data[0] = (short) (dc * dequant[0]);
1369

1370
   // decode AC components, see JPEG spec
1371 1372
   k = 1;
   do {
1373
      unsigned int zig;
1374 1375 1376 1377 1378 1379 1380 1381 1382
      int c,r,s;
      if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
      c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
      r = fac[c];
      if (r) { // fast-AC path
         k += (r >> 4) & 15; // run
         s = r & 15; // combined length
         j->code_buffer <<= s;
         j->code_bits -= s;
1383
         // decode into unzigzag'd location
1384 1385
         zig = stbi__jpeg_dezigzag[k++];
         data[zig] = (short) ((r >> 8) * dequant[zig]);
1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396
      } else {
         int rs = stbi__jpeg_huff_decode(j, hac);
         if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
         s = rs & 15;
         r = rs >> 4;
         if (s == 0) {
            if (rs != 0xf0) break; // end block
            k += 16;
         } else {
            k += r;
            // decode into unzigzag'd location
1397 1398
            zig = stbi__jpeg_dezigzag[k++];
            data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1399
         }
1400 1401 1402 1403 1404
      }
   } while (k < 64);
   return 1;
}

1405
// take a -128..127 value and stbi__clamp it and convert to 0..255
1406
stbi_inline static stbi_uc stbi__clamp(int x)
1407 1408 1409 1410 1411 1412
{
   // trick to use a single test to catch both cases
   if ((unsigned int) x > 255) {
      if (x < 0) return 0;
      if (x > 255) return 255;
   }
1413
   return (stbi_uc) x;
1414 1415
}

1416
#define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
1417
#define stbi__fsh(x)  ((x) << 12)
1418 1419

// derived from jidctint -- DCT_ISLOW
1420
#define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7)       \
1421 1422 1423
   int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
   p2 = s2;                                    \
   p3 = s6;                                    \
1424 1425 1426
   p1 = (p2+p3) * stbi__f2f(0.5411961f);             \
   t2 = p1 + p3*stbi__f2f(-1.847759065f);            \
   t3 = p1 + p2*stbi__f2f( 0.765366865f);            \
1427 1428
   p2 = s0;                                    \
   p3 = s4;                                    \
1429 1430
   t0 = stbi__fsh(p2+p3);                            \
   t1 = stbi__fsh(p2-p3);                            \
1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442
   x0 = t0+t3;                                 \
   x3 = t0-t3;                                 \
   x1 = t1+t2;                                 \
   x2 = t1-t2;                                 \
   t0 = s7;                                    \
   t1 = s5;                                    \
   t2 = s3;                                    \
   t3 = s1;                                    \
   p3 = t0+t2;                                 \
   p4 = t1+t3;                                 \
   p1 = t0+t3;                                 \
   p2 = t1+t2;                                 \
1443 1444 1445 1446 1447 1448 1449 1450 1451
   p5 = (p3+p4)*stbi__f2f( 1.175875602f);            \
   t0 = t0*stbi__f2f( 0.298631336f);                 \
   t1 = t1*stbi__f2f( 2.053119869f);                 \
   t2 = t2*stbi__f2f( 3.072711026f);                 \
   t3 = t3*stbi__f2f( 1.501321110f);                 \
   p1 = p5 + p1*stbi__f2f(-0.899976223f);            \
   p2 = p5 + p2*stbi__f2f(-2.562915447f);            \
   p3 = p3*stbi__f2f(-1.961570560f);                 \
   p4 = p4*stbi__f2f(-0.390180644f);                 \
1452 1453 1454 1455 1456 1457
   t3 += p1+p4;                                \
   t2 += p2+p3;                                \
   t1 += p2+p4;                                \
   t0 += p1+p3;

// .344 seconds on 3*anemones.jpg
1458
static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1459 1460
{
   int i,val[64],*v=val;
1461
   stbi_uc *o;
1462 1463 1464
   short *d = data;

   // columns
1465
   for (i=0; i < 8; ++i,++d, ++v) {
1466 1467 1468 1469 1470 1471 1472
      // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
      if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
           && d[40]==0 && d[48]==0 && d[56]==0) {
         //    no shortcut                 0     seconds
         //    (1|2|3|4|5|6|7)==0          0     seconds
         //    all separate               -0.047 seconds
         //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
1473
         int dcterm = d[0] << 2;
1474 1475
         v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
      } else {
1476
         STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492
         // constants scaled things up by 1<<12; let's bring them back
         // down, but keep 2 extra bits of precision
         x0 += 512; x1 += 512; x2 += 512; x3 += 512;
         v[ 0] = (x0+t3) >> 10;
         v[56] = (x0-t3) >> 10;
         v[ 8] = (x1+t2) >> 10;
         v[48] = (x1-t2) >> 10;
         v[16] = (x2+t1) >> 10;
         v[40] = (x2-t1) >> 10;
         v[24] = (x3+t0) >> 10;
         v[32] = (x3-t0) >> 10;
      }
   }

   for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
      // no fast case since the first 1D IDCT spread components out
1493
      STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505
      // constants scaled things up by 1<<12, plus we had 1<<2 from first
      // loop, plus horizontal and vertical each scale by sqrt(8) so together
      // we've got an extra 1<<3, so 1<<17 total we need to remove.
      // so we want to round that, which means adding 0.5 * 1<<17,
      // aka 65536. Also, we'll end up with -128 to 127 that we want
      // to encode as 0..255 by adding 128, so we'll add that before the shift
      x0 += 65536 + (128<<17);
      x1 += 65536 + (128<<17);
      x2 += 65536 + (128<<17);
      x3 += 65536 + (128<<17);
      // tried computing the shifts into temps, or'ing the temps to see
      // if any were out of range, but that was slower
1506 1507 1508 1509 1510 1511 1512 1513
      o[0] = stbi__clamp((x0+t3) >> 17);
      o[7] = stbi__clamp((x0-t3) >> 17);
      o[1] = stbi__clamp((x1+t2) >> 17);
      o[6] = stbi__clamp((x1-t2) >> 17);
      o[2] = stbi__clamp((x2+t1) >> 17);
      o[5] = stbi__clamp((x2-t1) >> 17);
      o[3] = stbi__clamp((x3+t0) >> 17);
      o[4] = stbi__clamp((x3-t0) >> 17);
1514 1515 1516
   }
}

1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694
#ifdef STBI_SSE2

// sse2 integer IDCT. not the fastest possible implementation but it
// produces bit-identical results to the generic C version so it's
// fully "transparent".
static void stbi__idct_sse2(stbi_uc *out, int out_stride, short data[64])
{
   // This is constructed to match our regular (generic) integer IDCT exactly.
   __m128i row0, row1, row2, row3, row4, row5, row6, row7;
   __m128i tmp;

   // dot product constant: even elems=x, odd elems=y
   #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))

   // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
   // out(1) = c1[even]*x + c1[odd]*y
   #define dct_rot(out0,out1, x,y,c0,c1) \
      __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
      __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
      __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
      __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
      __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
      __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)

   // out = in << 12  (in 16-bit, out 32-bit)
   #define dct_widen(out, in) \
      __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
      __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)

   // wide add
   #define dct_wadd(out, a, b) \
      __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
      __m128i out##_h = _mm_add_epi32(a##_h, b##_h)

   // wide sub
   #define dct_wsub(out, a, b) \
      __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
      __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)

   // butterfly a/b, add bias, then shift by "s" and pack
   #define dct_bfly32o(out0, out1, a,b,bias,s) \
      { \
         __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
         __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
         dct_wadd(sum, abiased, b); \
         dct_wsub(dif, abiased, b); \
         out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
         out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
      }

   // 8-bit interleave step (for transposes)
   #define dct_interleave8(a, b) \
      tmp = a; \
      a = _mm_unpacklo_epi8(a, b); \
      b = _mm_unpackhi_epi8(tmp, b)

   // 16-bit interleave step (for transposes)
   #define dct_interleave16(a, b) \
      tmp = a; \
      a = _mm_unpacklo_epi16(a, b); \
      b = _mm_unpackhi_epi16(tmp, b)

   #define dct_pass(bias,shift) \
      { \
         /* even part */ \
         dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
         __m128i sum04 = _mm_add_epi16(row0, row4); \
         __m128i dif04 = _mm_sub_epi16(row0, row4); \
         dct_widen(t0e, sum04); \
         dct_widen(t1e, dif04); \
         dct_wadd(x0, t0e, t3e); \
         dct_wsub(x3, t0e, t3e); \
         dct_wadd(x1, t1e, t2e); \
         dct_wsub(x2, t1e, t2e); \
         /* odd part */ \
         dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
         dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
         __m128i sum17 = _mm_add_epi16(row1, row7); \
         __m128i sum35 = _mm_add_epi16(row3, row5); \
         dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
         dct_wadd(x4, y0o, y4o); \
         dct_wadd(x5, y1o, y5o); \
         dct_wadd(x6, y2o, y5o); \
         dct_wadd(x7, y3o, y4o); \
         dct_bfly32o(row0,row7, x0,x7,bias,shift); \
         dct_bfly32o(row1,row6, x1,x6,bias,shift); \
         dct_bfly32o(row2,row5, x2,x5,bias,shift); \
         dct_bfly32o(row3,row4, x3,x4,bias,shift); \
      }

   __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
   __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
   __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
   __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
   __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
   __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
   __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
   __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));

   // rounding biases in column/row passes, see stbi__idct_block for explanation.
   __m128i bias_0 = _mm_set1_epi32(512);
   __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));

   // load
   row0 = _mm_load_si128((const __m128i *) (data + 0*8));
   row1 = _mm_load_si128((const __m128i *) (data + 1*8));
   row2 = _mm_load_si128((const __m128i *) (data + 2*8));
   row3 = _mm_load_si128((const __m128i *) (data + 3*8));
   row4 = _mm_load_si128((const __m128i *) (data + 4*8));
   row5 = _mm_load_si128((const __m128i *) (data + 5*8));
   row6 = _mm_load_si128((const __m128i *) (data + 6*8));
   row7 = _mm_load_si128((const __m128i *) (data + 7*8));

   // column pass
   dct_pass(bias_0, 10);

   {
      // 16bit 8x8 transpose pass 1
      dct_interleave16(row0, row4);
      dct_interleave16(row1, row5);
      dct_interleave16(row2, row6);
      dct_interleave16(row3, row7);

      // transpose pass 2
      dct_interleave16(row0, row2);
      dct_interleave16(row1, row3);
      dct_interleave16(row4, row6);
      dct_interleave16(row5, row7);

      // transpose pass 3
      dct_interleave16(row0, row1);
      dct_interleave16(row2, row3);
      dct_interleave16(row4, row5);
      dct_interleave16(row6, row7);
   }

   // row pass
   dct_pass(bias_1, 17);

   {
      // pack
      __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
      __m128i p1 = _mm_packus_epi16(row2, row3);
      __m128i p2 = _mm_packus_epi16(row4, row5);
      __m128i p3 = _mm_packus_epi16(row6, row7);

      // 8bit 8x8 transpose pass 1
      dct_interleave8(p0, p2); // a0e0a1e1...
      dct_interleave8(p1, p3); // c0g0c1g1...

      // transpose pass 2
      dct_interleave8(p0, p1); // a0c0e0g0...
      dct_interleave8(p2, p3); // b0d0f0h0...

      // transpose pass 3
      dct_interleave8(p0, p2); // a0b0c0d0...
      dct_interleave8(p1, p3); // a4b4c4d4...

      // store
      _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
      _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
      _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
      _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
      _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
   }

#undef dct_const
#undef dct_rot
#undef dct_widen
#undef dct_wadd
#undef dct_wsub
#undef dct_bfly32o
#undef dct_interleave8
#undef dct_interleave16
#undef dct_pass
1695 1696
}

1697
#endif // STBI_SSE2
1698

1699
#define STBI__MARKER_none  0xff
1700 1701 1702
// if there's a pending marker from the entropy stream, return that
// otherwise, fetch from the stream and get a marker. if there's no
// marker, return 0xff, which is never a valid marker value
1703
static stbi_uc stbi__get_marker(stbi__jpeg *j)
1704
{
1705
   stbi_uc x;
1706
   if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
1707
   x = stbi__get8(j->s);
1708
   if (x != 0xff) return STBI__MARKER_none;
1709
   while (x == 0xff)
1710
      x = stbi__get8(j->s);
1711 1712 1713 1714 1715
   return x;
}

// in each scan, we'll have scan_n components, and the order
// of the components is specified by order[]
1716
#define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
1717

1718
// after a restart interval, stbi__jpeg_reset the entropy decoder and
1719
// the dc prediction
1720
static void stbi__jpeg_reset(stbi__jpeg *j)
1721 1722 1723 1724 1725
{
   j->code_bits = 0;
   j->code_buffer = 0;
   j->nomore = 0;
   j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
1726
   j->marker = STBI__MARKER_none;
1727 1728 1729 1730 1731
   j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
   // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
   // since we don't even allow 1<<30 pixels
}

1732
static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
1733
{
1734
   stbi__jpeg_reset(z);
1735 1736
   if (z->scan_n == 1) {
      int i,j;
1737
      STBI_SIMD_ALIGN(short, data[64]);
1738 1739 1740 1741 1742 1743 1744 1745 1746
      int n = z->order[0];
      // non-interleaved data, we just need to process one block at a time,
      // in trivial scanline order
      // number of blocks to do just depends on how many actual "pixels" this
      // component has, independent of interleaved MCU blocking and such
      int w = (z->img_comp[n].x+7) >> 3;
      int h = (z->img_comp[n].y+7) >> 3;
      for (j=0; j < h; ++j) {
         for (i=0; i < w; ++i) {
1747
            int ha = z->img_comp[n].ha;
1748
            if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
1749
            z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
1750 1751
            // every data block is an MCU, so countdown the restart interval
            if (--z->todo <= 0) {
1752
               if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
1753 1754
               // if it's NOT a restart, then just bail, so we get corrupt data
               // rather than no data
1755 1756
               if (!STBI__RESTART(z->marker)) return 1;
               stbi__jpeg_reset(z);
1757 1758 1759 1760 1761
            }
         }
      }
   } else { // interleaved!
      int i,j,k,x,y;
1762
      STBI_SIMD_ALIGN(short, data[64]);
1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773
      for (j=0; j < z->img_mcu_y; ++j) {
         for (i=0; i < z->img_mcu_x; ++i) {
            // scan an interleaved mcu... process scan_n components in order
            for (k=0; k < z->scan_n; ++k) {
               int n = z->order[k];
               // scan out an mcu's worth of this component; that's just determined
               // by the basic H and V specified for the component
               for (y=0; y < z->img_comp[n].v; ++y) {
                  for (x=0; x < z->img_comp[n].h; ++x) {
                     int x2 = (i*z->img_comp[n].h + x)*8;
                     int y2 = (j*z->img_comp[n].v + y)*8;
1774
                     int ha = z->img_comp[n].ha;
1775
                     if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
1776
                     z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
1777 1778 1779 1780 1781 1782
                  }
               }
            }
            // after all interleaved components, that's an interleaved MCU,
            // so now count down the restart interval
            if (--z->todo <= 0) {
1783
               if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
1784 1785
               // if it's NOT a restart, then just bail, so we get corrupt data
               // rather than no data
1786 1787
               if (!STBI__RESTART(z->marker)) return 1;
               stbi__jpeg_reset(z);
1788 1789 1790 1791 1792 1793 1794
            }
         }
      }
   }
   return 1;
}

1795
static int stbi__process_marker(stbi__jpeg *z, int m)
1796 1797 1798
{
   int L;
   switch (m) {
1799
      case STBI__MARKER_none: // no marker found
S
Sean Barrett 已提交
1800
         return stbi__err("expected marker","Corrupt JPEG");
1801

1802
      case 0xC2: // stbi__SOF - progressive
S
Sean Barrett 已提交
1803
         return stbi__err("progressive jpeg","JPEG format not supported (progressive)");
1804 1805

      case 0xDD: // DRI - specify restart interval
1806 1807
         if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
         z->restart_interval = stbi__get16be(z->s);
1808 1809 1810
         return 1;

      case 0xDB: // DQT - define quantization table
1811
         L = stbi__get16be(z->s)-2;
1812
         while (L > 0) {
1813
            int q = stbi__get8(z->s);
1814 1815
            int p = q >> 4;
            int t = q & 15,i;
S
Sean Barrett 已提交
1816 1817
            if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
            if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
1818
            for (i=0; i < 64; ++i)
1819
               z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
1820 1821 1822 1823 1824
            L -= 65;
         }
         return L==0;

      case 0xC4: // DHT - define huffman table
1825
         L = stbi__get16be(z->s)-2;
1826
         while (L > 0) {
1827
            stbi_uc *v;
1828
            int sizes[16],i,n=0;
1829
            int q = stbi__get8(z->s);
1830 1831
            int tc = q >> 4;
            int th = q & 15;
S
Sean Barrett 已提交
1832
            if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
1833
            for (i=0; i < 16; ++i) {
1834
               sizes[i] = stbi__get8(z->s);
1835 1836 1837 1838
               n += sizes[i];
            }
            L -= 17;
            if (tc == 0) {
1839
               if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
1840 1841
               v = z->huff_dc[th].values;
            } else {
1842
               if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
1843 1844 1845
               v = z->huff_ac[th].values;
            }
            for (i=0; i < n; ++i)
1846
               v[i] = stbi__get8(z->s);
1847 1848
            if (tc != 0)
               stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
1849 1850 1851 1852 1853 1854
            L -= n;
         }
         return L==0;
   }
   // check for comment block or APP blocks
   if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
1855
      stbi__skip(z->s, stbi__get16be(z->s)-2);
1856 1857 1858 1859 1860
      return 1;
   }
   return 0;
}

1861 1862
// after we see stbi__SOS
static int stbi__process_scan_header(stbi__jpeg *z)
1863 1864
{
   int i;
1865 1866 1867 1868
   int Ls = stbi__get16be(z->s);
   z->scan_n = stbi__get8(z->s);
   if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad stbi__SOS component count","Corrupt JPEG");
   if (Ls != 6+2*z->scan_n) return stbi__err("bad stbi__SOS len","Corrupt JPEG");
1869
   for (i=0; i < z->scan_n; ++i) {
1870 1871
      int id = stbi__get8(z->s), which;
      int q = stbi__get8(z->s);
1872 1873 1874 1875
      for (which = 0; which < z->s->img_n; ++which)
         if (z->img_comp[which].id == id)
            break;
      if (which == z->s->img_n) return 0;
S
Sean Barrett 已提交
1876 1877
      z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
      z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
1878 1879
      z->order[i] = which;
   }
1880 1881 1882
   if (stbi__get8(z->s) != 0) return stbi__err("bad stbi__SOS","Corrupt JPEG");
   stbi__get8(z->s); // should be 63, but might be 0
   if (stbi__get8(z->s) != 0) return stbi__err("bad stbi__SOS","Corrupt JPEG");
1883 1884 1885 1886

   return 1;
}

1887
static int stbi__process_frame_header(stbi__jpeg *z, int scan)
1888
{
S
Sean Barrett 已提交
1889
   stbi__context *s = z->s;
1890
   int Lf,p,i,q, h_max=1,v_max=1,c;
1891 1892 1893 1894 1895
   Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad stbi__SOF len","Corrupt JPEG"); // JPEG
   p  = stbi__get8(s);          if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
   s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
   s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
   c = stbi__get8(s);
S
Sean Barrett 已提交
1896
   if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG");    // JFIF requires
1897 1898 1899 1900 1901 1902
   s->img_n = c;
   for (i=0; i < c; ++i) {
      z->img_comp[i].data = NULL;
      z->img_comp[i].linebuf = NULL;
   }

1903
   if (Lf != 8+3*s->img_n) return stbi__err("bad stbi__SOF len","Corrupt JPEG");
1904 1905

   for (i=0; i < s->img_n; ++i) {
1906
      z->img_comp[i].id = stbi__get8(s);
1907 1908
      if (z->img_comp[i].id != i+1)   // JFIF requires
         if (z->img_comp[i].id != i)  // some version of jpegtran outputs non-JFIF-compliant files!
S
Sean Barrett 已提交
1909
            return stbi__err("bad component ID","Corrupt JPEG");
1910
      q = stbi__get8(s);
S
Sean Barrett 已提交
1911 1912
      z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
      z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
1913
      z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
1914 1915 1916 1917
   }

   if (scan != SCAN_load) return 1;

1918
   if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933

   for (i=0; i < s->img_n; ++i) {
      if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
      if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
   }

   // compute interleaved mcu info
   z->img_h_max = h_max;
   z->img_v_max = v_max;
   z->img_mcu_w = h_max * 8;
   z->img_mcu_h = v_max * 8;
   z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
   z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;

   for (i=0; i < s->img_n; ++i) {
T
Tero Hänninen 已提交
1934
      // number of effective pixels (e.g. for non-interleaved MCU)
1935 1936
      z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
      z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
1937
      // to simplify generation, we'll allocate enough memory to decode
1938
      // the bogus oversized data from using interleaved MCUs and their
T
Tero Hänninen 已提交
1939
      // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
1940 1941 1942
      // discard the extra data until colorspace conversion
      z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
      z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
1943
      z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
1944 1945
      if (z->img_comp[i].raw_data == NULL) {
         for(--i; i >= 0; --i) {
S
Sean Barrett 已提交
1946
            STBI_FREE(z->img_comp[i].raw_data);
1947 1948
            z->img_comp[i].data = NULL;
         }
S
Sean Barrett 已提交
1949
         return stbi__err("outofmem", "Out of memory");
1950 1951
      }
      // align blocks for installable-idct using mmx/sse
1952
      z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
1953 1954 1955 1956 1957 1958
      z->img_comp[i].linebuf = NULL;
   }

   return 1;
}

T
Tero Hänninen 已提交
1959
// use comparisons since in some cases we handle more than one case (e.g. stbi__SOF)
1960 1961 1962 1963 1964
#define stbi__DNL(x)         ((x) == 0xdc)
#define stbi__SOI(x)         ((x) == 0xd8)
#define stbi__EOI(x)         ((x) == 0xd9)
#define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1)
#define stbi__SOS(x)         ((x) == 0xda)
1965

1966
static int decode_jpeg_header(stbi__jpeg *z, int scan)
1967 1968
{
   int m;
1969 1970 1971
   z->marker = STBI__MARKER_none; // initialize cached marker to empty
   m = stbi__get_marker(z);
   if (!stbi__SOI(m)) return stbi__err("no stbi__SOI","Corrupt JPEG");
1972
   if (scan == SCAN_type) return 1;
1973 1974 1975 1976 1977
   m = stbi__get_marker(z);
   while (!stbi__SOF(m)) {
      if (!stbi__process_marker(z,m)) return 0;
      m = stbi__get_marker(z);
      while (m == STBI__MARKER_none) {
1978
         // some files have extra padding after their blocks, so ok, we'll scan
1979 1980
         if (stbi__at_eof(z->s)) return stbi__err("no stbi__SOF", "Corrupt JPEG");
         m = stbi__get_marker(z);
1981 1982
      }
   }
1983
   if (!stbi__process_frame_header(z, scan)) return 0;
1984 1985 1986
   return 1;
}

1987
static int decode_jpeg_image(stbi__jpeg *j)
1988 1989 1990 1991
{
   int m;
   j->restart_interval = 0;
   if (!decode_jpeg_header(j, SCAN_load)) return 0;
1992 1993 1994 1995 1996 1997
   m = stbi__get_marker(j);
   while (!stbi__EOI(m)) {
      if (stbi__SOS(m)) {
         if (!stbi__process_scan_header(j)) return 0;
         if (!stbi__parse_entropy_coded_data(j)) return 0;
         if (j->marker == STBI__MARKER_none ) {
1998
            // handle 0s at the end of image data from IP Kamera 9060
1999 2000
            while (!stbi__at_eof(j->s)) {
               int x = stbi__get8(j->s);
2001
               if (x == 255) {
2002
                  j->marker = stbi__get8(j->s);
2003 2004 2005 2006 2007
                  break;
               } else if (x != 0) {
                  return 0;
               }
            }
2008
            // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2009 2010
         }
      } else {
2011
         if (!stbi__process_marker(j, m)) return 0;
2012
      }
2013
      m = stbi__get_marker(j);
2014 2015 2016 2017 2018 2019
   }
   return 1;
}

// static jfif-centered resampling (across block boundaries)

2020
typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2021 2022
                                    int w, int hs);

2023
#define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2024

2025
static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2026 2027 2028 2029 2030 2031 2032 2033
{
   STBI_NOTUSED(out);
   STBI_NOTUSED(in_far);
   STBI_NOTUSED(w);
   STBI_NOTUSED(hs);
   return in_near;
}

2034
static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2035 2036 2037 2038 2039
{
   // need to generate two samples vertically for every one in input
   int i;
   STBI_NOTUSED(hs);
   for (i=0; i < w; ++i)
2040
      out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2041 2042 2043
   return out;
}

2044
static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2045 2046 2047
{
   // need to generate two samples horizontally for every one in input
   int i;
2048
   stbi_uc *input = in_near;
2049 2050 2051 2052 2053 2054 2055 2056

   if (w == 1) {
      // if only one sample, can't do any interpolation
      out[0] = out[1] = input[0];
      return out;
   }

   out[0] = input[0];
2057
   out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2058 2059
   for (i=1; i < w-1; ++i) {
      int n = 3*input[i]+2;
2060 2061
      out[i*2+0] = stbi__div4(n+input[i-1]);
      out[i*2+1] = stbi__div4(n+input[i+1]);
2062
   }
2063
   out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2064 2065 2066 2067 2068 2069 2070 2071
   out[i*2+1] = input[w-1];

   STBI_NOTUSED(in_far);
   STBI_NOTUSED(hs);

   return out;
}

2072
#define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2073

2074
static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2075 2076 2077 2078
{
   // need to generate 2x2 samples for every one in input
   int i,t0,t1;
   if (w == 1) {
2079
      out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2080 2081 2082 2083
      return out;
   }

   t1 = 3*in_near[0] + in_far[0];
2084
   out[0] = stbi__div4(t1+2);
2085 2086 2087
   for (i=1; i < w; ++i) {
      t0 = t1;
      t1 = 3*in_near[i]+in_far[i];
2088 2089
      out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
      out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
2090
   }
2091
   out[w*2-1] = stbi__div4(t1+2);
2092 2093 2094 2095 2096 2097

   STBI_NOTUSED(hs);

   return out;
}

2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178
#ifdef STBI_SSE2
static stbi_uc *stbi__resample_row_hv_2_sse2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
{
   // need to generate 2x2 samples for every one in input
   int i=0,t0,t1;
   __m128i bias = _mm_set1_epi16(8);

   if (w == 1) {
      out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
      return out;
   }

   t1 = 3*in_near[0] + in_far[0];
   // process groups of 8 pixels for as long as we can.
   // note we can't handle the last pixel in a row in this loop
   // because we need to handle the filter boundary conditions.
   for (; i < ((w-1) & ~7); i += 8) {
      // load and perform the vertical filtering pass
      // this uses 3*x + y = 4*x + (y - x)
      __m128i zero  = _mm_setzero_si128();
      __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
      __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
      __m128i farw  = _mm_unpacklo_epi8(farb, zero);
      __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
      __m128i diff  = _mm_sub_epi16(farw, nearw);
      __m128i nears = _mm_slli_epi16(nearw, 2);
      __m128i curr  = _mm_add_epi16(nears, diff); // current row

      // horizontal filter works the same based on shifted of current
      // row. "prev" is current row shifted right by 1 pixel; we need to
      // insert the previous pixel value (from t1).
      // "next" is current row shifted left by 1 pixel, with first pixel
      // of next block of 8 pixels added in.
      __m128i prv0 = _mm_slli_si128(curr, 2);
      __m128i nxt0 = _mm_srli_si128(curr, 2);
      __m128i prev = _mm_insert_epi16(prv0, t1, 0);
      __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);

      // horizontal filter, polyphase implementation since it's convenient:
      // even pixels = 3*cur + prev = cur*4 + (prev - cur)
      // odd  pixels = 3*cur + next = cur*4 + (next - cur)
      // note the shared term.
      __m128i curs = _mm_slli_epi16(curr, 2);
      __m128i prvd = _mm_sub_epi16(prev, curr);
      __m128i nxtd = _mm_sub_epi16(next, curr);
      __m128i curb = _mm_add_epi16(curs, bias);
      __m128i even = _mm_add_epi16(prvd, curb);
      __m128i odd  = _mm_add_epi16(nxtd, curb);

      // interleave even and odd pixels, then undo scaling.
      __m128i int0 = _mm_unpacklo_epi16(even, odd);
      __m128i int1 = _mm_unpackhi_epi16(even, odd);
      __m128i de0  = _mm_srli_epi16(int0, 4);
      __m128i de1  = _mm_srli_epi16(int1, 4);

      // pack and write output
      __m128i outv = _mm_packus_epi16(de0, de1);
      _mm_storeu_si128((__m128i *) (out + i*2), outv);

      // "previous" value for next iter
      t1 = 3*in_near[i+7] + in_far[i+7];
   }

   t0 = t1;
   t1 = 3*in_near[i] + in_far[i];
   out[i*2] = stbi__div16(3*t1 + t0 + 8);

   for (++i; i < w; ++i) {
      t0 = t1;
      t1 = 3*in_near[i]+in_far[i];
      out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
      out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   }
   out[w*2-1] = stbi__div4(t1+2);

   STBI_NOTUSED(hs);

   return out;
}
#endif

2179
static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2180 2181 2182 2183 2184 2185 2186 2187 2188 2189
{
   // resample with nearest-neighbor
   int i,j;
   STBI_NOTUSED(in_far);
   for (i=0; i < w; ++i)
      for (j=0; j < hs; ++j)
         out[i*hs+j] = in_near[i];
   return out;
}

2190 2191 2192
#ifdef STBI_JPEG_OLD
// this is the same YCbCr-to-RGB calculation that stb_image has used
// historically before the algorithm changes in 1.49
2193
#define float2fixed(x)  ((int) ((x) * 65536 + 0.5))
2194
static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210
{
   int i;
   for (i=0; i < count; ++i) {
      int y_fixed = (y[i] << 16) + 32768; // rounding
      int r,g,b;
      int cr = pcr[i] - 128;
      int cb = pcb[i] - 128;
      r = y_fixed + cr*float2fixed(1.40200f);
      g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
      b = y_fixed                            + cb*float2fixed(1.77200f);
      r >>= 16;
      g >>= 16;
      b >>= 16;
      if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
      if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
      if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
2211 2212 2213
      out[0] = (stbi_uc)r;
      out[1] = (stbi_uc)g;
      out[2] = (stbi_uc)b;
2214 2215 2216 2217
      out[3] = 255;
      out += step;
   }
}
2218 2219 2220 2221 2222
#else
// this is a reduced-precision calculation of YCbCr-to-RGB introduced
// to make sure the code produces the same results in both SIMD and scalar
#define float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
S
VC6:  
Sean Barrett 已提交
2223 2224 2225 2226 2227 2228 2229
{
   int i;
   for (i=0; i < count; ++i) {
      int y_fixed = (y[i] << 20) + (1<<19); // rounding
      int r,g,b;
      int cr = pcr[i] - 128;
      int cb = pcb[i] - 128;
2230 2231 2232
      r = y_fixed +  cr* float2fixed(1.40200f);
      g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
      b = y_fixed                               +   cb* float2fixed(1.77200f);
S
VC6:  
Sean Barrett 已提交
2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245
      r >>= 20;
      g >>= 20;
      b >>= 20;
      if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
      if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
      if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
      out[0] = (stbi_uc)r;
      out[1] = (stbi_uc)g;
      out[2] = (stbi_uc)b;
      out[3] = 255;
      out += step;
   }
}
2246
#endif
S
VC6:  
Sean Barrett 已提交
2247

F
Fabian Giesen 已提交
2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259
#ifdef STBI_SSE2
static void stbi__YCbCr_to_RGB_sse2(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
{
   int i = 0;

   // step == 3 is pretty ugly on the final interleave, and i'm not convinced
   // it's useful in practice (you wouldn't use it for textures, for example).
   // so just accelerate step == 4 case.
   //
   // note: unlike the IDCT, this isn't bit-identical to the integer version.
   if (step == 4) {
      // this is a fairly straightforward implementation and not super-optimized.
2260
      __m128i signflip  = _mm_set1_epi8(-0x80);
S
VC6:  
Sean Barrett 已提交
2261 2262 2263 2264 2265 2266
      __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
      __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
      __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
      __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
      __m128i y_bias = _mm_set1_epi8((char) 128);
      __m128i xw = _mm_set1_epi16(255); // alpha channel
F
Fabian Giesen 已提交
2267 2268 2269 2270 2271 2272

      for (; i+7 < count; i += 8) {
         // load
         __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
         __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
         __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
S
VC6:  
Sean Barrett 已提交
2273 2274
         __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
         __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
F
Fabian Giesen 已提交
2275 2276

         // unpack to short (and left-shift cr, cb by 8)
S
VC6:  
Sean Barrett 已提交
2277 2278 2279
         __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
         __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
         __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
F
Fabian Giesen 已提交
2280 2281

         // color transform
S
VC6:  
Sean Barrett 已提交
2282
         __m128i yws = _mm_srli_epi16(yw, 4);
F
Fabian Giesen 已提交
2283 2284 2285 2286
         __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
         __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
         __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
         __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
S
VC6:  
Sean Barrett 已提交
2287 2288 2289
         __m128i rws = _mm_add_epi16(cr0, yws);
         __m128i gwt = _mm_add_epi16(cb0, yws);
         __m128i bws = _mm_add_epi16(yws, cb1);
F
Fabian Giesen 已提交
2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312
         __m128i gws = _mm_add_epi16(gwt, cr1);

         // descale
         __m128i rw = _mm_srai_epi16(rws, 4);
         __m128i bw = _mm_srai_epi16(bws, 4);
         __m128i gw = _mm_srai_epi16(gws, 4);

         // back to byte, set up for transpose
         __m128i brb = _mm_packus_epi16(rw, bw);
         __m128i gxb = _mm_packus_epi16(gw, xw);

         // transpose to interleave channels
         __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
         __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
         __m128i o0 = _mm_unpacklo_epi16(t0, t1);
         __m128i o1 = _mm_unpackhi_epi16(t0, t1);

         // store
         _mm_storeu_si128((__m128i *) (out + 0), o0);
         _mm_storeu_si128((__m128i *) (out + 16), o1);
         out += 32;
      }
   }
2313

F
Fabian Giesen 已提交
2314
   for (; i < count; ++i) {
2315
      int y_fixed = (y[i] << 20) + (1<<19); // rounding
F
Fabian Giesen 已提交
2316 2317 2318
      int r,g,b;
      int cr = pcr[i] - 128;
      int cb = pcb[i] - 128;
2319 2320 2321 2322 2323 2324
      r = y_fixed + cr* float2fixed(1.40200f);
      g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
      b = y_fixed                             +   cb* float2fixed(1.77200f);
      r >>= 20;
      g >>= 20;
      b >>= 20;
F
Fabian Giesen 已提交
2325 2326 2327 2328 2329 2330 2331 2332 2333
      if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
      if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
      if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
      out[0] = (stbi_uc)r;
      out[1] = (stbi_uc)g;
      out[2] = (stbi_uc)b;
      out[3] = 255;
      out += step;
   }
2334 2335 2336
}
#endif

2337 2338 2339 2340
// set up the kernels
static void stbi__setup_jpeg(stbi__jpeg *j)
{
   j->idct_block_kernel = stbi__idct_block;
F
Fabian Giesen 已提交
2341
   j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
2342
   j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
2343 2344 2345 2346

#ifdef STBI_SSE2
   if (stbi__sse2_available()) {
      j->idct_block_kernel = stbi__idct_sse2;
2347
      #ifndef STBI_JPEG_OLD
F
Fabian Giesen 已提交
2348
      j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_sse2;
2349
      #endif
2350
      j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_sse2;
2351 2352
   }
#endif
2353
}
2354 2355

// clean up the temporary component buffers
2356
static void stbi__cleanup_jpeg(stbi__jpeg *j)
2357 2358 2359
{
   int i;
   for (i=0; i < j->s->img_n; ++i) {
2360
      if (j->img_comp[i].raw_data) {
S
Sean Barrett 已提交
2361
         STBI_FREE(j->img_comp[i].raw_data);
2362
         j->img_comp[i].raw_data = NULL;
2363 2364 2365
         j->img_comp[i].data = NULL;
      }
      if (j->img_comp[i].linebuf) {
S
Sean Barrett 已提交
2366
         STBI_FREE(j->img_comp[i].linebuf);
2367 2368 2369 2370 2371 2372 2373 2374
         j->img_comp[i].linebuf = NULL;
      }
   }
}

typedef struct
{
   resample_row_func resample;
2375
   stbi_uc *line0,*line1;
2376 2377 2378 2379
   int hs,vs;   // expansion factor in each axis
   int w_lores; // horizontal pixels pre-expansion 
   int ystep;   // how far through vertical expansion we are
   int ypos;    // which pre-expansion row we're on
2380
} stbi__resample;
2381

2382
static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
2383 2384
{
   int n, decode_n;
2385 2386
   z->s->img_n = 0; // make stbi__cleanup_jpeg safe

2387
   // validate req_comp
S
Sean Barrett 已提交
2388
   if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
2389 2390

   // load a jpeg image from whichever source
2391
   if (!decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404

   // determine actual number of components to generate
   n = req_comp ? req_comp : z->s->img_n;

   if (z->s->img_n == 3 && n < 3)
      decode_n = 1;
   else
      decode_n = z->s->img_n;

   // resample and color-convert
   {
      int k;
      unsigned int i,j;
2405 2406
      stbi_uc *output;
      stbi_uc *coutput[4];
2407

2408
      stbi__resample res_comp[4];
2409 2410

      for (k=0; k < decode_n; ++k) {
2411
         stbi__resample *r = &res_comp[k];
2412 2413 2414

         // allocate line buffer big enough for upsampling off the edges
         // with upsample factor of 4
2415
         z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
2416
         if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
2417 2418 2419 2420 2421 2422 2423 2424 2425

         r->hs      = z->img_h_max / z->img_comp[k].h;
         r->vs      = z->img_v_max / z->img_comp[k].v;
         r->ystep   = r->vs >> 1;
         r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
         r->ypos    = 0;
         r->line0   = r->line1 = z->img_comp[k].data;

         if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
2426 2427
         else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
         else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
2428
         else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
2429
         else                               r->resample = stbi__resample_row_generic;
2430 2431 2432
      }

      // can't error after this so, this is safe
2433
      output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
2434
      if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
2435 2436 2437

      // now go ahead and resample
      for (j=0; j < z->s->img_y; ++j) {
2438
         stbi_uc *out = output + n * z->s->img_x * j;
2439
         for (k=0; k < decode_n; ++k) {
2440
            stbi__resample *r = &res_comp[k];
2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453
            int y_bot = r->ystep >= (r->vs >> 1);
            coutput[k] = r->resample(z->img_comp[k].linebuf,
                                     y_bot ? r->line1 : r->line0,
                                     y_bot ? r->line0 : r->line1,
                                     r->w_lores, r->hs);
            if (++r->ystep >= r->vs) {
               r->ystep = 0;
               r->line0 = r->line1;
               if (++r->ypos < z->img_comp[k].y)
                  r->line1 += z->img_comp[k].w2;
            }
         }
         if (n >= 3) {
2454
            stbi_uc *y = coutput[0];
2455
            if (z->s->img_n == 3) {
F
Fabian Giesen 已提交
2456
               z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
2457 2458 2459 2460 2461 2462 2463
            } else
               for (i=0; i < z->s->img_x; ++i) {
                  out[0] = out[1] = out[2] = y[i];
                  out[3] = 255; // not used if n==3
                  out += n;
               }
         } else {
2464
            stbi_uc *y = coutput[0];
2465 2466 2467 2468 2469 2470
            if (n == 1)
               for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
            else
               for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
         }
      }
2471
      stbi__cleanup_jpeg(z);
2472 2473 2474 2475 2476 2477 2478
      *out_x = z->s->img_x;
      *out_y = z->s->img_y;
      if (comp) *comp  = z->s->img_n; // report original components, not output
      return output;
   }
}

S
Sean Barrett 已提交
2479
static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
2480
{
2481
   stbi__jpeg j;
2482
   j.s = s;
2483
   stbi__setup_jpeg(&j);
2484 2485 2486
   return load_jpeg_image(&j, x,y,comp,req_comp);
}

S
Sean Barrett 已提交
2487
static int stbi__jpeg_test(stbi__context *s)
2488 2489
{
   int r;
2490
   stbi__jpeg j;
2491
   j.s = s;
2492
   stbi__setup_jpeg(&j);
2493
   r = decode_jpeg_header(&j, SCAN_type);
2494
   stbi__rewind(s);
2495 2496 2497
   return r;
}

2498
static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
2499 2500
{
   if (!decode_jpeg_header(j, SCAN_header)) {
2501
      stbi__rewind( j->s );
2502 2503 2504 2505 2506 2507 2508 2509
      return 0;
   }
   if (x) *x = j->s->img_x;
   if (y) *y = j->s->img_y;
   if (comp) *comp = j->s->img_n;
   return 1;
}

S
Sean Barrett 已提交
2510
static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
2511
{
2512
   stbi__jpeg j;
2513
   j.s = s;
2514
   return stbi__jpeg_info_raw(&j, x, y, comp);
2515 2516
}

2517
// public domain zlib decode    v0.2  Sean Barrett 2006-11-18
2518 2519 2520 2521 2522 2523 2524
//    simple implementation
//      - all input must be provided in an upfront buffer
//      - all output is written to a single output buffer (can malloc/realloc)
//    performance
//      - fast huffman

// fast-way is faster to check than jpeg huffman, but slow way is slower
2525 2526
#define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
#define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
2527 2528 2529 2530 2531

// zlib-style huffman encoding
// (jpegs packs from left, zlib from right, so can't share code)
typedef struct
{
2532
   stbi__uint16 fast[1 << STBI__ZFAST_BITS];
2533 2534 2535
   stbi__uint16 firstcode[16];
   int maxcode[17];
   stbi__uint16 firstsymbol[16];
2536
   stbi_uc  size[288];
2537
   stbi__uint16 value[288]; 
2538
} stbi__zhuffman;
2539

2540
stbi_inline static int stbi__bitreverse16(int n)
2541 2542 2543 2544 2545 2546 2547 2548
{
  n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
  n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
  n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
  n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
  return n;
}

2549
stbi_inline static int stbi__bit_reverse(int v, int bits)
2550
{
S
Sean Barrett 已提交
2551
   STBI_ASSERT(bits <= 16);
2552
   // to bit reverse n bits, reverse 16 and shift
T
Tero Hänninen 已提交
2553
   // e.g. 11 bits, bit reverse and shift away 5
2554
   return stbi__bitreverse16(v) >> (16-bits);
2555 2556
}

2557
static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
2558 2559 2560 2561 2562 2563
{
   int i,k=0;
   int code, next_code[16], sizes[17];

   // DEFLATE spec for generating codes
   memset(sizes, 0, sizeof(sizes));
2564
   memset(z->fast, 0, sizeof(z->fast));
2565 2566 2567 2568
   for (i=0; i < num; ++i) 
      ++sizes[sizelist[i]];
   sizes[0] = 0;
   for (i=1; i < 16; ++i)
S
Sean Barrett 已提交
2569
      STBI_ASSERT(sizes[i] <= (1 << i));
2570 2571 2572 2573 2574 2575 2576
   code = 0;
   for (i=1; i < 16; ++i) {
      next_code[i] = code;
      z->firstcode[i] = (stbi__uint16) code;
      z->firstsymbol[i] = (stbi__uint16) k;
      code = (code + sizes[i]);
      if (sizes[i])
S
Sean Barrett 已提交
2577
         if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt JPEG");
2578 2579 2580 2581 2582 2583 2584 2585 2586
      z->maxcode[i] = code << (16-i); // preshift for inner loop
      code <<= 1;
      k += sizes[i];
   }
   z->maxcode[16] = 0x10000; // sentinel
   for (i=0; i < num; ++i) {
      int s = sizelist[i];
      if (s) {
         int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
2587
         stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
2588 2589
         z->size [c] = (stbi_uc     ) s;
         z->value[c] = (stbi__uint16) i;
2590 2591 2592
         if (s <= STBI__ZFAST_BITS) {
            int k = stbi__bit_reverse(next_code[s],s);
            while (k < (1 << STBI__ZFAST_BITS)) {
2593
               z->fast[k] = fastv;
2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610
               k += (1 << s);
            }
         }
         ++next_code[s];
      }
   }
   return 1;
}

// zlib-from-memory implementation for PNG reading
//    because PNG allows splitting the zlib stream arbitrarily,
//    and it's annoying structurally to have PNG call ZLIB call PNG,
//    we require PNG read all the IDATs and combine them into a single
//    memory buffer

typedef struct
{
2611
   stbi_uc *zbuffer, *zbuffer_end;
2612 2613 2614 2615 2616 2617 2618 2619
   int num_bits;
   stbi__uint32 code_buffer;

   char *zout;
   char *zout_start;
   char *zout_end;
   int   z_expandable;

2620 2621
   stbi__zhuffman z_length, z_distance;
} stbi__zbuf;
2622

2623
stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
2624 2625 2626 2627 2628
{
   if (z->zbuffer >= z->zbuffer_end) return 0;
   return *z->zbuffer++;
}

2629
static void stbi__fill_bits(stbi__zbuf *z)
2630 2631
{
   do {
S
Sean Barrett 已提交
2632
      STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
2633
      z->code_buffer |= stbi__zget8(z) << z->num_bits;
2634 2635 2636 2637
      z->num_bits += 8;
   } while (z->num_bits <= 24);
}

2638
stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
2639 2640
{
   unsigned int k;
2641
   if (z->num_bits < n) stbi__fill_bits(z);
2642 2643 2644 2645 2646 2647
   k = z->code_buffer & ((1 << n) - 1);
   z->code_buffer >>= n;
   z->num_bits -= n;
   return k;   
}

2648
static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
2649 2650 2651 2652
{
   int b,s,k;
   // not resolved by fast table, so compute it the slow way
   // use jpeg approach, which requires MSbits at top
2653 2654
   k = stbi__bit_reverse(a->code_buffer, 16);
   for (s=STBI__ZFAST_BITS+1; ; ++s)
2655 2656 2657 2658 2659
      if (k < z->maxcode[s])
         break;
   if (s == 16) return -1; // invalid code!
   // code size is s, so:
   b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
S
Sean Barrett 已提交
2660
   STBI_ASSERT(z->size[b] == s);
2661 2662 2663 2664 2665
   a->code_buffer >>= s;
   a->num_bits -= s;
   return z->value[b];
}

2666 2667 2668 2669 2670
stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
{
   int b,s;
   if (a->num_bits < 16) stbi__fill_bits(a);
   b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
2671 2672
   if (b) {
      s = b >> 9;
2673 2674
      a->code_buffer >>= s;
      a->num_bits -= s;
2675
      return b & 511;
2676 2677 2678 2679
   }
   return stbi__zhuffman_decode_slowpath(a, z);
}

2680
static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
2681 2682 2683
{
   char *q;
   int cur, limit;
2684
   z->zout = zout;
S
Sean Barrett 已提交
2685
   if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
2686 2687 2688 2689
   cur   = (int) (z->zout     - z->zout_start);
   limit = (int) (z->zout_end - z->zout_start);
   while (cur + n > limit)
      limit *= 2;
S
Sean Barrett 已提交
2690
   q = (char *) STBI_REALLOC(z->zout_start, limit);
S
Sean Barrett 已提交
2691
   if (q == NULL) return stbi__err("outofmem", "Out of memory");
2692 2693 2694 2695 2696 2697
   z->zout_start = q;
   z->zout       = q + cur;
   z->zout_end   = q + limit;
   return 1;
}

2698
static int stbi__zlength_base[31] = {
2699 2700 2701 2702
   3,4,5,6,7,8,9,10,11,13,
   15,17,19,23,27,31,35,43,51,59,
   67,83,99,115,131,163,195,227,258,0,0 };

2703
static int stbi__zlength_extra[31]= 
2704 2705
{ 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };

2706
static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
2707 2708
257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};

2709
static int stbi__zdist_extra[32] =
2710 2711
{ 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};

2712
static int stbi__parse_huffman_block(stbi__zbuf *a)
2713
{
2714
   char *zout = a->zout;
2715
   for(;;) {
2716
      int z = stbi__zhuffman_decode(a, &a->z_length);
2717
      if (z < 256) {
S
Sean Barrett 已提交
2718
         if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
2719 2720 2721 2722 2723
         if (zout >= a->zout_end) {
            if (!stbi__zexpand(a, zout, 1)) return 0;
            zout = a->zout;
         }
         *zout++ = (char) z;
2724
      } else {
2725
         stbi_uc *p;
2726
         int len,dist;
2727 2728 2729 2730
         if (z == 256) {
            a->zout = zout;
            return 1;
         }
2731
         z -= 257;
2732 2733 2734
         len = stbi__zlength_base[z];
         if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
         z = stbi__zhuffman_decode(a, &a->z_distance);
S
Sean Barrett 已提交
2735
         if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
2736 2737
         dist = stbi__zdist_base[z];
         if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
2738 2739 2740 2741 2742 2743
         if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
         if (zout + len > a->zout_end) {
            if (!stbi__zexpand(a, zout, len)) return 0;
            zout = a->zout;
         }
         p = (stbi_uc *) (zout - dist);
2744 2745 2746 2747 2748 2749
         if (dist == 1) { // run of one byte; common in images.
            stbi_uc v = *p;
            do *zout++ = v; while (--len);
         } else {
            do *zout++ = *p++; while (--len);
         }
2750 2751 2752 2753
      }
   }
}

2754
static int stbi__compute_huffman_codes(stbi__zbuf *a)
2755
{
2756
   static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
2757
   stbi__zhuffman z_codelength;
2758 2759
   stbi_uc lencodes[286+32+137];//padding for maximum single op
   stbi_uc codelength_sizes[19];
2760 2761
   int i,n;

2762 2763 2764
   int hlit  = stbi__zreceive(a,5) + 257;
   int hdist = stbi__zreceive(a,5) + 1;
   int hclen = stbi__zreceive(a,4) + 4;
2765 2766 2767

   memset(codelength_sizes, 0, sizeof(codelength_sizes));
   for (i=0; i < hclen; ++i) {
2768
      int s = stbi__zreceive(a,3);
2769
      codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
2770
   }
2771
   if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
2772 2773 2774

   n = 0;
   while (n < hlit + hdist) {
2775
      int c = stbi__zhuffman_decode(a, &z_codelength);
S
Sean Barrett 已提交
2776
      STBI_ASSERT(c >= 0 && c < 19);
2777
      if (c < 16)
2778
         lencodes[n++] = (stbi_uc) c;
2779
      else if (c == 16) {
2780
         c = stbi__zreceive(a,2)+3;
2781 2782 2783
         memset(lencodes+n, lencodes[n-1], c);
         n += c;
      } else if (c == 17) {
2784
         c = stbi__zreceive(a,3)+3;
2785 2786 2787
         memset(lencodes+n, 0, c);
         n += c;
      } else {
S
Sean Barrett 已提交
2788
         STBI_ASSERT(c == 18);
2789
         c = stbi__zreceive(a,7)+11;
2790 2791 2792 2793
         memset(lencodes+n, 0, c);
         n += c;
      }
   }
S
Sean Barrett 已提交
2794
   if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
2795 2796
   if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
   if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
2797 2798 2799
   return 1;
}

2800
static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
2801
{
2802
   stbi_uc header[4];
2803 2804
   int len,nlen,k;
   if (a->num_bits & 7)
2805
      stbi__zreceive(a, a->num_bits & 7); // discard
2806 2807 2808
   // drain the bit-packed data into header
   k = 0;
   while (a->num_bits > 0) {
2809
      header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
2810 2811 2812
      a->code_buffer >>= 8;
      a->num_bits -= 8;
   }
S
Sean Barrett 已提交
2813
   STBI_ASSERT(a->num_bits == 0);
2814 2815
   // now fill header the normal way
   while (k < 4)
2816
      header[k++] = stbi__zget8(a);
2817 2818
   len  = header[1] * 256 + header[0];
   nlen = header[3] * 256 + header[2];
S
Sean Barrett 已提交
2819 2820
   if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
   if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
2821
   if (a->zout + len > a->zout_end)
2822
      if (!stbi__zexpand(a, a->zout, len)) return 0;
2823 2824 2825 2826 2827 2828
   memcpy(a->zout, a->zbuffer, len);
   a->zbuffer += len;
   a->zout += len;
   return 1;
}

2829
static int stbi__parse_zlib_header(stbi__zbuf *a)
2830
{
2831
   int cmf   = stbi__zget8(a);
2832 2833
   int cm    = cmf & 15;
   /* int cinfo = cmf >> 4; */
2834
   int flg   = stbi__zget8(a);
S
Sean Barrett 已提交
2835 2836 2837
   if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
   if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
2838 2839 2840 2841 2842
   // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
   return 1;
}

// @TODO: should statically initialize these for optimal thread safety
2843
static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
2844
static void stbi__init_zdefaults(void)
2845 2846
{
   int i;   // use <= to match clearly with spec
2847 2848 2849 2850
   for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
   for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
   for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
   for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
2851

2852
   for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
2853 2854
}

2855
static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
2856 2857 2858
{
   int final, type;
   if (parse_header)
2859
      if (!stbi__parse_zlib_header(a)) return 0;
2860 2861 2862
   a->num_bits = 0;
   a->code_buffer = 0;
   do {
2863 2864
      final = stbi__zreceive(a,1);
      type = stbi__zreceive(a,2);
2865
      if (type == 0) {
2866
         if (!stbi__parse_uncomperssed_block(a)) return 0;
2867 2868 2869 2870 2871
      } else if (type == 3) {
         return 0;
      } else {
         if (type == 1) {
            // use fixed code lengths
2872 2873 2874
            if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
            if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , 288)) return 0;
            if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
2875
         } else {
2876
            if (!stbi__compute_huffman_codes(a)) return 0;
2877
         }
2878
         if (!stbi__parse_huffman_block(a)) return 0;
2879 2880 2881 2882 2883
      }
   } while (!final);
   return 1;
}

2884
static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
2885 2886 2887 2888 2889 2890
{
   a->zout_start = obuf;
   a->zout       = obuf;
   a->zout_end   = obuf + olen;
   a->z_expandable = exp;

2891
   return stbi__parse_zlib(a, parse_header);
2892 2893 2894 2895
}

STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
{
2896
   stbi__zbuf a;
2897
   char *p = (char *) stbi__malloc(initial_size);
2898
   if (p == NULL) return NULL;
2899 2900
   a.zbuffer = (stbi_uc *) buffer;
   a.zbuffer_end = (stbi_uc *) buffer + len;
2901
   if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
2902 2903 2904
      if (outlen) *outlen = (int) (a.zout - a.zout_start);
      return a.zout_start;
   } else {
S
Sean Barrett 已提交
2905
      STBI_FREE(a.zout_start);
2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916
      return NULL;
   }
}

STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
{
   return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
}

STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
{
2917
   stbi__zbuf a;
2918
   char *p = (char *) stbi__malloc(initial_size);
2919
   if (p == NULL) return NULL;
2920 2921
   a.zbuffer = (stbi_uc *) buffer;
   a.zbuffer_end = (stbi_uc *) buffer + len;
2922
   if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
2923 2924 2925
      if (outlen) *outlen = (int) (a.zout - a.zout_start);
      return a.zout_start;
   } else {
S
Sean Barrett 已提交
2926
      STBI_FREE(a.zout_start);
2927 2928 2929 2930 2931 2932
      return NULL;
   }
}

STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
{
2933
   stbi__zbuf a;
2934 2935
   a.zbuffer = (stbi_uc *) ibuffer;
   a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
2936
   if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
2937 2938 2939 2940 2941 2942 2943
      return (int) (a.zout - a.zout_start);
   else
      return -1;
}

STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
{
2944
   stbi__zbuf a;
2945
   char *p = (char *) stbi__malloc(16384);
2946
   if (p == NULL) return NULL;
2947 2948
   a.zbuffer = (stbi_uc *) buffer;
   a.zbuffer_end = (stbi_uc *) buffer+len;
2949
   if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
2950 2951 2952
      if (outlen) *outlen = (int) (a.zout - a.zout_start);
      return a.zout_start;
   } else {
S
Sean Barrett 已提交
2953
      STBI_FREE(a.zout_start);
2954 2955 2956 2957 2958 2959
      return NULL;
   }
}

STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
{
2960
   stbi__zbuf a;
2961 2962
   a.zbuffer = (stbi_uc *) ibuffer;
   a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
2963
   if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983
      return (int) (a.zout - a.zout_start);
   else
      return -1;
}

// public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
//    simple implementation
//      - only 8-bit samples
//      - no CRC checking
//      - allocates lots of intermediate memory
//        - avoids problem of streaming data between subsystems
//        - avoids explicit window management
//    performance
//      - uses stb_zlib, a PD zlib implementation with fast huffman decoding


typedef struct
{
   stbi__uint32 length;
   stbi__uint32 type;
2984
} stbi__pngchunk;
2985

2986
static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
2987
{
2988
   stbi__pngchunk c;
2989 2990
   c.length = stbi__get32be(s);
   c.type   = stbi__get32be(s);
2991 2992 2993
   return c;
}

2994
static int stbi__check_png_header(stbi__context *s)
2995
{
2996
   static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
2997 2998
   int i;
   for (i=0; i < 8; ++i)
2999
      if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3000 3001 3002 3003 3004
   return 1;
}

typedef struct
{
S
Sean Barrett 已提交
3005
   stbi__context *s;
3006
   stbi_uc *idata, *expanded, *out;
3007
} stbi__png;
3008 3009 3010


enum {
3011 3012 3013 3014 3015 3016 3017 3018
   STBI__F_none=0,
   STBI__F_sub=1,
   STBI__F_up=2,
   STBI__F_avg=3,
   STBI__F_paeth=4,
   // synthetic filters used for first scanline to avoid needing a dummy row of 0s
   STBI__F_avg_first,
   STBI__F_paeth_first
3019 3020
};

3021
static stbi_uc first_row_filter[5] =
3022
{
3023 3024 3025 3026 3027
   STBI__F_none,
   STBI__F_sub,
   STBI__F_none,
   STBI__F_avg_first,
   STBI__F_paeth_first
3028 3029
};

3030
static int stbi__paeth(int a, int b, int c)
3031 3032 3033 3034 3035 3036 3037 3038 3039 3040
{
   int p = a + b - c;
   int pa = abs(p-a);
   int pb = abs(p-b);
   int pc = abs(p-c);
   if (pa <= pb && pa <= pc) return a;
   if (pb <= pc) return b;
   return c;
}

3041 3042
#define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings

3043 3044
static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };

3045
// create the png data from post-deflated data
3046
static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3047
{
S
Sean Barrett 已提交
3048
   stbi__context *s = a->s;
3049
   stbi__uint32 i,j,stride = x*out_n;
3050
   stbi__uint32 img_len, img_width_bytes;
3051 3052
   int k;
   int img_n = s->img_n; // copy it into a local for later
O
ocornut 已提交
3053

S
Sean Barrett 已提交
3054
   STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3055
   a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
S
Sean Barrett 已提交
3056
   if (!a->out) return stbi__err("outofmem", "Out of memory");
O
ocornut 已提交
3057

3058 3059
   img_width_bytes = (((img_n * x * depth) + 7) >> 3);
   img_len = (img_width_bytes + 1) * y;
3060
   if (s->img_x == x && s->img_y == y) {
O
ocornut 已提交
3061
      if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
3062
   } else { // interlaced:
O
ocornut 已提交
3063
      if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
3064
   }
O
ocornut 已提交
3065

3066
   for (j=0; j < y; ++j) {
3067 3068
      stbi_uc *cur = a->out + stride*j;
      stbi_uc *prior = cur - stride;
3069
      int filter = *raw++;
3070 3071 3072
      int filter_bytes = img_n;
      int width = x;
      if (filter > 4)
3073 3074
         return stbi__err("invalid filter","Corrupt PNG");

3075
      if (depth < 8) {
3076
         STBI_ASSERT(img_width_bytes <= x);
3077 3078 3079
         cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
         filter_bytes = 1;
         width = img_width_bytes;
3080
      }
3081

3082 3083
      // if first row, use special filter that doesn't sample previous row
      if (j == 0) filter = first_row_filter[filter];
O
ocornut 已提交
3084

3085 3086
      // handle first byte explicitly
      for (k=0; k < filter_bytes; ++k) {
3087
         switch (filter) {
3088 3089 3090 3091 3092 3093 3094
            case STBI__F_none       : cur[k] = raw[k]; break;
            case STBI__F_sub        : cur[k] = raw[k]; break;
            case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
            case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
            case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
            case STBI__F_avg_first  : cur[k] = raw[k]; break;
            case STBI__F_paeth_first: cur[k] = raw[k]; break;
3095 3096
         }
      }
3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109

      if (depth == 8) {
         if (img_n != out_n)
            cur[img_n] = 255; // first pixel
         raw += img_n;
         cur += out_n;
         prior += out_n;
      } else {
         raw += 1;
         cur += 1;
         prior += 1;
      }

3110
      // this is a little gross, so that we don't switch per-pixel or per-component
3111
      if (depth < 8 || img_n == out_n) {
3112
         int nk = (width - 1)*img_n;
3113 3114
         #define CASE(f) \
             case f:     \
3115
                for (k=0; k < nk; ++k)
3116
         switch (filter) {
3117 3118
            // "none" filter turns into a memcpy here; make that explicit.
            case STBI__F_none:         memcpy(cur, raw, nk); break;
3119 3120 3121 3122 3123 3124
            CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
            CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
            CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
            CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
            CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
            CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
3125 3126
         }
         #undef CASE
3127
         raw += nk;
3128
      } else {
S
Sean Barrett 已提交
3129
         STBI_ASSERT(img_n+1 == out_n);
3130 3131
         #define CASE(f) \
             case f:     \
3132
                for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
3133
                   for (k=0; k < img_n; ++k)
3134
         switch (filter) {
3135 3136 3137 3138 3139 3140 3141
            CASE(STBI__F_none)         cur[k] = raw[k]; break;
            CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
            CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
            CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
            CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
            CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
            CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
3142 3143 3144 3145
         }
         #undef CASE
      }
   }
3146

3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220
   // we make a separate pass to expand bits to pixels; for performance,
   // this could run two scanlines behind the above code, so it won't
   // intefere with filtering but will still be in the cache.
   if (depth < 8) {
      for (j=0; j < y; ++j) {
         stbi_uc *cur = a->out + stride*j;
         stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
         // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
         // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
         stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range

         // note that the final byte might overshoot and write more data than desired.
         // we can allocate enough data that this never writes out of memory, but it
         // could also overwrite the next scanline. can it overwrite non-empty data
         // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
         // so we need to explicitly clamp the final ones

         if (depth == 4) {
            for (k=x*img_n; k >= 2; k-=2, ++in) {
               *cur++ = scale * ((*in >> 4)       );
               *cur++ = scale * ((*in     ) & 0x0f);
            }
            if (k > 0) *cur++ = scale * ((*in >> 4)       );
         } else if (depth == 2) {
            for (k=x*img_n; k >= 4; k-=4, ++in) {
               *cur++ = scale * ((*in >> 6)       );
               *cur++ = scale * ((*in >> 4) & 0x03);
               *cur++ = scale * ((*in >> 2) & 0x03);
               *cur++ = scale * ((*in     ) & 0x03);
            }
            if (k > 0) *cur++ = scale * ((*in >> 6)       );
            if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
            if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
         } else if (depth == 1) {
            for (k=x*img_n; k >= 8; k-=8, ++in) {
               *cur++ = scale * ((*in >> 7)       );
               *cur++ = scale * ((*in >> 6) & 0x01);
               *cur++ = scale * ((*in >> 5) & 0x01);
               *cur++ = scale * ((*in >> 4) & 0x01);
               *cur++ = scale * ((*in >> 3) & 0x01);
               *cur++ = scale * ((*in >> 2) & 0x01);
               *cur++ = scale * ((*in >> 1) & 0x01);
               *cur++ = scale * ((*in     ) & 0x01);
            }
            if (k > 0) *cur++ = scale * ((*in >> 7)       );
            if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
            if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
            if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
            if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
            if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
            if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
         }
         if (img_n != out_n) {
            // insert alpha = 255
            stbi_uc *cur = a->out + stride*j;
            int i;
            if (img_n == 1) {
               for (i=x-1; i >= 0; --i) {
                  cur[i*2+1] = 255;
                  cur[i*2+0] = cur[i];
               }
            } else {
               assert(img_n == 3);
               for (i=x-1; i >= 0; --i) {
                  cur[i*4+3] = 255;
                  cur[i*4+2] = cur[i*3+2];
                  cur[i*4+1] = cur[i*3+1];
                  cur[i*4+0] = cur[i*3+0];
               }
            }
         }
      }
   }

3221 3222 3223
   return 1;
}

3224
static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
3225
{
3226
   stbi_uc *final;
3227 3228
   int p;
   if (!interlaced)
3229
      return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
3230 3231

   // de-interlacing
3232
   final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
3233 3234 3235 3236 3237 3238 3239 3240 3241 3242
   for (p=0; p < 7; ++p) {
      int xorig[] = { 0,4,0,2,0,1,0 };
      int yorig[] = { 0,0,4,0,2,0,1 };
      int xspc[]  = { 8,8,4,4,2,2,1 };
      int yspc[]  = { 8,8,8,4,4,2,2 };
      int i,j,x,y;
      // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
      x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
      y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
      if (x && y) {
3243 3244
         stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
         if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
S
Sean Barrett 已提交
3245
            STBI_FREE(final);
3246 3247
            return 0;
         }
3248 3249 3250 3251 3252
         for (j=0; j < y; ++j) {
            for (i=0; i < x; ++i) {
               int out_y = j*yspc[p]+yorig[p];
               int out_x = i*xspc[p]+xorig[p];
               memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
3253
                      a->out + (j*x+i)*out_n, out_n);
3254 3255
            }
         }
S
Sean Barrett 已提交
3256
         STBI_FREE(a->out);
3257 3258
         image_data += img_len;
         image_data_len -= img_len;
3259 3260 3261 3262 3263 3264 3265
      }
   }
   a->out = final;

   return 1;
}

3266
static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
3267
{
S
Sean Barrett 已提交
3268
   stbi__context *s = z->s;
3269
   stbi__uint32 i, pixel_count = s->img_x * s->img_y;
3270
   stbi_uc *p = z->out;
3271 3272 3273

   // compute color-based transparency, assuming we've
   // already got 255 as the alpha value in the output
S
Sean Barrett 已提交
3274
   STBI_ASSERT(out_n == 2 || out_n == 4);
3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290

   if (out_n == 2) {
      for (i=0; i < pixel_count; ++i) {
         p[1] = (p[0] == tc[0] ? 0 : 255);
         p += 2;
      }
   } else {
      for (i=0; i < pixel_count; ++i) {
         if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
            p[3] = 0;
         p += 4;
      }
   }
   return 1;
}

3291
static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
3292 3293
{
   stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
3294
   stbi_uc *p, *temp_out, *orig = a->out;
3295

3296
   p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
S
Sean Barrett 已提交
3297
   if (p == NULL) return stbi__err("outofmem", "Out of memory");
3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319

   // between here and free(out) below, exitting would leak
   temp_out = p;

   if (pal_img_n == 3) {
      for (i=0; i < pixel_count; ++i) {
         int n = orig[i]*4;
         p[0] = palette[n  ];
         p[1] = palette[n+1];
         p[2] = palette[n+2];
         p += 3;
      }
   } else {
      for (i=0; i < pixel_count; ++i) {
         int n = orig[i]*4;
         p[0] = palette[n  ];
         p[1] = palette[n+1];
         p[2] = palette[n+2];
         p[3] = palette[n+3];
         p += 4;
      }
   }
S
Sean Barrett 已提交
3320
   STBI_FREE(a->out);
3321 3322 3323 3324 3325 3326 3327
   a->out = temp_out;

   STBI_NOTUSED(len);

   return 1;
}

3328 3329
static int stbi__unpremultiply_on_load = 0;
static int stbi__de_iphone_flag = 0;
3330

3331
STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
3332
{
3333
   stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
3334
}
3335 3336

STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
3337
{
3338
   stbi__de_iphone_flag = flag_true_if_should_convert;
3339 3340
}

3341
static void stbi__de_iphone(stbi__png *z)
3342
{
S
Sean Barrett 已提交
3343
   stbi__context *s = z->s;
3344
   stbi__uint32 i, pixel_count = s->img_x * s->img_y;
3345
   stbi_uc *p = z->out;
3346 3347 3348

   if (s->img_out_n == 3) {  // convert bgr to rgb
      for (i=0; i < pixel_count; ++i) {
3349
         stbi_uc t = p[0];
3350 3351 3352 3353 3354
         p[0] = p[2];
         p[2] = t;
         p += 3;
      }
   } else {
S
Sean Barrett 已提交
3355
      STBI_ASSERT(s->img_out_n == 4);
3356
      if (stbi__unpremultiply_on_load) {
3357 3358
         // convert bgr to rgb and unpremultiply
         for (i=0; i < pixel_count; ++i) {
3359 3360
            stbi_uc a = p[3];
            stbi_uc t = p[0];
3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373
            if (a) {
               p[0] = p[2] * 255 / a;
               p[1] = p[1] * 255 / a;
               p[2] =  t   * 255 / a;
            } else {
               p[0] = p[2];
               p[2] = t;
            } 
            p += 4;
         }
      } else {
         // convert bgr to rgb
         for (i=0; i < pixel_count; ++i) {
3374
            stbi_uc t = p[0];
3375 3376 3377 3378 3379 3380 3381 3382
            p[0] = p[2];
            p[2] = t;
            p += 4;
         }
      }
   }
}

3383 3384
#define STBI__PNG_TYPE(a,b,c,d)  (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))

3385
static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
3386
{
3387 3388
   stbi_uc palette[1024], pal_img_n=0;
   stbi_uc has_trans=0, tc[3];
3389
   stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
3390
   int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
S
Sean Barrett 已提交
3391
   stbi__context *s = z->s;
3392 3393 3394 3395 3396

   z->expanded = NULL;
   z->idata = NULL;
   z->out = NULL;

3397
   if (!stbi__check_png_header(s)) return 0;
3398 3399 3400 3401

   if (scan == SCAN_type) return 1;

   for (;;) {
3402
      stbi__pngchunk c = stbi__get_chunk_header(s);
3403
      switch (c.type) {
3404
         case STBI__PNG_TYPE('C','g','B','I'):
3405
            is_iphone = 1;
3406
            stbi__skip(s, c.length);
3407
            break;
3408
         case STBI__PNG_TYPE('I','H','D','R'): {
3409
            int comp,filter;
S
Sean Barrett 已提交
3410
            if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
3411
            first = 0;
S
Sean Barrett 已提交
3412
            if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
3413 3414
            s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
            s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
3415
            depth = stbi__get8(s);  if (depth != 1 && depth != 2 && depth != 4 && depth != 8)  return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
3416
            color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
S
Sean Barrett 已提交
3417
            if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
3418 3419 3420
            comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
            filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
            interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
S
Sean Barrett 已提交
3421
            if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
3422 3423
            if (!pal_img_n) {
               s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
3424
               if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
3425 3426 3427 3428 3429
               if (scan == SCAN_header) return 1;
            } else {
               // if paletted, then pal_n is our final components, and
               // img_n is # components to decompress/filter.
               s->img_n = 1;
S
Sean Barrett 已提交
3430
               if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
3431 3432 3433 3434 3435
               // if SCAN_header, have to scan to see if we have a tRNS
            }
            break;
         }

3436
         case STBI__PNG_TYPE('P','L','T','E'):  {
S
Sean Barrett 已提交
3437 3438
            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
            if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
3439
            pal_len = c.length / 3;
S
Sean Barrett 已提交
3440
            if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
3441
            for (i=0; i < pal_len; ++i) {
3442 3443 3444
               palette[i*4+0] = stbi__get8(s);
               palette[i*4+1] = stbi__get8(s);
               palette[i*4+2] = stbi__get8(s);
3445 3446 3447 3448 3449
               palette[i*4+3] = 255;
            }
            break;
         }

3450
         case STBI__PNG_TYPE('t','R','N','S'): {
S
Sean Barrett 已提交
3451 3452
            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
            if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
3453 3454
            if (pal_img_n) {
               if (scan == SCAN_header) { s->img_n = 4; return 1; }
S
Sean Barrett 已提交
3455 3456
               if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
               if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
3457 3458
               pal_img_n = 4;
               for (i=0; i < c.length; ++i)
3459
                  palette[i*4+3] = stbi__get8(s);
3460
            } else {
S
Sean Barrett 已提交
3461 3462
               if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
               if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
3463 3464
               has_trans = 1;
               for (k=0; k < s->img_n; ++k)
3465
                  tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
3466 3467 3468 3469
            }
            break;
         }

3470
         case STBI__PNG_TYPE('I','D','A','T'): {
S
Sean Barrett 已提交
3471 3472
            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
            if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
3473 3474
            if (scan == SCAN_header) { s->img_n = pal_img_n; return 1; }
            if (ioff + c.length > idata_limit) {
3475
               stbi_uc *p;
3476 3477 3478
               if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
               while (ioff + c.length > idata_limit)
                  idata_limit *= 2;
S
Sean Barrett 已提交
3479
               p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
3480 3481
               z->idata = p;
            }
3482
            if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
3483 3484 3485 3486
            ioff += c.length;
            break;
         }

3487
         case STBI__PNG_TYPE('I','E','N','D'): {
3488
            stbi__uint32 raw_len;
S
Sean Barrett 已提交
3489
            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
3490
            if (scan != SCAN_load) return 1;
S
Sean Barrett 已提交
3491
            if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
3492 3493 3494
            // initial guess for decoded data size to avoid unnecessary reallocs
            raw_len = s->img_x * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
            z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
3495
            if (z->expanded == NULL) return 0; // zlib should set error
S
Sean Barrett 已提交
3496
            STBI_FREE(z->idata); z->idata = NULL;
3497 3498 3499 3500
            if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
               s->img_out_n = s->img_n+1;
            else
               s->img_out_n = s->img_n;
3501
            if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
3502
            if (has_trans)
3503
               if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
3504
            if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
3505
               stbi__de_iphone(z);
3506 3507 3508 3509 3510
            if (pal_img_n) {
               // pal_img_n == 3 or 4
               s->img_n = pal_img_n; // record the actual colors we had
               s->img_out_n = pal_img_n;
               if (req_comp >= 3) s->img_out_n = req_comp;
3511
               if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
3512 3513
                  return 0;
            }
S
Sean Barrett 已提交
3514
            STBI_FREE(z->expanded); z->expanded = NULL;
3515 3516 3517 3518 3519
            return 1;
         }

         default:
            // if critical, fail
S
Sean Barrett 已提交
3520
            if (first) return stbi__err("first not IHDR", "Corrupt PNG");
3521 3522 3523
            if ((c.type & (1 << 29)) == 0) {
               #ifndef STBI_NO_FAILURE_STRINGS
               // not threadsafe
3524 3525 3526 3527 3528
               static char invalid_chunk[] = "XXXX PNG chunk not known";
               invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
               invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
               invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
               invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
3529
               #endif
3530
               return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
3531
            }
3532
            stbi__skip(s, c.length);
3533 3534
            break;
      }
3535
      // end of PNG chunk, read and skip CRC
3536
      stbi__get32be(s);
3537 3538 3539
   }
}

3540
static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
3541 3542
{
   unsigned char *result=NULL;
S
Sean Barrett 已提交
3543
   if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3544
   if (stbi__parse_png_file(p, SCAN_load, req_comp)) {
3545 3546 3547
      result = p->out;
      p->out = NULL;
      if (req_comp && req_comp != p->s->img_out_n) {
3548
         result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
3549 3550 3551 3552 3553
         p->s->img_out_n = req_comp;
         if (result == NULL) return result;
      }
      *x = p->s->img_x;
      *y = p->s->img_y;
3554
      if (n) *n = p->s->img_out_n;
3555
   }
S
Sean Barrett 已提交
3556 3557 3558
   STBI_FREE(p->out);      p->out      = NULL;
   STBI_FREE(p->expanded); p->expanded = NULL;
   STBI_FREE(p->idata);    p->idata    = NULL;
3559 3560 3561 3562

   return result;
}

S
Sean Barrett 已提交
3563
static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3564
{
3565
   stbi__png p;
3566
   p.s = s;
3567
   return stbi__do_png(&p, x,y,comp,req_comp);
3568 3569
}

S
Sean Barrett 已提交
3570
static int stbi__png_test(stbi__context *s)
3571 3572
{
   int r;
3573
   r = stbi__check_png_header(s);
3574
   stbi__rewind(s);
3575 3576 3577
   return r;
}

3578
static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
3579
{
3580
   if (!stbi__parse_png_file(p, SCAN_header, 0)) {
3581
      stbi__rewind( p->s );
3582 3583 3584 3585 3586 3587 3588 3589
      return 0;
   }
   if (x) *x = p->s->img_x;
   if (y) *y = p->s->img_y;
   if (comp) *comp = p->s->img_n;
   return 1;
}

3590
static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
3591
{
3592
   stbi__png p;
3593
   p.s = s;
3594
   return stbi__png_info_raw(&p, x, y, comp);
3595 3596 3597
}

// Microsoft/Windows BMP image
3598
static int stbi__bmp_test_raw(stbi__context *s)
3599
{
3600
   int r;
3601
   int sz;
3602 3603 3604 3605 3606 3607 3608
   if (stbi__get8(s) != 'B') return 0;
   if (stbi__get8(s) != 'M') return 0;
   stbi__get32le(s); // discard filesize
   stbi__get16le(s); // discard reserved
   stbi__get16le(s); // discard reserved
   stbi__get32le(s); // discard data offset
   sz = stbi__get32le(s);
3609
   r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
3610 3611 3612 3613 3614 3615
   return r;
}

static int stbi__bmp_test(stbi__context *s)
{
   int r = stbi__bmp_test_raw(s);
3616
   stbi__rewind(s);
3617 3618 3619 3620 3621
   return r;
}


// returns 0..31 for the highest set bit
3622
static int stbi__high_bit(unsigned int z)
3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633
{
   int n=0;
   if (z == 0) return -1;
   if (z >= 0x10000) n += 16, z >>= 16;
   if (z >= 0x00100) n +=  8, z >>=  8;
   if (z >= 0x00010) n +=  4, z >>=  4;
   if (z >= 0x00004) n +=  2, z >>=  2;
   if (z >= 0x00002) n +=  1, z >>=  1;
   return n;
}

3634
static int stbi__bitcount(unsigned int a)
3635 3636 3637 3638 3639 3640 3641 3642 3643
{
   a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
   a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
   a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
   a = (a + (a >> 8)); // max 16 per 8 bits
   a = (a + (a >> 16)); // max 32 per 8 bits
   return a & 0xff;
}

3644
static int stbi__shiftsigned(int v, int shift, int bits)
3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660
{
   int result;
   int z=0;

   if (shift < 0) v <<= -shift;
   else v >>= shift;
   result = v;

   z = bits;
   while (z < 8) {
      result += v >> z;
      z += bits;
   }
   return result;
}

3661
static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3662
{
3663
   stbi_uc *out;
3664 3665 3666 3667
   unsigned int mr=0,mg=0,mb=0,ma=0, fake_a=0;
   stbi_uc pal[256][4];
   int psize=0,i,j,compress=0,width;
   int bpp, flip_vertically, pad, target, offset, hsz;
3668 3669 3670 3671 3672 3673
   if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
   stbi__get32le(s); // discard filesize
   stbi__get16le(s); // discard reserved
   stbi__get16le(s); // discard reserved
   offset = stbi__get32le(s);
   hsz = stbi__get32le(s);
3674
   if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
3675
   if (hsz == 12) {
3676 3677
      s->img_x = stbi__get16le(s);
      s->img_y = stbi__get16le(s);
3678
   } else {
3679 3680
      s->img_x = stbi__get32le(s);
      s->img_y = stbi__get32le(s);
3681
   }
3682 3683
   if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
   bpp = stbi__get16le(s);
S
Sean Barrett 已提交
3684
   if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
3685 3686 3687 3688 3689 3690
   flip_vertically = ((int) s->img_y) > 0;
   s->img_y = abs((int) s->img_y);
   if (hsz == 12) {
      if (bpp < 24)
         psize = (offset - 14 - 24) / 3;
   } else {
3691
      compress = stbi__get32le(s);
S
Sean Barrett 已提交
3692
      if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
3693 3694 3695 3696 3697
      stbi__get32le(s); // discard sizeof
      stbi__get32le(s); // discard hres
      stbi__get32le(s); // discard vres
      stbi__get32le(s); // discard colorsused
      stbi__get32le(s); // discard max important
3698 3699
      if (hsz == 40 || hsz == 56) {
         if (hsz == 56) {
3700 3701 3702 3703
            stbi__get32le(s);
            stbi__get32le(s);
            stbi__get32le(s);
            stbi__get32le(s);
3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720
         }
         if (bpp == 16 || bpp == 32) {
            mr = mg = mb = 0;
            if (compress == 0) {
               if (bpp == 32) {
                  mr = 0xffu << 16;
                  mg = 0xffu <<  8;
                  mb = 0xffu <<  0;
                  ma = 0xffu << 24;
                  fake_a = 1; // @TODO: check for cases like alpha value is all 0 and switch it to 255
                  STBI_NOTUSED(fake_a);
               } else {
                  mr = 31u << 10;
                  mg = 31u <<  5;
                  mb = 31u <<  0;
               }
            } else if (compress == 3) {
3721 3722 3723
               mr = stbi__get32le(s);
               mg = stbi__get32le(s);
               mb = stbi__get32le(s);
3724 3725 3726
               // not documented, but generated by photoshop and handled by mspaint
               if (mr == mg && mg == mb) {
                  // ?!?!?
S
Sean Barrett 已提交
3727
                  return stbi__errpuc("bad BMP", "bad BMP");
3728 3729
               }
            } else
S
Sean Barrett 已提交
3730
               return stbi__errpuc("bad BMP", "bad BMP");
3731 3732
         }
      } else {
S
Sean Barrett 已提交
3733
         STBI_ASSERT(hsz == 108 || hsz == 124);
3734 3735 3736 3737 3738
         mr = stbi__get32le(s);
         mg = stbi__get32le(s);
         mb = stbi__get32le(s);
         ma = stbi__get32le(s);
         stbi__get32le(s); // discard color space
3739
         for (i=0; i < 12; ++i)
3740
            stbi__get32le(s); // discard color space parameters
3741 3742 3743 3744 3745 3746
         if (hsz == 124) {
            stbi__get32le(s); // discard rendering intent
            stbi__get32le(s); // discard offset of profile data
            stbi__get32le(s); // discard size of profile data
            stbi__get32le(s); // discard reserved
         }
3747 3748 3749 3750 3751
      }
      if (bpp < 16)
         psize = (offset - 14 - hsz) >> 2;
   }
   s->img_n = ma ? 4 : 3;
3752
   if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
3753 3754 3755
      target = req_comp;
   else
      target = s->img_n; // if they want monochrome, we'll post-convert
3756
   out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
S
Sean Barrett 已提交
3757
   if (!out) return stbi__errpuc("outofmem", "Out of memory");
3758 3759
   if (bpp < 16) {
      int z=0;
S
Sean Barrett 已提交
3760
      if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
3761
      for (i=0; i < psize; ++i) {
3762 3763 3764
         pal[i][2] = stbi__get8(s);
         pal[i][1] = stbi__get8(s);
         pal[i][0] = stbi__get8(s);
3765
         if (hsz != 12) stbi__get8(s);
3766 3767
         pal[i][3] = 255;
      }
3768
      stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
3769 3770
      if (bpp == 4) width = (s->img_x + 1) >> 1;
      else if (bpp == 8) width = s->img_x;
S
Sean Barrett 已提交
3771
      else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
3772 3773 3774
      pad = (-width)&3;
      for (j=0; j < (int) s->img_y; ++j) {
         for (i=0; i < (int) s->img_x; i += 2) {
3775
            int v=stbi__get8(s),v2=0;
3776 3777 3778 3779 3780 3781 3782 3783 3784
            if (bpp == 4) {
               v2 = v & 15;
               v >>= 4;
            }
            out[z++] = pal[v][0];
            out[z++] = pal[v][1];
            out[z++] = pal[v][2];
            if (target == 4) out[z++] = 255;
            if (i+1 == (int) s->img_x) break;
3785
            v = (bpp == 8) ? stbi__get8(s) : v2;
3786 3787 3788 3789 3790
            out[z++] = pal[v][0];
            out[z++] = pal[v][1];
            out[z++] = pal[v][2];
            if (target == 4) out[z++] = 255;
         }
3791
         stbi__skip(s, pad);
3792 3793 3794 3795 3796
      }
   } else {
      int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
      int z = 0;
      int easy=0;
3797
      stbi__skip(s, offset - 14 - hsz);
3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808
      if (bpp == 24) width = 3 * s->img_x;
      else if (bpp == 16) width = 2*s->img_x;
      else /* bpp = 32 and pad = 0 */ width=0;
      pad = (-width) & 3;
      if (bpp == 24) {
         easy = 1;
      } else if (bpp == 32) {
         if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
            easy = 2;
      }
      if (!easy) {
S
Sean Barrett 已提交
3809
         if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
3810
         // right shift amt to put high bit in position #7
3811 3812 3813 3814
         rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
         gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
         bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
         ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
3815 3816 3817 3818
      }
      for (j=0; j < (int) s->img_y; ++j) {
         if (easy) {
            for (i=0; i < (int) s->img_x; ++i) {
3819 3820 3821 3822
               unsigned char a;
               out[z+2] = stbi__get8(s);
               out[z+1] = stbi__get8(s);
               out[z+0] = stbi__get8(s);
3823
               z += 3;
3824
               a = (easy == 2 ? stbi__get8(s) : 255);
3825
               if (target == 4) out[z++] = a;
3826 3827 3828
            }
         } else {
            for (i=0; i < (int) s->img_x; ++i) {
3829
               stbi__uint32 v = (stbi__uint32) (bpp == 16 ? stbi__get16le(s) : stbi__get32le(s));
3830
               int a;
3831 3832 3833
               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
               out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
3834
               a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
3835
               if (target == 4) out[z++] = STBI__BYTECAST(a); 
3836 3837
            }
         }
3838
         stbi__skip(s, pad);
3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852
      }
   }
   if (flip_vertically) {
      stbi_uc t;
      for (j=0; j < (int) s->img_y>>1; ++j) {
         stbi_uc *p1 = out +      j     *s->img_x*target;
         stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
         for (i=0; i < (int) s->img_x*target; ++i) {
            t = p1[i], p1[i] = p2[i], p2[i] = t;
         }
      }
   }

   if (req_comp && req_comp != target) {
3853 3854
      out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
      if (out == NULL) return out; // stbi__convert_format frees input on failure
3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865
   }

   *x = s->img_x;
   *y = s->img_y;
   if (comp) *comp = s->img_n;
   return out;
}

// Targa Truevision - TGA
// by Jonathan Dummer

3866
static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
3867 3868 3869
{
    int tga_w, tga_h, tga_comp;
    int sz;
3870 3871
    stbi__get8(s);                   // discard Offset
    sz = stbi__get8(s);              // color type
3872
    if( sz > 1 ) {
3873
        stbi__rewind(s);
3874 3875
        return 0;      // only RGB or indexed allowed
    }
3876
    sz = stbi__get8(s);              // image type
3877 3878
    // only RGB or grey allowed, +/- RLE
    if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
3879 3880
    stbi__skip(s,9);
    tga_w = stbi__get16le(s);
3881
    if( tga_w < 1 ) {
3882
        stbi__rewind(s);
3883 3884
        return 0;   // test width
    }
3885
    tga_h = stbi__get16le(s);
3886
    if( tga_h < 1 ) {
3887
        stbi__rewind(s);
3888 3889
        return 0;   // test height
    }
3890
    sz = stbi__get8(s);               // bits per pixel
3891 3892
    // only RGB or RGBA or grey allowed
    if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
3893
        stbi__rewind(s);
3894 3895 3896 3897 3898 3899 3900 3901 3902
        return 0;
    }
    tga_comp = sz;
    if (x) *x = tga_w;
    if (y) *y = tga_h;
    if (comp) *comp = tga_comp / 8;
    return 1;                   // seems to have passed everything
}

3903
static int stbi__tga_test(stbi__context *s)
3904
{
3905
   int res;
3906
   int sz;
3907 3908
   stbi__get8(s);      //   discard Offset
   sz = stbi__get8(s);   //   color type
3909
   if ( sz > 1 ) return 0;   //   only RGB or indexed allowed
3910
   sz = stbi__get8(s);   //   image type
3911
   if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0;   //   only RGB or grey allowed, +/- RLE
3912 3913 3914 3915 3916 3917 3918 3919
   stbi__get16be(s);      //   discard palette start
   stbi__get16be(s);      //   discard palette length
   stbi__get8(s);         //   discard bits per palette color entry
   stbi__get16be(s);      //   discard x origin
   stbi__get16be(s);      //   discard y origin
   if ( stbi__get16be(s) < 1 ) return 0;      //   test width
   if ( stbi__get16be(s) < 1 ) return 0;      //   test height
   sz = stbi__get8(s);   //   bits per pixel
3920 3921 3922 3923 3924
   if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
      res = 0;
   else
      res = 1;
   stbi__rewind(s);
3925 3926 3927
   return res;
}

3928
static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3929 3930
{
   //   read in the TGA header stuff
3931 3932 3933
   int tga_offset = stbi__get8(s);
   int tga_indexed = stbi__get8(s);
   int tga_image_type = stbi__get8(s);
3934
   int tga_is_RLE = 0;
3935 3936
   int tga_palette_start = stbi__get16le(s);
   int tga_palette_len = stbi__get16le(s);
3937
   int tga_palette_bits = stbi__get8(s);
3938 3939 3940 3941
   int tga_x_origin = stbi__get16le(s);
   int tga_y_origin = stbi__get16le(s);
   int tga_width = stbi__get16le(s);
   int tga_height = stbi__get16le(s);
3942
   int tga_bits_per_pixel = stbi__get8(s);
3943
   int tga_comp = tga_bits_per_pixel / 8;
3944
   int tga_inverted = stbi__get8(s);
3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984
   //   image data
   unsigned char *tga_data;
   unsigned char *tga_palette = NULL;
   int i, j;
   unsigned char raw_data[4];
   int RLE_count = 0;
   int RLE_repeating = 0;
   int read_next_pixel = 1;

   //   do a tiny bit of precessing
   if ( tga_image_type >= 8 )
   {
      tga_image_type -= 8;
      tga_is_RLE = 1;
   }
   /* int tga_alpha_bits = tga_inverted & 15; */
   tga_inverted = 1 - ((tga_inverted >> 5) & 1);

   //   error check
   if ( //(tga_indexed) ||
      (tga_width < 1) || (tga_height < 1) ||
      (tga_image_type < 1) || (tga_image_type > 3) ||
      ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
      (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
      )
   {
      return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
   }

   //   If I'm paletted, then I'll use the number of bits from the palette
   if ( tga_indexed )
   {
      tga_comp = tga_palette_bits / 8;
   }

   //   tga info
   *x = tga_width;
   *y = tga_height;
   if (comp) *comp = tga_comp;

3985
   tga_data = (unsigned char*)stbi__malloc( tga_width * tga_height * tga_comp );
S
Sean Barrett 已提交
3986
   if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
3987

3988
   // skip to the data's starting position (offset usually = 0)
3989
   stbi__skip(s, tga_offset );
3990 3991 3992 3993

   if ( !tga_indexed && !tga_is_RLE) {
      for (i=0; i < tga_height; ++i) {
         int y = tga_inverted ? tga_height -i - 1 : i;
3994
         stbi_uc *tga_row = tga_data + y*tga_width*tga_comp;
3995
         stbi__getn(s, tga_row, tga_width * tga_comp);
3996 3997 3998 3999 4000
      }
   } else  {
      //   do I need to load a palette?
      if ( tga_indexed)
      {
4001
         //   any data to skip? (offset usually = 0)
4002
         stbi__skip(s, tga_palette_start );
4003
         //   load the palette
4004
         tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
4005
         if (!tga_palette) {
S
Sean Barrett 已提交
4006
            STBI_FREE(tga_data);
S
Sean Barrett 已提交
4007
            return stbi__errpuc("outofmem", "Out of memory");
4008
         }
4009
         if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
S
Sean Barrett 已提交
4010 4011
            STBI_FREE(tga_data);
            STBI_FREE(tga_palette);
S
Sean Barrett 已提交
4012
            return stbi__errpuc("bad palette", "Corrupt TGA");
4013 4014 4015 4016 4017
         }
      }
      //   load the data
      for (i=0; i < tga_width * tga_height; ++i)
      {
4018
         //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
4019 4020 4021 4022 4023
         if ( tga_is_RLE )
         {
            if ( RLE_count == 0 )
            {
               //   yep, get the next byte as a RLE command
4024
               int RLE_cmd = stbi__get8(s);
4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042
               RLE_count = 1 + (RLE_cmd & 127);
               RLE_repeating = RLE_cmd >> 7;
               read_next_pixel = 1;
            } else if ( !RLE_repeating )
            {
               read_next_pixel = 1;
            }
         } else
         {
            read_next_pixel = 1;
         }
         //   OK, if I need to read a pixel, do it now
         if ( read_next_pixel )
         {
            //   load however much data we did have
            if ( tga_indexed )
            {
               //   read in 1 byte, then perform the lookup
4043
               int pal_idx = stbi__get8(s);
4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058
               if ( pal_idx >= tga_palette_len )
               {
                  //   invalid index
                  pal_idx = 0;
               }
               pal_idx *= tga_bits_per_pixel / 8;
               for (j = 0; j*8 < tga_bits_per_pixel; ++j)
               {
                  raw_data[j] = tga_palette[pal_idx+j];
               }
            } else
            {
               //   read in the data raw
               for (j = 0; j*8 < tga_bits_per_pixel; ++j)
               {
4059
                  raw_data[j] = stbi__get8(s);
4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077
               }
            }
            //   clear the reading flag for the next pixel
            read_next_pixel = 0;
         } // end of reading a pixel

         // copy data
         for (j = 0; j < tga_comp; ++j)
           tga_data[i*tga_comp+j] = raw_data[j];

         //   in case we're in RLE mode, keep counting down
         --RLE_count;
      }
      //   do I need to invert the image?
      if ( tga_inverted )
      {
         for (j = 0; j*2 < tga_height; ++j)
         {
4078 4079 4080
            int index1 = j * tga_width * tga_comp;
            int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
            for (i = tga_width * tga_comp; i > 0; --i)
4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092
            {
               unsigned char temp = tga_data[index1];
               tga_data[index1] = tga_data[index2];
               tga_data[index2] = temp;
               ++index1;
               ++index2;
            }
         }
      }
      //   clear my palette, if I had one
      if ( tga_palette != NULL )
      {
S
Sean Barrett 已提交
4093
         STBI_FREE( tga_palette );
4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111
      }
   }

   // swap RGB
   if (tga_comp >= 3)
   {
      unsigned char* tga_pixel = tga_data;
      for (i=0; i < tga_width * tga_height; ++i)
      {
         unsigned char temp = tga_pixel[0];
         tga_pixel[0] = tga_pixel[2];
         tga_pixel[2] = temp;
         tga_pixel += tga_comp;
      }
   }

   // convert to target component count
   if (req_comp && req_comp != tga_comp)
4112
      tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124

   //   the things I do to get rid of an error message, and yet keep
   //   Microsoft's C compilers happy... [8^(
   tga_palette_start = tga_palette_len = tga_palette_bits =
         tga_x_origin = tga_y_origin = 0;
   //   OK, done
   return tga_data;
}

// *************************************************************************************************
// Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB

S
Sean Barrett 已提交
4125
static int stbi__psd_test(stbi__context *s)
4126
{
4127 4128
   int r = (stbi__get32be(s) == 0x38425053);
   stbi__rewind(s);
4129 4130 4131
   return r;
}

4132
static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4133 4134 4135 4136 4137
{
   int   pixelCount;
   int channelCount, compression;
   int channel, i, count, len;
   int w,h;
4138
   stbi_uc *out;
4139 4140

   // Check identifier
4141
   if (stbi__get32be(s) != 0x38425053)   // "8BPS"
S
Sean Barrett 已提交
4142
      return stbi__errpuc("not PSD", "Corrupt PSD image");
4143 4144

   // Check file type version.
4145
   if (stbi__get16be(s) != 1)
S
Sean Barrett 已提交
4146
      return stbi__errpuc("wrong version", "Unsupported version of PSD image");
4147 4148

   // Skip 6 reserved bytes.
4149
   stbi__skip(s, 6 );
4150 4151

   // Read the number of channels (R, G, B, A, etc).
4152
   channelCount = stbi__get16be(s);
4153
   if (channelCount < 0 || channelCount > 16)
S
Sean Barrett 已提交
4154
      return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
4155 4156

   // Read the rows and columns of the image.
4157 4158
   h = stbi__get32be(s);
   w = stbi__get32be(s);
4159 4160
   
   // Make sure the depth is 8 bits.
4161
   if (stbi__get16be(s) != 8)
S
Sean Barrett 已提交
4162
      return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 bit");
4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173

   // Make sure the color mode is RGB.
   // Valid options are:
   //   0: Bitmap
   //   1: Grayscale
   //   2: Indexed color
   //   3: RGB color
   //   4: CMYK color
   //   7: Multichannel
   //   8: Duotone
   //   9: Lab color
4174
   if (stbi__get16be(s) != 3)
S
Sean Barrett 已提交
4175
      return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
4176 4177

   // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
4178
   stbi__skip(s,stbi__get32be(s) );
4179 4180

   // Skip the image resources.  (resolution, pen tool paths, etc)
4181
   stbi__skip(s, stbi__get32be(s) );
4182 4183

   // Skip the reserved data.
4184
   stbi__skip(s, stbi__get32be(s) );
4185 4186 4187 4188 4189

   // Find out if the data is compressed.
   // Known values:
   //   0: no compression
   //   1: RLE compressed
4190
   compression = stbi__get16be(s);
4191
   if (compression > 1)
S
Sean Barrett 已提交
4192
      return stbi__errpuc("bad compression", "PSD has an unknown compression format");
4193 4194

   // Create the destination image.
4195
   out = (stbi_uc *) stbi__malloc(4 * w*h);
S
Sean Barrett 已提交
4196
   if (!out) return stbi__errpuc("outofmem", "Out of memory");
4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212
   pixelCount = w*h;

   // Initialize the data to zero.
   //memset( out, 0, pixelCount * 4 );
   
   // Finally, the image data.
   if (compression) {
      // RLE as used by .PSD and .TIFF
      // Loop until you get the number of unpacked bytes you are expecting:
      //     Read the next source byte into n.
      //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
      //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
      //     Else if n is 128, noop.
      // Endloop

      // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
4213
      // which we're going to just skip.
4214
      stbi__skip(s, h * channelCount * 2 );
4215 4216 4217

      // Read the RLE data by channel.
      for (channel = 0; channel < 4; channel++) {
4218
         stbi_uc *p;
4219 4220 4221 4222 4223 4224 4225 4226 4227
         
         p = out+channel;
         if (channel >= channelCount) {
            // Fill this channel with default data.
            for (i = 0; i < pixelCount; i++) *p = (channel == 3 ? 255 : 0), p += 4;
         } else {
            // Read the RLE data.
            count = 0;
            while (count < pixelCount) {
4228
               len = stbi__get8(s);
4229 4230 4231 4232 4233 4234 4235
               if (len == 128) {
                  // No-op.
               } else if (len < 128) {
                  // Copy next len+1 bytes literally.
                  len++;
                  count += len;
                  while (len) {
4236
                     *p = stbi__get8(s);
4237 4238 4239 4240
                     p += 4;
                     len--;
                  }
               } else if (len > 128) {
4241
                  stbi_uc   val;
4242 4243 4244 4245
                  // Next -len+1 bytes in the dest are replicated from next source byte.
                  // (Interpret len as a negative 8-bit int.)
                  len ^= 0x0FF;
                  len += 2;
4246
                  val = stbi__get8(s);
4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263
                  count += len;
                  while (len) {
                     *p = val;
                     p += 4;
                     len--;
                  }
               }
            }
         }
      }
      
   } else {
      // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
      // where each channel consists of an 8-bit value for each pixel in the image.
      
      // Read the data by channel.
      for (channel = 0; channel < 4; channel++) {
4264
         stbi_uc *p;
4265 4266 4267 4268 4269 4270 4271 4272
         
         p = out + channel;
         if (channel > channelCount) {
            // Fill this channel with default data.
            for (i = 0; i < pixelCount; i++) *p = channel == 3 ? 255 : 0, p += 4;
         } else {
            // Read the data.
            for (i = 0; i < pixelCount; i++)
4273
               *p = stbi__get8(s), p += 4;
4274 4275 4276 4277 4278
         }
      }
   }

   if (req_comp && req_comp != 4) {
4279 4280
      out = stbi__convert_format(out, 4, req_comp, w, h);
      if (out == NULL) return out; // stbi__convert_format frees input on failure
4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296
   }

   if (comp) *comp = channelCount;
   *y = h;
   *x = w;
   
   return out;
}

// *************************************************************************************************
// Softimage PIC loader
// by Tom Seddon
//
// See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
// See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/

4297
static int stbi__pic_is4(stbi__context *s,const char *str)
4298 4299 4300
{
   int i;
   for (i=0; i<4; ++i)
4301
      if (stbi__get8(s) != (stbi_uc)str[i])
4302 4303 4304 4305 4306
         return 0;

   return 1;
}

4307
static int stbi__pic_test_core(stbi__context *s)
4308 4309 4310
{
   int i;

4311
   if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
4312 4313 4314
      return 0;

   for(i=0;i<84;++i)
4315
      stbi__get8(s);
4316

4317
   if (!stbi__pic_is4(s,"PICT"))
4318 4319 4320 4321 4322 4323 4324 4325
      return 0;

   return 1;
}

typedef struct
{
   stbi_uc size,type,channel;
4326
} stbi__pic_packet;
4327

4328
static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
4329 4330 4331 4332 4333
{
   int mask=0x80, i;

   for (i=0; i<4; ++i, mask>>=1) {
      if (channel & mask) {
4334
         if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
4335
         dest[i]=stbi__get8(s);
4336 4337 4338 4339 4340 4341
      }
   }

   return dest;
}

4342
static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
4343 4344 4345 4346 4347 4348 4349 4350
{
   int mask=0x80,i;

   for (i=0;i<4; ++i, mask>>=1)
      if (channel&mask)
         dest[i]=src[i];
}

4351
static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
4352 4353
{
   int act_comp=0,num_packets=0,y,chained;
4354
   stbi__pic_packet packets[10];
4355 4356 4357 4358

   // this will (should...) cater for even some bizarre stuff like having data
    // for the same channel in multiple packets.
   do {
4359
      stbi__pic_packet *packet;
4360 4361

      if (num_packets==sizeof(packets)/sizeof(packets[0]))
S
Sean Barrett 已提交
4362
         return stbi__errpuc("bad format","too many packets");
4363 4364 4365

      packet = &packets[num_packets++];

4366
      chained = stbi__get8(s);
4367 4368 4369
      packet->size    = stbi__get8(s);
      packet->type    = stbi__get8(s);
      packet->channel = stbi__get8(s);
4370 4371 4372

      act_comp |= packet->channel;

4373
      if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
S
Sean Barrett 已提交
4374
      if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
4375 4376 4377 4378 4379 4380 4381 4382
   } while (chained);

   *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?

   for(y=0; y<height; ++y) {
      int packet_idx;

      for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
4383
         stbi__pic_packet *packet = &packets[packet_idx];
4384 4385 4386 4387
         stbi_uc *dest = result+y*width*4;

         switch (packet->type) {
            default:
S
Sean Barrett 已提交
4388
               return stbi__errpuc("bad format","packet has bad compression type");
4389 4390 4391 4392 4393

            case 0: {//uncompressed
               int x;

               for(x=0;x<width;++x, dest+=4)
4394
                  if (!stbi__readval(s,packet->channel,dest))
4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405
                     return 0;
               break;
            }

            case 1://Pure RLE
               {
                  int left=width, i;

                  while (left>0) {
                     stbi_uc count,value[4];

4406
                     count=stbi__get8(s);
4407
                     if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
4408 4409

                     if (count > left)
4410
                        count = (stbi_uc) left;
4411

4412
                     if (!stbi__readval(s,packet->channel,value))  return 0;
4413 4414

                     for(i=0; i<count; ++i,dest+=4)
4415
                        stbi__copyval(packet->channel,dest,value);
4416 4417 4418 4419 4420 4421 4422 4423
                     left -= count;
                  }
               }
               break;

            case 2: {//Mixed RLE
               int left=width;
               while (left>0) {
4424 4425
                  int count = stbi__get8(s), i;
                  if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
4426 4427 4428 4429 4430 4431

                  if (count >= 128) { // Repeated
                     stbi_uc value[4];
                     int i;

                     if (count==128)
4432
                        count = stbi__get16be(s);
4433 4434 4435
                     else
                        count -= 127;
                     if (count > left)
S
Sean Barrett 已提交
4436
                        return stbi__errpuc("bad file","scanline overrun");
4437

4438
                     if (!stbi__readval(s,packet->channel,value))
4439 4440 4441
                        return 0;

                     for(i=0;i<count;++i, dest += 4)
4442
                        stbi__copyval(packet->channel,dest,value);
4443 4444
                  } else { // Raw
                     ++count;
S
Sean Barrett 已提交
4445
                     if (count>left) return stbi__errpuc("bad file","scanline overrun");
4446 4447

                     for(i=0;i<count;++i, dest+=4)
4448
                        if (!stbi__readval(s,packet->channel,dest))
4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461
                           return 0;
                  }
                  left-=count;
               }
               break;
            }
         }
      }
   }

   return result;
}

4462
static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
4463 4464 4465 4466 4467
{
   stbi_uc *result;
   int i, x,y;

   for (i=0; i<92; ++i)
4468
      stbi__get8(s);
4469

4470 4471 4472
   x = stbi__get16be(s);
   y = stbi__get16be(s);
   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
4473
   if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
4474

S
Sean Barrett 已提交
4475 4476 4477
   stbi__get32be(s); //skip `ratio'
   stbi__get16be(s); //skip `fields'
   stbi__get16be(s); //skip `pad'
4478 4479

   // intermediate buffer is RGBA
4480
   result = (stbi_uc *) stbi__malloc(x*y*4);
4481 4482
   memset(result, 0xff, x*y*4);

4483
   if (!stbi__pic_load_core(s,x,y,comp, result)) {
S
Sean Barrett 已提交
4484
      STBI_FREE(result);
4485 4486 4487 4488 4489
      result=0;
   }
   *px = x;
   *py = y;
   if (req_comp == 0) req_comp = *comp;
4490
   result=stbi__convert_format(result,4,req_comp,x,y);
4491 4492 4493 4494

   return result;
}

S
Sean Barrett 已提交
4495
static int stbi__pic_test(stbi__context *s)
4496
{
4497 4498
   int r = stbi__pic_test_core(s);
   stbi__rewind(s);
4499 4500 4501 4502 4503
   return r;
}

// *************************************************************************************************
// GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
4504 4505
typedef struct 
{
4506
   stbi__int16 prefix;
4507 4508 4509
   stbi_uc first;
   stbi_uc suffix;
} stbi__gif_lzw;
4510

4511
typedef struct
4512 4513 4514 4515
{
   int w,h;
   stbi_uc *out;                 // output buffer (always 4 components)
   int flags, bgindex, ratio, transparent, eflags;
4516 4517 4518 4519
   stbi_uc  pal[256][4];
   stbi_uc lpal[256][4];
   stbi__gif_lzw codes[4096];
   stbi_uc *color_table;
4520 4521 4522 4523 4524 4525
   int parse, step;
   int lflags;
   int start_x, start_y;
   int max_x, max_y;
   int cur_x, cur_y;
   int line_size;
4526
} stbi__gif;
4527

4528
static int stbi__gif_test_raw(stbi__context *s)
4529 4530
{
   int sz;
4531 4532
   if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
   sz = stbi__get8(s);
4533
   if (sz != '9' && sz != '7') return 0;
4534
   if (stbi__get8(s) != 'a') return 0;
4535 4536 4537
   return 1;
}

S
Sean Barrett 已提交
4538
static int stbi__gif_test(stbi__context *s)
4539
{
4540 4541
   int r = stbi__gif_test_raw(s);
   stbi__rewind(s);
4542 4543 4544
   return r;
}

4545
static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
4546 4547 4548
{
   int i;
   for (i=0; i < num_entries; ++i) {
4549 4550 4551
      pal[i][2] = stbi__get8(s);
      pal[i][1] = stbi__get8(s);
      pal[i][0] = stbi__get8(s);
4552 4553 4554 4555
      pal[i][3] = transp ? 0 : 255;
   }   
}

4556
static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
4557
{
4558
   stbi_uc version;
4559
   if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
S
Sean Barrett 已提交
4560
      return stbi__err("not GIF", "Corrupt GIF");
4561

4562
   version = stbi__get8(s);
S
Sean Barrett 已提交
4563
   if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
4564
   if (stbi__get8(s) != 'a')                      return stbi__err("not GIF", "Corrupt GIF");
4565
 
4566
   stbi__g_failure_reason = "";
4567 4568 4569 4570 4571
   g->w = stbi__get16le(s);
   g->h = stbi__get16le(s);
   g->flags = stbi__get8(s);
   g->bgindex = stbi__get8(s);
   g->ratio = stbi__get8(s);
4572 4573 4574 4575 4576 4577 4578
   g->transparent = -1;

   if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments

   if (is_info) return 1;

   if (g->flags & 0x80)
4579
      stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
4580 4581 4582 4583

   return 1;
}

4584
static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
4585
{
4586 4587 4588
   stbi__gif g;   
   if (!stbi__gif_header(s, &g, comp, 1)) {
      stbi__rewind( s );
4589 4590 4591 4592 4593 4594 4595
      return 0;
   }
   if (x) *x = g.w;
   if (y) *y = g.h;
   return 1;
}

4596
static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
4597
{
4598
   stbi_uc *p, *c;
4599

4600
   // recurse to decode the prefixes, since the linked-list is backwards,
4601 4602
   // and working backwards through an interleaved image would be nasty
   if (g->codes[code].prefix >= 0)
4603
      stbi__out_gif_code(g, g->codes[code].prefix);
4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629

   if (g->cur_y >= g->max_y) return;
  
   p = &g->out[g->cur_x + g->cur_y];
   c = &g->color_table[g->codes[code].suffix * 4];

   if (c[3] >= 128) {
      p[0] = c[2];
      p[1] = c[1];
      p[2] = c[0];
      p[3] = c[3];
   }
   g->cur_x += 4;

   if (g->cur_x >= g->max_x) {
      g->cur_x = g->start_x;
      g->cur_y += g->step;

      while (g->cur_y >= g->max_y && g->parse > 0) {
         g->step = (1 << g->parse) * g->line_size;
         g->cur_y = g->start_y + (g->step >> 1);
         --g->parse;
      }
   }
}

4630
static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
4631
{
4632
   stbi_uc lzw_cs;
4633 4634 4635
   stbi__int32 len, code;
   stbi__uint32 first;
   stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
4636
   stbi__gif_lzw *p;
4637

4638
   lzw_cs = stbi__get8(s);
4639 4640 4641 4642 4643 4644 4645 4646
   clear = 1 << lzw_cs;
   first = 1;
   codesize = lzw_cs + 1;
   codemask = (1 << codesize) - 1;
   bits = 0;
   valid_bits = 0;
   for (code = 0; code < clear; code++) {
      g->codes[code].prefix = -1;
4647 4648
      g->codes[code].first = (stbi_uc) code;
      g->codes[code].suffix = (stbi_uc) code;
4649 4650 4651 4652 4653 4654 4655 4656 4657 4658
   }

   // support no starting clear code
   avail = clear+2;
   oldcode = -1;

   len = 0;
   for(;;) {
      if (valid_bits < codesize) {
         if (len == 0) {
4659
            len = stbi__get8(s); // start new block
4660 4661 4662 4663
            if (len == 0) 
               return g->out;
         }
         --len;
4664
         bits |= (stbi__int32) stbi__get8(s) << valid_bits;
4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677
         valid_bits += 8;
      } else {
         stbi__int32 code = bits & codemask;
         bits >>= codesize;
         valid_bits -= codesize;
         // @OPTIMIZE: is there some way we can accelerate the non-clear path?
         if (code == clear) {  // clear code
            codesize = lzw_cs + 1;
            codemask = (1 << codesize) - 1;
            avail = clear + 2;
            oldcode = -1;
            first = 0;
         } else if (code == clear + 1) { // end of stream code
4678 4679 4680
            stbi__skip(s, len);
            while ((len = stbi__get8(s)) > 0)
               stbi__skip(s,len);
4681 4682
            return g->out;
         } else if (code <= avail) {
S
Sean Barrett 已提交
4683
            if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
4684 4685 4686

            if (oldcode >= 0) {
               p = &g->codes[avail++];
S
Sean Barrett 已提交
4687
               if (avail > 4096)        return stbi__errpuc("too many codes", "Corrupt GIF");
4688 4689 4690 4691
               p->prefix = (stbi__int16) oldcode;
               p->first = g->codes[oldcode].first;
               p->suffix = (code == avail) ? p->first : g->codes[code].first;
            } else if (code == avail)
S
Sean Barrett 已提交
4692
               return stbi__errpuc("illegal code in raster", "Corrupt GIF");
4693

4694
            stbi__out_gif_code(g, (stbi__uint16) code);
4695 4696 4697 4698 4699 4700 4701 4702

            if ((avail & codemask) == 0 && avail <= 0x0FFF) {
               codesize++;
               codemask = (1 << codesize) - 1;
            }

            oldcode = code;
         } else {
S
Sean Barrett 已提交
4703
            return stbi__errpuc("illegal code in raster", "Corrupt GIF");
4704 4705 4706 4707 4708
         }
      } 
   }
}

4709
static void stbi__fill_gif_background(stbi__gif *g)
4710 4711
{
   int i;
4712
   stbi_uc *c = g->pal[g->bgindex];
4713 4714
   // @OPTIMIZE: write a dword at a time
   for (i = 0; i < g->w * g->h * 4; i += 4) {
4715
      stbi_uc *p  = &g->out[i];
4716 4717 4718 4719 4720 4721 4722 4723
      p[0] = c[2];
      p[1] = c[1];
      p[2] = c[0];
      p[3] = c[3];
   }
}

// this function is designed to support animated gifs, although stb_image doesn't support it
4724
static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
4725 4726
{
   int i;
4727
   stbi_uc *old_out = 0;
4728 4729

   if (g->out == 0) {
4730
      if (!stbi__gif_header(s, g, comp,0))     return 0; // stbi__g_failure_reason set by stbi__gif_header
4731
      g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
S
Sean Barrett 已提交
4732
      if (g->out == 0)                      return stbi__errpuc("outofmem", "Out of memory");
4733
      stbi__fill_gif_background(g);
4734 4735 4736 4737
   } else {
      // animated-gif-only path
      if (((g->eflags & 0x1C) >> 2) == 3) {
         old_out = g->out;
4738
         g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
S
Sean Barrett 已提交
4739
         if (g->out == 0)                   return stbi__errpuc("outofmem", "Out of memory");
4740 4741 4742 4743 4744
         memcpy(g->out, old_out, g->w*g->h*4);
      }
   }
    
   for (;;) {
4745
      switch (stbi__get8(s)) {
4746 4747 4748
         case 0x2C: /* Image Descriptor */
         {
            stbi__int32 x, y, w, h;
4749
            stbi_uc *o;
4750

4751 4752 4753 4754
            x = stbi__get16le(s);
            y = stbi__get16le(s);
            w = stbi__get16le(s);
            h = stbi__get16le(s);
4755
            if (((x + w) > (g->w)) || ((y + h) > (g->h)))
S
Sean Barrett 已提交
4756
               return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
4757 4758 4759 4760 4761 4762 4763 4764 4765

            g->line_size = g->w * 4;
            g->start_x = x * 4;
            g->start_y = y * g->line_size;
            g->max_x   = g->start_x + w * 4;
            g->max_y   = g->start_y + h * g->line_size;
            g->cur_x   = g->start_x;
            g->cur_y   = g->start_y;

4766
            g->lflags = stbi__get8(s);
4767 4768 4769 4770 4771 4772 4773 4774 4775 4776

            if (g->lflags & 0x40) {
               g->step = 8 * g->line_size; // first interlaced spacing
               g->parse = 3;
            } else {
               g->step = g->line_size;
               g->parse = 0;
            }

            if (g->lflags & 0x80) {
4777 4778
               stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
               g->color_table = (stbi_uc *) g->lpal;       
4779
            } else if (g->flags & 0x80) {
4780
               for (i=0; i < 256; ++i)  // @OPTIMIZE: stbi__jpeg_reset only the previous transparent
4781 4782 4783
                  g->pal[i][3] = 255; 
               if (g->transparent >= 0 && (g->eflags & 0x01))
                  g->pal[g->transparent][3] = 0;
4784
               g->color_table = (stbi_uc *) g->pal;
4785
            } else
S
Sean Barrett 已提交
4786
               return stbi__errpuc("missing color table", "Corrupt GIF");
4787
   
4788
            o = stbi__process_gif_raster(s, g);
4789 4790 4791
            if (o == NULL) return NULL;

            if (req_comp && req_comp != 4)
4792
               o = stbi__convert_format(o, 4, req_comp, g->w, g->h);
4793 4794 4795 4796 4797 4798
            return o;
         }

         case 0x21: // Comment Extension.
         {
            int len;
4799 4800
            if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
               len = stbi__get8(s);
4801
               if (len == 4) {
4802 4803 4804
                  g->eflags = stbi__get8(s);
                  stbi__get16le(s); // delay
                  g->transparent = stbi__get8(s);
4805
               } else {
4806
                  stbi__skip(s, len);
4807 4808 4809
                  break;
               }
            }
4810 4811
            while ((len = stbi__get8(s)) != 0)
               stbi__skip(s, len);
4812 4813 4814 4815
            break;
         }

         case 0x3B: // gif stream termination code
4816
            return (stbi_uc *) s; // using '1' causes warning on some compilers
4817 4818

         default:
S
Sean Barrett 已提交
4819
            return stbi__errpuc("unknown code", "Corrupt GIF");
4820 4821 4822 4823
      }
   }
}

S
Sean Barrett 已提交
4824
static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4825
{
4826
   stbi_uc *u = 0;
J
johan 已提交
4827 4828
   stbi__gif g;
   memset(&g, 0, sizeof(g));
4829

4830
   u = stbi__gif_load_next(s, &g, comp, req_comp);
4831
   if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
4832 4833 4834 4835 4836 4837 4838 4839
   if (u) {
      *x = g.w;
      *y = g.h;
   }

   return u;
}

S
Sean Barrett 已提交
4840
static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
4841
{
4842
   return stbi__gif_info_raw(s,x,y,comp);
4843 4844 4845 4846 4847 4848 4849
}


// *************************************************************************************************
// Radiance RGBE HDR loader
// originally by Nicolas Schulz
#ifndef STBI_NO_HDR
4850
static int stbi__hdr_test_core(stbi__context *s)
4851 4852 4853 4854
{
   const char *signature = "#?RADIANCE\n";
   int i;
   for (i=0; signature[i]; ++i)
4855
      if (stbi__get8(s) != signature[i])
4856 4857 4858 4859
         return 0;
   return 1;
}

S
Sean Barrett 已提交
4860
static int stbi__hdr_test(stbi__context* s)
4861
{
4862 4863
   int r = stbi__hdr_test_core(s);
   stbi__rewind(s);
4864 4865 4866
   return r;
}

4867 4868
#define STBI__HDR_BUFLEN  1024
static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
4869 4870 4871 4872
{
   int len=0;
   char c = '\0';

4873
   c = (char) stbi__get8(z);
4874

4875
   while (!stbi__at_eof(z) && c != '\n') {
4876
      buffer[len++] = c;
4877
      if (len == STBI__HDR_BUFLEN-1) {
4878
         // flush to end of line
4879
         while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
4880 4881 4882
            ;
         break;
      }
4883
      c = (char) stbi__get8(z);
4884 4885 4886 4887 4888 4889
   }

   buffer[len] = 0;
   return buffer;
}

4890
static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916
{
   if ( input[3] != 0 ) {
      float f1;
      // Exponent
      f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
      if (req_comp <= 2)
         output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
      else {
         output[0] = input[0] * f1;
         output[1] = input[1] * f1;
         output[2] = input[2] * f1;
      }
      if (req_comp == 2) output[1] = 1;
      if (req_comp == 4) output[3] = 1;
   } else {
      switch (req_comp) {
         case 4: output[3] = 1; /* fallthrough */
         case 3: output[0] = output[1] = output[2] = 0;
                 break;
         case 2: output[1] = 1; /* fallthrough */
         case 1: output[0] = 0;
                 break;
      }
   }
}

4917
static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4918
{
4919
   char buffer[STBI__HDR_BUFLEN];
4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930
   char *token;
   int valid = 0;
   int width, height;
   stbi_uc *scanline;
   float *hdr_data;
   int len;
   unsigned char count, value;
   int i, j, k, c1,c2, z;


   // Check identifier
4931
   if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
S
Sean Barrett 已提交
4932
      return stbi__errpf("not HDR", "Corrupt HDR image");
4933 4934 4935
   
   // Parse header
   for(;;) {
4936
      token = stbi__hdr_gettoken(s,buffer);
4937 4938 4939 4940
      if (token[0] == 0) break;
      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   }

S
Sean Barrett 已提交
4941
   if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
4942 4943 4944

   // Parse width and height
   // can't use sscanf() if we're not using stdio!
4945
   token = stbi__hdr_gettoken(s,buffer);
S
Sean Barrett 已提交
4946
   if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
4947 4948 4949
   token += 3;
   height = (int) strtol(token, &token, 10);
   while (*token == ' ') ++token;
S
Sean Barrett 已提交
4950
   if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
4951 4952 4953 4954 4955 4956
   token += 3;
   width = (int) strtol(token, NULL, 10);

   *x = width;
   *y = height;

4957
   if (comp) *comp = 3;
4958 4959 4960
   if (req_comp == 0) req_comp = 3;

   // Read data
4961
   hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
4962 4963 4964 4965 4966 4967 4968 4969 4970

   // Load image data
   // image data is stored as some number of sca
   if ( width < 8 || width >= 32768) {
      // Read flat data
      for (j=0; j < height; ++j) {
         for (i=0; i < width; ++i) {
            stbi_uc rgbe[4];
           main_decode_loop:
4971
            stbi__getn(s, rgbe, 4);
4972
            stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
4973 4974 4975 4976 4977 4978 4979
         }
      }
   } else {
      // Read RLE-encoded data
      scanline = NULL;

      for (j = 0; j < height; ++j) {
4980 4981 4982
         c1 = stbi__get8(s);
         c2 = stbi__get8(s);
         len = stbi__get8(s);
4983 4984 4985
         if (c1 != 2 || c2 != 2 || (len & 0x80)) {
            // not run-length encoded, so we have to actually use THIS data as a decoded
            // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
4986 4987 4988 4989
            stbi_uc rgbe[4];
            rgbe[0] = (stbi_uc) c1;
            rgbe[1] = (stbi_uc) c2;
            rgbe[2] = (stbi_uc) len;
4990
            rgbe[3] = (stbi_uc) stbi__get8(s);
4991
            stbi__hdr_convert(hdr_data, rgbe, req_comp);
4992 4993
            i = 1;
            j = 0;
S
Sean Barrett 已提交
4994
            STBI_FREE(scanline);
4995 4996 4997
            goto main_decode_loop; // yes, this makes no sense
         }
         len <<= 8;
4998
         len |= stbi__get8(s);
S
Sean Barrett 已提交
4999
         if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
5000
         if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
5001 5002 5003 5004
            
         for (k = 0; k < 4; ++k) {
            i = 0;
            while (i < width) {
5005
               count = stbi__get8(s);
5006 5007
               if (count > 128) {
                  // Run
5008
                  value = stbi__get8(s);
5009 5010 5011 5012 5013 5014
                  count -= 128;
                  for (z = 0; z < count; ++z)
                     scanline[i++ * 4 + k] = value;
               } else {
                  // Dump
                  for (z = 0; z < count; ++z)
5015
                     scanline[i++ * 4 + k] = stbi__get8(s);
5016 5017 5018 5019
               }
            }
         }
         for (i=0; i < width; ++i)
5020
            stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
5021
      }
S
Sean Barrett 已提交
5022
      STBI_FREE(scanline);
5023 5024 5025 5026 5027
   }

   return hdr_data;
}

5028
static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
5029
{
5030
   char buffer[STBI__HDR_BUFLEN];
5031 5032 5033
   char *token;
   int valid = 0;

5034 5035
   if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
       stbi__rewind( s );
5036 5037 5038 5039
       return 0;
   }

   for(;;) {
5040
      token = stbi__hdr_gettoken(s,buffer);
5041 5042 5043 5044 5045
      if (token[0] == 0) break;
      if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   }

   if (!valid) {
5046
       stbi__rewind( s );
5047 5048
       return 0;
   }
5049
   token = stbi__hdr_gettoken(s,buffer);
5050
   if (strncmp(token, "-Y ", 3)) {
5051
       stbi__rewind( s );
5052 5053 5054 5055 5056 5057
       return 0;
   }
   token += 3;
   *y = (int) strtol(token, &token, 10);
   while (*token == ' ') ++token;
   if (strncmp(token, "+X ", 3)) {
5058
       stbi__rewind( s );
5059 5060 5061 5062 5063 5064 5065 5066 5067
       return 0;
   }
   token += 3;
   *x = (int) strtol(token, NULL, 10);
   *comp = 3;
   return 1;
}
#endif // STBI_NO_HDR

5068
static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
5069 5070
{
   int hsz;
5071
   if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
5072
       stbi__rewind( s );
5073 5074
       return 0;
   }
5075 5076
   stbi__skip(s,12);
   hsz = stbi__get32le(s);
5077
   if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
5078
       stbi__rewind( s );
5079 5080 5081
       return 0;
   }
   if (hsz == 12) {
5082 5083
      *x = stbi__get16le(s);
      *y = stbi__get16le(s);
5084
   } else {
5085 5086
      *x = stbi__get32le(s);
      *y = stbi__get32le(s);
5087
   }
5088
   if (stbi__get16le(s) != 1) {
5089
       stbi__rewind( s );
5090 5091
       return 0;
   }
5092
   *comp = stbi__get16le(s) / 8;
5093 5094 5095
   return 1;
}

5096
static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
5097 5098
{
   int channelCount;
5099
   if (stbi__get32be(s) != 0x38425053) {
5100
       stbi__rewind( s );
5101 5102
       return 0;
   }
5103
   if (stbi__get16be(s) != 1) {
5104
       stbi__rewind( s );
5105 5106
       return 0;
   }
5107 5108
   stbi__skip(s, 6);
   channelCount = stbi__get16be(s);
5109
   if (channelCount < 0 || channelCount > 16) {
5110
       stbi__rewind( s );
5111 5112
       return 0;
   }
5113 5114 5115
   *y = stbi__get32be(s);
   *x = stbi__get32be(s);
   if (stbi__get16be(s) != 8) {
5116
       stbi__rewind( s );
5117 5118
       return 0;
   }
5119
   if (stbi__get16be(s) != 3) {
5120
       stbi__rewind( s );
5121 5122 5123 5124 5125 5126
       return 0;
   }
   *comp = 4;
   return 1;
}

5127
static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
5128 5129
{
   int act_comp=0,num_packets=0,chained;
5130
   stbi__pic_packet packets[10];
5131

5132
   stbi__skip(s, 92);
5133

5134 5135 5136
   *x = stbi__get16be(s);
   *y = stbi__get16be(s);
   if (stbi__at_eof(s))  return 0;
5137
   if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
5138
       stbi__rewind( s );
5139 5140 5141
       return 0;
   }

5142
   stbi__skip(s, 8);
5143 5144

   do {
5145
      stbi__pic_packet *packet;
5146 5147 5148 5149 5150

      if (num_packets==sizeof(packets)/sizeof(packets[0]))
         return 0;

      packet = &packets[num_packets++];
5151
      chained = stbi__get8(s);
5152 5153 5154
      packet->size    = stbi__get8(s);
      packet->type    = stbi__get8(s);
      packet->channel = stbi__get8(s);
5155 5156
      act_comp |= packet->channel;

5157
      if (stbi__at_eof(s)) {
5158
          stbi__rewind( s );
5159 5160 5161
          return 0;
      }
      if (packet->size != 8) {
5162
          stbi__rewind( s );
5163 5164 5165 5166 5167 5168 5169 5170 5171
          return 0;
      }
   } while (chained);

   *comp = (act_comp & 0x10 ? 4 : 3);

   return 1;
}

5172
static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
5173
{
S
Sean Barrett 已提交
5174
   if (stbi__jpeg_info(s, x, y, comp))
5175
       return 1;
S
Sean Barrett 已提交
5176
   if (stbi__png_info(s, x, y, comp))
5177
       return 1;
S
Sean Barrett 已提交
5178
   if (stbi__gif_info(s, x, y, comp))
5179
       return 1;
5180
   if (stbi__bmp_info(s, x, y, comp))
5181
       return 1;
5182
   if (stbi__psd_info(s, x, y, comp))
5183
       return 1;
5184
   if (stbi__pic_info(s, x, y, comp))
5185 5186
       return 1;
   #ifndef STBI_NO_HDR
5187
   if (stbi__hdr_info(s, x, y, comp))
5188 5189 5190
       return 1;
   #endif
   // test tga last because it's a crappy test!
S
Sean Barrett 已提交
5191
   if (stbi__tga_info(s, x, y, comp))
5192
       return 1;
S
Sean Barrett 已提交
5193
   return stbi__err("unknown image type", "Image not of any known type, or corrupt");
5194 5195 5196 5197 5198
}

#ifndef STBI_NO_STDIO
STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
{
S
Sean Barrett 已提交
5199
    FILE *f = stbi__fopen(filename, "rb");
5200
    int result;
S
Sean Barrett 已提交
5201
    if (!f) return stbi__err("can't fopen", "Unable to open file");
5202 5203 5204 5205 5206 5207 5208 5209
    result = stbi_info_from_file(f, x, y, comp);
    fclose(f);
    return result;
}

STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
{
   int r;
S
Sean Barrett 已提交
5210
   stbi__context s;
5211
   long pos = ftell(f);
5212 5213
   stbi__start_file(&s, f);
   r = stbi__info_main(&s,x,y,comp);
5214 5215 5216 5217 5218 5219 5220
   fseek(f,pos,SEEK_SET);
   return r;
}
#endif // !STBI_NO_STDIO

STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
{
S
Sean Barrett 已提交
5221 5222
   stbi__context s;
   stbi__start_mem(&s,buffer,len);
5223
   return stbi__info_main(&s,x,y,comp);
5224 5225 5226 5227
}

STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
{
S
Sean Barrett 已提交
5228 5229
   stbi__context s;
   stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
5230
   return stbi__info_main(&s,x,y,comp);
5231 5232 5233 5234 5235 5236
}

#endif // STB_IMAGE_IMPLEMENTATION

/*
   revision history:
5237
      1.48 (2014-12-14) fix incorrectly-named assert()
5238 5239 5240
      1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
                        optimize PNG (ryg)
                        fix bug in interlaced PNG with user-specified channel count (stb)
5241 5242 5243
      1.46 (2014-08-26)
             fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
      1.45 (2014-08-16)
5244
             fix MSVC-ARM internal compiler error by wrapping malloc
5245
      1.44 (2014-08-07)
5246
               various warning fixes from Ronny Chevalier
S
Sean Barrett 已提交
5247 5248
      1.43 (2014-07-15)
             fix MSVC-only compiler problem in code changed in 1.42
S
Sean Barrett 已提交
5249 5250 5251 5252
      1.42 (2014-07-09)
             don't define _CRT_SECURE_NO_WARNINGS (affects user code)
             fixes to stbi__cleanup_jpeg path
             added STBI_ASSERT to avoid requiring assert.h
5253 5254
      1.41 (2014-06-25)
             fix search&replace from 1.36 that messed up comments/error messages
S
Sean Barrett 已提交
5255 5256
      1.40 (2014-06-22)
             fix gcc struct-initialization warning
5257 5258 5259 5260
      1.39 (2014-06-15)
             fix to TGA optimization when req_comp != number of components in TGA;
             fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
             add support for BMP version 5 (more ignored fields)
5261 5262
      1.38 (2014-06-06)
             suppress MSVC warnings on integer casts truncating values
S
Sean Barrett 已提交
5263
             fix accidental rename of 'skip' field of I/O
5264 5265
      1.37 (2014-06-04)
             remove duplicate typedef
5266 5267 5268
      1.36 (2014-06-03)
             convert to header file single-file library
             if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
5269 5270 5271 5272 5273 5274 5275
      1.35 (2014-05-27)
             various warnings
             fix broken STBI_SIMD path
             fix bug where stbi_load_from_file no longer left file pointer in correct place
             fix broken non-easy path for 32-bit BMP (possibly never used)
             TGA optimization by Arseny Kapoulkine
      1.34 (unknown)
5276
             use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293
      1.33 (2011-07-14)
             make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
      1.32 (2011-07-13)
             support for "info" function for all supported filetypes (SpartanJ)
      1.31 (2011-06-20)
             a few more leak fixes, bug in PNG handling (SpartanJ)
      1.30 (2011-06-11)
             added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
             removed deprecated format-specific test/load functions
             removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
             error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
             fix inefficiency in decoding 32-bit BMP (David Woo)
      1.29 (2010-08-16)
             various warning fixes from Aurelien Pocheville 
      1.28 (2010-08-01)
             fix bug in GIF palette transparency (SpartanJ)
      1.27 (2010-08-01)
5294
             cast-to-stbi_uc to fix warnings
5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309
      1.26 (2010-07-24)
             fix bug in file buffering for PNG reported by SpartanJ
      1.25 (2010-07-17)
             refix trans_data warning (Won Chun)
      1.24 (2010-07-12)
             perf improvements reading from files on platforms with lock-heavy fgetc()
             minor perf improvements for jpeg
             deprecated type-specific functions so we'll get feedback if they're needed
             attempt to fix trans_data warning (Won Chun)
      1.23   fixed bug in iPhone support
      1.22 (2010-07-10)
             removed image *writing* support
             stbi_info support from Jetro Lauha
             GIF support from Jean-Marc Lienher
             iPhone PNG-extensions from James Brown
S
Sean Barrett 已提交
5310
             warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
5311
      1.21   fix use of 'stbi_uc' in header (reported by jon blow)
5312 5313 5314 5315 5316
      1.20   added support for Softimage PIC, by Tom Seddon
      1.19   bug in interlaced PNG corruption check (found by ryg)
      1.18 2008-08-02
             fix a threading bug (local mutable static)
      1.17   support interlaced PNG
5317
      1.16   major bugfix - stbi__convert_format converted one too many pixels
5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334
      1.15   initialize some fields for thread safety
      1.14   fix threadsafe conversion bug
             header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
      1.13   threadsafe
      1.12   const qualifiers in the API
      1.11   Support installable IDCT, colorspace conversion routines
      1.10   Fixes for 64-bit (don't use "unsigned long")
             optimized upsampling by Fabian "ryg" Giesen
      1.09   Fix format-conversion for PSD code (bad global variables!)
      1.08   Thatcher Ulrich's PSD code integrated by Nicolas Schulz
      1.07   attempt to fix C++ warning/errors again
      1.06   attempt to fix C++ warning/errors again
      1.05   fix TGA loading to return correct *comp and use good luminance calc
      1.04   default float alpha is 1, not 255; use 'void *' for stbi_image_free
      1.03   bugfixes to STBI_NO_STDIO, STBI_NO_HDR
      1.02   support for (subset of) HDR files, float interface for preferred access to them
      1.01   fix bug: possible bug in handling right-side up bmps... not sure
S
Sean Barrett 已提交
5335
             fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360
      1.00   interface to zlib that skips zlib header
      0.99   correct handling of alpha in palette
      0.98   TGA loader by lonesock; dynamically add loaders (untested)
      0.97   jpeg errors on too large a file; also catch another malloc failure
      0.96   fix detection of invalid v value - particleman@mollyrocket forum
      0.95   during header scan, seek to markers in case of padding
      0.94   STBI_NO_STDIO to disable stdio usage; rename all #defines the same
      0.93   handle jpegtran output; verbose errors
      0.92   read 4,8,16,24,32-bit BMP files of several formats
      0.91   output 24-bit Windows 3.0 BMP files
      0.90   fix a few more warnings; bump version number to approach 1.0
      0.61   bugfixes due to Marc LeBlanc, Christopher Lloyd
      0.60   fix compiling as c++
      0.59   fix warnings: merge Dave Moore's -Wall fixes
      0.58   fix bug: zlib uncompressed mode len/nlen was wrong endian
      0.57   fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
      0.56   fix bug: zlib uncompressed mode len vs. nlen
      0.55   fix bug: restart_interval not initialized to 0
      0.54   allow NULL for 'int *comp'
      0.53   fix bug in png 3->4; speedup png decoding
      0.52   png handles req_comp=3,4 directly; minor cleanup; jpeg comments
      0.51   obey req_comp requests, 1-component jpegs return as 1-component,
             on 'test' only check type, not whether we support this variant
      0.50   first released version
*/