What should I know about Insight's 16-bit data compression?
Being relatively new to Insight processing I would like to better understand how intermediate data is stored in 16-bit. I haven't found any documentation so have the following questions:
Q: What processing should be applied to the original 32-bit floating point input data before it can be safely stored in 16-bit (integer?) format (e.g. any amplitude balancing or scaling required)?
A: Your data should be despiked. 16 bit data results in a fixed-size compression ratio of about 1.5:1 (i.e. an output file that's about 60% the size of the uncompressed input) with a loss of precision of approximately 1 part in 32,000, provided that the data has no spikes. Hard zeroes (0.0) are preserved exactly.
Q: Does the sample format of the internal processing depend on the storage format of the input data or is it always done at 32 or 16-bit?
A: Internal processing is always done at full precision
Q: Are there any issues associated with the subsequent return of data from 16 to 32-bit when the final data are written out to SEGY format?
A: Exporting back to 32 bit will retain the precision you had at 16 bits.
Q: If a large spike is detected on a 16-bit stored dataset I assume that this could impact the accuracy of the nearby amplitudes and subsequently the correct approach to fixing this would be to recreate the dataset without a spike rather than just fix the spike on the input data?
A: The volume is packed in blocks of 32 samples. For each block of 32 samples, the 16-bit range is scaled to match the range of those samples.
A spike of extreme amplitude blows out the range of the block, ruining the storage of the other samples. This is why we recommend using 16 bit data after the input has been processed to remove spikes.
Approaches to despiking are data dependent. We recommend consulting an experienced seismic processor for guidance.