Monday, June 18, 2012

SEG-Y values from the command line - The tape problem

We all love magnetic tape don't we.  Well we have problems with filenames regarding tapes.  Lots of times we can discriminate SEG-Y files on a tape from it's trace header values.

Use this script to zip through lots and lots of indescript files from a tape output.

#!/bin/bash

hexvalue=$(hexdump -C -s 3614x -n 2 $@ | awk '{print $2 $3;}')
let decnum=0x$hexvalue
echo $decnum

Change the 3614 to correspond with what value you want to extract from the headers.

Happy day!

Saturday, February 18, 2012

Creating EBCDIC Header Files From The SEG-Y

Hi Folks

Copy and paste this.  Free up permissions and all.  It scans through a directory and extracts the EBCDIC header from the SEG-Y.  Mighty fun?  Not!  Caution, this runs through every file.  So careful what's in there.


#!/bin/bash


files=*
for f in $files
do


# this needs to be one long string
  for i in 1 80 160 240 320 400 480 560 640 720 800 880 960 1040 1120 1200 1280 1360 1440 1520 1600 1680 1760 1840 1920 2000 2080 2160 2240 2320 2400 2480 2560 2640 2720 2800 2880 2960 3040 3120
    do
      # this needs to be one long string too
      out=$(dd if=$f bs=1 skip=$i count=80 conv=ascii status=noxfer
)
      extention=_header.txt
      combo=$f$extention
      echo $out >> $combo
    done
  echo "finished processing file "$f
done

Thursday, January 5, 2012

Name Names and Describe Descriptions

I hate to have to spell something so fundamental out but...  here it goes.  Abbreviate what the data is and what it was and what you did to it.  Example:

Here's a simple run of the mill seismic dataset out found on the web somewhere.  As an example:

l11f1.sgy

At least they didn't call it l11f1.segy or l11f1.SEG-Y.  The standard is *.sgy.

If you found this dataset on at tape or disk someplace all by itself and had no idea what l11f1.sgy meant what's your next step?  Well, the extension's *.sgy so you'd assume it's in SEG-Y format.  Next, look at the EBCDIC header.  Many times like here:


C 1 CLIENT U.S.G.S.              COMPANY  WHFC                  CREW NO 0
C 2 LINE l11f1.     AREA  Columbia River       MAP ID None
C 3 REEL NO 1         DAY-START OF REEL 263 YEAR 2000 OBSERVER
C 4 INSTRUMENT:  Triton        MODEL ISIS       SERIAL NO
C 5 DATA TRACES/RECORD 1      AUXILIARY TRACES/RECORD         CDP FOLD
C 6 SAMPLE INTERVAL 258     SAMPLES/TRACE  2050 BITS/IN      BYTES/SAMPLE  2
C 7 RECORDING FORMAT 3      FORMAT THIS REEL        MEASUREMENT SYSTEM
C 8 SAMPLE CODE: Short Integers
C 9 GAIN TYPE:
C10 FILTERS: ALIAS     HZ  NOTCH     HZ  BAND           HZ  SLOPE        DB/OCT
C11 SOURCE:                 NUMBER/POINT         POINT INTERVAL
C12 PATTERN:                               LENGTH        WIDTH
C13 SWEEP:           HZ          HZ  LENGTH 531  MS  CHANNEL NO     TYPE
C14 TAPER:                    MS                   MS  TYPE
C15 SPREAD:
C16 GEOPHONES:
C17 PATTERN:
C18 TRACES SORTED BY: RECORD
C19 AMPLITUDE RECOVERY:
C20 MAP PROJECTION
C21 PROCESSING:
C22 ACOUSTIC SOURCE: SIS-1000                 FIRE RATE: 530 ms  SECS
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
C34
C35
C36
C37
C38
C39
C40 END EBCDIC

Okay it's:
  • Client:  USGS
  • Company:  WHFC
  • Crew:  0 (this means nothing)
  • Line: l11f1
  • Area:  Columbia River
  • Record Length:  512 MS
  • Sample Rate:  258
  • Instrument:  Triton
  • Model:  Isis
  • Reel:  262 (not noted in binary header)
  • Year:  2000
  • Acoustic Source:  SIS-1000
  • Fire Rate:   530 MS

Okay, that's all fine and good for somebody who's spent 3 months struggling to get this data correct.  But your masterpiece is not a masterpiece until the rest of world has seen it and understood it.  Looking at the data, it looks stacked.  It "might" be 2D because the line name in C2 matches the filename (3D data usually just notes the first inline which is okay in some circumstances).  We know it's marine data (sort of) because of it mentioning Columbia River.  After that, you're at a dead end.

Next we have to look at the trace headers.  We have little or nothing to go on with what's going on with the X&Ys (byte locations 73-88).  Are those X and Ys, are they lat and logs with some weird system we have to figure out.  Where in God's name is the Columbia River?  If I google it and come up with something that looks like the numbers that are in the X & Ys - are they gonna match?  Probably not.

There's some 10 digit number in byte locations 181-184.  I have no idea what that's supposed to represent or what's in 193 (looks like a dupe of the water depth what's in 61-68.

My point is:  Explain what's where!  If you spent hours and hours figuring out this magnificent number number, please explain what is it and what it's there for.  If I spent months looking at this data, I'd know what it is too, but I have a deadline to figure this stuff out.  This data will stick around MUCH much longer that you will.  You'd better do good housekeeping.

In the EBCDIC header - please describe:
  • Where is data is that you're presenting
  • Who it's for
  • When it was done
  • What is the final process you're presenting
  • What steps did you take getting there
  • What will I need to load the data (what can I expect the coordinates to be in (projected etc))
  • Is there a grid?  If so, what is it?
  • What/what are the significant non-standard literals.
  • What are the byte locations for your data
  • Is it 2D/3D
Finally the dataset naming convention (the bare minimum):

  • areaname_linename_processname.sgy

Nobody cares if you have a 300 character file name.  People are going to care if you have a file which means nothing to somebody who's not part of your inner circle.

Tuesday, January 3, 2012

The Son of the Return of the SEG-Y Binary Header

A little more but a little more readable:


#!/bin/bash
echo -n "SEG-Y file name:  "
read _segyin


_jobid=$(hexdump -C -s 3200x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_linenumber=$(hexdump -C -s 3204x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_reelnumber=$(hexdump -C -s 3208x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_tracesperrec=$(hexdump -C -s 3212x -n 2 $_segyin | awk '{print $2 $3}')
_auxtraces=$(hexdump -C -s 3214x -n 2 $_segyin | awk '{print $2 $3}')
_sampleinterval=$(hexdump -C -s 3216x -n 2 $_segyin | awk '{print $2 $3;}')
_samplespertrace=$(hexdump -C -s 3218x -n 2 $_segyin | awk '{print $2 $3;}')
_samplesperreel=$(hexdump -C -s 3220x -n 2 $_segyin | awk '{print $2 $3;}')
_filesize=$(stat -c%s $_segyin)
_formatcode=$(hexdump -C -s 3224x -n 2 $_segyin | awk '{print $2 $3;}')
_tracesortcode=$(hexdump -C -s 3228x -n 2 $_segyin | awk '{print $2 $3;}')
_feetmeters=$(hexdump -C -s 3254x -n 2 $_segyin | awk '{print $2 $3;}')


let decjobid=0x$_jobid
let declinenumber=0x$_linenumber
let decreelnumber=0x$_reelnumber
let dectracesperrec=0x$_tracesperrec
let decauxtraces=0x$_auxtraces
let decsampleinterval=0x$_sampleinterval
let decsamplespertrace=0x$_samplespertrace
let decsamplesperreel=0x$_samplesperreel
let dectracesortcode=0x$_tracesortcode
let decformatcode=0x$_formatcode
let decfeetmeters=0$_feetmeters


if [ $decformatcode = 3 ]; then
  let _bytespersample=2
else
  let _bytespersample=4
fi


let _bytespertrace=($decsamplespertrace*$_bytespersample)+240
let _minus=$_filesize-3600
let _numtraces=$_minus/$_bytespertrace


echo '************************'
echo 'Number of traces:  '$_numtraces


if [ $decjobid != 0 ]; then
  echo 'Job Id:  '$decjobid
fi
if [ $declinenumber != 0 ]; then
  echo 'Line number:  '$declinenumber
fi
if [ $decreelnumber != 0 ]; then
  echo 'Reel number:  '$decreelnumber
fi
if [ $dectracesperrec != 0 ]; then
  echo 'Traces per record:  '$dectracesperrec
fi
if [ $decauxtraces != 0 ]; then
  echo 'Number of aux traces per trace:  '$decauxtraces
fi
if [ $decsampleinterval != 0 ]; then
  echo 'Sample interval:  '$decsampleinterval
fi
if [ $decsamplespertrace != 0 ]; then
  echo 'Samples per trace:  '$decsamplespertrace
fi


if [ $decsamplesperreel != 0 ]; then
  echo 'Samples per reel:  '$decsamplesperreel
fi
if [ $decformatcode = 1 ]; then
  echo 'Format code:  4-byte IBM Floating Point'
fi
if [ $decformatcode = 2 ]; then
  echo 'Format code:  4-byte, twos complement integer'
fi
if [ $decformatcode = 3 ]; then
  echo 'Format code:  2-byte, twos complement integer'
fi
if [ $decformatcode = 4 ]; then
  echo 'Format code:  4-byte, fixed point with gain'
fi
if [ $decformatcode = 5 ]; then
  echo 'Format code:  4-byte IEEE floating point'
fi
if [ $decformatcode = 8 ]; then
  echo 'Format code:  1-byte, twos complement integer'
fi


if [ $dectracesortcode = 1 ]; then
  echo 'Trace sort code:  1 as recorded, no sorting'
fi
if [ $dectracesortcode = 2 ]; then
  echo 'Trace sort code:  2 CDP ensemble'
fi
if [ $dectracesortcode = 3 ]; then
  echo 'Trace sort code:  Single code continuous profile'
fi
if [ $dectracesortcode = 4 ]; then
  echo 'Trace sort code:  Horizontally stacked'
fi
if [ $dectracesortcode = 5 ]; then
  echo 'Trace sort code:  5 Common source point'
fi
if [ $dectracesortcode = 6 ]; then
  echo 'Trace sort code:  6 Common receiver point'
fi
if [ $dectracesortcode = 7 ]; then
  echo 'Trace sort code:  7 Common offset point'
fi
if [ $dectracesortcode = 8 ]; then
  echo 'Trace sort code:  8 Common mid-point'
fi
if [ $dectracesortcode = 9 ]; then
  echo 'Trace sort code:  9 Common conversion point'
fi
if [ $decfeetmeters = 1 ]; then
  echo 'Measurement system:  Meters'
fi
if [ $decfeetmeters = 2 ]; then
  echo 'Measurement system:  Feet'
fi
echo 'Dec filesize:  '$_filesize' bytes'
echo '************************'



You should get something that looks like this:

SEG-Y file name:  102.sgy
************************
Number of traces:  41047
Line number:  102
Reel number:  1
Traces per record:  1
Sample interval:  2000
Samples per reel:  1024
Format code:  4-byte IBM Floating Point
Measurement system:  Meters
Dec filesize:  9854992 bytes
************************

or on the rare occasion:


SEG-Y file name:  R_22.sgy
************************
Number of traces:  10122
Job Id:  666
Line number:  22
Reel number:  1
Traces per record:  1
Sample interval:  4000
Samples per trace:  2000
Samples per reel:  2001
Format code:  2-byte, twos complement integer
Trace sort code:  Horizontally stacked
Measurement system:  Meters
Dec filesize:  42924156 bytes
************************

La dee da!!


Sunday, January 1, 2012

More SEG-Y Command Line Stuff: Binary header

A really sleazy bash script to read SEG-Y binary header data.  Basically a file stuffed full of command line stuff.

#!/bin/bash
echo -n "SEG-Y file name:  "
read _segyin


_jobid=$(hexdump -C -s 3200x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_linenumber=$(hexdump -C -s 3204x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_reelnumber=$(hexdump -C -s 3208x -n 4 $_segyin | awk '{print $2 $3 $4 $5}')
_tracesperrec=$(hexdump -C -s 3212x -n 2 $_segyin | awk '{print $2 $3}')
_auxtraces=$(hexdump -C -s 3214x -n 2 $_segyin | awk '{print $2 $3}')
_sampleinterval=$(hexdump -C -s 3216x -n 2 $_segyin | awk '{print $2 $3;}')
_samplespertrace=$(hexdump -C -s 3220x -n 2 $_segyin | awk '{print $2 $3;}')
_filesize=$(stat -c%s $_segyin)
_formatcode=$(hexdump -C -s 3224x -n 2 $_segyin | awk '{print $2 $3;}')


let decjobid=0x$_jobid
let declinenumber=0x$_linenumber
let decreelnumber=0x$_reelnumber
let dectracesperrec=0x$_tracesperrec
let decauxtraces=0x$_auxtraces
let decsampleinterval=0x$_sampleinterval
let decsamplespertrace=0x$_samplespertrace
let decformatcode=0x$_formatcode
let declocation1=0x$_location1


echo 'Job Id:  '$decjobid
echo 'Line number:  '$declinenumber
echo 'Reel number:  '$decreelnumber
echo 'Traces per record:  '$dectracesperrec
echo 'Number of aux traces per trace:  '$decauxtraces
echo 'Sample interval:  '$decsampleinterval
echo 'Samples per trace:  '$decsamplespertrace
echo 'Format code:  '$decformatcode
echo 'Dec filesize:  '$_filesize

Saturday, December 3, 2011

Reading SEG-Y from Command Line

Yes, I've been gone for a month or so...

Here it is. If you don't have your favorite reader use these fun dd & hexdump commands from the command line.

Reading the EBCDIC header:
dd if=input_file.sgy conv=ascii conv=unblock


Reading the job number
hexdump -C -s 3201x -n 3 input_file.sgy | awk '{print $2 $3 $4;}'

Reading the line number:
hexdump -C -s 3205x -n 3 10_770.sgy | awk '{print $2 $3 $4;}'

Reading the reel number:
hexdump -C -s 3209x -n 3 10_770.sgy | awk '{print $2 $3 $4;}'

Reading the format code:
hexdump -C -s 3224x -n 2 10_770.sgy | awk '{print $2 $3;}'

Reading the sample interval:
hexdump -C -s 3216x -n 3 10_770.sgy | awk '{print $2 $3;}'

Reading the samples per trace:
hexdump -C -s 3220x -n 2 10_770.sgy | awk '{print $2 $3;}'

Reading the sort code:
hexdump -C -s 3228x -n 2 10_770.sgy | awk '{print $2 $3;}'

Thursday, October 27, 2011

Arc Seconds to Degrees

Wouldn't you know it. I thought that I was just dim-witted or the people who were writing the SEG-Y were completely zonked - but there is a reason behind putting X/Y data in arc seconds. For a long time, every once in a while you'll come across SEG-Y byte location 98 and it'll have a 2 there (meaning the coordinates are in Arc Seconds. I'm scratching my head saying - who on earth, why on earth? Okay, here's the rhyme - not sure of the reason.

Arc Seconds = 3599.99999712 * degree
while
Degree = 0.0002778 * arc seconds

go figure. I got this all from here.

Grazie