Commit ae8faf32 authored by thomas.forbriger's avatar thomas.forbriger Committed by thomas.forbriger
Browse files

documentation

This is a legacy commit from before 2015-03-01.
It may be incomplete as well as inconsistent.
See COPYING.legacy and README.history for details.


SVN Path:     http://gpitrsvn.gpi.uni-karlsruhe.de/repos/TFSoftware/trunk
SVN Revision: 2407
SVN UUID:     67feda4a-a26e-11df-9d6e-31afc202ad0c
parent a7c22190
......@@ -3,7 +3,7 @@
----------------------------------------------------------------------------
$Id: README,v 1.4 2006-01-25 11:11:03 tforb Exp $
$Id: README,v 1.5 2007-09-29 17:44:59 tforb Exp $
\author Thomas Forbriger
\date 16/03/2002
......@@ -29,7 +29,7 @@ REVISIONS and CHANGES
\author Thomas Forbriger
\date 16/03/2002
GSE++ library: reading and writing GSE waveforms
GSE++ library: reading and writing %GSE waveforms
Copyright (c) 2002 by Thomas Forbriger (IMG Frankfurt)
......@@ -53,10 +53,10 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
\section goal Goal of the library
This library supports reading and writing of waveforms as defined by the
GSE2.1 standard (Provisional GSE2.1 Formats and Protocols, May 1997).
%GSE2.1 standard (Provisional %GSE2.1 Formats and Protocols, May 1997).
The library defines a modul GSE21::waveform. This modul containes classes
(TWID2, TSTA2) that hold GSE21 format elements. Further it contains
The library defines a modul GSE2::waveform. This modul containes classes
(TWID2, TSTA2) that hold %GSE2 format elements. Further it contains
functions to read and write waveform data in subformats CM6.
\section using Using the library
......@@ -64,7 +64,7 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Example code is provided by gsexx_write_example.cc
The simplest way of writing CM6 encoded GSE data is
The simplest way of writing %CM6 encoded %GSE data is
\code
// open output file
std::ofstream os("data.gse");
......@@ -97,24 +97,60 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
}
\endcode
Corresponding code for reading %GSE waveform data can be found in
gsexx_tests.cc.
\code
// open input stream
std::ifstream is("date.gse");
// set maximum number of samples
const int nsamples=2300;
// create array to read data to
int indata[msamples];
// create WID2 line instance and read from file
GSE2::waveform::TWID2 wid2line;
wid2line.read(is);
if (wid2line.Fsamps > msamples)
{
std::cerr << "ERROR: too many samples in file!" << std::endl;
abort();
}
// create instance of reader class for appropriate number of samples
GSE2::waveform::TDAT2readCM6 freader(wid2line.Fsamps);
int i=0;
while (freader.hot())
{
if (i >= wid2line.Fsamps)
{
std::cerr << "ERROR: missed last sample!" << std::endl;
abort();
}
indata[i] =freader(is);
i++;
}
\endcode
\section design Design Decisions
\subsection services Services and header files
The central aim of the library is to provide reading and writing of GSE2
data subformats (mainly CM6).
The primary aim of the library is to provide reading and writing of %GSE2
data subformats (mainly %CM6).
Since the main goal is reading and writing, the iostream headre will always
be included together with the libgsexx.h header.
Since the main goal is reading and writing, the \c iostream header file will
always be included together with the \c libgsexx.h header.
For ease of implementation the libgse.h always includes the string module
and the libtime module. Both are integral parts of the TWID2 class. They may
not easily be ommited.
For ease of implementation the \c gsexx.h always includes the string module.
This is an integral part of the TWID2 class. It may not easily be omitted.
\subsection GSEandSFF GSE2.1, GSE2.0 and SFF
With GSE2.0 the straight concept was to write a full waveform at once,
\subsection GSEandSFF %GSE2.1, %GSE2.0 and SFF
With %GSE2.0 the straight concept was to write a full waveform at once,
consisting of a WID2 line, a DAT2 identifier, the actual waveform data in
any sub-format encoding and a CHK2 line. This was totally compatible to the
SFF (Stuttgart File Format). With GSE2.1, however, a new STA2 line was
introduced. This GSE2.1 this line is mandatory. But it doesn't appear in the
SFF (Stuttgart File Format). With %GSE2.1, however, a new STA2 line was
introduced. This %GSE2.1 this line is mandatory. But it doesn't appear in the
SFF specification and is thus not allowed with in SFF data.
There are many reasons for reading and writing the full waveform set at
......@@ -124,50 +160,55 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
\subsection classes Classes
TDAT2reader and TDAT2writer are the interface classes to the core of this
module. The are pure abstract classes. Implementations of this conecpt are
TCM6reader, TCM6writer, TCM8reader, TCM8writer, TINTreader, and TINTwriter,
of which only the CM6 types will be implemented in a first version. The
TWID2 class provides member functions to return an appropriate
implementation for reading or writing the GSE waveform sub-format selected
TDAT2read and TDAT2write are the interface classes to the core of this
module. They are abstract classes. Implementations of this conecpt are
TDAT2readCM6, TCM6writeCM6, TDAT2readCM8, TCM6writeCM8, TDAT2readINT, and
TCM6writeINT of which only the %CM6 types will be implemented in a first
version.
The TWID2 class provides member functions to return an appropriate
implementation for reading or writing the %GSE waveform sub-format selected
in the WID2 line.
Two classes T2nddiffrem and T2nddiffappl are helper classes the remove or
apply second differences from/to the integer data stream. They are used with
in the reader and writer classes.
Two classes Tremove_diff and Tapply_diff are helper classes. They remove or
apply differences from/to the integer data stream, respectively.
They are used within the reader and writer classes.
Particular helpers remove2nddiffT and apply2nddiffT for second differences
are provided through typedefs.
The classes TWID2 and TSTA2 simply hold the data from the corresponding GSE2
format lines. Further there are input/output operators that provide reading
The classes TWID2 and TSTA2 simply hold the data of the corresponding %GSE2
format lines. Further, there are input/output operators that provide reading
and writing from/to streams.
A class TCHK2 deals with the checksum. It may be initialized from an array.
In this case the checksum is calculated from the values in the array. Or it
may be initialized from an input stream, thus reading the CHK2 line. further
there is a stream input/output provided to read/write a CHK2 line from/to a
stream.
A class TCHK2 deals with the checksum. It accepts sample values through a
member function and builds up the checksum from them. Or it may be set by
reading a CHK2 line from an input stream. Further there is a stream
input/output provided to read/write a CHK2 line from/to a stream.
\sa GSE2::waveform::TCHK2, GSE2::waveform::TWID2,
GSE2::waveform::differences::Tapply_diff,
GSE2::waveform::differences::Tremove_diff,
GSE2::waveform::remove2nddiffT,
GSE2::waveform::apply2nddiffT
\par The reader classes
The reader classes take a reference to an input stream on initialization.
They are further attached to this stream. They provide a read function that
allows bytewise reading from the input stream. The reader classes provide a
function that returns the number of characters read so far from the input
stream.
The reader classes take the number of expected input samples through their
constructor. Reading is accomplished sample by sample. After the last sample
the CHK2 line is read automatically by the reader class and checked against
the checksum calculated for the samples read. The member function hot()
indicates further samples to be read.
\sa GSE2::waveform::TDAT2read, GSE2::waveform::TDAT2readCM6
\par The writer classes
The writer classes take a reference to an output stream on initialization.
Further they are attached to this stream. They provide a write function that
allows bytewise feeding to the output stream. The writer classes provide a
function the returns the number of characters written to the stream
(excluding carriage return). This is essential together with the SFF library
the needs this information to include it in the DAST line. Writer classes
may be attached to a string stream to delay writing to the file.
The writer classes take the number of output samples through their
constructor. Writing is accomplished sample by sample. After the last sample
the CHK2 line is written automatically by the writer class with the checksum
calculated for the samples just written. The member function hot() indicates
further samples to be written.
\todo
Use a preprocessor macro to select the use of libtime. TWID2 holds the
fields in generic types. But in case we provide some conversion functions to
retrieve a libtime class from a WID2 line data class.
\sa GSE2::waveform::TDAT2write, GSE2::waveform::TDAT2writeCM6
\todo
Provide a check for the DAT2 specifier in the constructor for TDAT2read.
......@@ -177,5 +218,8 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
\todo
Provide a check function to look for the next identifier in the file.
\todo
Integer overflow could be checked in the class that applies differences.
*/
/* ----- END OF README ----- */
......@@ -3,7 +3,7 @@
*
* ----------------------------------------------------------------------------
*
* $Id: gsexx_TDAT2.cc,v 1.8 2006-01-25 11:11:04 tforb Exp $
* $Id: gsexx_TDAT2.cc,v 1.9 2007-09-29 17:45:00 tforb Exp $
* \author Thomas Forbriger
* \date 29/03/2002
*
......@@ -33,7 +33,7 @@
#define TF_GSEXX_TDAT2_CC_VERSION \
"TF_GSEXX_TDAT2_CC V1.0 "
#define TF_GSEXX_TDAT2_CC_CVSID \
"$Id: gsexx_TDAT2.cc,v 1.8 2006-01-25 11:11:04 tforb Exp $"
"$Id: gsexx_TDAT2.cc,v 1.9 2007-09-29 17:45:00 tforb Exp $"
#include <gsexx.h>
#include "gsexx_TDAT2.h"
......@@ -226,9 +226,7 @@ std::string TDAT2write::operator()(const intT& value)
/*! \class TDAT2readCM6
*
* This is an abstract base class, which defines a standard interface for
* reading %GSE2 %waveform data in any subformat. A derived class for
* each supported subformat must be defined to make this concept useable.
* This is a class to read %GSE2 %waveform data in %CM6 subformat.
*
* \sa GSE2::waveform::TDAT2read
*
......@@ -273,9 +271,7 @@ intT TDAT2readCM6::convert(std::istream& is)
/*! \class TDAT2writeCM6
*
* This is an abstract base class, which defines a standard interface for
* writing %GSE2 %waveform data in any subformat. A derived class for
* each supported subformat must be defined to make this concept useable.
* This is a class to write %GSE2 %waveform data in %CM6 subformat.
*
* \sa GSE2::waveform::TDAT2write
*
......
......@@ -212,6 +212,12 @@ c
c position format contents
c 1-5 a5 DAST (identifier)
c 7-16 i10 number of characters in encoded dataset
c From library version 1.10 this field may be -1.
c In this case the reading program has to determine
c the number of characters itself by detecting the
c CHK2 line. This change was necessary to implement
c the C++ version of libsff since this starts writing
c without having encoded the whole trace already.
c 18-33 e16.6 ampfac
c This is a factor to scale the (floating point)
c dataset to an desireable dynamic range
......
......@@ -3,7 +3,7 @@
*
* ----------------------------------------------------------------------------
*
* $Id: README,v 1.3 2006-03-28 15:58:18 tforb Exp $
* $Id: README,v 1.4 2007-09-29 17:45:01 tforb Exp $
* \author Thomas Forbriger
* \date 25/12/2003
*
......@@ -35,7 +35,8 @@
/*! \mainpage
The library is designed to provide a tool for SFF writing and reading.
The library is designed to provide a tool for SFF (Stuttgart File Format)
writing and reading.
It uses libgsexx for the GSE2 layer of SFF.
While libgsexx does not explicitely use an array class for time series,
libsffxx provides whole array reading and writing functionality.
......@@ -51,13 +52,14 @@ nchar field of the DAST line.
\warning
<b>
The library will write -1 to the nchar field of the DAST line.
You will need an updated version of libstuff or libsff (Fortran version) to
read data files written with libsffxx to Fortran code.
You will need an updated version (1.10 or higher) of \c libstuff or \c libsff
(Fortran version) to read data files written with \c libsffxx to Fortran code.
</b>
\section Contents
- \ref sec_main_dependencies
- \ref sec_main_format_definition
- \ref sec_main_structure
- \ref sec_main_structs
- \ref sec_main_header_classes
......@@ -70,11 +72,14 @@ read data files written with libsffxx to Fortran code.
<HR>
\section sec_main_dependencies Dependencies
The library depends on other code libaries. They are:
The library depends on other code libaries.
They are libtime, libgsexx, libaff, and STL.
The packages for libtime, libgsexx, and libaff should be available at the
place you obtained libsffxx from.
\subsection libtime
You should link against libtime++.a
You must link against libtime++.a
Date values in WID2 and SRCE structs are stored in libtime::TAbsoluteTime
objects.
......@@ -83,24 +88,36 @@ read data files written with libsffxx to Fortran code.
\subsection libgsexx
You should link against libgsexx.a
You must link against libgsexx.a
The GSE2 layer of the library relies on the code provided by libgsexx.
The header sffxx.h includes gsexx.h
\subsection libaff
You should link against libaff.a
You must link against libaff.a
The reading and writing classes sff::InputWaveform and sff::OutputWaveform
as well as the sff::TraceHeader::scanseries member function deal with
series type containers as defined in libaff.
The header sffxx.h includes aff/series.h and aff/iterator.h
\subsection STL
\subsection STL (Standard Template Libarary)
The sff::FREE block struct uses an STL list of strings.
<HR>
\section sec_main_format_definition SFF format definition
SFF (Stuttgart File Format) is based on the definition of the GSE2.0 format.
It supplements GSE2.0 by header elements the provide source specification
(SRCE lines), receiver information (INFO lines), and comments (FREE blocks).
It further provides an amplitude scaling factor (DAST line) to support
synthetic (non-integer) data.
The \ref page_sff_definition from file \c sff.doc in package \c libsff is
provided on a separate page: \ref page_sff_definition
<HR>
\section sec_main_structure Structure of the library modules
......@@ -302,4 +319,390 @@ read data files written with libsffxx to Fortran code.
*/
/* ========================================================================= */
/*! \page page_sff_definition Definition of the Stuttgart File Format
The following format definition is obtained from \c sff.doc which comes along
with \c libsff or \c libstuff.
The SFF was invented in 1996.
\date 28/09/2007
\par Contents
- \ref sec_definition_structure
- \ref sec_definition_elements
- \ref sec_definition_structure
- \ref subsec_definition_file_header
- \ref subsec_definition_data_block
- \ref sec_definition_elements
- \ref subsec_definition_stat_line
- \ref subsec_definition_free_block
- \ref subsec_definition_srce_line
- \ref subsec_definition_dast_line
- \ref subsec_definition_wid2_line
- \ref subsec_definition_dat2_indicator
- \ref subsec_definition_chk2_line
- \ref subsec_definition_info_line
- \ref sec_definition_coordinates
- \ref subsec_definition_cartesian_coordinates
- \ref subsec_definition_spherical_coordinates
The SFF (Stuttgart File Format) is an attempt to reconcile
different demands on the way seismic data used at the Institute
of Geophysics at Stuttgart University should be archived.
A single data format allows the standardization of software
used to perform common tasks on the data as reading,
writing, processing and plotting of data. Software has to
be written only once, may be used by many people and may
be kept at a single place in the computer system.
The general structure of the file format is a header block
followed by one or more data blocks. Within the header and
the data blocks optional blocks containing additional
information are allowed. Each data block is structured as
described by the GSE2.0 format. The data are compressed
using second differences and are encoded into pure ASCII
characters using a six bit encoding scheme (CM6) also described
by the GSE2.0 format. The ASCII encoding ensures portability
of the data across different operating systems and computer
architectures. Moreover, it allows sending data via e-mail.
\section sec_definition_structure Overall structure
The whole datafile is ASCII readable with any text editor
and is therefor transferable from any system to any system
via email. You can extract valid GSE2.0 data blocks from
the files by just using a text editor to delete additional lines.
The whole file consists of one file header block and one ore
more data blocks:
\verbatim
File Header
Data Block
.
.
.
\endverbatim
\subsection subsec_definition_file_header File Header
The File Header consists of a STAT line which is obligatory.
There may be an optional FREE block and/or and an optional
SRCE line:
\verbatim
STAT line obligatory
FREE block optional
SRCE line optional
\endverbatim
\sa sff::FileHeader
\subsection subsec_definition_data_block Data Block
Each Data Block has to start with an obligatory DAST line
and a WID2 line defined in GSE2.0 format. After that
there have to follow the encoded data samples between
a DAT2 identifier and a CHK2 checksum. These lines may
be followed by an optional FREE block and/or an optional
INFO line.
\verbatim
DAST line obligatory
WID2 line obligatory \
DAT2 identifier obligatory | The GSE2.0 data block consists
dataset obligatory | of these four elements.
CHK2 line obligatory /
FREE block optional
INFO line optional
\endverbatim
\sa sff::TraceHeader
\section sec_definition_elements Definition of the elements
\subsection subsec_definition_stat_line STAT line
This line provides general information about the data file
<TABLE>
<TR>
<TH> position </TH><TH> format </TH><TH> contents </TH>
</TR><TR>
<TD> 1-5 </TD><TD> a5 </TD><TD> STAT (identifier) </TD>
</TR><TR>
<TD> 6-12 </TD><TD> f6.2 </TD><TD> library version <BR>
minor versions are counted in 0.01 steps<BR>
major versions are counted in 1.0 steps </TD>
</TR><TR>
<TD> 14-26 </TD><TD> a13 </TD><TD> timestamp of file creation time:
yymmdd.hhmmss </TD>
</TR><TR>
<TD> 28-37 </TD><TD> a10 </TD><TD> code with a combination of two
possible characters:<BR>
F: there follows a FREE block<BR>
S: there follows a SRCE line </TD>
</TR>
</TABLE>
\sa sff::STAT
\subsection subsec_definition_free_block FREE block
This is a block of any set of 80 characters wide lines.
The start of this block is indicated a single line
containing FREE in the first 5 positions. Another line
of this content indicates the end of the FREE block.
A FREE block may contain any usefull information for
the user and has to follow no other standard than
a line length of 80 characters.
\sa sff::FREE
\subsection subsec_definition_srce_line SRCE line
This line holds information of the source that caused the
seismic signal
<TABLE>
<TR>
<TH> position </TH><TH> format </TH><TH> contents </TH>
</TR><TR>
<TD> 1-5 </TD><TD> a5 </TD><TD> SRCE (identifier) </TD>
</TR><TR>
<TD> 6-25 </TD><TD> a20 </TD><TD> type of source (any string like
"earthquake") </TD>
</TR><TR>
<TD> 27 </TD><TD> a1 </TD><TD> type of coordinate system:<BR>
C: cartesian<BR> S: spherical </TD>
</TR><TR>
<TD> 29-43 </TD><TD> f15.6 </TD><TD> c1: x, latitude
(see also \ref sec_definition_coordinates)</TD>
</TR><TR>
<TD> 44-58 </TD><TD> f15.6 </TD><TD> c2: y, longitude
(see also \ref sec_definition_coordinates)</TD>
</TR><TR>
<TD> 59-73 </TD><TD> f15.6 </TD><TD> c3: z, height
(see also \ref sec_definition_coordinates)</TD>
</TR><TR>
<TD> 75-80 </TD><TD> a6 </TD><TD> date of source event: yymmdd </TD>
</TR><TR>
<TD> 82-91 </TD><TD> a10 </TD><TD> time of source event: hhmmss.sss </TD>
</TR>
</TABLE>
\sa sff::SRCE
\subsection subsec_definition_dast_line DAST line
This line holds information on the actual dataset
<TABLE>
<TR>
<TH> position </TH><TH> format </TH><TH> contents </TH>
</TR><TR>
<TD> 1-5 </TD><TD> a5 </TD><TD> DAST (identifier) </TD>
</TR><TR>
<TD> 7-16 </TD><TD> i10 </TD><TD>
number of characters in encoded dataset <BR>
From library version 1.10 this field may be -1.
In this case the reading program has to determine
the number of characters itself by detecting the
CHK2 line. This change was necessary to implement
the C++ version of libsff since this starts writing
without having encoded the whole trace already.
</TD>
</TR><TR>
<TD> 18-33 </TD><TD> e16.6 </TD><TD> ampfac <BR>
This is a factor to scale the (floating point)
dataset to an desireable dynamic range
before converting it to Fortran integer
values. After reading the dataset and
decoding and converting it to floating point
you have to multiply each sample by ampfac
to get back the original values.
As the maximum range of integer values goes
from -(2.**31) to (2.**31)-1 you might
like to adjust the maximum integer value
to 0x7FFFFFFF. This may cause problems
as the second differences compressing algorithm
may increase the dynamic range of your data
by a factor of four in the worst case.
It is save to adjust the largest absolute
value in the dataset to (2.**23)-1 which
is 0x7FFFFF. </TD>
</TR><TR>
<TD> 35-44 </TD><TD> a10 </TD><TD>
code with a combination of three possible
characters indicating possible optional blocks
and a following further dataset:<BR>
F: a FREE block follows after dataset<BR>
I: an INFO line follows after dataset<BR>
D: there is another Data Block following
in this file (this must be the last
character in code) </TD>
</TR>
</TABLE>
\sa sff::DAST
\subsection subsec_definition_wid2_line WID2 line
(is 132 characters wide!)
This waveform identification line holds information on the dataset
as defined in GSE2.0 format.
<TABLE>
<TR>
<TH> position </TH><TH> name </TH><TH> format </TH><TH> contents </TH>
</TR><TR>
<TD> 1-4 </TD><TD> id </TD><TD> a4 </TD><TD> WID2 (identifier) </TD>
</TR><TR>
<TD> 6-15 </TD><TD> date </TD><TD> i4,a1,i2,a1,i2 </TD><TD>
date of first sample: yyyy/mm/dd </TD>
</TR><TR>
<TD> 17-28 </TD><TD> time </TD><TD> i2,a1,i2,a1,f6.3</TD><TD>
time of first sample: hh:mm:ss.sss </TD>
</TR><TR>
<TD> 30-34 </TD><TD> station </TD><TD> a5 </TD><TD>
for a valid GSE2.0 block use ISC station code </TD>
</TR><TR>
<TD> 36-38 </TD><TD> channel </TD><TD> a3 </TD><TD>
for a valid GSE2.0 block use FDSN channel designator </TD>
</TR><TR>
<TD> 40-43 </TD><TD> auxid </TD><TD> a4 </TD><TD>
auxiliary identification code </TD>
</TR><TR>
<TD> 45-47 </TD><TD> datatype </TD><TD>a3 </TD><TD>
must be CM6 in SFF </TD>
</TR><TR>
<TD> 49-56 </TD><TD> samps </TD><TD> i8 </TD><TD>
number of samples </TD>
</TR><TR>
<TD> 58-68 </TD><TD> samprat </TD><TD> f11.6 </TD><TD>
data sampling rate in Hz </TD>
</TR><TR>
<TD> 70-79 </TD><TD> calib </TD><TD> e10.2 </TD><TD>
calibration factor </TD>
</TR><TR>
<TD> 81-87 </TD><TD> calper </TD><TD> f7.3 </TD><TD>
calibration period where calib is valid </TD>
</TR><TR>
<TD> 89-94 </TD><TD> instype </TD><TD> a6 </TD><TD>
instrument type (as defined in GSE2.0) </TD>
</TR><TR>
<TD> 96-100 </TD><TD> hang </TD><TD> f5.1 </TD><TD>
horizontal orientation of sensor, measured in degrees
clockwise from North (-1.0 if vertical) </TD>
</TR><TR>
<TD> 102-105 </TD><TD> vang </TD><TD> f4.1 </TD><TD>
vertical orientation of sensor, measured in degrees from
vertical (90.0 if horizontal) </TD>
</TR>
</TABLE>
\sa sff::WID2
\subsection subsec_definition_dat2_indicator DAT2 indicator
This line indicates the beginning of the encoded dataset.
The dataset follows in 80 characters wide lines.
\subsection subsec_definition_chk2_line CHK2 line
Provides a checksum for the dataset. The checksum has to be
calculated as defined in GSE2.0: