Version 2
Copyright © 2004-2012 Delaware Environmental Observing System
Published: February 27th 2004
Revision History | ||
---|---|---|
Revision 1.1 | 2004/02/27 | GEQ |
Initial version | ||
Revision 1.2 | 2004/03/02 | GEQ |
Updated database description section. | ||
Revision 1.3 | 2004/03/05 | GEQ |
Added historical processing information. | ||
Revision 1.4 | 2005/02/02 | GEQ |
Added stream table. | ||
Revision 1.5 | 2005/03/08 | GEQ |
Added update option. | ||
Revision 1.6 | 2005/05/31 | GEQ |
Added file action option. | ||
Revision 1.7 | 2005/08/09 | GEQ |
Added discussion of stream state to database parameters and operational use sections. | ||
Revision 1.8 | 2006/01/23 | GEQ |
Added discussion of flags and related issues. Added DelDOT-Main stream. | ||
Revision 2.1 | 2006/09/26 | GEQ |
Incremented version number of this document to 2. | ||
Revision 2.2 | 2007/04/02 | GEQ |
Added DGS well streams. | ||
Revision 2.3 | 2007/05/17 | GEQ |
Added records option. Added DNERR streams. Added append option. | ||
Revision 2.4 | 2007/07/27 | GEQ |
Added USACE streams. | ||
Revision 2.1.7 | 2008/08/14 | GEQ |
Added statistics option. | ||
Revision 2.1.8 | 2010/01/31 | GEQ |
Added NOS streams. | ||
Revision 2.1.11 | 2011/11/28 | GEQ |
Added idnwq program and DNREC-Water_Quality stream. |
Abstract
This technical note describes the DEOS ingest and idnwq programs, and the options for their use.
Table of Contents
The DEOS ingest and idnwq programs provide the functionality to parse data files containing meteorological data and store them in the database for use by subsequent components of the DEOS system. The program also implements QA/QC procedures.
The ingest program utilizes streams for ingest. The following table describes the streams supported.
Table 1. Ingest Streams
Stream Name | Description |
---|---|
NEXRAD | The only file type ingested at present is the digital precipitation array. |
DEOS | |
METAR-SFSS | This is the stream for a single-station dataset in a single file. As provided on the NWS web site. |
USGS-STREAM | USGS stream-flow stations. |
USGS-TIDAL | USGS tidal stations. |
RAWS | NWS fire weather stations. |
NDBC-BUOY | NDBC Ocean buoy stations. |
DelDOT-Main | State of Delaware Department of Transport stations. |
DEOS-Wells | Delaware Geologic Survey well data passed by DEOS communications. |
DGS-Wells | Delaware Geologic Survey well data retrieved from web resources. |
DNERR-WQ-Multi | Delaware National Estuarine Research Reserve multi-station water quality data. |
DNERR-Buoy-Status | Delaware National Estuarine Research Reserve buoy status data. |
DNERR-Buoy-Wave | Delaware National Estuarine Research Reserve buoy wave data. |
USACE-Buoy-Current | US Army Corps of Engineers buoy data (current data.) |
USACE-Buoy-Historical | US Army Corps of Engineers buoy data (historical data.) |
NOS-Water-Level-Prediction | NOS data (prediction data.) |
NOS-Water-Level-Obs | NOS data (observed data.) |
DNREC-Water_Quality | DNREC Water Quality data |
The ingest is the main method to enter station data into the DEOS database.
The program is executed by the command ingest from the command line.
ingest
{ -u username
| --user username
} { -p password
| --password password
} { -d database
| --database database
} { -s stream-name
| --stream stream-name
} [ --update | --no-update ] [ -a action
| --file-action action
] [ --append ] [ --statistics ] [ --records value
] [ -v value
| --verbose value
] [ -h | --help ]
The ingest program uses database configuration parameters to determine details of what to do with the various data streams it is presented.
The specification for the user option is:
{ -u username
| --user username
| --user=username
}
This option will allow the program to connect to the database using the username provided. The user must already exist, and have sufficient permissions to access the database specified. This item is required.
The specification for the password option is:
{ -p password
| --password password
| --password=password
}
This option in combination with the username allows the program to connect to the database using the username provided. This item is required.
The specification for the database option is:
{ -d database
| --database database
| --database=database
}
This option will specify which database the program should use as a source of data, configuration items and a destination for any event logging items. This item is required.
The specification for the stream option is:
{ -s stream name
| --stream stream name
| --stream=stream name
}
This option specifies the stream name for retrieval. This is a required argument.
The specification for the update option is:
[--update]
This option overrides any database setting and forces the ingest program to update any data already existing in the database.
The specification for the no update option is:
[--no-update]
This option overrides any database setting and forces the ingest program to not update any data already existing in the database. This is used to speed up the ingest process for typically use during bulk upload.
The specification for the file action option is:
[ -a { n | m | d | c } | --file-action { n | m | d | c } | --file-action= { n | m | d | c } ]
This option specifies what action is to be performed on the input file following successful processing. The possibilities are [n]othing, [m]ove, [d]elete or [c]opy. The default is to move the file to the designated archive directory. If a problem is detected during processing, the file will be moved to the appropriate unprocessed directory.
The specification for the append option is:
[ --append ]
This option, if provided, indicates that a timestamp of the current UTC time be appended to files after they are copied or moved. The default is to not add a timestamp.
The specification for the statistics option is:
[ --statistics ]
This option, if provided, indicates that the ingest program should update the metadata statistics recorded in the database.
The specification for the records option is:
[ --records value
| --records=value
]
This option specifies the maximum number of data records that will be processed. If not provided, all data records will be processed. A value of zero indicates all records will be processed.
The specification for the verbose option is:
[ -v value
| --verbose value
]
The value provided for the verbose setting (a digit between 0 and 5) indicates how much output is generated, 0 being minimal and 5 being extremely verbose. Positive values of setting send data to the ELF, negative value also print the message to the screen. If no value is provided, the setting is assumed to be 1.
The following database tables provide configuration options for the ingest program.
This table contains data for the network associated with the specifiedstream, such as destination path, and network name and ID.
This table contains data for the specified ingest method for the stream, such as stream ID as well as the lock parameter discussed below.
This table contains data for the ingest data type (e.g., air temperature or precipitation, etc.), and minimum and maximum permitted values. Ingested items outside those bounds are flagged and thus not available for user viewing.
For operational use, it is expected that the ingest program be run from inside a UNIX cron system, and to be run at an appropriate time interval, depending upon the frequency of data updates at the remote site.
Once the ingest program starts processing a specific stream, no other ingest process may process that stream's data. This is implemented by a database field lock that is only relinquished when the former ingest program completes successfully.
The ingest program implements QA and QC processes on all ingested data according the various levels described here.
Stage zero checks whether the data time is in the future. If so, bit 5 of the flag is set to 1.
The idnwq is the program used to enter historical DNREC water quality data into the database.
The program is executed by the command idnwq from the command line.
idnwq
{ -u username
| --user username
} { -p password
| --password password
} { -d database
| --database database
} { -f filename
| --filename filename
} [ --update | --no-update ] [ --append ] [ --statistics ] [ -v value
| --verbose value
] [ -h | --help ]
The idnwq program uses database configuration parameters to determine details of what to do with the various data streams it is presented.
The specification for the user option is:
{ -u username
| --user username
| --user=username
}
This option will allow the program to connect to the database using the username provided. The user must already exist, and have sufficient permissions to access the database specified. This item is required.
The specification for the password option is:
{ -p password
| --password password
| --password=password
}
This option in combination with the username allows the program to connect to the database using the username provided. This item is required.
The specification for the database option is:
{ -d database
| --database database
| --database=database
}
This option will specify which database the program should use as a source of data, configuration items and a destination for any event logging items. This item is required.
The specification for the filename option is:
[ -f filename
| --filename filename
| --filename=filename
]
This option specifies a file name for output.
The specification for the update option is:
[--update]
This option overrides any database setting and forces the ingest program to update any data already existing in the database.
The specification for the no update option is:
[--no-update]
This option overrides any database setting and forces the ingest program to not update any data already existing in the database. This is used to speed up the ingest process for typically use during bulk upload.
The specification for the append option is:
[ --append ]
This option, if provided, indicates that a timestamp of the current UTC time be appended to files after they are copied or moved. The default is to not add a timestamp.
The specification for the statistics option is:
[ --statistics ]
This option, if provided, indicates that the ingest program should update the metadata statistics recorded in the database.
The specification for the verbose option is:
[ -v value
| --verbose value
]
The value provided for the verbose setting (a digit between 0 and 5) indicates how much output is generated, 0 being minimal and 5 being extremely verbose. Positive values of setting send data to the ELF, negative value also print the message to the screen. If no value is provided, the setting is assumed to be 1.
The input filename option must point to a file conforming to the description given in this section. The first line is assumed to be a header line and is skipped during processing.
Records (lines) are assumed to be comma delimited fields of the following data.
Station Name
Date of observation in YYYY/MM/DD HH:MM:SS format
Data type ID
Data value
The following database tables provide configuration options for the idnwq program.
This table contains data for the ingest data type (e.g., air temperature or precipitation, etc.), and minimum and maximum permitted values. Ingested items outside those bounds are flagged and thus not available for user viewing.