Commit f9e7bdda authored by BorjaEst's avatar BorjaEst
Browse files

Version 0.4.0

Package simplification removing loop processing accross sources
2 main scripts: norm & skim to divide functionality
Packaging using pip instad of docker
Documentation updated
parent b637952b
......@@ -3,8 +3,6 @@ config:
o3skim:
repo: 'https://git.scc.kit.edu/synergy.o3as/o3skim.git'
branch: master
dockerhub: synergyimk/o3skim
dockertag: latest
sqa_criteria:
qc_style:
......@@ -15,14 +13,6 @@ sqa_criteria:
tox_file: /o3skim/tox.ini
testenv:
- stylecheck
qc_coverage:
repos:
o3skim:
container: testing
tox:
tox_file: /o3skim/tox.ini
testenv:
- unittesting
qc_functional:
repos:
o3skim:
......
# Dockerfile has three Arguments: base, tag, branch
# base - base image (default: python)
# tag - tag for base mage (default: stable-slim)
# branch - user repository branch to clone (default: python)
#
# To build the image:
# $ docker build -t <dockerhub_user>/<dockerhub_repo> --build-arg arg=value .
# or using default args:
# $ docker build -t <dockerhub_user>/<dockerhub_repo> .
# set the base image. default is python
ARG base=python
# set the tag (e.g. latest, 3.8, 3.7 : for python)
ARG tag=3.6-slim
# Base image, e.g. python:3.6-slim
FROM ${base}:${tag}
LABEL maintainer='B.Esteban, T.Kerzenmacher, V.Kozlov (KIT)'
# What branch to clone (!)
ARG branch=master
# Which user and group to use
ARG user=application
ARG group=standard
# Set environments
ENV LANG C.UTF-8
# Install system updates and tools
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
# Install system updates and tools
ca-certificates \
gcc \
g++ \
git && \
# Clean up & back to dialog front end
apt-get autoremove -y && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*
ENV DEBIAN_FRONTEND=dialog
# Install user app:
RUN git clone --depth 1 -b ${branch} https://git.scc.kit.edu/synergy.o3as/o3skim.git app && \
# Install python application
cd app && \
pip3 install --no-cache-dir -e . && \
pip3 install --no-cache-dir tox && \
# Clean up
rm -rf /root/.cache/pip/* && \
rm -rf /tmp/*
WORKDIR /app
# Change user context and drop root privileges
RUN groupadd -r ${group} && \
useradd --no-log-init -r -d /app -g ${group} ${user} && \
chown -R ${user} .
USER ${user}
# Start default script
ENTRYPOINT [ "main" ]
#!/usr/bin/groovy
@Library(['github.com/indigo-dc/jenkins-pipeline-library@2.0.0']) _
@Library(['github.com/indigo-dc/jenkins-pipeline-library@2.1.0']) _
def projectConfig
......
......@@ -7,7 +7,7 @@
<div align="center">
[![Build Status](https://jenkins.eosc-synergy.eu/buildStatus/icon?job=eosc-synergy-org%2Fo3skim%2Ftest)](https://jenkins.eosc-synergy.eu/job/eosc-synergy-org/job/o3skim/job/master)
[![Build Status](https://jenkins.eosc-synergy.eu/buildStatus/icon?job=eosc-synergy-org%2Fo3skim%2Fmaster)](https://jenkins.eosc-synergy.eu/job/eosc-synergy-org/job/o3skim/job/master)
[![Documentation Status](https://readthedocs.org/projects/o3as/badge/?version=latest)](https://o3as.readthedocs.io/en/latest/?badge=latest)
[![pipeline status](https://git.scc.kit.edu/synergy.o3as/o3skim/badges/master/pipeline.svg)](https://git.scc.kit.edu/synergy.o3as/o3skim/-/commits/master)
[![coverage status](https://git.scc.kit.edu/synergy.o3as/o3skim/badges/master/coverage.svg)](https://git.scc.kit.edu/synergy.o3as/o3skim/-/commits/master)
......@@ -24,90 +24,20 @@
# 📝 Table of Contents
- [About](#about)
- [Build using docker](#build)
- [Run using udocker](#deployment)
- [Testing](#testing)
- [Documentation](https://o3as.readthedocs.io/en/latest)
- [Authors](#authors)
- [TODO](https://git.scc.kit.edu/synergy.o3as/o3skim/-/issues)
- [Issues & ToDo](https://git.scc.kit.edu/synergy.o3as/o3skim/-/issues)
# About <a name = "about"></a>
This project provides the tools to preprocess, standardize and reduce ozone data for later transfer and plot.
## Prerequisites
To run the project as container, you need the following systems and container technologies:
- __Build machine__ with [docker](https://docs.docker.com/engine/install/)
- __Runtime machine__ with [udocker](https://indigo-dc.gitbook.io/udocker/installation_manual)
This software is shipped as python3 package, therefore you need to have python3
and pip installed. If not, please check [pip documentation](https://pip.pypa.io/en/stable/installing/) to find out how to run it in your system.
> Note udocker cannot be used to build containers, only to run them.
> Non admin rights? Check how to run [conda](https://docs.conda.io/en/latest/) in your machine.
# Built using docker <a name = "build"></a>
Download the repository at the __Build machine__ using git.
```sh
$ git clone git@git.scc.kit.edu:synergy.o3as/o3skim.git
Cloning into 'o3skim'...
...
```
Build the docker image at the __Build machine__ using docker.
```sh
$ docker build --tag o3skim .
...
Successfully built 69587025a70a
Successfully tagged o3skim:latest
```
If the build process succeeded, you can list the image on the docker image list:
```sh
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
o3skim latest 69587025a70a xx seconds ago 557MB
...
```
# Run using udocker <a name = "deployment"></a>
To deploy the the application using __udocker__ at the __Runtime machine__ you need:
- Input path with data to skim, to be mounter on `/app/data` inside the container.
- Output path for skimmed results, to be mounted on `/app/output` inside the container.
- Configuration file with a data structure description at the input path in [YAML](https://yaml.org/) format.
This configuration file has to be mounted on `/app/sources.yaml` inside the container.
Once the requirement are completed, pull the image from the image registry.
For example, to pull it from the synergy-imk official registry use:
```sh
$ udocker pull synergyimk/o3skim
...
Downloading layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
...
```
Once the repository is added and the image downloaded, create the local container:
```sh
$ udocker create --name=o3skim synergyimk/o3skim
fa42a912-b0d4-3bfb-987f-1c243863802d
```
Finally, run the container. Note the described _data_, _output_ and _sources.yaml_ have to be provided. Also it is needed to specify the user _application_ should run inside the container:
```sh
$ udocker run \
--user=application \
--volume=${PWD}/sources.yaml:/app/sources.yaml \
--volume=${PWD}/data:/app/data \
--volume=${PWD}/output:/app/output \
o3skim --verbosity INFO ${action1} ${action2}
...
executing: main
...
2020-08-25 12:42:34,151 - INFO - Configuration found at: './sources.yaml'
2020-08-25 12:42:34,152 - INFO - Loading data from './data'
2020-08-25 12:42:34,261 - INFO - Skimming data to './output'
```
For the main function description and commands help you can call:
```sh
$ udocker run --user=application o3skim --help
...
```
# Testing <a name = "testing"></a>
Testing is based on [sqa-baseline](https://indigo-dc.github.io/sqa-baseline/) criteria. On top, [tox](https://tox.readthedocs.io/en/latest/) automation is used to simplify the testing process.
......
......@@ -52,4 +52,4 @@ html_theme = 'sphinx_rtd_theme'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
\ No newline at end of file
html_static_path = ['_static']
Local installation
==================================
To run tests, you need to install the tool in your system without docker.
As first step ensure you have the following dependencies:
- python_ > 3.6
- pip_ > 20.0.2
- gcc_
- g++_
After downloading the repository and installing dependencies check, install with pip:
.. code-block:: bash
$ pip install -e .
.. _python: https://www.python.org
.. _pip: https://pypi.org
.. _gcc: https://gcc.gnu.org
.. _g++:
o3skim package
Package index
==============
.. automodule:: o3skim
:members: loading, processing, grouping, saving
:members: process, save
o3skim loads
--------------------------
.. automodule:: o3skim.loads
:members: tco3, vmro3
:members: ccmi, ecmwf, esacci, sbuv
o3skim standardization
o3skim normalization
--------------------------
.. automodule:: o3skim.standardization
:members: tco3, vmro3
.. automodule:: o3skim.normalization
:members: run
o3skim operations
--------------------------
.. automodule:: o3skim.loads
.. automodule:: o3skim.operations
:members: run
o3skim extended_xarray
--------------------------
.. automodule:: o3skim.extended_xarray
:inherited-members:
:members: O3Accessor, TCO3Accessor, VMRO3Accessor
o3skim utils
--------------------------
.. automodule:: o3skim.utils
:inherited-members:
:members: cd, load, save, mergedicts
Tests
Test guidelines
==================================
Testing is based on sqa-baseline_ criteria, tox_ automation is used to
......
Build
===================
Download the code from the o3skim_ repository at the **Build machine**.
For example, using git_:
.. code-block:: bash
$ git clone git@git.scc.kit.edu:synergy.o3as/o3skim.git
Cloning into 'o3skim'...
...
.. _o3skim: https://git.scc.kit.edu/synergy.o3as/o3skim
.. _git: https://git-scm.com/
Build the container image at the **Build machine**.
For example, using docker_:
.. code-block:: bash
$ docker build --tag o3skim .
...
Successfully built 69587025a70a
Successfully tagged o3skim:latest
.. _docker: https://docs.docker.com/engine/reference/commandline/build
If the build process succeeded, then you should see the image name on the container images list:
.. code-block:: bash
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
o3skim latest 69587025a70a xx seconds ago 557MB
...
To use your new generated image on the **Runtime machine**, the easiest way is to
push to a dockerhub repository. For example, with docker_:
.. code-block:: bash
$ docker push <repository>/o3skim:<tag>
The push refers to repository [docker.io/........./o3skim]
...
7e84795fccac: Preparing
7e84795fccac: Layer already exists
ffaeb20d9e23: Layer already exists
4cdd6a90e552: Layer already exists
3e0762bebc71: Layer already exists
1e441fe06d90: Layer already exists
98ff2784e9f5: Layer already exists
2b99e2403063: Layer already exists
d0f104dc0a1f: Layer already exists
...: digest: sha256:...................... size: 2004
If you do not have internet access from the **Build machine** or **Runtime machine**
it is also possible to use `docker save`_ to export your images.
.. _`docker save`: https://docs.docker.com/engine/reference/commandline/save/
Command Line Interface
=======================
Usage:
.. code-block:: bash
usage: main [-h] [-f SOURCES_FILE] [-s {year,decade}]
[-v {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
{lon_mean,lat_mean} [{lon_mean,lat_mean} ...]
To run the the application using **udocker** at the **Runtime machine**
you need to provide the following volumes to the container:
- --volume, mount `/app/data`: Input path with data to skim.
- --volume, mount `/app/output`: Output path for skimmed results.
- --volume, mount `/app/sources.yaml`: Configuration file with a data structure
description at the input path in YAML_ format.
See :doc:`../getting_started/source-file` for a configuration example.
.. _YAML: https://yaml.org/
Also, in the specific case of udocker_, it is needed to specify that the
user `application` should run inside the container:
.. _udocker: https://indigo-dc.gitbook.io/udocker
For example,to run the container using udocker_ use the following:
.. code-block:: bash
$ udocker run \
--user=application \
--volume=${PWD}/sources.yaml:/app/sources.yaml \
--volume=${PWD}/data:/app/data \
--volume=${PWD}/output:/app/output \
o3skim --verbosity INFO lon_mean
...
executing: main
...
2020-08-25 12:42:34,151 - INFO - Configuration found at: './sources.yaml'
2020-08-25 12:42:34,152 - INFO - Loading data from './data'
2020-08-25 12:42:34,261 - INFO - Skimming data to './output'
For the main function description and commands help you can call:
.. code-block:: bash
$ udocker run --user=application o3skim --help
Where positional arguments are the o3skim operations to perform:
.. code-block:: bash
o3skim is a tool for data pre-processing of ozone applications:
- lon_mean: Mean operation over longitude axis.
- lat_mean: Mean operation over latitude axis.
positional arguments:
{lon_mean,lat_mean} o3skim operations to perform
optional arguments:
-h, --help Show this help message and exit
-f SOURCES_FILE, --sources_file SOURCES_FILE
Custom sources YAML configuration; default:./sources.yaml
-s {year,decade}, --split_by {year,decade}
Period time to split output; default: None
-v {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --verbosity {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Sets the logging level; default: ERROR
Note that SOURCES_FILE is only modified for development purposes as usually any
file from host can be mounted using the container directive '--volume'.
Deployment
==================================
To deploy the the application using **udocker** at the **Runtime machine**
you need the o3skim container image.
The easiest way to deploy in your **Runtime machine** is by pulling the image
from a remote registry. You can use the official registry at synergyimk_ or use
the instructions at :doc:`build` to create your image and uploaded at your own registry.
.. _synergyimk: https://hub.docker.com/r/synergyim
Once you decide from which registry download, pull the image that image registry.
For example, to pull it from the synergy-imk official registry use:
.. code-block:: bash
$ udocker pull synergyimk/o3skim
...
Downloading layer: sha256:......
...
Note it is also possible to use `udocker load`_ to import images generated by
`docker save`_.
.. _`udocker load`: https://indigo-dc.gitbook.io/udocker/user_manual#1-4-basic-flow
.. _`docker save`: https://docs.docker.com/engine/reference/commandline/save
Once the image is downloaded or imported, create the local container.
For example, if it was downloaded from synergyimk registry you can use:
.. code-block:: bash
$ udocker create --name=o3skim synergyimk/o3skim
fa42a912-b0d4-3bfb-987f-1c243863802d
Check the containers
available at the **Runtime machine**:
.. code-block:: bash
$ udocker ps
CONTAINER ID P M NAMES IMAGE
...
fa42a912-b0d4-3bfb-987f-1c243863802d . W ['o3skim'] synergyimk/o3skim:latest
Now you are ready to start using the container as `o3skim`. Read how to use the
:doc:`cli` as first steps to skim your data.
First steps
==================================
Prerequisites
----------------------------------
This software is shipped as **python3** package, therefore you need to have python3
and pip installed. If not, please check `pip documentation`_ to find out how to
install and run **pip** in your system with at least the following versions:
============= ===============
software version
============= ===============
python >= 3.6.12
pip >= 21.0.1
============= ===============
.. note:: Non admin rights? Check how to run conda_ in your machine.
.. _`pip documentation`: https://indigo-dc.gitbook.io/udocke
.. _conda: https://docs.conda.io/en/latest
Installation
----------------------------------
Once **python3** and **pip** are running in your system, download the package and
install it using pip:
.. code-block:: bash
$ git clone https://git.scc.kit.edu/synergy.o3as/o3skim.git
Cloning into 'o3skim'...
...
$ cd o3skim
$ pip install .
...
Successfully installed o3skim-0.4.0
o3norm
=======================
You can standardize a dataset from any sources using the provided command
**o3norm** when installing the package.
.. code-block:: bash
usage: o3norm [-h] [-v {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-t TARGET]
(--tco3_zm | --vmro3_zm)
{CCMI-1,ECMWF,ESACCI,SBUV} ...
This command loads the model from the specified sources and produces an
standardized netCDF_ output with the following structure:
.. code-block::
Dimensions: (lat: _, lon: _, plev: _, time: _)
Coordinates:
* time (time) datetime64[ns] ____-__-__ ... ____-__-__T__:__:__
* plev (plev) float64 ___._ ... ___._
* lon (lon) float64 ___._ ... ___._
* lat (lat) float64 ___._ ... ___._
Data variables:
tco3_zm (time, lon, lat) float64 __
vmro3_zm (time, plev, lon, lat) float64 __
...
Attributes:
<The original dataset attributes>
.. _netCDF: https://www.unidata.ucar.edu/software/netcdf
This can help you to work easier with multiple sources having a common data
structure for your data.
The usage is very simple, call the **o3norm** command followed by the specific
model type you would like to load as a *Sub-command*. Note that there are
general optional arguments common to all source types (for example --target)
and specific to each source type (for example --delimiter in the case of SBUV).
.. code-block::
positional arguments (Sub-commands):
{CCMI-1,ECMWF,ESACCI,SBUV}
Sub-commands
CCMI-1 CCMI-1 Source input
ECMWF ECMWF Source input
ESACCI ESACCI Source input
SBUV SBUV Source input
optional arguments:
-h, --help show this help message and exit
-v {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --verbosity {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Sets the logging level (default: INFO)
-t TARGET, --target TARGET
Target netCDF file (default: o3data)
--tco3_zm Standardization for total column ozone
--vmro3_zm Standardization for volume mixing ratio ozone
You can use the *optional argument* **--help** to see the *Command* and
*Sub-command* instructions. For example, to see the options available for the
*CCMI-1* you can use:
.. code-block::
$ o3norm CCMI-1 --help
usage: o3norm CCMI-1 [-h] [--time TIME] [--plev PLEV] [--lat LAT] [--lon LON]
variable paths [paths ...]
...
As last example, the following commands shows how to produce an output of
standardized netCDF_ files at the file *mydata.nc* using the files provided
from a CCMI-1 source:
.. code-block::
$ o3norm --tco3_zm -t mydata.nc CCMI-1 toz Ccmi/some_path/*.nc
$ o3norm --vmro3_zm -t mydata.nc CCMI-1 vmro3 Ccmi/some_path/*.nc
...
See the first command loads the **toz** from the source as **tco3_zm** and the
second the **vmro3** as **vmro3_zm**. Both commands are targeting the file
*mydata.nc*, therefore that file will contain the information about the 2
variables.
.. _netCDF: https://www.unidata.ucar.edu/software/netcdf
o3skim
=======================
You can reduce an output dataset produced by **o3norm** using the provided
command **o3skim** when installing the package.
.. code-block:: bash
usage: o3skim [-h] [-v {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-o OUTPUT]
[--lon_mean] [--lat_mean] [--year_mean]