Commit 754083b5 authored by julian.gethmann's avatar julian.gethmann

Merge branch 'master' into dev-pep484

parents b0ccaa82 490cd72c
......@@ -61,6 +61,11 @@ docs:
- pip install Sphinx
script:
- python setup.py docs
artifacts:
paths:
- public
only:
- tags
# This deploy job uses a simple deploy flow to Heroku, other providers, e.g. AWS Elastic Beanstalk
# are supported too: https://github.com/travis-ci/dpl
......@@ -68,5 +73,5 @@ docs:
# type: deploy
# environment: production
# script:
# - python setup.py
# - python setup.py
# - dpl --provider=heroku --app=$HEROKU_APP_NAME --api-key=$HEROKU_PRODUCTION_KEY
......@@ -8,7 +8,7 @@ Changelog
------------------
* remove not implemented argument `show_url` of `dump_cassandra_data()`
* write tests that are independent of access to the IBPT network
* move directory structure to new one (cassandra->src/cassandra) to complete `PyScaffold` structure
* move directory structure to new one (`cassandra`->`src/cassandra`) to complete `PyScaffold` structure
* Upgrade PyScaffold to be compatible with new setuptools versions
0.7.1 (2017-10-31)
......
......@@ -5,7 +5,7 @@
Contributing
============
Contributions are welcome, and they are greatly appreciated!
Contributions are welcome, and they are greatly appreciated!
You can contribute in many ways:
......@@ -15,7 +15,7 @@ Types of Contributions
Report Bugs
~~~~~~~~~~~
Report bugs at https://git.scc.kit.edu/las/py/cassandra/issues.
Report bugs at https://git.scc.kit.edu/las-software/15-3-DataProcessing/cassandra/issues.
If you are reporting a bug, please include:
......@@ -40,14 +40,16 @@ Write Documentation
~~~~~~~~~~~~~~~~~~~
cassandra could always use more documentation, whether as part of the
official cassandra docs, in docstrings.
official cassandra docs, in docstrings or the examples.
The documentation is generated automatically using `Sphinx docs`_ with the napoleon extension.
So please write your docstrings following `Google's docstring convention`_.
So please write your docstrings following `Google's docstring convention`_.
.. _`Sphinx docs`: http://www.sphinx-doc.org
.. _`Google's docstring convention`: http://www.sphinx-doc.org/en/stable/ext/example_google.html?highlight=napoleon
Since Sphinx does not handle Python 2's type annotations (`PEP 484 <https://www.python.org/dev/peps/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code>`_) well, it is still necessary to write the type in the docstring.
Write Tests
~~~~~~~~~~~
......@@ -56,7 +58,7 @@ Try to increase the test coverage by writing tests.
Submit Feedback
~~~~~~~~~~~~~~~
The best way to send feedback is to file an issue at https://git.scc.kit.edu/las/py/cassandra/issues.
The best way to send feedback is to file an issue at https://git.scc.kit.edu/las-software/15-3-DataProcessing/cassandra/issues.
If you are proposing a feature:
......@@ -75,12 +77,13 @@ Ready to contribute? Here's how to set up `cassandra` for local development.
$ git clone git@git.scc.kit.edu/YOURNAME/cassandra.git
3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development::
3. Install your local copy into a virtualenv or venv, which ships with Python 3::
$ mkvirtualenv cassandra
$ cd cassandra/
$ python3 -m venv .venv
$ source .venv/bin/activate
$ python setup.py develop
$ python -m pip install -r test-requirements.txt
$ python -m pip install -r test-requirements.txt -r requirements.txt
4. Create a branch for local development::
......@@ -96,10 +99,12 @@ Ready to contribute? Here's how to set up `cassandra` for local development.
To get flake8 and tox, just pip install them into your virtualenv.
Change things marked by the pre-commit-hooks if there are some.
6. Check for upstream updates::
$ git remote add upstream https://git.scc.kit.edu/las/py/cassandra.git
$ git fetch upstream
$ git remote add upstream https://git.scc.kit.edu/las-software/15-3-DataProcessing/cassandra.git
$ git fetch upstream
$ git checkout master
$ git merge upstream/master
$ git checkout name-of-your-bugfix-or-feature
......@@ -121,14 +126,15 @@ Before you submit a pull request, check that it meets these guidelines:
1. The pull request should include tests.
2. If the pull request adds functionality, the docs should be updated. Put
your new functionality into a function with a docstring, and add the
feature to the list in README.rst.
3. The pull request should work for Python 2.7, 3.4, 3.5 and 3.6, and for possibly PyPy and other versions.
feature to the list in README.rst and CHANGELOG.rst.
3. The pull request should work for Python 2.7, 3.4, 3.5 and 3.6, and possibly PyPy and other versions.
Tips
----
..
Tips
----
Try to increase the test coverage. You can check the coverage by running
Try to increase the test coverage. You can check the coverage by running
$ make cov
$ make cov
and opening the `htmlcov/index.html` file.
and opening the `htmlcov/index.html` file.
......@@ -4,7 +4,7 @@ Cassandra
Cassandra_ — ANKAs aktuelles (Stand 2016) Archivierungsdatenbanksystem. Es existiert eine JSON-API, die aus dem ANKA-Netz (las-bernhard.anka.kit.edu) verfügbar ist.
EPICS — ANKAs neuestes Kontrollsystem (Stand 2016), in welchem u. a. der CLIC-Dämpfungswiggler integriert ist. Aus diesem wird Cassandra befüllt. Für weitere Informationen siehe z. B. Arbeitsgruppenvortrag von Walter Werner, ANKA-Confluence, sowie die `offizielle EPICS Seite`_
EPICS — ANKAs neuestes Kontrollsystem (Stand 2016), in welchem u. a. der CLIC-Dämpfungswiggler integriert ist. Aus diesem wird Cassandra befüllt. Für weitere Informationen siehe z. B. Arbeitsgruppenvortrag von Walter Werner, ANKA-Confluence, sowie die `offizielle EPICS Seite`_
PV — EPICS-Variable, die hier abgefragt werden kann.
......@@ -23,7 +23,7 @@ Installation
::
python -m pip install "git+ssh://git@git.scc.kit.edu:/las/py/cassandra.git#egg=cassandra"
python -m pip install "git+ssh://git@git.scc.kit.edu:/las-software/15-3-DataProcessing/cassandra.git#egg=cassandra"
Script
......@@ -34,7 +34,7 @@ Das Script `./fetch.py` läd Daten eines PV und plottet sie
cassandra.py Python-Klasse zum abstrahieren von Cassandra (basiert auf dem fetch.py-Skript)
Unter `bin` sind ein beispielhaftes Skript zur Verwendung der Cassandra-Klasse (`plot_two.py`).
Unter `bin` sind ein beispielhaftes Skript zur Verwendung der Cassandra-Klasse (`plot_two.py`).
Auch wird das Kommandozeilen-Programm zum anzeigen von gemittelten Werten aus der Cassandra-Datenbank:`get_mean_values.py` installiert.
JSON-Dateien
......@@ -60,14 +60,16 @@ The docu is generated using `Sphinx <https://sphinx-doc.org>`_, is written in `R
You can compile the docu either by using the setup.py::
$ python setup.py docs
$ python setup.py build_sphinx
or by hand by running::
$ cd docs && sphinx-apidoc -f -o ./_source ../cassandra
In detail this dose something similar to the following steps.
Run the sphinx-apidoc::
$ cd docs && sphinx-apidoc -f -o ./_source ../src/cassandra
Include the filenames (ls `*.rst >> index.rst`) into `index.rst` at a appropriate place and add three spaces before them (e.\,g. care about the indention!)
and finally by run::
including filenames (ls `*.rst >> index.rst`) into `index.rst` at a apropiate place and add three spaces before them (e.\,g. care about the indention!)
and finally by running::
$ make latexpdf
$ okular _build/latex/cassandra.pdf
......@@ -78,14 +80,17 @@ or::
To setup your own documentation if you're not using PyScaffold you need to
install Sphinx and napoleon::
$ pip install Sphinx
$ pip install sphinxcontrib-napoleon
create the docs directory and initialize the project::
Create the docs directory and initialize the project::
$ mkdir docs
$ cd docs
$ sphinx-quickstart (project path: ../cassandra, source-build seperated)
$ sphinx-quickstart (project path: ../src/cassandra, source-build seperated)
configure it by editing `source/conf.py` and adding `shpinx.ext.napoleon` to the list of extensions and
Configure its configuration by editing `source/conf.py` and adding `shpinx.ext.napoleon` to the list of extensions and
uncommenting `import sys` and `import os` and modifying `abspath('../..')`.
Create some required directories `mkdir _source _static`.
.. _changes:
.. include:: ../CHANGELOG.rst
.. _examples:
========
Examples
========
Here are some examples how you can use the Classes as well as how you can use the script.
If I write Cassandra I am speaking of this Python class, if I write cassanda I am referencing to the package and if I am writing Cassandra-DB I am speaking of the `Apache Cassandra <https://cassandra.apache.org/>`_ instance at KARA.
Scripts
-------
Classes
-------
Let's say we want to have the time a specific fill started, then we can ask the Cassandra-DB to provide us the fill numbers for the time range of the beginning of the data logging (let's say 1st Jan, 2005) till today.
This could be done with the following function that uses the Cassanda's context manager (with statement) to also care about exceptions.
The context manager needs at least the start and end date and time for which it should ask for data and the process variable (PV) name.
So in our case this could something like :obj:`datetime.datetime(2005, 1, 1, 0, 0, 1)` as `start` and `datetime.datetime.now()` as `end`. The PV name is `'A:SR:OperationStatus:01:FillNumber'`.
Putting it together this gives us
.. code:: Python
from cassandra.cassandra import Cassandra
import datetime
with Cassandra(start=datetime.datetime(2005, 1, 1, 0, 0, 1),
end=datetime.datetime.now(),
pv="A:SR:OperationStatus:01:FillNumber") as data:
times, fills = data
Now the `times` list has got all the datetimes of the fill numbers that are in the `fills` list as integers.
Since we did not provide any number for the optional argument `count` the number of entries is not known and we need not to have one entry for each fill.
As it is not always convenient to provide :obj:`datetime.datetime` objects, you can also write the `start` and `end` date as strings in some common formats, like ISO-8601 or CSS-exports' format or the format used by the THz group.
For an explanation of the `count` argument see the documentation of Cassandra-DB's web-interface in the Confluence wiki.
The also optional argument `directory` does not make sense for the context manager, and is there for creating objects.
Now we can create another simple example that utilises the above mentioned string format, the count argument and uses the helper class :class:`cassandra.cassandra.Pvs`.
* The starting time, which we will provide as the string `"2005/01/01 00:00:01"`,
* the end time, which we may give as a :obj:`datetime.datetime` object,
* the PV name for which we can use the `fill` of the `Pvs.pv` dictionary, and
* in this case we want to limit the counts to `1`, because the frequency of the fills is very low.
We end up with a code that might look like:
.. code:: Python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from cassandra import Cassandra, Pvs
import datetime
def fill_to_time(fill: int) -> datetime.datetime:
"""Return datetime of the KARA fill
Args:
fill: Fill number
Returns:
Fill time
"""
now_str = datetime.strftime(datetime.now(), "%Y/%m/%d %H:%M:%S")
with Cassandra("2005/01/01 00:00:01", now_str, Pvs.pv["fill"], 1) as data:
times, fills = data
return times[fills.index(fill)]
fill_to_time(6000)
The `data` object we get from the handler is a :obj:`tuple` of two :obj:`list` s that contain the date stamps as :obj:`datetime.datetime` returned from Cassandra and the corresponding data. In our case the latter are the fill numbers.
......@@ -7,10 +7,11 @@ Contents
.. toctree::
:maxdepth: 2
Examples <examples>
Module Reference <api/cassandra>
Changelog <changelog>
License <license>
Authors <authors>
Changelog <changelog>
Module Reference <api/modules>
Contributing <contributing>
......
......@@ -54,6 +54,7 @@ norecursedirs =
[aliases]
release = sdist bdist_wheel upload
docs = build_sphinx
[bdist_wheel]
# Use this option if your package is pure-python
......@@ -62,6 +63,7 @@ universal = 1
[build_sphinx]
source_dir = docs
build_dir = docs/_build
fresh-env = true
[devpi:upload]
# Options for the devpi: PyPI server and packaging tool
......
......@@ -9,6 +9,7 @@
"""
import sys
from setuptools import setup
# Add here console scripts and other entry points in ini-style format
......@@ -18,14 +19,16 @@ entry_points = """
# For example:
# fibonacci = cassandra.skeleton:run
"""
if sys.version_info <= (3,):
requires = ['pyscaffold==2.5.10']
else:
requires = ['pyscaffold>=3.0a0,<3.1a0']
def setup_package():
needs_sphinx = {'build_sphinx', 'upload_docs'}.intersection(sys.argv)
sphinx = ['sphinx'] if needs_sphinx else []
setup(setup_requires=['pyscaffold>=3.0.2a0,<3.1a0'] + sphinx,
entry_points=entry_points,
use_pyscaffold=True)
setup(setup_requires=requires + sphinx, entry_points=entry_points, use_pyscaffold=True)
if __name__ == "__main__":
......
# -*- coding: utf-8 -*-
""" A module to handle typical tasks concerning ANKA's ``Cassandra``.
Read data from ANKA's ``Cassandra`` and exported data exported by ``Control System Studio``'s ``Data Browser``
Read data from KARA's ``Cassandra`` database and exported data exported by ``Control System Studio``'s ``Data Browser``
.. moduleauthor:: Julian Gethmann <atb@gethmann.org>
To read data from CSS exports you need the :func:`css_read.load_css_data`.
To read data from the cassandra-db you either need to use the class
:class:`cassandra.Cassandra` probably with its context manager.
Or you can use the :func:`pd.pvs2pd` function to export the data directly
to a pandas :obj:`pandas.DataFrame`.
If you don't know a specific, but common, process variable name changces are
high you might find it in the :obj:`cassandra.Pvs.pv` dictionary.
Finally there are some helper functions, like time conversions, in the :class:`cassandra.CassandraHelper`
class.
"""
from pkg_resources import get_distribution, DistributionNotFound
......
......@@ -25,9 +25,9 @@ __copyright__ = "Julian Gethmann"
__license__ = "mit"
__all__ = ["Cassandra", "CassandraHelper", "Pvs"]
try: # Py2
try: # Py3
from urllib.error import URLError
except ImportError: # Py3
except ImportError: # Py2
from urllib2 import URLError
......@@ -217,16 +217,14 @@ class CassandraHelper(object):
except ValueError:
pass
except TypeError:
raise TypeError(
"Time ({}) must either be a `datetime.datetime` or a string"
" as described in the docstring and not a {}.".format(
time_string, type(time_string).__name__))
raise TypeError("Time ({}) must either be a `datetime.datetime` or a string"
" as described in the docstring and not a {}.".format(
time_string, type(time_string).__name__))
if not ret:
raise ValueError(
"`time_string` must not be of the form `{}` and type {}! "
"See docstring for possible forms.".format(
time_string, type(time_string).__name__))
raise ValueError("`time_string` must not be of the form `{}` and type {}! "
"See docstring for possible forms.".format(
time_string, type(time_string).__name__))
else:
return ret
......@@ -270,8 +268,8 @@ class CassandraHelper(object):
values.append(entry["value"])
# if entry["status"] == "OK" or entry["severity"]["level"] == "OK"]
timestamps = [
datetime.datetime.fromtimestamp(entry["time"] / 1e9)
for entry in json_data if entry["severity"]["level"] == "OK"
datetime.datetime.fromtimestamp(entry["time"] / 1e9) for entry in json_data
if entry["severity"]["level"] == "OK"
]
dataset = namedtuple("dataset", ["timestamps", "values"])
......@@ -370,11 +368,7 @@ class Cassandra(object):
bool: True if the connection is ok, False if an error occurred
"""
try:
tmp = Cassandra(
"2017-07-07T08:08:08",
"2017-07-07T08:08:08",
Pvs.pv["energy"],
count=1)
tmp = Cassandra("2017-07-07T08:08:08", "2017-07-07T08:08:08", Pvs.pv["energy"], count=1)
tmp.timeout = 1
tmp._download_cassandra_data()
except (TimeoutError, URLError):
......@@ -413,7 +407,7 @@ class Cassandra(object):
})
def _download_cassandra_data(self):
# type: (...) -> List[str]
# type: (Cassandra) -> List[str]
"""Return Cassandra's JSON as an JSON object. """
url = self.gen_url()
try:
......@@ -431,9 +425,7 @@ class Cassandra(object):
except URLError:
pass
else:
raise URLError(
"Request had a timeout. Maybe you're not inside the ANKA-LAN"
)
raise URLError("Request had a timeout. Maybe you're not inside the ANKA-LAN")
elif CassandraHelper.PY2:
import urllib2 as request
for err_count in range(self.RETRIES):
......@@ -449,13 +441,12 @@ class Cassandra(object):
"Request had a timeout. Maybe you're not inside the IBPT-CN-LAN or provided a wrong PV name"
)
else:
raise NotImplementedError(
"Not implemented for other versions than 2 or 3!")
raise NotImplementedError("Not implemented for other versions than 2 or 3!")
return json_data
def dump_cassandra_data(self):
# type: (...) -> str
# type: (Cassandra) -> str
"""Dump the JSON file to a file named like the PV and time that is returned
Dump a JSON file fetched from the Cassandra `host` and return its name.
......@@ -472,6 +463,7 @@ class Cassandra(object):
return path.abspath(self.json_file)
def get_json_local(self):
# type: (Cassandra) -> Tuple[List[datetime.datetime], List[Union[List[float], float]]]
""" Return timestamps and values for given Cassandra object and copy JSON file if neccessary."""
if not path.isfile(self.json_file) or path.getsize(self.json_file) <= 2:
self.dump_cassandra_data()
......
......@@ -3,8 +3,10 @@
:Authors: Julian Gethmann
:Contact: atb@gethmann.org
:Date: 2017-09-15
.. versionadded:: 0.5.5
.. versionchanged:: 0.7.0
"""
from datetime import datetime
from typing import Iterable, List, Optional, Tuple # flake8: noqa
......@@ -18,16 +20,16 @@ def pvs2pd(
start, # type: datetime.datetime
end, # type: datetime.datetime
pv_names, # type: Iterable[str]
count=None, # type: Optional[int]
upsample=None, # type: Optional[str]
save_local=False, # type: Optional[bool]
): # type (...) -> pd.DataFrame
# Optional[int], Optional[str], Optional[bool]) -> pd.DataFrame
"""Return a `pd.DataFrame` with data for all `pv_names` and `time` as index
count=None, # type: int
upsample=None, # type: str
save_local=False, # type: bool
):
# type (...) -> pd.DataFrame
"""Return a :obj:`pandas.DataFrame` with data for all `pv_names` and `time` as index
Missing data points are filled with the last value (like in CSS).
The sampling rate is not corrected to be constant (like in CSS).
The `pd.DataFrame` starts with the start date and takes the last saved data
The `pandas.DataFrame` starts with the start date and takes the last saved data
for it and ends with the end date and also takes the filled data.
.. note::
......@@ -51,7 +53,7 @@ def pvs2pd(
are saved locally in the same directory.
Returns:
pd.DataFrame: "time" as index and `pv_names` as columns.
pandas.DataFrame: "time" as index and `pv_names` as columns.
Examples:
>>> from datetime import datetime
......@@ -71,22 +73,19 @@ def pvs2pd(
pv = Pvs.pv[pv_name] if ":" not in pv_name else pv_name
if save_local:
cas = Cassandra(start, end, pv=pv, count=None, directory=".")
data = cas.get_json_local(
) # type: Tuple[List[datetime.datetime], List[float]]
data = cas.get_json_local() # type: Tuple[List[datetime.datetime], List[float]]
collected_data = collected_data.join(
pd.DataFrame({
"time": data[0],
pv_name: data[1]
}).set_index("time"),
how="outer")
}).set_index("time"), how="outer")
else:
with Cassandra(start, end, pv=pv) as cas:
collected_data = collected_data.join(
pd.DataFrame({
"time": cas[0],
pv_name: cas[1]
}).set_index("time"),
how="outer")
}).set_index("time"), how="outer")
collected_data = collected_data.ffill()
collected_data = collected_data[collected_data.first_valid_index():]
# if start in collected_data.index:
......
......@@ -4,13 +4,12 @@
:Authors: Julian Gethmann
:Contact: phd@gethmann.org
.. lastupdated:: 2017-01-25
Return the mean and standard deviation for the PVs in the given time range.
"""
import argparse
import logging
from math import sqrt
from typing import Optional, Tuple, Union # flake8: noqa
from .cassandra import Cassandra, Pvs
......@@ -19,11 +18,12 @@ logging.getLogger('cassandra.tools').addHandler(logging.NullHandler())
def _get_mean_values(
start_time="2016/04/07 13:30:00",
end_time="2016/04/07 14:00:00",
pvs=("q1", "q2"),
output_format="full",
start_time="2016/04/07 13:30:00", # type: Optional[Union[datetime.datetime,str]]
end_time="2016/04/07 14:00:00", # type: Optional[Union[datetime.datetime,str]]
pvs=("q1", "q2"), # type: Tuple[str, str, ...]
output_format="full", # type: str
):
# type: (...) -> None
"""
.. versionchanged:: 0.3.0
"""
......@@ -43,37 +43,29 @@ def _get_mean_values(
"str": "{val}",
}
for pv in pvs:
with Cassandra(
start_time, end_time, Pvs.pv.get(pv, pv), count,
directory="/tmp/") as obj:
with Cassandra(start_time, end_time, Pvs.pv.get(pv, pv), count, directory="/tmp/") as obj:
if "str" in output_format:
_logger.info("Raw string for PV: {}".format(pv))
try:
print(obj[1][0])
except KeyError:
raise KeyError(
"No %s as output_format available!" % output_format)
raise KeyError("No %s as output_format available!" % output_format)
else:
_logger.info("Values for PV: {}".format(pv))
try:
mean = sum(obj[1]) / len(obj[1])
std = sqrt(
sum([(xi - mean)**2
for xi in obj[1]]) / (len(obj[1]) - 1))
print(output_str[output_format].format(
name=pv, val=mean, std=std))
std = sqrt(sum([(xi - mean)**2 for xi in obj[1]]) / (len(obj[1]) - 1))
print(output_str[output_format].format(name=pv, val=mean, std=std))
except ValueError:
raise ValueError(
"Probably too short time to get enough points")
raise ValueError("Probably too short time to get enough points")
except KeyError:
raise KeyError(
"No %s as output_format available!" % output_format)
raise KeyError("No %s as output_format available!" % output_format)
def get_mean_values():
parser = argparse.ArgumentParser(
description="Return the mean value "
"and standard deviation for the PVs in the given time range.")
# type: () -> None
parser = argparse.ArgumentParser(description="Return the mean value "
"and standard deviation for the PVs in the given time range.")
parser.add_argument(
"--start",
"-s",
......@@ -99,12 +91,7 @@ def get_mean_values():
required=False,
help="short name of the PV (see PVs class from cassandra module)")
parser.add_argument(
"--rawpv",
dest="rawpvs",
nargs="?",
type=str,
required=False,
help="full name of *one* PV")
"--rawpv", dest="rawpvs", nargs="?", type=str, required=False, help="full name of *one* PV")
parser.add_argument(
"--raw",
"-r",
......
......@@ -8,9 +8,6 @@
import datetime
import json
import os
import pathlib
from urllib import error, request
from urllib.error import URLError
import cassandra
import pytest
......@@ -19,6 +16,18 @@ from cassandra.cassandra import Cassandra
# from cassandra.cassandra import *
from .conftest import request_openurl
try:
import pathlib
except ImportError:
pass
try: # Py3
from urllib import request
from urllib.error import URLError
except ImportError: # Py2
from urllib2 import URLError
import urllib2 as request
@pytest.mark.usefixtures("cleandir", "setup_")
class TestCassandraClass(object):
......@@ -32,7 +41,9 @@ class TestCassandraClass(object):
assert cas2.count is None
assert cas2.start_time == start
assert cas2.end_time == end
assert pathlib.Path(cas2.directory).absolute() == pathlib.Path.cwd()