I had heard of that project a while back and was curious to give it a go, with this weekend, I ended up having a bit of time to test this. Skyline is a project created by etsy which is designed to monitor automatically graphs and detect anomalies.
I would have normally used a CentOS 6 installation to test this but it turns out that the requirements are actually quite important in terms of dependencies. You will need fairly recent versions of packages to make this work. This is why I ended up using Ubuntu (14.04 LTS at the time of writing).
You can choose to use pip in order to install dependencies but I preferred to use the distribution’s packages instead. If that works for you, here’s what you need to install:
apt-get install python-numpy python-scipy python-pandas \ python-patsy python-statsmodels python-msgpack \ python-unittest2 python-mock python-simplejson \ python-hiredis redis-server python-daemon python-flask
Next, you need the latest version of skyline, you will need git installed for this, you can just do:
apt-get install git-core cd /opt git clone https://github.com/etsy/skyline.git cd skyline cp src/settings.py.example src/settings.py cp src/redis.conf /etc/redis/ # copying redis skyline config mkdir /var/log/skyline /var/dump /var/run/skyline chown -R redis /var/lib/redis/ service redis-server restart # important
You will need to modify the port and address if like me you are not using the same machine, then you can start the daemons. There is two things I fell into when i started them: first, make sure your host has its correct name and ip in /etc/hosts, horizon will get upset if you don’t. Second, also make sure that in the settings.py you will have to replace 127.0.0.1 by 0.0.0.0 if you want to be able to connect to the webapp outside of your machine. Last but not least, change the value of the http interface to point to your graphite instance, failing this, not much analysis can happen. Also note that you have to run the web interface on port 80, using anything different will fail.
Time to start it all:
cd /opt/skyline bin/horizon.d start bin/analyzer.d start bin/webapp.d start
If at any point starting these daemons you have an issue, you are on your own, use the logs. I have included in the commands above all the issues I had, so you should be alright starting all the daemons.
You will need to direct your graphite metrics to a relay for duplication, this is nicely explained here.
You can also do a check to make sure it’s all good, luckily for you, the project includes a little utility to test this.
/opt/skyline# python utils/seed_data.py Loading data over UDP via Horizon... Connecting to Redis... Congratulations! The data made it in. The Horizon pipeline seems to be working.