Breaking stuff, making cloud, playing games. General Harakiri of a bored mind off loaded to computing.

Feb 132017

Found a good post on how to use the  rescue mode in Hetzner to setup a Logical Volume that spans multiple drives. This plus vsftpd can help setup Terabytes of space for storage as backup. In any case you should not use Hetzner for anything closely resembling a legit online service, website etc. It is only popular as Seedbox for a reason as they are quick to lock your server out.

Speaking of seedboxes, if you want to setup one here is a good script https://github.com/arakasi72/rtinst It can optionally install Webmin.

For file storage, search and download etc I have not yet found a tool. It’s mostly find and scp

If you feel adventurous I found a tricked out Seedbox setup script here https://github.com/dannyti/seedbox-from-scratch. Does everything and makes coffee.

With Bacula running on the local server I can send snapshots over to Hetzner albeit at a slow-as-snail speed. 30 Euros per month for 5.5TB of space is not too bad. I picked up a system from the their auctions and its quite alright but one shouldn’t expect performance or longevity from such systems. You get what you pay for essentially.


Feb 062017

Start by setting up a container/VM with ubuntu 12 LTS

sudo apt-get install software-properties-common python-software-properties

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default
echo JAVA_HOME="/usr/lib/jvm/java-8-oracle"  | sudo tee /etc/environment
source /etc/environment
wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -
echo deb http://pkg.jenkins-ci.org/debian binary/ | sudo tee /etc/apt/sources.list.d/jenkins.list
sudo apt-get update
sudo apt-get install jenkins
sudo apt-get install git
sudo apt-get install autoconf bison build-essential libssl-dev libyaml-dev libreadline6 libreadline6-dev zlib1g zlib1g-dev imagemagick libmagickcore-dev libmagickwand-dev sqlite3 libsqlite3-dev libxml2-dev unzip
sudo apt-get install redis-server postgresql-9.1 libpq-dev postgresql-contrib
wget -qO- https://deb.nodesource.com/setup_6.x | sudo bash -
sudo apt-get install -y nodejs
npm install -g phantomjs-prebuilt
sudo su - jenkins 

This step with rbenv maybe unnecessary since the jenkins plugins does the same thing

git clone https://github.com/sstephenson/rbenv.git ~/.rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(rbenv init -)"' >> ~/.bashrc
exec $SHELL

git clone https://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build

#install ruby version if not using rbenv plugin
rbenv install 2.3.1
#Edit this to match github
git config --global user.name "John Doe"
git config --global user.email [email protected]

ssh-keygen and upload to github as required.

On the Jenkins UI install plugins

  • git
  • github
  • rbenv
  • Rake
  • envInjector plugin to populate .envrc equivalent

follow these guides here for project setup

http://www.jianshu.com/p/0c9cbbd6d787 -RVM specific

http://www.webascender.com/Blog/ID/522/Setting-up-Jenkins-for-GitHub-Rails-Rspec RBENV

Postgres related

Configure postgres pg_hba.conf to match your database.yml. Contrary to popular belief you dont need to copy any database config

Using 9.5+ version

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" >> /etc/apt/sources.list.d/pgdg.list'

wget -q https://www.postgresql.org/media/keys/ACCC4CF8.asc -O - | sudo apt-key add -

sudo apt-get update 

sudo apt-get install postgresql postgresql-contrib

Despite all the guides, much of Rails on Jenkins is trial and error. Setting up a new project is relatively easy as per the linked guides but YMMV.

In addition I have had to edit code to make it Jenkins friendly and enabling JUnit reports in rails via Gems is very useful.

Note that installing a single instance of postgres on the Jenkins master is alright for small local developments but not useful for large parallel builds as you are essentially on one DB. I hope this is good enough for anyone to get started and help improve upon the setup.

Possible Further improvements:
Use containers for postgres, build vm
Use EC2 or Azure Vm plugins
use Jenkins Pipeline

Dec 272016

Twitter Heron is an alternative to Apache Storm with compatible API as it was developed from Apache Storm. The major difference I found was that Heron opts to get rid of Threads and going with Process based workers. This is surprising and I am not sure the any advantage it would have over Threads+Process.

Heron also has done away with DRPC which is the one thing I need to provide direct access to a Storm cluster. I haven’t seen an alternative being mentioned.

Most of everything else is same as Storm except the sales page for Heron is better of course 🙂 The reason to explore Heron is ease of Production use and DevOps. It’s kind of difficult with Storm but Azure HDInsight might help as it has now Apache Storm 1.x supported in their HDInsight 3.5 version.

I hope to learn more from users of Storm and Heron who have been operating it in Production on what’s the pros and cons.


Dec 272016

Finally got a metal storage rack that I will use as Rack space. About US$99. Finally so much space. Comes with removable half panels, just in case. It’s the same sort I have in my storage room. Why did I not think of this earlier!

metal rack for servers/devices

Since I was unplugging everything to get the rack installed, I took the plunge and installed that 2nd Xeon CPU on the workstation and it went without a hitch. This is the first time I have actually installed the CPU myself. I did try earlier without a heatsink and it was overheating so the system would not boot. The BIOS logs accurately pointed as much, with the heatsink, no complains. Cleaned out the system with compressed air and a vacuum cleaner to catch the dust.

Got a lot done for Christmas holidays I must say.

It tooks months for me to get my Second Intel(R) Xeon(R) CPU E5420 CPU for the used Workstation I bought an year ago. There was some hassle and expensive shipping to get the Heatsink as the stock heatsink would not fit the Dell workstation spec. After all the upgrades to this beast of a Dell t5420 Workstation is finally complete.  My final server spec is

Intel(R) Xeon(R) CPU E5420
4TB external powered HDD
2TB 2x enclosure, powered disk cloner that doubles as USB drive
Edgemax Pro 3 Gbps router and Unifi pro wireless (next in line for network upgrades)
1Gbps down and 500mbps up fiber internet connection
excludes my desktop and peripherals.
Runs a decent Apache Storm cluster, mirrors OSS projects and chomps on AWS Kinesis streams while deploying to Azure app services.

There is no hyperthreading support unfortunately on this old Xeon and the 2 Disk LVM still suffers from IO issues. It could be the limits of 2009 SATA. I might have better luck with external storage for things like my Raspberry mirror. I have been very happy with WD My Book which is externally powered and has been my offline backup for almost 4 years now.

Wire management is going to get easier. Plus the Eubiq port is amazing for power supply. See picture

Eubiq power strip, still need to manage those cables.

It doesn’t sit on the floor so it’s gonna stay a lot cleaner and quieter. Once I am sure where to place what i an do some cable management.

Full blast LXC containers manage and host all my apps. Maybe I will try Docker at some point.

The workstation is heavy and I forgot to take picture of the installed dual Xeon CPUs. Maybe next time when I upgrade the disks.

Complete setup of my home workspace, much neater with everything else on the rack

my workspace, those cables need managing

Dec 252016

Save time and money. Go Backblaze. Crashplan sucks, terrible speed and it has not improved in years. They are full of excuses about it though so they are unlikely to address it in any correct way.

I am using both and going to ditch crashplan once subscription expires.

Keep it up Backblaze.

Dec 242016

I have been scouring the internet for opinions and alternatives to what Java has to offer in terms of both ORM and non-ORM for database access. Unlike Python, I do not clearly know what is out there for Java. Rails is easy, if you use Rails you use ActiveRecord and forget about anything else. Hibernate seems to be much-used historically for Java and Spring framework as ORM layer. In any case, for building small apps and micro-services which are data centric it would be overkill to use ORM. Unlike Rails/Ruby, Java projects are much more active and well maintained.

Found an interesting thread on Reddit discussing alternatives to ORM even though the ask was about ORM. I found the following possible non-ORM alternatives that could come in handy.

JDBC : Pure database and SQL query part of Java API. Verbose (which Java tends to be anyway) but no frills. This is the direct interface to Database using the API standard. I think it’s important to learn this way first anyway plus it’s always going to work if you get it right. Where complex POJO are not required , JDBC is the best option.

JDBI: An interesting wrapper to JDBC that makes it easier to use JDBC and less verbose. Surely worth a look and the DAO layer is all we need mostly to not have to write the same queries over and over again.

JOOQ: Java 8 lambda supporting interface to JDBC much like JDBI but with lot more features including code generation from database. A lot of focus on functional programming and lambdas. Good reason for it as the author also created lambdaj for Java 5. It has neat data binding practices which I would certainly consider for SparkJava. It is commercial though OSS. it is still free for non-commercial databases. So if you use MariaDB or PostgresSQL you’re fine. Overkill for small apps as some suggest but it’s worth mentioning.

SQL2o – I only discovered it when I was evaluating SparkJava and it is amazing. It is as low footprint as JDBC but adds Java class mapping, extremely terse. This could be a go to instead of JDBC. Try-with-resources for added ease of use.  Definitely looking like my go-to after JDBC

ORMLite – It’s a lightweight ORM. Commonly used for Android apps but can be used as server side just the same. Weird licensing so I am unlikely to use it.

Once I run out of JDBC which is likely as I write more POJO to save objects, I might consider SQL2O

Dec 162016

I have been getting familiar with Java a lot more than I had planned to in my career. It so happens I need to develop a standard set of APIs for public consumption and I happened upon Swagger.io which is an API Framework/Guideline. Its similar to WSDL. It allows you to auto-generate code from Contract-first Document. So you could write what your JSON looks like and Swagger will generate code. I noticed it supported tons of JAX-RS framework. Now JAX-RS is another API-Spec for Java https://jax-rs-spec.java.net/ and it works so well together. There isn’t any official bridge between Swagger and JAX-RS but I was sold. So were other folks in my team. We can generate both server and client code from API specification. Ofcourse we need to do the Business Logic but none of the grunt work.

So JAX-RS is what I am going with. It’s perfect for someone coming from Python Flask/Django world. I suggest this good tutorial videos on JAX-RS with jersey done by Koushik, from Java Brains. You can find the playlist here https://www.youtube.com/playlist?list=PLqq-6Pq4lTTZh5U8RbdXq0WaYvZBz2rbn There is also advanced JAX-RS which I haven’t checked out yet, but good to know if I need more info. Even though I have production APIs serving millions of users every day, there is so much I don’t know and looking for answers always surprises me.

I am currently using Azure API services for deploying the JAX-RS services and it’s quite a easy as they support git push.

The reason for getting in depth in to JAX-RS is so i can do contract last and annotate for Swagger. But it really depends on your requirements and project

Swagger is not covered in the videos. I will post about it once I have worked out a sample.

Edit: This is the best sample project I have found https://github.com/swagger-api/swagger-samples/tree/master/java/java-jersey-jaxrs

It has everything you need. A JAX-RS sample project with Swagger annotations. I recommend this way, JAX-RS->Swagger, instead if you are not going Contract First method

Dec 142016

I have been using du and df with a bit of other bash scripts to gather disk space usage information on my linux servers. I need to clean up disk space often specially when I am reaching 70% usage just to be sure I will really need that next upgrade. On Windows we have WinDirStat and for KDE there is KDirStat (same tool ported to Windows) but for headless Servers GUI is not an option.

I have come accross the command line tool called ncdu that can be installed from the regular Repository. Just yum or apt-get this tool.


You can get per directory usage sorted by space taken like this

ncdu <root-path-to-use>

You can even hit d to delete directories and files recursively. None of that xargs weirdness to handle large number of files. It takes care of that.


Give it a spin.

Nov 042016

If you read my posts you will realize I am a fan of Linode, have been for years. They are great and reasonable people.

I decided to give DigitalOcean a try since I have heard great things about it.

I found the connectivity was awesome from here in Singapore to my first Droplet in London! They have  quite a few locations that is hard to find from their site. Like you really have to search the FAQ etc. So here is a screenshot of all their locations.

Digital ocean data center locations

That’s pretty cool list of locations. Bangalore & Singapore! wow. And it all starts at $5 but for you it’s free $10  so you can try things out. And I get some referral credits so I can ping other data centers 😀

I wanted to find out more about how I could make it easy to deploy images like Docker/Chef or puppet. I need the ability to maintain multiple regions with the same exact image and all changes applied without too much DevOps.

I would definitely use my free credits to test other locations like Singapore and Bangalore.

Here is my MTR results to London from Singapore using actual vps. I would expect average 270-350ms of RTT so this is pretty good.


Start: Fri Nov 4 13:45:06 2016
HOST: xeon Loss% Snt Last Avg Best Wrst StDev
 1.|-- gateway 0.0% 100 0.3 0.3 0.2 3.9 0.3
 2.|-- 0.0% 100 1.8 201.2 1.6 3931. 726.0
 3.|-- 103-6-148-41.myrepublic.c 0.0% 100 1.7 14.7 1.5 57.4 18.5
 4.|-- 103-6-148-13.myrepublic.c 0.0% 100 2.5 3.3 2.2 19.8 2.5
 5.|-- 0.0% 100 7.7 3.4 2.2 15.5 1.6
 6.|-- ae-0.r21.sngpsi05.sg.bb.g 21.0% 100 11.4 4.1 2.1 14.1 2.7
 7.|-- ae-8.r24.londen12.uk.bb.g 0.0% 100 185.8 187.1 185.8 193.7 1.4
 8.|-- ae-1.r25.londen12.uk.bb.g 0.0% 100 182.2 183.4 182.1 193.1 1.9
 9.|-- ae-2.r02.londen01.uk.bb.g 0.0% 100 183.3 183.8 183.2 191.6 1.0
 10.|-- hds.r02.londen01.uk.bb.gi 0.0% 100 189.0 189.4 188.3 231.4 4.4
 11.|-- ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
 12.|-- 0.0% 100 183.4 183.5 182.9 188.8 1.0

The lower the better. 185 ms is the lowest I have ever seen.

I added the IP to AWS Route 53 Latency based routing DNS and noticed that while most countries in Europe where getting the London IP assigned, UK itself was not. This could be either AWS problem or Digital Ocean problem.

So if you if you are going to give it a try for free (I paid the 5$) you can use my code by clicking through here http://www.digitalocean.com/?refcode=3a149653659e

Oct 302016

I was having this terrible problem that prevented use of my home server. I was starting to regret many things like buying a refurbished Worskstation, the refurbished memory DIMMs or that fact that I chose Centos7 over Ubuntu which I use for development.

A careful and determined Google Searchathon revealed the issue. My reason to persist was that other than disk IO causing feezing there was no issue with the system. I needed to RSync all my backups from the servers locally. Backups cost me over 500US$ per year. I wanted to save some of that money by atleast getting rid of server snapshots that cost 240US$ per year.


Ok back o the problem:

“CentOS or any distro appears to freeze and become completely urecoverable except for hard reset when heavy Disk I/O task is performed for example like copying files over CIFS, RSync or or any heavy read writes.”

I don’t even think it was massive I/O as many blog posts suggests that would cause this. Just about anything would cause it. Pretty stupid if you ask me that after 6 years the problem is still packaged and shipped to everyone.

The search led to a solution from 2010! This was a blog post and a Stack overflow post linked to blog post.

Gist is, it’s your IO Scheduler, the default one is [cfq] more on this below.

I dont immediately modify my server just because of one ServerFault.com answer or Blog entry so I checked the official documentaion at Redhat https://access.redhat.com/solutions/5427 It does appear to be the case that my system is using [cfq] as well.

I switched to deadline as noop is basically no scheduling, useful only inside VM and Containers.


This is the blog post for reference:



Change your IO Scheduler from CFQ to something else.


Check which scheduler is used by disk. LVM or not is not relevant here. Use pvscan to find your disk labels.

$ cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

$echo deadline >  /sys/block/sda/queue/scheduler

Above command applies it immediately. The settings are gone on reboot.

Do this for any or all drives. To persist across reboots refer to Redhat documentation linked above. You can use /etc/rc.local as well.