Dec 272016

Twitter Heron is an alternative to Apache Storm with compatible API as it was developed from Apache Storm. The major difference I found was that Heron opts to get rid of Threads and going with Process based workers. This is surprising and I am not sure the any advantage it would have over Threads+Process.

Heron also has done away with DRPC which is the one thing I need to provide direct access to a Storm cluster. I haven’t seen an alternative being mentioned.

Most of everything else is same as Storm except the sales page for Heron is better of course 🙂 The reason to explore Heron is ease of Production use and DevOps. It’s kind of difficult with Storm but Azure HDInsight might help as it has now Apache Storm 1.x supported in their HDInsight 3.5 version.

I hope to learn more from users of Storm and Heron who have been operating it in Production on what’s the pros and cons.


Dec 272016

Finally got a metal storage rack that I will use as Rack space. About US$99. Finally so much space. Comes with removable half panels, just in case. It’s the same sort I have in my storage room. Why did I not think of this earlier!

metal rack for servers/devices

Since I was unplugging everything to get the rack installed, I took the plunge and installed that 2nd Xeon CPU on the workstation and it went without a hitch. This is the first time I have actually installed the CPU myself. I did try earlier without a heatsink and it was overheating so the system would not boot. The BIOS logs accurately pointed as much, with the heatsink, no complains. Cleaned out the system with compressed air and a vacuum cleaner to catch the dust.

Got a lot done for Christmas holidays I must say.

It tooks months for me to get my Second Intel(R) Xeon(R) CPU E5420 CPU for the used Workstation I bought an year ago. There was some hassle and expensive shipping to get the Heatsink as the stock heatsink would not fit the Dell workstation spec. After all the upgrades to this beast of a Dell t5420 Workstation is finally complete.  My final server spec is

Intel(R) Xeon(R) CPU E5420
4TB external powered HDD
2TB 2x enclosure, powered disk cloner that doubles as USB drive
Edgemax Pro 3 Gbps router and Unifi pro wireless (next in line for network upgrades)
1Gbps down and 500mbps up fiber internet connection
excludes my desktop and peripherals.
Runs a decent Apache Storm cluster, mirrors OSS projects and chomps on AWS Kinesis streams while deploying to Azure app services.

There is no hyperthreading support unfortunately on this old Xeon and the 2 Disk LVM still suffers from IO issues. It could be the limits of 2009 SATA. I might have better luck with external storage for things like my Raspberry mirror. I have been very happy with WD My Book which is externally powered and has been my offline backup for almost 4 years now.

Wire management is going to get easier. Plus the Eubiq port is amazing for power supply. See picture

Eubiq power strip, still need to manage those cables.

It doesn’t sit on the floor so it’s gonna stay a lot cleaner and quieter. Once I am sure where to place what i an do some cable management.

Full blast LXC containers manage and host all my apps. Maybe I will try Docker at some point.

The workstation is heavy and I forgot to take picture of the installed dual Xeon CPUs. Maybe next time when I upgrade the disks.

Complete setup of my home workspace, much neater with everything else on the rack

my workspace, those cables need managing

Dec 252016

Save time and money. Go Backblaze. Crashplan sucks, terrible speed and it has not improved in years. They are full of excuses about it though so they are unlikely to address it in any correct way.

I am using both and going to ditch crashplan once subscription expires.

Keep it up Backblaze.

Dec 242016

I have been scouring the internet for opinions and alternatives to what Java has to offer in terms of both ORM and non-ORM for database access. Unlike Python, I do not clearly know what is out there for Java. Rails is easy, if you use Rails you use ActiveRecord and forget about anything else. Hibernate seems to be much-used historically for Java and Spring framework as ORM layer. In any case, for building small apps and micro-services which are data centric it would be overkill to use ORM. Unlike Rails/Ruby, Java projects are much more active and well maintained.

Found an interesting thread on Reddit discussing alternatives to ORM even though the ask was about ORM. I found the following possible non-ORM alternatives that could come in handy.

JDBC : Pure database and SQL query part of Java API. Verbose (which Java tends to be anyway) but no frills. This is the direct interface to Database using the API standard. I think it’s important to learn this way first anyway plus it’s always going to work if you get it right. Where complex POJO are not required , JDBC is the best option.

JDBI: An interesting wrapper to JDBC that makes it easier to use JDBC and less verbose. Surely worth a look and the DAO layer is all we need mostly to not have to write the same queries over and over again.

JOOQ: Java 8 lambda supporting interface to JDBC much like JDBI but with lot more features including code generation from database. A lot of focus on functional programming and lambdas. Good reason for it as the author also created lambdaj for Java 5. It has neat data binding practices which I would certainly consider for SparkJava. It is commercial though OSS. it is still free for non-commercial databases. So if you use MariaDB or PostgresSQL you’re fine. Overkill for small apps as some suggest but it’s worth mentioning.

SQL2o – I only discovered it when I was evaluating SparkJava and it is amazing. It is as low footprint as JDBC but adds Java class mapping, extremely terse. This could be a go to instead of JDBC. Try-with-resources for added ease of use.  Definitely looking like my go-to after JDBC

ORMLite – It’s a lightweight ORM. Commonly used for Android apps but can be used as server side just the same. Weird licensing so I am unlikely to use it.

Once I run out of JDBC which is likely as I write more POJO to save objects, I might consider SQL2O

Dec 162016

I have been getting familiar with Java a lot more than I had planned to in my career. It so happens I need to develop a standard set of APIs for public consumption and I happened upon which is an API Framework/Guideline. Its similar to WSDL. It allows you to auto-generate code from Contract-first Document. So you could write what your JSON looks like and Swagger will generate code. I noticed it supported tons of JAX-RS framework. Now JAX-RS is another API-Spec for Java and it works so well together. There isn’t any official bridge between Swagger and JAX-RS but I was sold. So were other folks in my team. We can generate both server and client code from API specification. Ofcourse we need to do the Business Logic but none of the grunt work.

So JAX-RS is what I am going with. It’s perfect for someone coming from Python Flask/Django world. I suggest this good tutorial videos on JAX-RS with jersey done by Koushik, from Java Brains. You can find the playlist here There is also advanced JAX-RS which I haven’t checked out yet, but good to know if I need more info. Even though I have production APIs serving millions of users every day, there is so much I don’t know and looking for answers always surprises me.

I am currently using Azure API services for deploying the JAX-RS services and it’s quite a easy as they support git push.

The reason for getting in depth in to JAX-RS is so i can do contract last and annotate for Swagger. But it really depends on your requirements and project

Swagger is not covered in the videos. I will post about it once I have worked out a sample.

Edit: This is the best sample project I have found

It has everything you need. A JAX-RS sample project with Swagger annotations. I recommend this way, JAX-RS->Swagger, instead if you are not going Contract First method

Dec 142016

I have been using du and df with a bit of other bash scripts to gather disk space usage information on my linux servers. I need to clean up disk space often specially when I am reaching 70% usage just to be sure I will really need that next upgrade. On Windows we have WinDirStat and for KDE there is KDirStat (same tool ported to Windows) but for headless Servers GUI is not an option.

I have come accross the command line tool called ncdu that can be installed from the regular Repository. Just yum or apt-get this tool.


You can get per directory usage sorted by space taken like this

ncdu <root-path-to-use>

You can even hit d to delete directories and files recursively. None of that xargs weirdness to handle large number of files. It takes care of that.


Give it a spin.