For the last few years, its been all Node all the time, but when it comes down to the crunch on a new big work project, I was surprised to find myself getting a little nervous with the Mean Stack. I love MongoDB (and especially love what they are doing with compass, which hopefully will have aggregation support soon – that’s right Thomas…I am talking to you!) but when you are implementing stuff that will sit along side old school tech stacks and you need to provision internal servers (I know, right?), everyone just gets too nervous when you go no SQL.
To get around this I have been playing around with some nice Node SQL tools, but they just don’t seem mature enough yet. And to be fair, I am still to really find a project that would actually really really benefit from a noSQL strategy. And I have someone in the team who actually prefers to do data cleaning and analytics in SQL than Python pandas (WTF??!! …seriously).
So the plan I have gone with is Angular and Django. The team’s analytics stack is python Jupyter so there is a nice link here, and I had forgotten how nice it was to work with the Django back end. I moved away from Django because it just became so messy for SPA apps but as a rest API it is just the best. Also having a python toolset in the back end for business logic is a breath of fresh air (I love you lodash.js….but even so…)
Having been away from Django for a few years, the refresher place to go was of course that darn polls tutorial they have on their site (how many times have we all done that??!!) but I also found another great refresher resource here. Greg’s Angular / Django post is perfect to cover all those little annoying set up things (and frankly that guy’s posts are must read for IT folks). And of course, perhaps to be expected, Pluralsight has some really nice Django/Angular stuff so check it out! Overall though, with good ol Django, it was just a few hours of pottering around in it and its back up to speed.
What the heck happened to the October 2016 release date for my software?! Apologies all, circumstances intervened and I left the quiet university life to take on another project which messed with the timeline somewhat.
Back on track now and hoping to be in beta by April 2017.
The other problem of course with your own software is the endless scope creep! I have taken the extra time to include some nice aggregated sequence searching tools and corpus grouping functionality (and thanks for the suggestions on this) but will leave the other updates for the post release version.
I had a plan to work up a quick tutorial on typical R commands for those starting with Python Notebook, but when I was looking around to see what resources were out there, I came across these – Greg Hamel’s Introduction to R and Python for Data Analysis
No more need to work up a tutorial – what an amazing set of resources (and kind of hard to find too!) that have been so well put together, a 30 part series on both giving a really nice balanced view of both. You should head on over and make sure are across these, and thank this guy (thanks Greg!)
And don’t go all should-I-learn-Python-or-R on me! Make sure you know them both! For me, R wins outs for its sheer joy of doing exploratory stuff, but Python has the edge when it comes to plugging into different stuff, and just generally plays better with messy tech situations.
Yes I know…what a thrilling blog post title! But you know, sometimes I get the feeling we get all caught up in overly complicated ways to solve problems, its neural nets that, boosting this, recommender that, yada yada, yada….like whatever!
We forget that in data analysis its so important it to have an intuitive grasp on the basics. And with that in mind I just had to post this amazing image from Sean Owens great blog post on understanding the relationship between different probability distributions. I started searching for something like this because I got into this long and thrilling conversation the other day on different distributions and everyone seemed a little too vague about it all. I tell you, I felt uneasy!
So check out the post and ensure you have a zen understanding of how they they all fit together!
I have finally released Stelupa out in to the wild! Here is a walkthrough of the features! I am still working on the backend query engine but expecting it to be finalised by October 2016.
Start using Stelupa
One of things about MusicXML is that it does not explicitly encode time information, but just sticks everything in a list. You can’t extract any particular note or rest from the data in order to find its exact position.
To solve this I have created a node module that takes a MusicXML file and converts it into MusicJSON! This provides all the time information you would expect such as tempo, the absolute location, the location in a measure of a note or rest, the location of the beginning of measures, and an ISO8601 compliant timestamp location in milliseconds that is calculated using location and tempo information.
Having this kind of information encoded in JSON makes data visualisation far simpler and also makes the data set well suited to a range of machine learning and machine intelligence applications.
I have included a pianoroll visualization express app in this node module (which generates the visualization below), and is the test I have found most handy. You can check it all out on my github at https://github.com/jgab3103/musicXML2Json
After using all kinds of editors over the years, from Emacs to Word to LaTex, I have to say there has never been a better time to be doing research. Tools like Jupyter Notebook and the MEAN stack mean that you can easily have access to an industrial weight data store right along side your academic writing. There is nothing cooler than having access to variables in academic writing, and being able to put together interactive visualisations so people can immerse themselves in your data when they are reading your stuff (well, ok, there is probably some cooler stuff than this…but even so).
But while this brave new world is cool, sometimes you do need to able to print stuff out. If you work across different disciplines especially, its not enough to have all your stuff living on the web. You need to be able to create high quality, camera ready papers suitable for hard copy printing.
To solve the issue, I created a cool angular research template. Styled for even the most discerning tweed wearing academic type, its a way to work in the web environment with a view to printing.
- A simple CSS set up to manage A4 pages and margins, fonts, and some Twitter Bootstrap styling, super handy for basic layout stuff.
- Angular driven website under the hood, so all the advantages that come with that. You can beautifully organise your data in factories and all that other Angular garb. Directives, along with the CSS, give you that nice LaTex style control that we all love and crave.
You can check it out on my Github (which is another neat thing of this set up – simple version control!).