` Niall McCarroll's Professional Home Page

Recent Blog Posts

Various musings about everything and nothing...

Mo Farah's Olympic (Rio 2016) 5000m final victory, in tweets

javascript, raphaeljs

21st August 2016

Four years on from the London Olympics he's only gone and done it again - the double double 5000m/1000m.

Mo Farah's "mobot" victory gesture. Once again, I tracked the tweets using the twitter streaming API (search terms #gomo,#motime,@mo_farah,#mofarah) before, during and after the race.

The interesting things is, well, the distribution of tweets over time is pretty similar to last time. Even the absolute rates in tweets per second are similar, despite the fact the race started at 01.37am British Summer Time. You can compare them youselves by looking at my original post from 2012.

More...

Compiling V8 on the Raspberry Pi 2

raspberrypi

4th April 2015

For a new project I needed d8, a shell for V8 Javascript Engine on my Raspberry Pi 2 running 2015-02-16-raspbian-wheezy. After some digging around it turned out to be fairly straightforwards to build from source but in case it helps anyone else, here are the steps I took.

More...

Startrails

photography, raspberrypi

14th December 2014

Equipped with only a Raspberry Pi, a Raspberry Pi Camera, and small USB Lithium Battery pack, I set out to capture some Astrophotography images on a clear winters night.

In particular I was aiming to capture star trails - the trails that appear to be left by stars as they move across the night sky (in fact it is the observer who moves as the earth rotates). I would take a series of still images using the Raspberry Pi's camera and store them onto the Pi's memory card. Later I would try to combine the images which would hopefully plot the star trails.

More...

Getting started with PySpark - Part 2

pyspark, python, data science

5th May 2014

In Part 1 we looked at installing the data processing engine Apache Spark and started to explore some features of its Python API, PySpark. In this article, we look in more detail at using PySpark.

More...

Getting started with PySpark - Part 1

pyspark, python, data science

2nd March 2014

Apache Spark is a relatively new data processing engine implemented in Scala and Java that can run on a cluster to process and analyze large amounts of data. Spark performance is particularly good if the cluster has sufficient main memory to hold the data being analyzed. Several sub-projects run on top of Spark and provide graph analysis (GraphX), Hive-based SQL engine (Shark), machine learning algorithms (MLlib) and realtime streaming (Spark streaming). Spark has also recently been promoted from incubator status to a new top-level project.

In this series of blog posts, we'll look at installing spark on a cluster and explore using its Python API bindings PySpark for a number of practical data science tasks. This first post focuses on installation and getting started.

More...

Building a Raspberry Pi mini cluster - part 2

raspberrypi

4th December 2013

In Part 1 I looked at the hardware required to build a mini Raspberry Pi cluster with 5 machines where one of the machines acts as the "head" node, connecting the cluster to the internet via a wireless lan link, and the other four are "worker" nodes.

In this post I'll describe how to configure Raspbian OS on each of the machines. We'll want to be able to log in to each of the worker nodes from the head node without a password, and we'll want to enable internet access on the head node and each of the worker nodes.

More...

Building a Raspberry Pi mini cluster - part 1

raspberrypi

12th October 2013

The Raspberry Pi is an amazing, cheap single board computer designed to hook up to a TV and help with teaching programming.

I was inspired by a story from Southampton University in the UK to build my own (but rather more modestly sized) Raspberry Pi cluster. The cluster has 4 worker nodes (Rev 2 model Bs with 512Mb RAM), and 1 head node (Rev 1 model B with 256Mb RAM). Each of the nodes run Raspbian.

More...

simpleproxy

python

14th July 2013

This snippet, simpleproxy, is a simple command line tool, written in python2/python3, for forwarding network connections on a specified proxy port, to another port (possibly on a different host).

simpleproxy can forward multiple connections at the same time and uses an approach called event driven programming. Rather than starting a thread to handle each proxy connection, the program sits in an event loop and waits for activity on each of the connections, moving data whenever a connection is ready. This approach is particularly suitable for programs which do a lot of I/O (and this one does little else).

More...

Northern lights - capturing time-lapse video clips from webcams

photography

26th January 2013

I've always wanted to see the northern lights in person, but in the meantime I experimented with capturing time-lapse video from web-cams set up on the Nature of Jokkmokk site.

While being careful not to overload the site, I set up a job to download images from one of the site's webcams every 5 minutes over a 5 day period, and found that the northern lights lit up the night of the 23rd/24th January 2013. It was then fairly simple to stitch the images from this night together using ffmpeg and some advice in Paul Rouget's blog post.

More...

pyworksheet, an online tool for experimenting with python

python

1st January 2013

I was recently inspired by the Scala worksheet plugin for Eclipse and could see how it was a great resource for learning to program in Scala. I started working on trying to create something similar for Python.

I've hosted an initial effort hosted on Google App Engine. To try it, visit http://pyworksheet.appspot.com. You can enter Python code into the widget and press the Go button to execute it - execution results are pasted into the code as comments and the widget display is refreshed.

The widget uses the excellent CodeMirror JavaScript library (http://codemirror.net) to provide the code editor component with Python-specific smart indenting and syntax coloring (CodeMirror also supports a large number of other languanges).

More...

Older Posts

twitstreamercommandline tool for reading tweets from the twitter streaming API1st December 2012
twitfetchercommandline tool for searching for and fetching tweets21st November 2012
Analyzing co-occurence networks with GephiThis blog post describes some early results analyzing networks co-occurence of individuals named in news stories12th October 2012
Displaying recent volcanic eruptions on a 3D globe with webgl/three.jsThis blog post uses webgl/three.js to plot the location of recent volcanic eruptions on a 3D globe22nd August 2012
Mo Farah's Olympic 5000m final victory, in tweetsThis blog post uses the g.raphaeljs library to chart tweeting rates during the 5000m final12th August 2012
Visualizing Reading FC's Winning 2011/2012 SeasonThis blog post uses the d3 library to visualize Reading FC's incredible 2011/2012 season8th August 2012
Connecting a Raspberry Pi to a ubuntu netbookThis blog post describes how to make a direct ethernet connection between the Raspberry Pi and a netbook/laptop running Ubuntu 12.0427th July 2012
HeatmapA javascript bookmarklet for adding heatmaps to HTML tables17th July 2012
Perill de caigudesPerill de caigudes6th June 2012
HBaseloadA utility for importing/exporting between hbase and csv12th December 2011
MinicrawlerAn extensible queue-based web-crawler12th December 2011
svgworldsvgworld renders a 3d projection of a world map using svg23rd December 2010
PySparqlSimple ways to access dbpedia data using python and Sparql11th January 2010
S3 CacheA local file cache for Amazon S3 using python and boto9th September 2009
j2jsAn article with some tips about how to use GWT to generate javascript library code from java code which is not user-interface related17th June 2009
SnapshotSnapshot demonstrates a way to create a static, standalone copy of the dynamic content in a web page7th February 2009
PPyPNGA pure python script with limited functionality for creating portable network graphics (PNG) format files6th July 2008