Saturday, December 28, 2013

Streaming Systems history, part 3

Finally, here we are on the third part of the streaming system history. I'm sorry it took so long, but I had many, many things to do for work and I was very busy with real life stuff.
In this part I'm going to introduce systems from 2011. If you find any mistake or anything incomplete feel free to let me know by commenting the post! Not all the 2011 systems are shown in this post, I will complete it with the final, fourth part.

In this part I'm going to quickly talk about Storm (2011) and WebRTC (2011).

Storm is a distributed computational environment, free and open source (check out the link!) originally developed by Twitter. The building blocks, where information can be manipulated and created are called "spouts" (blocks that produce data) and "bolts" (blocks that receive streamed data). With them, developers can build a pipeline; they are indeed very similar to MapReduce jobs, with the only difference that they can (theoretically) run forever. Every spout and bolt can be run by one or more worker, where the core of the Storm computation is executed.

Given spouts and bolts as basic building blocks for a pipeline, developers can build Direct Acyclic Graph pipelines. The programming language is Java, thus the programming representation is imperative. Storm pipelines can be deployed on the cloud: that is they can run on local machine and have spouts or bolts somewhere on the cloud. This implementation detail became very important as time passed by. The pipeline is dynamic at node level, that is a Storm spout or bolt can alter the number of workers at runtime. Faults are dealt by reconfiguring the pipeline at runtime thanks to some controller threads that constantly run checking for errors. I call these kind of controllers "dynamic adaptive", like the ones seen in DryadLINQ and S4.

Web Real-Time Communication (WebRTC) is an API currently drafted by the World Wide Web Consortium (W3C). It enables browser-to-browser connection for applications like video, chat or voIP. Many examples of this API have been implemented since its initial release, for example peer-to-peer file sharing or video conferencing applications. This API is not a real streaming system per se, but brings a whole new connectivity for what concerns browser applications, opening path for future browser streaming applications.

Given the fact that WebRTC is only an API, we can build any kind of pipeline (arbitrary pipelines). It can only be programmed in JavaScript, thus likewise Storm, its programming representation is imperative while the deployment, for now, is only the web browser. The pipeline flexibility is dynamic at topology level, as nodes (browsers) can connect and disconnect from the pipeline at runtime. Of course there is no fault tolerance or load balancing, thus once the pipeline is disrupted, it can't be reconstructed again. Ideally, we would need a software on top of it to do so.

This was a small part, in respect to the previous two. As I mentioned, I'm very busy lately thus I only found time to write this. The next part will hopefully be the last one (this one was supposed to be, but it turned out to be very long).
See ya!


Monday, November 25, 2013

Streaming Systems history, part 2

Here I am again with some history on streaming systems! Today I would like to dive into systems born from 2005 to 2010. If you find any mistake please comment the post and let me know!

In this post I'm going to talk about the second generation of streaming systems: Borealis (2005), SPC (2006), DryadLINQ (2008) and S4 (2010).

Borealis is a distributed stream processing engine which inherits much from Aurora and Medusa. Aurora is a framework for monitoring application born around 2002. At the high level, the system model is made out of operators which receive and forward data thorough continuous queries. After Aurora was born, Aurora* was proposed, which is a version of Aurora with distribution and scalability. Borealis implements all the functionality of Aurora and is designed to support not only dynamic query modification and revision but also to be flexible an highly scalable.

Borealis can run Directed Acyclic Graphs (DAGs) as pipelines but not Arbitrary pipelines. A bachelor thesis (if I'm not mistaken) proposed a graphical user interface for Borealis that let users build pipelines using visual building blocks. Otherwise, one can implement a pipeline directly in C# (imperative programming). Borealis is the only one system in the second generation that supports not only deployment on clusters and the cloud, but also on pervasive devices (i.e. microcontrollers).
At runtime, it is flexible at node level to handle failures: it is capable of reinitialising a failed node while the stream is running, thanks to a replication mechanism that sends the data to the closest upstream replica. Load balancing is again performed with load shedding.

SPC stands for Stream Processing Core and is a middleware for distributed stream processing which targets data mining applications. It was built to support applications that extract data from multiple digital data streams. Topologies are composed by Processing Elements (which are what I call nodes) which implement application-defined operators that are connected by stream subscriptions.

Likewise Borealis, this system supports DAG as topology and has an imperative programming language as well as a graphical user interface. The deployment is only available on cluster of machines and the cloud. It is flexible at pipeline level, in the sense that it can change pipeline while the stream is flowing. The idea is to have the pipeline to be extended at runtime: new nodes can join the pipeline and connected to the stream while at runtime. Again like Borealis, it has replication to cope with faults and has no mean to cope with load balancing.

DyadLINQ imports a new programming model for distributed computing on large scale. It supports both general-purpose declarative and imperative operations on datasets thanks o a high-level programming language. A streaming application built with DryadLINQ is composed by LINQ expressions automatically translated by the compiler into a distributed execution plan (which DryadLINQ calls "job graph", and is a "topology") passed to the Dryad executing platform.

DryadLINQ supports DAG pipelines and both imperative and declarative programming models. The deployment is only targeting cluster of machines, while likewise SPC, it supports dynamic topology reconfiguration at runtime. Fault tolerance is indeed tackled with reconfiguration while load balancing is faced by a dynamic adaptive controller. More in details, DryadLINQ exploits some hooks in the Dryad API to mutate on runtime the pipeline to improve performance. Ideally performance is improved by aggregating nodes, thus decreasing I/O times.

Last but not least, we have S4 (Simple Scalable Streaming System), which is again a general-purpose distributed streaming platform developed by Yahoo! and inspired by the Actor model and MapReduce. It allows developers to program applications that process unbounded streams of data. Developers can set up pipelines with Processing Nodes that host Processing Elements. Each Processing Element (PE) is associated with a function and the type of event that it consumes.

Also S4, like almost all the others, supports DAGs as topology. The programming model is imperative and the deployment is possible on cluster of machines and the cloud. The pipeline is flexible at Node level, while for fault tolerance it supports reconfiguration (it can reconfigure a pipeline splitting the nodes deployed on a failed host among the remaining available execution resources) and for load balancing it again shows a dynamic adaptive controller.

And that's it. Hopefully I haven't made much mistakes.

See you next time with the last part!

Friday, November 15, 2013

Streaming Systems history, part 1

Lately I've been very busy studying streaming systems in general. I've took a look at the last decade streaming systems and what literature proposed. I want to share a little bit my findings summing up things here a little bit. I'm going to divide this series of posts in three, one for each generation of streaming systems. All the data should be more or less correct, but feel free to comment below and ask question of correct mistakes if there are! I would like to be the more precise as possible.

The first generation of streams proposes, among all, StreamIt (2002), Sawzall (2003) and CQL (2003).

StreamIt is a compilation infrastructure and programming language created to setup pipelines of streams. Users can create any kind of topology of the pipeline thanks to different kind of filters available. The filter structure available are:

  •  Pipeline: Let the user build a filter which has one input channel and one output channel, thus with a series of pipeline filters you can only build linear streaming pipelines.
  • SplitJoin: The first filter is a Split which has two output streams, then there can be different pipeline filters and at the end a Join filter which joins the work performed in parallel.
  • Feedback Loop: Let the user build a node which has two output streams, one that goes on in the pipeline and the other that feeds itself. Useful for tail recursive computations (i.e. Fibonacci).
The programming model is imperative, with a much Java/C++-like syntax. Pipelines built with StreamIt can be run on clusters of machines. To cope with load problems it has a mechanism of reinitialisation, where if a bottleneck is found, the pipeline is restarted with a different configuration to cope with the load changes.

Sawzall is a procedural domain-specific programming language which includes support for statistical aggregation of values read or computed from the input, very similar to Pig. Sawzall was developed by Google to process log data generated by Google's server. It processes one record at a time, and emits an output in tabular form. Sawzall is stateless, and thanks to MapReduce, each Sawzall program can run on multiple hosts (cluster of machines). 

Last but not least, CQL (Continuous Query Language) is an SQL based language for writing and maintaining continuous queries over a stream of data. Those queries are suitable for reactive and real-time programs; for example to keep an up to date view of data. It was implemented as a part of a project named STREAM.
Here is an example of a CQL query:

Select Sum(O.cost)
From Orders O, Fulfillments F [Range 1 Day]
Where O.orderID = F.orderID And F.clerk = "Alice"
    And O.customer = "Bob"

As you can see, CQL is much alike SQL, with the difference of having a timespan on the query (i.e. the day range in the example). This is because queries are performed over continuous streams of data, thus a time range has to be specified to get a result. In the example, the query selects the sum of the total costs of orders in one day, performed by Alice and bought by Bob.

And that's it for the first part of this streaming system history. I'm sorry about not writing something more about Sawzall, if you have some suggestions, comments or corrections please comment the post!

Thursday, November 7, 2013

I'm still alive

Hi all, I'm sorry I couldn't post much later. I've been in vacation for a while, and then suddenly a big chunk of work arrived, and I had practically 0 time to write here. I'm not even programming anymore, just paper writing and teaching assisting. Hopefully I'll be back on track after the 15th, the last deadline for the last paper I wrote.
Until next time, see ya!

Tuesday, August 13, 2013

Install a specific version of Node.JS

Quick post (personal reminder above all) about how to install a specific version of Node.js. I'm working on Ubuntu 12.04. The idea is to clone the whole repository and then checkout only the interested version. Then install it. Here are the commands.

git clone
cd node
git checkout "v0.8.18"
export JOBS=2
make install

And you are good to go!

Monday, August 12, 2013

I'm still alive - !false and !undefined in JavaScript

Hi all, I've been very busy lately, and then I went for a holiday, so I didn't update the blog. Sorry!
I recently came back from the seaside and started to work on a bug I had since forever. I thought it was a deeply rooted bug that was spread in different methods. It turns out it was not like that. And that's what I discovered:

First of all, I had this array that kept count of how many distributed processes answered a pull request. Each process has an ID, and it corresponded with the index of the array. The array first is filled with false values. As an example, imagine process ID 0 that is polled. At first the value in the array at index 0 is false, but when the process answers, I set the value in the array at index 0 to true.

If the process don't reply after a certain threshold time, I execute something. I used to check this by going through the array in this way:

for(var i = 0; i < processes.length; i++)
        //do something

In other words, if array[i] is false, it means the process did not answer.
This could look correct if only I would take into account the fact that, concurrently, some processes may be spawned, thus increasing the processes array. Since I didn't polled the newborn processes, I don't want them to be checked. Of course, with the shown code, this was happening. Luckily, JavaScript fills with undefined the indexes of an array which have not be initialised; but on the other hand the evaluation of !undefined is the same as the evaluation of !false. This clearly lead to a bug which always executed something, even if it was not the case. Again, luckily with JavaScript I could correct this very easily:

for(var i = 0; i < processes.length; i++)
    if(array[i] === false)
        //do something

And that's it!

Monday, July 1, 2013

Read a file line by line in node.js

Hi all. Today I decided to add a small functionality to my project: when I start my framework, I run stuff one after the other with a list of commands; I decided to add a functionality that reads a particular file (which I want to be a .k file) and extract the commands from that list.
To do so I need some things:

  1. Read a file
  2. Make sure it's a .k file
  3. Split it line by line
  4. Execute each line
Reading a file in node.js is very simple. We just need to import the fs module and call the function readFile:

fs.readFile(cmd[1], 'utf8', function(err, data) {
    if (err) throw err;
    //do something with the file

This will read the file specified in cmd[1] (which is the input variable I give). Next, I would like to add some more checks (for example, that it needs to be a .k file):
if(cmd[1].indexOf(".k") === cmd[1].length - 2){
    fs.readFile(cmd[1], 'utf8', function(err, data) {
 if (err) throw err;
 //do something with the file    
    console.log("Not a .k file!");

Finally, let's read the file line by line:
if(cmd[1].indexOf(".k") === cmd[1].length - 2){
    fs.readFile(cmd[1], 'utf8', function(err, data) {
 if (err) throw err;
 var commands = data.split('\n');
 for(var i = 0; i < commands.length; i++){
     //commands[i] contains lines of the input
    console.log("Not a .k file!");

Notice the .split('\n'); function, that splits the string into an array at every '\n' occurrence. Hope it helps!

Thursday, June 27, 2013

OUYA Development

Hi all. I've been quite busy lately. I travelled more than I expected and I didn't work much on JavaScript and my project. What I did was mostly bug fixing on not-so-interesting stuff, which did not lead me to interesting discoveries about the language in general.
What I wanted to post today is about OUYA, this new gaming platform. It's a small console with an Android engine inside. You plug it to your TV through an HDMI cable and can start browsing the store to buy games.

Yesterday I installed the ODK, OUYA Development Kit, following the instructions here. I would like to try and write some games, as I have always been attracted by game development.
Since I don't own the console I thought I could not test what I produce. Luckily, there are some Android phones that have the same power more or less, so apparently test may be run on the virtual device from Eclipse. I will let you know if this is actually possible (I still don't see how to emulate the controller, for example...).

If I ever start programming something, I will show the outcomes here.

Tuesday, May 28, 2013


I'm sorry for the long lack of updates. Recently I've been very busy. I travelled a lot and I had a lot of work to do, so not really much time to work on my project and discover new interesting things.
Today I want to post something that bothered me for a while. I was calling an asynchronous function, m_cli.get(), inside a for loop and wanted to keep the index variable as it was going to be used in the callback.
The first approach was the following, and of course was not working:

for(var i = 0; i < list.length; i++){
    mc_cli.get(list[i], function(err, response) {

With this approach, the callback would always execute using the last value of i. So, for example, in the list was long 10, it would always call do_something(9). To fix the problem I tried with a closure:

do_something((function(x){return x})(i)) 

The idea is to "keep" somehow the variable so that it could call the function only when the callback returns and keeps the right index. Unfortunately, also this approach wont work. Also by creating the closure outside the for-loop and calling it in the callback would not lead to a satisfying result.
Later on, I managed somehow to fix the problem like this:

for(var i = 0; i < list.length; i++){
        mc_cli.get( parsed_result.list[i], function(err, response) {

The idea is that the parameter now is the input parameter of an anonymous function which will then call the asynchronous function and when the callback fires, call the do_something(i) which the local input parameter, which is the correct index (and not the last element of the array). Basically it will treat the index as an input parameter, thus "remembering" it through the execution of the asynchronous function.
Hope I helped somebody!

Friday, April 19, 2013

Running a script at startup with Raspberry Pi and Raspbian

First of all, I'm sorry for the lack of updates. I've been very busy with teaching and grading. Plus, my research is not leading me to discover new libraries or programming models, so I have nothing to write here.
Lately, I've been working with deploying my Raspberry Pi integrated with some sensors. I wanted to a robust deployment, but how to do that? Should I bring my monitor wherever I want to deploy the Pi to set it up?
Luckily there is a very easy way to do that. In this way, whatever happens (like power supply unplugged), whenever the Pi is turned on again, it will start again what it was doing, effectively removing the need of connecting it to a screen and a mouse/keyboard.
The procedure is very simple: the idea is to add a script to the /etc/init.d/ path as many of the scripts running at booting runs from there. Here's an example:

#! /bin/sh
# /etc/init.d/myscript

# do something like running your node.js server or client!
cd mystuff/mynodeserver/
node my_server.js

exit 0

Once you save this file in the location specified before, you should make it executable. I usually run  chmod 777 but some people prefer to give just the root the power to execute that, so they call chmod with 755 instead.
Once this is done, you just have to update the symbolic link to make the script execute at startup and that's it! To update them just run update-rc.d myscript default.
To remove it, run update-rc.d -f myscript remove.
And it is as simple as that. I hope I helped somebody with this little trick!

Thursday, April 4, 2013

ZeroMQ on Node.JS and Socket inspection

First of all, I'm sorry for the lack of updates. Lately I've been writing papers and not really coding. Moreover in the last few days I was on vacation, so no computer either.
Anyways. In my project I'm using ZeroMQ which is a very good socket library. I use that to make my workers communicate with each other.

Lately my main concern was message loss. Since I increase and decrease the number of workers, it may happen that some worker gets shut down when it is receiving, processing or sending a message.
The very first approach I had to solve the issue was to save the timestamp when a message was received and then wait some time. If a message was not received within that time (last_received_message - time_now > some_variable) then no messages will ever arrive anymore and I would shut down the worker. Moreover a flag would help me if a message is being processed (that is, when receiving a message a flag is set to true, when the message leaves the worker, the flag is set to false).

The problem is that I cannot possibly access the socket's queue to check what is inside and if I have to wait some more time before shutting the worker down. Eventually I found out about the getsockopt() function and its return values.
Before showing the code, I have to tell that this is not a final solution, nor the very right way to do it. For what concerns my sockets, I use PULL and PUSH. This means that I can only have two valid options for both. For the PULL socket which is read-only:

0 = nothing to read
1 = have something to read

For the PUSH socket which is write-only they are:
0 = can't write
2 = can write

BUT. The getsockopt(ZMQ_EVENTS) & ZMQ_POLLOUT > 0 does not mean there are no messages in the queue. It just means that the queue is not full and the socket is ready to accept some more for sending. On the other hand getsockopt(ZMQ_EVENTS) & ZMQ_POLLIN == 0 guarantees that the incoming queue is empty.

if(msg.command == 'kill'){
       var time_now = new Date().getTime();
       //if 10 seconds passed without receiving any message or no message received at all (producer or useless worker)
       if(time_now - last_message_received > 10000 && !execution_flag || !last_message_received || receiver.getsockopt(zmq.ZMQ_EVENTS) | zmq.ZMQ_POLLIN == 0 && !execution_flag){
    }, 1000);

So basically I set up a timeout each second that checks if something has been received, if the worker is working on something or if is not working at something AND the POLLIN value is 0.
I still have to check this approach, but the given values for the bitmasks are correct.
If you have a better idea I'm open to suggestions. For now I think I will keep it this way.

Tuesday, March 26, 2013

Citing websites on BibTeX

While writing the last paper I'm working on, I realised I also needed to cite some web reference. As I'm used to input BibTeX entries that are papers or books, I didn't really know how to cite websites.
Apparently this is not supported by BibTeX, which is strange since the web is growing more and more important nowadays, so I found out another way to cite the online resources. This is not really the correct way to do it, as it hasn't been figured out yet, and some trick were using the @MISC type. I will use the @ONLINE type:

author = {Caio, Tizio},
title = {Tizio Caio Website of important reference},
month = mar,
year = {2013},
url = {}

Some other way to turn this around include @MISC (as mentioned before), @OTHER, and @BOOKLET. The code is basically the same, and so is the result eventually:

author = {Caio, Tizio},
title = {Tizio Caio Website of important reference},
month = mar,
year = {2013},
url = {}

author = {Caio, Tizio},
title = {Tizio Caio Website of important reference},
month = mar,
year = {2013},
url = {}

author = {Caio, Tizio},
title = {Tizio Caio Website of important reference},
month = mar,
year = {2013},
url = {}

I hope this is going to be of some kind of help, while waiting for some standard way to do it.

Friday, March 22, 2013

Arduino & Node.JS

Lately I've been experimenting quite a lot with Raspberry Pi, but I was curious about what can Node do on Arduino. So I got an Arduino and searched for some useful library to run Node on it. I immediately found out what I was looking for : Noduino.

The framework is really easy to run: just download the branch, then upload on your Arduino the du.ino file and you are good to go. I tried a very simple example given in the repo: test.walkLED.js and built a very simple circuit:

I think the most important thing was that the setup is very simple: it took me more to build the circuit than to install things. I won't paste here the example code but if you are interested I suggest you to go and check that particular example out. It shows how simple the code is to deal with such low level interfaces.

Monday, March 18, 2013

Cobbler Breakout Kit

When I first ordered my RaspberryPis, I didn't realize I would need some kind of male-to-female cables to connect the breadboard to the I/O pins on the Pis, which are male. I decided to spend I little bit more and buy the Cobbler Breakout Kits. These kits are very easy to solder and give some help in building circuits: on the board there are all the names of the pins which corresponds to the pins on the Pi. Here is one already soldered (don't mind the bad soldering...) :

It comes in 3 parts: the blue one, the black one and some small pins that have to be soldiered to the left and to the right of the blue board. First of all one needs to solder the black part, where the ribbon goes, to the board. The soldiering is done keeping the board upside down in a way that the pins of the black input are visible. Then I had to solder pin by pin: 

Then it comes to the pins on the left and right of the board. To do that I put the pins on the breadboard, then I put on top of them the board and soldered them one by one again:

Here is a top view of the breakout kit with all the names of the pins. As I said it makes like easier as you have them there and you don't have to look for an explanation of the pins.

I had many things to do this update came late. I just ordered an analog-to-digital converter. In some following update I will explain how to connect that to the Pi in order to get some analog data with digital output.

Monday, March 11, 2013

Raspberry Pi and Button Press Example

Hi there. This post is not going to be on JavaScript but it will be a bit more centered on the Raspberry Pi and what I did today. As pointed out in an old blog post, I'm working on a software that creates streams of data and pass them through a topology. As data sink, I though it would be very nice to have a Raspberry Pi which gathers data from the environment, for example.
Today I started to implement something on the RPi. Nothing serious, but I wanted to test the GPIO. First of all I had to install some Python dependencies to read data from it

$sudo apt-get update
$sudo apt-get install python-dev
$sudo apt-get install python-rpi.gpio

After this, I had to learn a little bit about the GPIO. The General Purpose Input/Output is an interface available on some devices capable of getting inputs and sending outputs. In the RPi, this is the male pins situated on top of the RPi logo in the latest devices.
The following schematics shows which pin corresponds to which usage

Today what I will do is connect a simple button and trigger some event on the RPi. The event will only be a print line saying something. The purpose of this test project was to get my hands dirty a little bit with the RPi by starting with something extremely simple.


- 1 Raspberry Pi
- 1 Ribbon
- 1 Breadboard
- Some wires
- 1 10k Ohm resistor
- 1 Button sensor

How to build

Place the button on the breadboard. On the row of one pin of the button, I had to connect one head of the resistor. The other head had to be connected on another row. On the same row of this head, then, I had to connect the 3v3 pin (pin 1) of the RPi. To do that I had to use the ribbon and then use a wire from the other end of the ribbon.

I have chosen to use the pin called GPIO17 on the image shown. From that pin I connected a wire which was placed on the same row in which one head of the button and one of the resistor were placed. Finally, the ground pin was connected to the other end of the button.


The code was a simple piece of Python that gathered the input from the pin number 17. Here is it:

import RPi.GPIO as gpio
pin = 17;
gpio.setup(pin, gpio.IN)

while True:
    input = gpio.input(pin)
    if input == False:
        print('button press');
        while input == False:
            input = gpio.input(pin);

Be careful and respect indentation as Python will complain otherwise. I have chosen pin 17 but any GPIO pin will work as long as you change it in your code. The code is extremely simple: with a while True it checks if the button has been pressed. If that is the case, then, it will print the message. While the button is pressed it will just fall into the second while statement.

To run the code, just do this

sudo python

And that's it. Here is a picture of what I built.

As you can see, the blue wire is the 3v3 pin, the yellow one is the GPIO17 while the black one is ground.
I hope to build some cooler stuff and show it here.

Tuesday, March 5, 2013

Passing parameters to setInterval & setTimeout

Sometimes we need to play with timed events. JavaScript comes to the rescue giving access to setTimeout and setInterval. The first one fires only once after a certain amount of time passes, while the second one fires every time a given interval passes. For example if we give 1000 as one of the input variables to setInterval, it will fire every second.

One problem that may arise using these functions is when we want to call a function defined somewhere else and pass a parameter to it. If we try to do the following:

var myFun = function(param) {
    //do something

var myParam = 'parameter';

setTimeout(myFun(myParam), 1000);

Instead of executing the function after 1000ms, it will execute it immediately. This is because the compiler first interprets the function calls, like myFun(myParam), and then passes the result as the callback to the setTimeout function. To avoid this, the usual way is to pass function as a variable, thus omitting the parentheses. This of course leads to one problem: how to pass parameters to the callback function? It is done in the following way:

var myFun = function(param) {
    //do something

var myParam = 'parameter';

setTimeout(myFun, 1000, myParam, ...);

The three dots there indicates that each following input parameter will be treated as an input parameter to the callback function. The same concept applies for the setTimeout function.

Monday, March 4, 2013

Remote access to file through file editor on ubuntu

While taking some measure on one of my softwares, I noticed instead of going through the ftp every time to get the results, I could just open some text editor. For this things I installed gedit on the Ubuntu remote machine I'm working on. I tried the following command

gedit filename.txt

Cannot open display: 
Run 'gedit --help' to see a full list of available command line options.

As you can see it produced that error. The problem I had (and apparently is quite common), it's the fact that when I logged on my remote machine through ssh I forgot to include the correct flag:

ssh -X

Mind the capital X as flag. I kept logging in with lowercase x which lead of course to wrong behaviour.
I'm sorry for the small update, but I noticed many many programmers asked this on StackOverflow, so I though it may be useful keeping it written somewhere :-).

Saturday, March 2, 2013

Appending text to a file in Node.JS

Recently I've been testing the application I'm working on. Since it is already complicated by itself I don't want to add a database support to it. Instead, I thought about writing data on a file. In the context of my software, I'm sending messages to different processes on remote machines and I want to be sure I'm not loosing any of those messages. To check this, when I receive a message I store its ID (every message has one) in a file, appending it to the end. Then by using the sort bash command I sort them and see which messages I lost (or simply through a wc -l filename I can check that all the messages I sent arrived).
To write on a file in Node.js this is the usual process:

var fs = require('fs');
fs.writeFile('log.txt', 'Hello', encoding='utf8', function (err) {
    if (err) throw err;

But this would overwrite the content of the file and write 'Hello' in it. To append text on a file, recently Node added the following command to their APIs:

var fs = require('fs');
fs.appendFile('log.txt', 'Hello', encoding='utf8', function (err) {
    if (err) throw err;

As you can see it works exactly like the writeFile function except that this would append 'Hello' at the end of the file. Moreover, if the file doesn't exist yet, it will create it for you. Pretty useful!
What I noticed while looking at the file that was filled in the meantime with IDs (I kept inspecting it through wc -l filename) is that sometimes it "blocked" itself on some values, and then after a while it added a bunch of IDs and then again it blocked, and so on. I was pretty frustrated by this, since I thought it was a problem of my topology. In the end I asked Google about the appendFile function, since I was interested in its performances and I found out that it is asynchronous. That means I never had any error concerning the odd behaviour it had, it was just because the execution of the function itself is asynchronous. Thus I searched for a synchronous way to append text on a file and on the Node API page I found this:

var fs = require('fs');
fs.appendFileSync('log.txt', 'Hello', encoding='utf8');

As you can see it works a little bit differently from what appendFile does: it doesn't take a callback function anymore. And that's it. Now it appends correctly each time the process receives a message, so I can better monitor what's going on.

Thursday, February 21, 2013

Dynamically resizing Google Chrome Extensions

Lately I've been working on an extension for Google Chrome to read the last entry of the famous xkcd comic. At a certain point I stopped because I didn't understand how to dynamically resize the popup window of the Chrome extension. This required quite some research to understand what was going wrong, here is how I created the extension, which problem I had and how I solved it.

How I did it

The extension is very easy, it executes a GET request to the xkcd website, gets the content of the index page and returns it. What I used to do, for a matter of simplicity, was getting the content, paste it in an hidden div, get the image of the comic of the day (through a getElementById call) and put it in a visible div.

The Problem

What one would expect is that the Google Chrome popup page of the extension notice the change of content and resize itself to fit the new content. But this doesn't happen. The popup page remains as small as it can get without styles, or of a fixed with/height if specified in the css file. I didn't wanted to have a fixed width/height as comics usually have not a fixed size (sometimes they are as small as a strip, sometimes they are quite big).
Since the popup extension doesn't resize, I though about resizing it using JavaScript:

var width =  document.getElementById("comic").offsetWidth + 10;
var height = document.getElementById("comic").offsetHeight + 10;

//[...] = ""+width; = ""+height;
document.getElementsByTagName("html")[0].style.width = ""+width;
document.getElementsByTagName("html")[0].style.height = ""+height;

(notice that I get the width and the height of the image and add a small offset to make it visually finer). The fact is that this approach wasn't working either. When opening the extension, one would see it flicker: first a frame of the correct size, and then a resize to the default 10 by 10 pixel frame.

The Solution

The solution was quite simple, but without somebody mentioning it on StackOverflow I would be still stuck. The fact is that the popup of the Chrome Extensions don't resize if there is a hidden element in the HTML tree. That is, basically the whole page I was hiding was preventing me to resize the popup frame. So the solution is:

document.getElementById("hiddenDiv").innerHTML = "";

And that solved every problem. The frame now is the correct size and I can read every two/three days the new xkcd comic.

Friday, February 15, 2013

Newline after every character

I was printing some results for a chart while, after 10 minutes of executing the test, I realised I forgot to put the \n command after the number I wanted to print. I know the numbers I was printing were smaller than 10, so I came up with a useful command to append a new line after each character in a file:

sed 's/\(.\)/\1\n/g' -i filename

And that's it. As easy as that. I'm sorry for the lack of updates but lately I've been very busy.

Wednesday, February 13, 2013

Removing an event listener from an element

Today I was working while I was asked how to remove an event listener from an element which had an
  click event. At first I didn't really remember how to do it, so I Googled a bit and I found this way:

document.getElementById("your_element_id").addEventListener('click', function(){
    //do something
}, false);

As you can see, all the work is done by addEventListener and removeEventListener functions. The first one takes as parameter the event (click in this case) and a function that instruct the event listener on what it has to do with that event. The third argument specifies whether the EventListener being removed was registered as a capturing listener or not (set false by default).

The function removeEventListener takes basically the same input parameters. In the element that has to remove, it specifies which element (click), which function (in this case I use arguments.callee as callee is a property of the arguments object and it can be used to refer to the currently executing function), and the third argument that specifies if it was registered as a capturing listener or not.

Of course there are simpler way to do this. It turned out that who asked me how to remove an event listener from an element was using jQuery. In that case the code looks even simpler:

$('#foo').unbind('click', function() {
   //do something
$('#foo').unbind('click', function() {
   //do something
}, false);

As the previous examples, also in this case there are three parameters. jQuery is smart enough to take all the three possibilities as fine. The first one will remove every listener binded. The second one every function binded to that listener, the third one will remove that particular function binded to that particular event, while the fourth one just specifies the third argument that by default is false.
And that's it. Pretty easy.

Tuesday, February 12, 2013

setInterval/setTimeout and this

I think everybody is familiar with the setInterval and the setTimeout functions. They create a timed event which will occur after a given timeout. setInterval creates an interval: after the given time it will repeat the same action over and over, while the setTimeout function will only execute the action once. Let's see a small example on how they work:

var timed_function = function(){

var timeout = setInterval(timed_function, 1000);

The shown example will execute every second (1000 milliseconds) the function timed_function.

Now, I recently worked with a timed event and I had to access a variable within this. The naive approach was the following:

//previously set variable
this.message = "hello!";

var timed_function = function(){

var timeout = setInterval(timed_function, 1000);

But this printed undefined. I then discovered that when setInterval or setTimeout are executed, the scope of this apparently refers to the window object, thus it's completely loosing the scope of the object in which the function is executed. The idea to solve this problem is to wrap the call inside another object which has to be called to initialise the timeout:

//previously set variable
this.message = "hello!";

var wrapper = function(that) {
    var timed_function = function(){

    var timeout = setInterval(timed_function, 1000);


In this way I basically instantiate the timeout passing through the wrapper function which will keep a reference to this thanks to the input variable of the function. The input variable then will be called inside the timed_function. Hope this helped somebody. I had a couple of issues trying to solve this because I was confused by the scope of the variables. Some folks helped me on StackOverflow luckily, even though some of them where short-sighted and arrogant as always.

Friday, February 8, 2013

Node.JS on Raspberry Pi with OSX, part 3

In this final part I will show how I installed Node.JS on the Raspberry Pi on the Raspbian Wheezy OS.
First of all, open a terminal and execute the following commands:

cd /usr/src
sudo wget

This will change the directory to /usr/src/ and download Node.JS version 0.8.16. I know this is not the last version, but I needed that because my project runs on it and it's not yet time to upgrade to the latest version (even though I'm pretty sure everything is going to be all right).
After the download is completed, extract the tarball:

sudo tar xvzf node-v0.8.16.tar.gz
cd node-v0.8.16

After this you will be in the Node folder, you then should execute the following three commands to build Node.

sudo ./configure
sudo make
sudo make install

This process requires quite some time to complete (especially make). I suggest you don't waste time waiting it to be completed and do something else. Raspberry Pi is not going to break if connected for too much time.
After the commands are finished, Node.JS is ready to be used! You can either run some test with the following commands:

sudo make test

Or just start writing your own web server.

Wednesday, February 6, 2013

Node.JS on Raspberry Pi with OSX, part 2

So after you plug everything (the power supply last), on the screen you should see Raspbian Wheezy booting. Lots of logs in the console and finally what looks like a BIOS screen:

If you don't then you will have the problem I had. In the past hours I struggled a lot because I couldn't see anything on the screen. I tried reinstalling the OS image on the SD (in at least three different ways) and plugging different devices but nothing happened. In the end the problem was indeed the screen (and not the OS as I was supposing). Basically I was using a DELL screen which had a DV input and I was using a HDMI-DV cable. Apparently Raspberry Pi doesn't like that kind of screen inputs. By chance I found a screen with an HDMI input and I could test the Raspberry Pi on it, and it worked!
If you have a different problem I suggest you to try asking on the official forum, there is some helpful people over there.

Anyways, it may also happen that you see something, but not that screen. If that's the case try running the following command:

sudo raspi-config

Then I suggest you set up the keyboard through configure_keyboard and set the timezone through change_timezone. If you happen to have an SD card quite big (more than 4GB at least) then I suggest you also check out expand_rootfs to enable all that space. Also, remember to setup ssh when setting the keyboard by running

sudo setupcon

Then you are good to go! Just hit <Finish> and enjoy your brand new Raspberry Pi. In the next (and hopefully last) post I will show how to install Node.JS and some other useful package.

Tuesday, February 5, 2013

Node.JS on Raspberry Pi with OSX, part 1

During these days I borrowed a Raspberry Pi. I would like to try to set up a Node.js server on it. I decided to write this post to show how I did the whole thing. Just to make things clear, I'm running on OSX.

So first of all once you have your Raspberry Pi, you should mount the OS image on the SD card. Some RPi comes with an SD card with everything setup. If that's the case, you are lucky and can skip this post. Otherwise, just connect the SD card to the machine and mount the OS image on it. To do so you first need to download the image from here. I'm using the  Raspbian Wheezy, which is an optimised version of Debian for RPi. After having downloaded the image, unzip it and download this app. Unzip it and open it. It will ask you to locate the image of the OS first, then you will have to choose the SD card on which you want to mount the image, and finally your password. After it is done, you can remove the SD card and plug it into your RPi. The following image shows how to set up the wirings for your RPi.

Now you have everything ready to start programming on your RPi. In the next post (hopefully) I will show how to set everything up at a software level to start your own Node.js server on it.

Monday, February 4, 2013

Measuring performance of a processor with Node.js

Some time ago I needed to measure the performance of the cores I was using on a machine while a Node.js server was running. I came up with different ideas, including writing a bash script that would measure it the hardcore way. I then realised I could use the os module that comes with Node and use the data it returns to measure it myself.

First of all, take a look at the specification of the method os.cpus() here. It shows how to call it and what is the return value. To measure the performance of the system, it is sufficient to take consecutive snapshots, compute a delta, sum all the data and compute the usage percentage like so:

//include the module
var os = require('os');

//data collected on the previous tick
var previous_cpu_usage;

//[...] collect previous_cpu_usage

//get the cpu data
var cpu_usage = os.cpus();

//iterate through it
for(var index = 0, len = cpu_usage.length; index < len; index++) {
    var cpu = cpu_usage[index], total = 0;
    var prev_cpu = previous_cpu_usage[index];
 //for each type compute delta
 for(type in cpu.times){
     total += cpu.times[type] - prev_cpu.times[type];
 //print it for now
 for(type in cpu.times){
     if(type === "user" || type === "idle")
         console.log("\t", type, Math.round(100 * (cpu.times[type] - prev_cpu.times[type])/ total));

The idea is very simple. Here I'm not showing how to collect data consecutively. This can be done by the means of two timeouts: one that collect one data, then the second collect the second one and then in the callback of the second timeout data gets aggregated. I'm not showing the code exactly as it looks in my solution as it involves message passing through a (big) topology.

In the example shown, I'm printing the data that corresponds to user and idle time, since it's the one most people may be interested in. Anyways, with this code any percentage out of the returned data can be computed.

Sunday, February 3, 2013

Bitwise operations

Some time ago I was talking with a second-year student after a C exam. He was pretty satisfied about the job done, but asked me a question about the ~ operator. At first I didn't quite remember, so I told him that was probably the bitwise not. Later I discovered I was right, so I took the opportunity to refresh my knowledge of the bitwise operations in JavaScript. The purpose of this article is to show which bitwise operations JavaScript offers.

First of all, we can only apply bitwise operations on numeric operands that have integer values. These integer are represented in the operation as a 32-bit integer representation instead of the equivalent floating-point representation. Some of these operations perform Boolean algebra on the single bits, while the others are used to shift bits.

Bitwise NOT ( ~ )

This operator simply flips all the bits of the given integer. In numeric terms, it changes the sign of the number and subtract 1. For example ~0x0000000f will become 0xfffffff0 after the evaluation.

Bitwise AND ( & )

Performs a bitwise Boolean AND operation and set in the result the bit only if it is set in both operands. For example 0xfff666fff & 0x000fff000 will evaluate to 0x000666000.

Bitwise OR ( | )

Likewise the bitwise AND, the bitwise OR behaves in the same way, but a bit is set in the result only if it is set in one or both of the operands.

Bitwise XOR ( ^ )

The bitwise exclusive OR behaves like a normal XOR. The bit in the result is set only if the first operand is true and the second is false, or viceversa. If both are true or both are false, the bit in the result is set to 0.

Shift Right ( >> )

This operator moves all the bits in the first operand to the right by the number of bits specified in the second operand (up to 31). The bits that are shifter off the right are lost, and the filling is given by the initial sign of the first operand (if the operand is positive, it will be zero-filled, otherwise f-filled).

Shift Right with zero Fill ( >>> )

Behaves exactly like the shift right operator, but the filling on the left is always 0, regardless of the sign of the first operand at the beginning of the operation.

Shift Left ( << )

This operator moves all the bits in the first operand to the left by the number of bits specified in the second operand (again, up to 31). The rightmost part of the number is filled with 0s.

These operations are not really used by JavaScript programmers, but it's important to know they exist.

Saturday, February 2, 2013

instanceof & typeof Operators

Sometimes, even if it is considered bad habit, we need to check the class of an object in JavaScript. Classes may be defined by the programmer, but I will cover this topic later, for now let's just focus on classes defined by the interpreter. There are many, for example:
  • Object
  • String
  • Number
  • Date
  • Function
  • ...
To check if an object is an instance of a specific class, JavaScript offers the instanceof method, which may be used in the following way:

var mynmbr = new Number(1);
var mystrng = new String("disasterjs!");
var myfunc = function(){
    console.log("hello world!");
mynmbr instanceof Number //returns true
mynmbr instanceof Object //returns true
mystrng instanceof String //returns true
mystrng instanceof Object //returns true
mynmbr instanceof String //returns false
myfunc instanceof Function //returns true

Be careful when using instanceof about what you are checking. For example numbers and strings not wrapped in the constructor are treated as primitive value, thus a call for instanceof in that case would produce an unexpected result:

var mystrng = "disasterjs!";
var mynmbr = 1;

mynmbr instanceof Number //returns false
mynmbr instanceof Object //returns false
mystrng instanceof String //returns false
mystrng instanceof Object //returns false

What about the typeof operator? This operator returns the class as a string. The following table summarises its possible return values:

Host object (provided by the JS environment)Implementation-dependent
E4X XML object"xml"
Any other object"object"

Programmer may be careful here when checking a Null object as the result of a typeof will return "object". Here are a couple of examples on how to use typeof.

typeof 1 === "number" //true
typeof Number(1) === "number" //true
typeof 1 === "object" //false
typeof undefined_variable === "undefined" //true
typeof true === "boolean" //true

And that's it.

Friday, February 1, 2013

JavaScript, passing-by-reference and slice()

Some time ago I had a problem with arrays and assignments. I was working with a distributed system in JavaScript; the root process sent some "work" to do to some subprocesses (workers) and those returned an array as result, which was modified inside the callback of the dispatching function.

The issue was that when I modified the array, I then assigned it to a variable and passed it to a function. The problem was that while I was working in the callback with this array, it got modified somehow. I later understood that the problem was due to the fact that JavaScript passes values by reference. That is basically the array I assigned to the variable before passing it to the function was simply saving it's reference on the variable, but not really duplicating it.

The following example will show the problem.

var arr = [1, 2, 3]

var arr2 = arr;

arr2[0] = 0;


//prints 0

As shown, this prints zero. To solve the issue I used a custom procedure to copy every single value of the array into a new one. This is pretty painful and later I discovered I could simply use the slice() function. The following snippet based on the previous example will show the black magic:

var arr = [1, 2, 3]

var arr2 = arr.slice();

arr2[0] = 0;


//prints 1

Basically that's it. It's a very naive problem, and a skilled programmer may quickly find the issue, but knowing the existence of the slice() function may save quite some time.

How to syntax-highlight code on Blogger

One of the first issues I had with this blog was how to syntax-highlight code snippets. All I could find on the internet was garbage and old tutorials with the previous Blogger interface. Then I finally found this blog that proposed a working solution. Actually, it's the only solution that worked, so I'm suggesting it here, but I'm not entirely satisfied with the CSS that comes with it, so I might slightly modify it sooner or later.
If somebody knows a better solution, let me know in the comments, I'll be glad to take a look at it.

Thursday, January 31, 2013

Hello World!

console.log("I'm opening this blog to talk about JavaScript and things I come around with it.");
console.log("Updates are not going to be very frequent.");