I’ve been using this python implementation of simulated annealing to deduce Singapore bus service from a database of bus stops (see previous posts about Google Map API).

I wanted to play a little bit more with this algo and decided to port it to OCaml: you can find the code on GitHub.

As a first try, I wrote a direct code translation. Only few points differ:

  • The OCaml code use a stream implementation (lazy list) … (mostly equivalent to my previous Erlang implementation).
  • The OCaml code uses array structure where Python uses list.
  • The OCaml implementation of function reversed_section do only one array reverse in all cases.

Result is quite ok. Without any optimization on the algo, the OCaml native code performed quite faster (around 10 times faster), meaning that solving the TSP problem for SG bus services took 40 minutes for OCaml version where the Python code ran for hours (of course we could improve this version too).

The algo itself can be improve (I guestimate the time for SG map problem can be reduce by one more order of magnitude with the same simulated annealing approach).

But in coming post (code is there, need time to clean it), I’ll be looking at other interesting problems I found inside this one, especially on how to walk in solutions space in quasi-random order.

This post is a following to the previous post about drawing SG bus route with google map API.

Status of the system

This is for now only a toy demo, all in a single javascript document that draw the map when loaded (the javascript file is generated from a KML file by a python program that also computes the bus service routes).

Toward drawing all the bus services …

Drawing all the bus services on the same map not only is *very heavy* for the browser (at least for firefox, and its SVG renderer) but also give very little information (it is too much data at one time).

See by yourself:

Even zooming (on Dhoby Ghaut station):

What could be done …

  • To solve the problem of too many services displayed at same time, definitively a good UI could improve usability.
    What I have in mind is a kind of UI which enable to select which services to display … but maybe something better can be done like auto-selection of services to display depending of zoom factor or user itinary?
  • The previous point could also help to not send all the data to the client (in order to save bandwith).
  • Automatic route simplification: on one route, when adjacent stops are “too close” to each other, we could remove one of them on the drawing (which would make it lighter, and also remove some artifact like tiny wigzag caused by stop on the opposite side of the road).
  • Adding some constraints manually (editing few distance between some stops) to correct some wrong path.

More?

As in the previous post conclusions, I think a mashup of existing itinery service to display the itinery on a map could be quite interesting.

Playing with the DB of SG bus stop locations and google map API.

My goal is to try to draw the route of the bus services … but the DB has no info about the order of the stops per service (there is info of which services stop for each bus stop). Following are some experiments, with display about bus service 124.

First try was kind of funny (not unexpected!) as I draw the bus route “as stops are found in DB” … which apparently is from left to right:

So, to get a better idea of the bus route, I’m trying to draw the shortest path … better and even quite enough:

But of course this is inexact map as:

  1. this shortest route uses all bus stops without taking care of which ones are on the way to go or the way to go back (this data is not in my DB);
  2. the length of the path does not depend of the road … those are only straight lines!

But anyway, this give a better view of the bus service than nothing or a guide book of bus stop names …

New example in ocamerl lib of erlocaml project: a tiny map-reduce-like (1) word-count program (word-count is the simple example often used to illustrate map-reduce principle).

First part of this example is an ocaml distributed erlang node [l52]. It is composed of one permanent registered (’wc’) mbox [l40] which upon request create other mboxes implementing the mappers [l19] (on request, a mapper read a text file [l8] and send it word by word to reducer [l29]).

Second part of this map-reduce example is an erlang module implementing two functionalities. One of them is the reduce part implementation [l34] done by a process which accept messages from mappers and update a dictionary accordingly.
Second responsibility of this module is orchestration of the map-reduce flow [l90], consisting of running all mappers [l66], waiting end of all those executions [l79], and collecting result of all reducers [l95].

Assuming EPMD is running, this script run an ocaml node and an erlang node, and run a simple map-reduce to count words in two test files.

As usual, not sure it’s a useful example, but it was sure fun to write :)

(1) this is not a map-reduce lib … there is no failure case taken care of, processes pipe is mostly done through messages over network (while map-reduce is mainly meant to use an efficient filesystem based communication), etc!

Little intro step-by-step to the ocaml-erlang message exchange mechanism offered by ocamerl (a lib of erlocaml).

Subject of the following code is a wonderfull application … named “echo server”!

  1. In the interpretor shell of ocaml, we create a hidden erlang node and an activity to echo all received message (both on console as well as in respond to the caller).
  2. In the erlang interpretor shell, we send few message and compare them to the received ones.

The following assumes code is running on machine “comp1″.

Ocaml shell:


(* getting Ocamerl available in toplevel *)
ocaml> #use "topfind";; (* ocamerl used findlib for install *)
ocaml> #thread;; (* ocamerl is multithreaded app *)
ocaml> #require "ocamerl";;

(* creating/declaring a node, assuming epmd is running *)
ocaml> let o1 = Ocamerl.Enode.run "o1" ~cookie:"cookie";;
val o1 : Ocamerl.Enode.t = <abstr>

(* creating/declaring a mbox, equivalent of erlang process *)
ocaml> let m = Ocamerl.Enode.create_mbox o1;;
val m : Ocamerl.Enode.Mbox.t = <abstr>
ocaml> Ocamerl.Enode.register_mbox o1 m "echoer";; (* give it a name *)
- : unit = ()

ocaml> Ocamerl.Enode.Mbox.create_activity m (fun msg -> match msg with
    | Ocamerl.Eterm.ET_tuple [|pid; any;|] ->
        Printf.eprintf "MSG:%s\n%!" (Ocamerl.Eterm.to_string msg);
        Ocamerl.Enode.send o1 pid any
    | _ ->
        () (* drop unexpected msg *)
);;
- : unit = ()

Erlang shell:


# starting erlang node with same cookie
erl -sname e1 -setcookie cookie

% check connection
erl> pong = net_adm:ping(o1@comp1).
pong

% utility to print whatever is in message queue
erl> F = fun() -> receive W -> io:format("got back: ~w~n", [W]) after 1 -> error end end.
#Fun<erl_eval.20.67289768>

% some tests ... send data, received it back
erl> {echo1, o1@comp1} ! {self(), {1,2,"test"}}.
{<0.37.0>,{1,2,"test"}}
erl> F().
got back: {1,2,[116,101,115,116]}
ok
% in the mean time, ocaml shell also display the data

That’s it! A wonderfull echo server :)

Amongst things on the to-do list:

  • Should not have to create a mbox and set its activity separately (need some wrapper)
  • Could have an onode ocaml toplevel which run one node by default and offer direct interface (e.g. “send”).

Yet another useless example for erlocaml, but this time it is not completely silly … it does something :)

In fact the project eocarve uses:

Idea is simple: an ocaml node run and provide a API for Erlang node to call the seamcarving library. See eocarve wiki for details/example.

Aim of this project was mainly to demonstrate use of ocamerl lib … however it may be usefull for an Erlang web app which would need seamcarving (heavy weight for CPU!). Had fun to integrate those lib in same app.

Minor updates for erlocaml (more exactly on ocamerl) … mostly code cleaning and minor refactoring!

Only one new feature: ability for Erlang processes to send message to unregistered ocaml processes. An example of that is “ex_node_mult” which generalize “ex_node_double” by dynamically creating ocaml process to perform a multiplication … yep I know, not very useful


% erlang code using the ocaml node
{byn, OcamlNode} ! {self(), 2},
By2 = receive M -> M after 1 -> error,
{byn, OcamlNode} ! {self(), 3},
By3 = receive M -> M after 1 -> error,
P1 = make_ref(),
P2 = make_ref(),
By2 ! {self(), P1, 21},
By3 ! {self(), P1, 21},
ok = receive {P1, 42} -> ok after 1 -> error end,
ok = receive {P2, 63} -> ok after 1 -> error end

With this feature, ocamerl begin to be usable … and that’s exactly what I will be doing next: experiment with more useful (at least less silly) examples! Some ideas are:

  • tiny mapreduce example with ocaml workers (erlang doing the network part);
  • using the ocaml image processing lib from erlang (not best example as lot of data need to be exchange … this is task to be solved for future erlocaml development);
  • others???

Google for it, Euler problem #1 has got a lot of solution implementations. Most of them use similar algo: produce the list of all numbers in the range [0-1000], filter only number divisible by 3 or 5, and compute the sum. This is perfectly fine, and it’s wonderfull to code when your language allow some kind of lazy list (default in Haskell, or often called stream library in other language).

But can we construct the sequence of “number divisible by 3 and 5″ directly?

Let’s have a look at what this sequence looks like:


m = 42
l = sorted(list(set(range(0, m, 3) + range(0, m, 5))))
l
[0, 3, 5, 6, 9, 10, 12, 15, 18, 20, 21, 24, 25, 27, 30, 33, 35, 36, 39, 40]

Well … cannot see a pattern … let’s look at the sequence of difference between 2 consecutive numbers:


map(lambda (x,y): y-x, zip(l, l[1:]))
[3, 2, 1, 3, 1, 2, 3, 3, 2, 1, 3, 1, 2, 3, 3, 2, 1, 3, 1]

Here we go … I have no idea why, but seems that we can construct the sequence using the list of increment 3,2,1,3,1,2,3 (same with m=300).

And then, here come a different python solution:


from itertools import cycle, takewhile

def succ(b, inc):
  while True:
    b += inc.next()
    yield b

sum(takewhile(lambda x: x < 1000, succ(0, cycle([3,2,1,3,1,2,3]))))

By the way, there must be a mathematical solution to this first problem, some kind of magic formula … any pointer?

Was inspired by Rober Pickering post about a post about Euler problem 14 …

Without changing the algorithm, I ran a simple test to compare F#, ocaml byte code and ocaml native code … and the result is quite surprising, for me at least!

The program do a lot of computation on integer (even 64 bits); I remember (no ref) that ocaml is more efficient for floating point operation that for integer. But anyway, just wanted an idea of how to port this code to Ocaml.

Following is the code in F# as seen on cited post:


#light
let rec seq_length x n =
    match x with
    | x when x = 0L -> (n + 1)
    | x when x = 1L -> seq_length 0L (n + 1)
    | x when x%2L=0L ->seq_length (x/2L) (n + 1)
    | _ -> seq_length (3L*x + 1L) (n + 1)

let rec loop i imax n =
  let n' = seq_length i 0
  let imax, n = if n' > n then i, n' else imax, n
  if i < 1000000L then loop (i + 1L) imax n else imax

print_any (loop 1L 0L 0)

And here is an Ocaml version:


let ( ** ) = Int64.mul
let ( // ) = Int64.div
let ( %% ) = Int64.rem 

let rec seq_length x n =
    match x with
    | 0L -> (n + 1)
    | 1L -> seq_length 0L (n + 1)
    | x when x %% 2L = 0L -> seq_length (x // 2L) (n + 1)
    | _ -> seq_length (Int64.succ (3L ** x)) (n + 1)

let rec loop i imax n =
  let n' = seq_length i 0 in
  let imax, n = if n' > n then (i, n' ;) else (imax, n) in
  if i < 1000000L then loop (Int64.succ i) imax n else imax

let _ = print_string (Int64.to_string (loop 1L 0L 0))

Surprisingly (for me!), the compiled F# code is much faster:

  • time mono eul14.exe: 12.979s
  • time ./eul14.byte: 58.145s
  • time ./eul14.native: 30.234s

I’m not sure why Ocaml seems to be slower for this test. Can it be a GC issue?

I was playing with this mini raytracer and modify it to use glcaml instead of glut.

I was delighted to see how the native code is faster than the byte compiled one … of course this is expected but after playing for some time with the byte code I was surprise how fast the result came after the first native compilation :)

(below is google graph; time unit is second)

I’ve been writing here some ideas or development status.

But looks like my readers are more oriented towards code examples or solutions … so I’ll try to be more pragmatic in future.

Anyway my code is better than my English (at least I hope so!) so that should improve quality of this blog ;)

As exercise of Ocaml, I wrote a very very simple library for inference using fuzzy logic.

There is only the bare minimum … but it’s often enough anyway ;). If someone is interested I can:

  • write a small doc
  • write a example
  • complete some parts

This is my first project on GitHub.

Quick news about ocamerl:

  • I changed the build system to ocamlbuild (need ocaml 3.10).
  • … and that’s all! can’t find time to clean all the mess.

So anyway, here is the current state of ocamerl:

  • ocaml can register a hidden erlang node (epmd).
  • ocaml node can respond to ping and keep connection up with erlang node (net_kernel process)
  • erlang can send (some) terms to named ocaml process
  • ocaml can send (some) terms to erlang process if it received the pid in a previous message

That’s all for now! Argh!

Amongst things I’d like to change:

  • simplify terms manipulation in ocaml
  • add features (most important initial message from ocaml to erlang)
  • complete redesign of concurrency (using events or JoCaml maybe!)
  • add a minimum of documentation

I need motivation! In fact I have no project which would need ocaml + erlang … for now.

How could an Erlang process communicate with an ICE-based application? The mix seems a bit strange (why would you need ICE when you already have Erlang ;) ) but without entering in details of why we would like to do so, let’s consider how can we plug an Erlang sub-system in a ICE application.

One way to go would be to write a slice2erl compiler! That’s presumably lot of work. This kind of work is being done for OCaml in the Hydro project … and is on my list of next things to play with.

Another way to get Erlang talk to ICE application would be via a custom intermediate translator between the Erlang app and the ICE system (using some intermediate language having a complete interface with both Erlang and ICE, like C, Java, or Python); yes, this surely add complexity and slow down the full system, but efficiency of distributed application is more often bound by design than by runtime ;).

Btw, as far as ease-of-development is concern, I have the same kind of feeling when comparing C++ to Python or ICE to Erlang :)

Having used different kinds of language (C++, python, ruby, ocaml, erlang), I still can’t choose what kind of typing I prefer as far as static vs dynamic typing is concerned (I will not discuss the strong vs duck typing).

I’m not convince that a static type system is a useful “bug avoidance system” (for my kind of development habits at least!). Not that it cannot prevent some bugs, but those are probably the “easy bugs” that would have show-up in unit-tests … and this is exactly the point: unit-tests are anyway necessary to test the functional aspect, so type kind of bug will be found in either kind of development environment.

Having written lot of code in python and C++, I sure better love python: I will not repeat here why language like python or ruby are more easy to read, more concise and so on …

However in team environment, a type is a first documentation feature! In a clear API, the type information is nearly enough to use the API (most python API documentation define all types of the parameters …).

Also with my last experiences with ocaml, I discover that having to define the type oblige to think more about what is the data. When thinking of the type of a data, you end-up asking you question like: what is it that I want to manipulate? if it should have this and that method, is it an application object? is it a subcomponent of my current object? is it a facet/interface of my object? can I abstract this “thing”? … all that is a good point in favor of strongly typed language at compile time as it help to concentrate on defining data!

Next Page »