Samson's machete

A preview of Guile-Lua (rebirth), and some opinions.

Posted by NalaGinrut on 5 September 2016 6:36 PM (guile | lua | compiler | parser | language design)

Oops, seems I haven't updated my blog for 11 months. What I'm doing?

I quit my boring day-job, and have my second daughter, and I've been invited to publish a paper about GNU Artanis on Schemeworkshop2016 affiliated by ICFP (well, I'll have a post for it after the conference), and I've tried many cool projects with my friends of SZDIY Community, and I'm planning to get my hands dirty on Artifitial Intellegence (but I dislike machine learning), and I may have a secret project end of this year (folks, you'll love it)...

But now, I would like to introduce our Lua frontend on GNU Guile, it's not ready for real work, but most of the work is done. There was a bit old, half baked Lua frontend in Guile branch, but seems unmaintain for a long time. I've rewritten one from scratch. Please don't ask me why bother to reinvent wheel. The new one exists now, so let's talk about it. I would like to avoid redundant work in other projects, but not in this one.

The term frontend here doesn't mean the web layout work for a page. A language frontend on Guile is a programming language implemented with Scheme language and taking advantage of all the Guile existing libs/modules, proper tail call, first class continuation, and all the optimizing.

Where is it?

It's still very experimental, but you may get it from github:

git clone git://

For anyone who wants to try, please make sure you have the latest Guile-2.1+ from master branch. It only works for the latest Guile. And it'll display many debug info such as AST analysis, environment printing, before you see the final result. As I said, it's not usable yet. Run it like this:

cd guile-lua-rebirth
guile -L ./

GNU Guile
Copyright (C) 1995-2016 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
[email protected](guile-user)> ,L lua
Happy hacking with Lua!  To switch back, type `,L scheme'.
[email protected](guile-user)> 

Now Guile REPL become a Lua REPL. Just try it as you know about Lua.

How does it work?

The most frequent statment I've heard like this: it should be easy to write frontend on Guile, you just parse Lua code and translate it to Scheme. Then let Scheme handle the scope and continuations or blabla. So the noticeable work is almost only the tokenizing and parsing.

NO! It's not how guile-lua-rebirth work, nor does any serious language frontend on Guile. We're not writing an interpreter of Lua with Scheme. We're trying to build a real compiler for Lua, with Scheme. For a multi-language platform like Guile, the Intermediate Language optimizing and code generation has been done. What we have to do is to parse Lua code to AST, and convert the AST to tree-il (the first layer of Guile intermediate representation) without losing any information especially the environment. In addition, we have to handle primitives for Lua, most of them are written in Scheme, and we have to find a proper way to call them from Lua acrossing the modules calling. When most of the work is done, we have to write many libs being compatible with origin Lua as possible to make this language implementation usable. Finally, we have to find a good way to let modules written in Scheme and Lua could be called by each other. Then you may take advantage of the contributions to Guile community.

So, the simplest work is the tokenizing and parsing in our work.

What do we have now?

Guile-lua-rebirth has a well tested lexer (tokenizer), and LALR(1) parser for complete Lua-5.2 grammar, a new designed Abstract Syntax Tree (AST) layer for pre-optimizing, arithmetic operations with paramatric polymorphism (you may plus a number to a string, overloading, if you're familiar with such a term), Lua tables, proper tail call...

Too many obscured terms, huh? Don't worry, they're not so important for a Lua user. Folks just focus on Lua programming, and ignore what the compiler it is. Unless you want to contribute, of course, it's welcome. I'll try to explain how Lua is implemented on Guile in the later posts.

What's new?

In addition, I'm trying to add new features into Lua. The rule to extend is not to break the compatibility as possible. I've added two so far.

Fixed ISSUE-1

The first new feature is actually a fix of Lua language. Let's see the code:


-- define with colon reference
function a.b.c:foo()
   return self

-- ==> table: 0x1a12ba0

-- ==> nil

-- define with point reference
   return self

-- ==> nil

-- ==> nil

This is a common issue in Lua. The variable 'a' is a table, which contains another table 'b', and 'b' contains table 'c'. Lua use this approach to mimic Object-Oriented programming: you may treat tables just like classes.

When you define a function in a table, there're two methods to reference the value with a key in the table. The colon reference, and the point reference. The only difference is that colon-ref will pass the current table object as a hidden argument into the function, and bound to the variable named self. Just like this object in Javascript or C++, although they're not the same, they're similar. For point-ref, the self is unbound to any value, so it's nil in default according to Lua spec.

The interesting point is that you have to use colon-ref if you want to get meaningful `self' value. Otherwise, you will get nil, even if you've defined the function with colon. (We name this issue as ISSUE-1 for our later context.)

It is rediculous.

I think it's a simple logical problem. If you define the function with colon, then you must want to reference a meaningful `self' value. If you don't, and you don't want any other people to reference the meaningful `self' value, you'll never define the function with colon, which mentions the compiler pass the current table to the defined function as a hidden argument. On the other hand, if people write in their code, they must want to get a meaningful result of `self', rather than nil. So the problem is that ISSUE-1 prevents people to get meaningful result even if they've defined the function with colon, and make people confuse with code because they can't figure out where's the error when they encoutered an unexpected nil somewhere.

My idea? I dropped the old design to pass the current table as the hidden argument. Because such an implementation will cause ISSUE-1. What I did in guile-lua-rebirth is to bind `self' to the current table in the nearest environment of the defined function. This makes `self' get a solid value when the function is defined with colon. No matter how you reference the `self' with colon or point.

---- The new design in guile-lua-rebirth

-- define with colon reference
function a.b.c:foo()
   return self

-- ==> table: 0x1a12ba0

-- ==> table: 0x1a12ba0

Of course, the activity is unchanged when you defined the function with point, the `self' will be nil either. So it is unnecessary to show the redundant code here.

Added `continue' statement

Many people thought Lua is just similar to C, since most of the syntax looks like C. Well, it is not true. At least, you can't use `continue'. You may read the discussion on StackOverflow. And the explain from the Lua author:

This is a common complaint. The Lua authors felt that continue was only one of a number of possible new control flow mechanisms
 (the fact that it cannot work with the scope rules of repeat/until was a secondary factor.)
In Lua 5.2, there is a goto statement which can be easily used to do the same job.

There's no `continue' statement in Lua, but this feature is useful in certain context. For example, you want to print all odd number from 1 to 10.

---- For Lua-5.1
for i=1,10 do
  if i%2~=0 then print(i) end
---- You have to change code logic to make it work

---- For Lua-5.2+ or Lua-jit
for i=1,10 do
  if i % 2 == 0 then goto continue end
---- You have to use GOTO which is obsolete for modern language

---- For guile-lua-rebirth
for i=1,10 do
  if i%2==0 then
---- No need to explain for any C user

Guile-lua-rebirth hasn't implemented GOTO yet. It won't be too difficult to implement when we have delimited-continuations. But I'm still thinking if we really need it, except for compatibility with Lua community. After all, Edsger Dijkstra has the famous paper GOTO statement considered harmful. But folks may argue against as "continuations are GOTOs too, and you plan to add it".

Well, I don't know.

I have to close comment since I found my blog is unstable to get comments from folks. It's all my bad, I should build a better blog. And I saw comments from you for encouraging me. Thank you very much Hayden Jones!

Actor-Model: a revealed black box in Scheme

Posted by NalaGinrut on 16 September 2015 6:20 PM (scheme | erlang | actors | concurrent | csp | pi-caculus | continuations | delimited-continuations | )

There're serveral concurrent models nowadays, CSP, Actors, π-calculus ...etc. It is believed that these concurrent models could bring high performance and scalable concurrent network service. Actor-model(I may call it Actors in the rest of the article) is a hot one that many folks would like to talk about it. But rather than using it, have you imagined to implement one for better learning it? Well, if your answer is yes then this is the article for you. I'll show you a feature-limited tiny actor-model implementation in Scheme, although you may call it a poor man's threading system, hope that's enough for you to understand the principle.

BTW. folks may kindly mention LFE(stands for Lisp Flavored Erlang) which is a Lisp syntax frontend implemented on Erlang compiler. In this article, I'm trying to show a tiny Actors framework written with pure Scheme. So they're very different in concept.

I would pick GNU Guile, because we need its delimited-continuations for coroutine. But it won't stop you to pick your fav Scheme implementation for Actors if you get the principle in this article.

Actors? Now that you have Scheme...

Why I've picked Actors rather than others, say, CSP? Because Actors has something to do with Scheme historically. And this is a Scheme blog, don't you know? Alright, not true, I write many things, but this blog is built with Scheme. ;-)

After Carl Hewitt introduced Actor-model in 1973, Gerald Sussman and Guy Steele's attempt to understand Actors resulted in the creation of the Scheme. Then they published a paper to show the conclusion: "we discovered that the 'actors' and the lambda expressions were identical in implementation."

It is believed that a proper Actors implementation requires at least two key features provided by languages[1]: first-class-continuation and proper-tail-call. First-class-continuation is used to construct light-weight process in userland, and the proper-tail-call is actually more efficient, in addition to being clearer. Maybe it's the reason why Scheme stick to them in the original. I'll try to explain why they're important for Actors, later. Well, folks may say that there'd be message passing and pattern matching as well, but these features could be contructed relative easily upon any mainstream languages. It's unnecessay to be provided by language itself.

The benefit of Actors is well-known: lock-free by means of message passing, async (if you play them properly with non-block), high concurrency...

Of course, there're caveats in Actors too. The actor model has no direct notion of inheritance or hierarchy, which means it is time consuming and confusing to program actors with trends of common behaviour. Besides, a *real* Actors framework demands scalability, which means it should support exchanging states and continuations between distributed nodes. It is a hard problem for efficient implementing. Moreover, it is not easy to do static-analysis debuggning and optimizing...etc for such a model.[2] These prolems seem to have been preventing Actors to be a mainstream computing model for a long time. For me, I would like to use it only in a high concurrent and scalable server design. It's great for server developping, but maybe not so generic for common usage just like what Object-Oriented Model does.

What is Actors exactly?

According to Carl Hewitt, unlike previous models of computation, Actors was inspired by physics, including general relativity and quantum mechanics. In theory, Actors can be described by several of axioms/laws/formulas. This article is not going to explain Actors in formal mathmatics, since it's painful to understand for non-mathmaticians.

The Actors, from programming perspective, at least consist of these priciples:

  1. Light-weight processes (actors) are the units for scheduling, it is unnessary to be the OS-level process; 
  2. Actors must communicate only by means of message(immutable) passing;
  3. Each actor has its own mailbox, and can be accessed only by its owner. This means the states are local;
  4. A proper pattern matching machanism for parsing the messages;
  5. Actors may block themselves, but no actor should block the thread on which it is running;
  6. Actors may spawn new actors;
  7. Sharing states and processes by distributed nodes.

Then we may write a prove-of-concept one by following them. Some features maybe considered to be very hard, but they're somehow implemented in certain modern languages, say, Erlang.

First, we have to consider how to implement light-weight processes, say, actors. The best way is not calling fork, nor pthread, but continuations.

Delimited Continuations

Continuations are dragons! Keep distance from it and pray some hero to slaught it for you is safer!

Alright, you might leave now.

Oh you don't buy it, right?

Continuations is good for implementing lightweight threads/coroutines. Unlike fork/pthread, such threads won't trap into OS kernel, hence this so-called green-thread would reduce much overhead, and lockfree (actually, just less locks). In brief, it's faster. Unfortunately, I'm not going to discuss what is continuations and how they're implemented in GNU Guile in this article. But it is necessary to explain the difference between full-continuations(call/cc) and delimited-continuations in a simple way. To show why the later would be better for light-weight coroutines. Maybe I could write another article to discuss them, deeper.

Similarly, I will discuss continuations from programing perspective.

When we talk about continuations, people may mention call/cc, as folks may know it is full-continuations. Full stack will be copied when capturing it, nevertheless, the worse situation is in Guile, there'll be two stacks are copied, continuations-stack and VM-stack. Please don't worry if you don't understand what I said. It's too heavy to use call/cc, that's what I want to say. Please see the figure below:

This figure maybe too simple, but you get the idea about how full-continuations are captured in principle.

And how about delimited continuations?

Again, it's too simple to show the true implementation in Guile, but you got how delimited-continuations are captured. Here's a better article if you want to learn more.

I hope it could help to understand why delimited-continuations is excellent for implementing light-weight processes. The point is that you just need to copy few stack frames (rather than full stack) in userland, which is light enough for scheduling.

I'm not going to discuss how to handle delimited-continuations in detail here, because it could be a very large topic. One may play delimited-continuations in many fancy ways. Here I just give an simplest example to show the principle of a coroutine.

(use-modules (ice-9 control)) ; import delimited-continuations module

(define workqueue '())

;; silly function to show the principle
(define (fun x)
  (display "start\n")
  (abort) ; yield here, the rest computation (continuation) is stopped then scheduled
  (display x)(newline)
  (display "end\n"))

;; very simple scheduler, just add a yield thread to a queue
(define (scheduler k)
  (set! workqueue (cons k workqueue)))

;; % is a breviate of call-with-prompt
(define (spawn-thread proc)
  (% (proc) scheduler))

(spawn-thread (lambda () (fun 123)))
;; ==> start

;; ==> (#<partial-continuation 28baba0>) ; or something similar

((car workqueue)) ; run this thread
;; ==> 123

You may notice that the key points are two, % for drawing the bottom line for capturing and abort for yielding. Besides, we have scheduler to deal with the captured continuation, it's the second argument of % as you might see.

Well, now I believe it's enough to enter Actors.

Actors in Scheme

There're various ways to implement Actors, here I'm trying to mimic an Erlang style one. Here's a ping-pong game code. If you ever played Erlang, you'll be familiar with the ping-pong code in Erlang manual:


-export([start/0, ping/2, pong/0]).

ping(0, Pong_PID) ->
    Pong_PID ! finished,
    io:format("ping finished~n", []);

ping(N, Pong_PID) ->
    Pong_PID ! {ping, self()},
        pong ->
            io:format("Ping received pong~n", [])
    ping(N - 1, Pong_PID).

pong() ->
        finished ->
            io:format("Pong finished~n", []);
        {ping, Ping_PID} ->
            io:format("Pong received ping~n", []),
            Ping_PID ! pong,

start() ->
    Pong_PID = spawn(tut15, pong, []),
    spawn(tut15, ping, [3, Pong_PID]).

But if you transfer Erlang code to S-expr:

(define (ping . args)
  (match args
    ((0 pong-pid)
     (! pong-pid 'finished)
     (format #t "ping finished~%"))
    ((n pong-pid)
     (! pong-pid (list 'ping (self)))
      ('pong #t
       (format #t "ping received pong~%")))
     (ping (1- n) pong-pid))))

(define (pong)
   ('finished #t
    (format #t "pong finished~%"))
   (('ping ping-pid)
    (format #t "pong received ping~%")
    (! ping-pid 'pong)

(define (start)
  (let ((pong-pid (spawn pong '())))
    (spawn ping (list 3 pong-pid))

Well, looks similar huh? Anyway, I just intended to write so, maybe helpful for you to learn Actors in Guile compared to Erlang code.

Before we try the code above to work, we have to implment the Actors framework first. We need to implement message passing, mailbox check, process spawn, and scheduler. Let me show the code directly:

;;==============Tiny framework of Actor-model=========================
(use-modules (ice-9 control) ; for delimited-continuations
             (ice-9 match)   ; for pattern matching
             (ice-9 q))      ; for queue data structure

(define *mailbox* (make-hash-table))
(define *task-queue* (make-q))
(define (gen-pid) (gensym "actor-"))

;; send message to the pid (uniq to a process)
(define-syntax-rule (! pid msg)
  (let ((mq (hashq-ref *mailbox* pid)))
    (if mq
        (enq! mq msg)
        (error '! "send msg: invalid pid" pid))))

(define-syntax-rule (has-task?)
  (not (q-empty? *task-queue*)))

;; get pid of the current process
(define-syntax-rule (self)
  (if (has-task?)
      (car (q-front *task-queue*))
      (error 'self "No task!")))

;; check mail box for current process, schedule to sleep when it's empty
(define-syntax-rule (receive body ...)
  (let lp()
    (match (hashq-ref *mailbox* (self))
      ((or #f (? q-empty?)) (abort 'sleep) (lp))
      ;; Very important!!! We must schedule the process after each receive scheduled,
      ;; or we failed to capture the correct continuation here. Don't blame me if you
      ;; see ghost when remove it, follow me to keep you safe!!!
      (mq ((lambda (_) (match _ body ...)) (deq! mq)) (abort 'sleep)))))

(define-syntax-rule (%schedule)
  (when (has-task?) ((cdr (q-front *task-queue*)))))

(define-syntax-rule (active-scheduler)
  (% (%schedule) scheduler))

;; a simple scheduler to dispatch the process with status
;; `sleep' to suspend the current process and run the next process from queue
;; `quit' to exit the current process
(define (scheduler k s)
  (case s
     (enq! *task-queue* (cons (car (deq! *task-queue*)) k)))
     (hashq-remove! *mailbox* (car (deq! *task-queue*)))))

;; spawn a new process to run proc with args as arguments
(define (spawn proc args)
  (let ((pid (gen-pid)))
    (hashq-set! *mailbox* pid (make-q))
    (enq! *task-queue* (cons pid (lambda () (apply proc args) (abort 'quit))))

Pretty easy, right? The principle is explicitly:

  1. spawn a process as running a function
  2. send messages between processes, depends on how you write these functions
  3. use receive to check mailbox for the current process and handle them
  4. schedule the current process when mailbox is empty
  5. quit the process properly when the function is over (no loop again)

And maybe you've noticed that the two key features mentioned above: first-class-continuations and proper-tail-call plays great role in the Actors. Actually, we're using first-class-delimited-continuations for implementing light-weight processes. The proper-tail-call, well, you see it everywhere in the code, right? it makes the code clearer and elegant. Or you may imagine how to rewrite them in while loops.

Here is a complete version you may try.

Don't be upset if you can't understand delimited-continuations properly, it looks easy but not easy at all. I may write another article for it. But this time, please focus on Actors' principle only.


I'm sorry but I didn't implement serializable-continuations which is used to Sharing states and processes by distributed nodes. It's another big topic to make it properly. I may discuss it in the future.


[1] [email protected], William.D.Clinger.

[2] Why has the actor model not succeeded? Paul.Mackay

Do some quick and dirty with Guile FFI

Posted by NalaGinrut on 28 March 2015 5:09 AM (guile | scheme | ffi | c | pointer | bytevector | pcre | compiler | mal)

These days I'm facinated in working on MAL (stands for Make a Lisp) project with Guile. I've done several steps so far, and I'll send pull request when it's all finished. And I found my implementation in Guile-2.1 runs faster than most of others, only little slower than C version. Seems there's something to be expected. ;-)

Today I don't want to talk about this project. I want to discuss FFI in Guile. The reason why I jump to play FFI is because the spec of MAL uses PCRE for its tokenizer hint. Personally, I don't think it's good idea. Because in this case, PCRE regex hides all the lexer details for compiler writers. But if it's for beginner compiler hackers, this would reduce their pain on lexer.

Let me get into the topic in short, Guile has no good enough PCRE implementation so far. Oh, yes, there is my favorite irregex, very cool stuff, and supports most of PCRE features. Well, but I don't want to send my pull request including this lib, since it's too big. I would like to try a tiny & elegant solution. Then I saw FFI.

This case uses libpcre, if you need libpcre2, least you have this article, right?


There's something we need to put in front of code:

(use-modules (rnrs)            ; for bytevectors
             (system foreign)) ; for FFI

;; Get dynamic link from .so file, note you don't have to write .so explicitly.
(define pcre-ffi (dynamic-link "libpcre"))

Bind functions

Now pick existed functions in the .so lib you want to bind to Guile.

(define %pcre-compile2
  (pointer->procedure '*                                       ; specify return type, here, it's a pointer
                       (dynamic-func "pcre_compile2" pcre-ffi) ; get function pointer you want
                       (list '* int '* '* '* '*))              ; declare arguments' types according to function signature

Note, no matter what kind of C pointer you faced, '* is the only way to go.

pointer->procedure is used to convert C function pointer to a Guile callable procedure.

Helper functions

These helper functions would be useful as you will learn later.

(define (make-blob-pointer len)
  (bytevector->pointer (make-bytevector len)))

(define* (make-c-type-wrapper v l h type set ref #:optional (endian 'little))
  (define size (sizeof type))
  (define _obj (make-bytevector size 0))
  (or (and (> v l) (< v h)) (error "value is overflow!" v))
  (if (> size 1) (set _obj 0 v endian) (set _obj 0 v)) 
  (lambda (cmd . arg)
    (case cmd
      ((ref) (if (> size 1) (ref _obj 0 endian) (ref _obj 0)))
       (or (and (> (car arg) l) (< (car arg) h)) (error "value is overflow!" v))
       (if (> size 1) (set _obj 0 (car arg) endian) (set _obj 0 (car arg)))) 
      ((&) (bytevector->pointer _obj))
      (else (error "Invalid cmd, should be ref/set/&" cmd)))))

;; Assuming we're little endian in this case
(define (make-C-uint8 x)
  (make-c-type-wrapper x 0 255 uint8 bytevector-u8-set! bytevector-u8-ref))
(define (make-C-sint8 x)
  (make-c-type-wrapper x -128 127 sint8 bytevector-s8-set! bytevector-s8-ref))
(define (make-C-uint16 x)
  (make-c-type-wrapper x 0 65535 int16 bytevector-u16-set! bytevector-u16-ref))
(define (make-C-sint16 x)
  (make-c-type-wrapper x -32768 32767 sint16 bytevector-s16-set! bytevector-s16-ref))
(define (make-C-uint32 x)
  (make-c-type-wrapper x 0 4294967295 uint32 bytevector-u32-set! bytevector-u32-ref))
(define (make-C-sint32 x)
  (make-c-type-wrapper x -2147483648 2147483647 sint32 bytevector-s32-set! bytevector-s32-ref))
;; Try to finish the rest by yourself! Don't forget float!

C pointer tricks

IMO, the only difficulty in FFI is how you handle various C pointers to meet the arguments. Other types, int, long...are trivial.There're at least 4 situations.

Regular C pointer point to nothing, any types except pointer type

non_pointer_type *a;

This kind of pointer is usually used to return value if one doesn't want to use function returning mechanism.

(let ((a %null-pointer)) ; In Guile, you have to point to NULL explicitly
  (func a) ; now 'a' holds something returned from func for later using

Regular C pointer point to certain variable, any types except pointer type

non_pointer_type a = certain_obj;
non_pointer_type *p = &a;

// Assuming we have this silly function in .so lib
int func(int *x)
  *x = 10;
  return 0;

Usually, this kind of pointer makes side-effect for the pointed variable, maybe change the value of it.

But we can't use scm->pointer directly!!!, since the pointer returned by this procedure is the pointer in VM, not in C stack or heap! Obviously, we have to get help from bytevectors.

;; For integer, including int/short/long

(let* ((a (make-C-uint8 5))
       (p (a '&)))
  (func p)
  (a 'ref))
;; ==> 10 ; Yay!!!

;; Please try other types by yourself

Pointer to pointer pointing to nothing

This kind of pointer is usually used to hold the memory block allocated within the callee function. Take a look this tutorial if you have any question.

non_pointer_type **ptr;

For Guile, you have two choices for freeing the allocated block. One is to free it as C programer does; another is to register its finalizer, then all the jobs delievered to GC, say, it'll be freed automatically when there's no reference to it.

The second point need to be noted is that you have to allocate a proper bytevector to hold the pointer of pointer. We need to allocate memory blocks manually. Because C will allocate memeory in stack automatically, but Guile wouldn't do it for C code since Guile doesn't know C grammar for that job. One of possible design is to embed a C parser for doing that. But it's out of our topic.

The last point is to remember to use dereference-pointer.

(define manual-free
  (pointer->procedure void (dynamic-func "xxx_free" xxx-ffi) (list '*))

;; NOTE: finalizer has to be a C function pointer rather than a Guile procedure!
(define auto-free (dynamic-func "xxx_free" xxx-ffi))

(let ((pp (make-blob-pointer (sizeof ptrdiff_t)))) ; allocate memory to store a pointer
  (malloc_something_func pp)
  ;; set finalizer, GC will free it automatically, as a Lisper
  (set-pointer-finalizer! (dereference-pointer pp) auto-free)
  ... ; doing your job
  (manual-free (dereference-pointer pp))) ; or you may free as a C programer

You don't have to take care of pp, since it's allocated by GC. What you should take care is (dereference-pointer pp) which points to a block allocated in malloc_something_func.

Three and higher level pointers

No!!! Don't ask me about it....

Practical example

Here's an workable example using FFI:

git clone

And you may try tokenizer of MAL:

(use-modules (nala pcre))

(define *token-re*
  (new-pcre "[\\s,]*([email protected]|[\\[\\]{}()'`~^@]|\"(?:\\\\.|[^\\\\\"])*\"|;[^\n]*|[^\\s\\[\\]{}('\"`,;)]*)"))

(define (tokenizer str)
  (filter (lambda (s) (and (not (string-null? s)) (not (string=? (substring s 0 1) ";"))))
          (pcre-search *token-re* str)))

(tokenizer "nil true ,false")
;; ==> ("nil" "true" "false")


There're at least two problems if you want to use pure FFI.

One is reentry issue. If something C stuff can't promise you reentry, you may have to write some C wrapper for that.

The second is that FFI can't handle pointers elegantly although it looks like elegant. Actually, you have to endure many segmentfalt before you get them work. And it's hard to debug when you use pure FFI. The same situation would be easier in C code.

Even my guile-pcre-ffi works fine, there's fatal bug causes segmentfault while you try to free pcre object, no matter explicitly or inexplicitly. Unfortunately, I haven't find the reason. As I said, it seems not so nice to debug...

[ANN] GNU Artanis 0.0.3 released [alpha]

Posted by NalaGinrut on 2 March 2015 7:08 PM (guile | scheme | web | framework | announcement | artanis)

I'm pleased to announce artanis-0.0.3 here.

GNU Artanis is a web application framework(WAF) written in Guile Scheme. It is designed to support the development of dynamic websites, web applications, web services and web resources. Artanis provides several tools for web development: database access, templating frameworks, session management, URL-remapping for RESTful, page caching, and so on.

GNU Artanis is under GPLv3+ & LGPLv3+ (dual licenses).

GNU Artanis is also the official project of SZDIY community. It's used to build server side of SZDIY common service. It is offered to GNU project to make free software better.

Here are the compressed sources:   (432KB)   (352KB)

Here are the GPG detached signatures[*]:

Use a mirror for higher download bandwidth:

Here are the MD5 and SHA1 checksums:

751adf2bee25fd780041142ee4f714f6  artanis-0.0.3.tar.gz
d4aa8076c5af142785037546a378cc61  artanis-0.0.3.tar.bz2
b33cd373f6d969db7e25ce38b4567ea6fb85adc6  artanis-0.0.3.tar.gz
acc5d2fa70f620639aeae9b965cc167562601c3a  artanis-0.0.3.tar.bz2

[*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this:

  gpg --verify artanis-0.0.3.tar.gz.sig

If that command fails because you don't have the required public key, then run this command to import it:

  gpg --keyserver --recv-keys EE78E925

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:

  Autoconf 2.69.120-5dcda-dirty


Changes in 0.0.3
* Notable changes
  Fixed several bugs.
  Support JSONP.

Simple solution to mkdir a tree

Posted by NalaGinrut on 1 March 2015 1:18 AM (algorithm | scheme | guile | data structure | tree | dfs | artanis | high order function | fp)

The problem

Yesterday night I've written command line interface (CLI) for GNU Artanis, after took care of my daughter to sleep, I had only one hour to hack something.

My aim is to add `art create proj-name' command to create a bunch of files/directories for initializing a new Artanis web app. Just like Rails does. Well, since my Turing brain has limited time for this hack, I have to choose my favorite way to traverse the tree for this job, the recursive way. ;-)

Depth first traversal

One of the proper way to mkdir a directory tree is to use DFS. I chose the pre-order DFS.

If you have well structured Tree implementation like this, then the pseudo code below could be simple enough to try.

;; This is pseudo code
(define (preorder node visit)
   ((leaf? node) #t)
    (visit node)
    (for-each preorder (children node)))))

Actually, the same algorithm is used in my unfinished data-taistructure lib Nashkel. Could be optimized to iterative one though...

Tree structure

Here's a problem: I don't known if I'll add/remove directories in the future, so I need a flexible design to let me do this job easily. A proper tree design is a good start.

For example:

|-- app
|   |-- controller
|   |-- model
|   `-- view
|-- lib
|-- log
|-- prv
|-- pub
|   |-- css
|   |-- img
|   |   `-- upload
|   `-- js
|-- sys
|   |-- i18n
|   `-- pages
|-- test
|   |-- benchmark
|   |-- functional
|   `-- unit
`-- tmp
    `-- cache

In Scheme, the most convenient data structure is the list:

(define *dir-arch*
  '((app (model controller view)) ; MVC stuff
    (sys (pages i18n)) ; system stuff
    (log) ; log files
    (lib) ; libs
    (pub ((img (upload)) css js)) ; public assets
    (prv) ; private stuff, say, something dedicated config or tokens
    (tmp (cache)) ; temporary files
    (test (unit functional benchmark)))) ; tests stuffs

Let me explain this tree structure built with list, since the algorithm implementation has something to do with the data structure design. I gave the rule below:

((node1 (child1_1 child1_2 ...))
 (node2 ((node3 (child3_1 child3_2 ...)) child2_1 child2_2 (node4 (child4_1 child4_2 ...)) ...))
 (node5) ; no any children

The disadvantage of this tree is hard to modify dynamically (alright, you may use set-car! and set-cdr!). But it's easy to handle statically, because the hierarchy is clear.

Design the high-order-function

Now that I'm using Scheme, I'll choose FP way containly. High order function just like design pattern of FP. Design a good and generic one will save your lot time in the future development.

Here is mine:

(define (dfs tree visit level)
  (match tree
    (() #t)
    (((r (children ...)) rest ...)
     (visit r level)
     (for-each (lambda (x) (dfs (list x) visit (cons r level))) children)
     (dfs rest visit level))
    (((r) rest ...)
     (visit r level)
     (dfs rest visit level))
    ((children ...)
     (visit (car children) level)
     (dfs (cdr children) visit level))))

So it's easy to mkdir a directory tree now:

(define (create-framework)
  (define (->path x l)
    (format #f "~{~a~^/~}" (reverse (cons x l))))
  (dfs *dir-arch* (lambda (x l) (mkdir (->path x l))) '()))

This CLI feature won't appear in 0.0.3, and because it'll bring great change to GNU Artanis, maybe it should be in 0.1.0. Dunno...

Happy hacking!

[ANN] GNU Artanis 0.0.2 released [alpha]

Posted by NalaGinrut on 29 January 2015 6:31 PM (guile | scheme | web | framework | announcement | artanis)

I'm pleased to announce artanis-0.0.2 here.

GNU Artanis is a web application framework(WAF) written in Guile Scheme. It is designed to support the development of dynamic websites, web applications, web services and web resources. Artanis provides several tools for web development: database access, templating frameworks, session management, URL-remapping for RESTful, page caching, and so on.

GNU Artanis is under GPLv3+ & LGPLv3+ (dual licenses).

GNU Artanis is also the official project of SZDIY community. It's used to build server side of SZDIY common service. It is offered to GNU project to make free software better.

Here are the compressed sources:   (440KB)   (360KB)

Here are the GPG detached signatures[*]:

Use a mirror for higher download bandwidth:

Here are the MD5 and SHA1 checksums:

0914f4511263a725973f9f1462a18d53  artanis-0.0.2.tar.gz
b288e77c5986b5b95ce03f8ef15a4d86  artanis-0.0.2.tar.bz2
af62cdf790ee9540172201109b6b6e5be6dd3fce  artanis-0.0.2.tar.gz
2aa174e8fdc12cbe8e7c108fac9cdf0ecf569f41  artanis-0.0.2.tar.bz2

[*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this:

  gpg --verify artanis-0.0.2.tar.gz.sig

If that command fails because you don't have the required public key, then run this command to import it:

  gpg --keyserver --recv-keys EE78E925

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:

  Autoconf 2.69.120-5dcda-dirty


Changes in 0.0.2
* Notable changes
  Updated for GNU project.

[ANN]Artanis-0.0.1 released!

Posted by NalaGinrut on 1 January 2015 5:52 PM (guile | scheme | web | framework | http | server | html5 | artanis)

Artanis is a new web application framework (WAF) written with pure GNU Guile Scheme. Artanis is free software, under GPLv3 and LGPLv3.

Artanis contains common HTTP stuffs (cookies,authentication,cache,sessions...), URL-remapping, HTML templating, and various experimental methods to handle Database: SSQL(SQL in s-expr), FPRM (Functional Programming Relational Mapping, and SQL-mapping. Now it supports mysql/postgresql/sqlite3 as DBD.

The current performance of server in Artanis is weak. It is planed to implement an async green-thread server with Guile's brand new delimited-continuations in next big version (0.2).

Artanis was announced as certificated awesome project in 2013 Lisp in summer projects contest. It's an official project of SZDIY community for building the server side of web-service and mobile-app of the community.

Convert hex to bin in Guile Scheme

Posted by NalaGinrut on 15 July 2014 6:20 PM (scheme | guile | hex2bin | h2bin | high order function)

Unfortunatly, Guile doesn't provide a method to convert hex to bin (When I'm writing, the current version is 2.0.11). People have to write their own one.

But fortunatly, writing code in Scheme is kind of enjoy. Every time I've done my program in Scheme, a new idea will strike my head to refactor it. Then you can't stop to continue to program.

Well, sometimes, we just want a quick way to do our job. But in Scheme, you may become a lib/framework writer unexpectedly. Maybe it's the worst point of Scheme. And one may enjoy wasting such time...hey I'm sorry Boss...

Here's my version, you may take it for free, or donate some fixes/suggestions/comments. ;-)

(use-modules (rnrs)) ; we need bytevectors in rnrs module

(define (%numstr->bin str final base)
  (define len (string-length str))
  (let lp((i 0) (ret '()))
     ((= i len) (final (reverse! ret)))
     (else (lp (+ i 2) (cons (string->number (substring str i (+ i 2)) base) ret))))))
;; NOTE: substring in Guile happens to be copy-on-write, so it would be efficient

(define (hex->bin str)
  (%numstr->bin str u8-list->bytevector 16))

(define (hex->list str)
  (%numstr->bin str identity 16))

(define (hex->ascii str)
  (%numstr->bin str (lambda (x) (utf8->string (u8-list->bytevector x))) 16))

;; (hex->bin "4a4b4c")
;; => #vu8(74 75 76)

;; (hex->list "4a4b4c")
;; => (74 75 76)

;; (hex->ascii "4a4b4c")
;; => "JKL"

In the beginning, I just need hex->bin, but finally, I wrote a high order function for a tiny hex lib.

Response to comments:

Suggestions from @Raymond:

1) get rid of the `final' argument of %numstr->bin.

(%numstr->bin str final base) isn't any shorter or faster than (final (%numstr->bin str base)); in the latter case it's easier to understand what's happening and there aren't as many positional parameters to remember.

Answer:Yes, you're right. I agree a callback in final position could be optimized out and put it outside.

2) rename %numstr->bin to %numstr->u8-list.

If you eliminate the `final' parameter as suggested, then %numstr->bin always returns the same type, and you can use a more descriptive name.

Answer:Yes, it's a related optimize. ;-)

Uroboros: The power of high order function

Posted by NalaGinrut on 30 June 2014 3:46 PM (fp | programming | scheme | guile | design pattern | high order functions | gcd | refactoring | SICP)

According to SICP, high-order function gives us the ability to build abstractions by assigning names to common patterns and then to work in terms of the abstractions directly. It's very useful while you're trying to refactor you code.

There're several built-in classical high-order functions in most modern languages, like map, reduce, filter, accumulator... Frequently, we use them for solving most problems.

But it doesn't mean you can't find or invent new one. Actually, you may be inspired during programming and find a new pattern which could be abstracted to use high-order functions.

This article introduced a new high-order function for certain cases. I wrote it in Scheme, but you can rewrite it in any languages supported first-class function.


It's easy to find such a pattern in some cases: the input data needs to be handled repeatedly, each round the result would be the input of next round, *BUT* we need to pass two arguments to the kernel function. This is similar to pipeline pattern, but different. Because pipeline pattern pass only one argument amount rounds.

To make this clearer, let's try a simple example.

Considering a GCD(Great Common Divisor) function:

(define (my-gcd x y)
  (if (zero? y)
      (my-gcd y (modulo x y))))

You may found a rule that my-gcd was called repeatedly, and each time the arguments are related to the last round. Actually, the kernel function my-gcd generates the input arguments for next round. This situation appears in most tail-call procedures.

So let me introduce a new pattern named urob stands for Uroboros. It's a snake eating itself. I can't say this metaphor is very proper, but I hope it can active your imagination.

(define (urob func init val pred)
 (if (pred val)
     (call-with-values (lambda () (func init val))
       (lambda (x y) (urob func x y pred)))))

Very simple huh?

The pred would predicate if the recursive calling should be end. And func is the kernel function. Let's try it for GCD:

(define (my-gcd2 x y)
 (urob (lambda (a b) (values b (modulo a b)))
       x y zero?))

"But why we drop the previous simple and elegant GCD implementation for this?"

Someone may shout.

I'll explain it in the end. Now let's try more complex case:

A delimiter lexer

A delimiter lexer is easy to understand. You specifiy a delimiters list, and the lexer will tokenize the input string:

;; (lexer str delimiters)
(lexer "hello.x+(bye)"  ".+()")
;; ==> ("hello" "." "x" "+" "(" "bye" ")")

It's not hard to implement such a lexer with loop:

(define (lexer str delimiters)
 (define cs (->char-set delimiters))
 (define-syntax-rule (-> w r)
  (if (null? w)
      (cons (list->string (reverse! w)) r)))
 (let lp((lst (string->list str)) (ret '()) (word '()))
   ((null? lst) (reverse! ret))
    (if (char-set-contains? cs (car lst))
        (lp (cdr lst) (cons (string (car lst)) (-> word ret)) '())
        (lp (cdr lst) ret (cons (car lst) word)))))))

But I'm going to show you a uroboros version:

(use-modules (ice-9 rdelim) (rnrs))

(define (lexer str delimiters)
 (define (-> c) (if (eof-object? c) '() (list (string c))))
 (define (tokenizer lst str)
  (call-with-input-string str
   (lambda (port)
    (let* ((token (read-delimited delimiters port 'peek))
           (delim (-> (read-char port)))  
           (rest (get-string-all port)))
         (values (if (string-null? token) `(,@lst ,@delim) `(,@lst ,token ,@delim))
  (urob tokenizer '() str string-null?))

So what?

Choosing a proper high-order function could help you to break you code into more maintainable independent functions, and reuse the code largely. Although understanding the classical high-order functions is important in functional programming, you can even invent your own during the practicing.

I always thought, if design pattern is so significant for OOP, maybe high-order function has equal status in FP. Maybe both.

Who knows...

Oba pipeline style!

Posted by NalaGinrut on 25 April 2014 4:51 PM (scheme | guile | python | pipeline | programming)

The concept of pipeline in functional programming means a succession of functions that operate, one after another, on an array of data, which consists of a chain of processing elements arranged so that the output of each element is the input of the next. One of the famous practices is Unix pipeline.

This article is about pipeline-like programming pattern, rather than 'yet another tutorial of pipeline tools'.

Let's see some code:

In Python

Here's a silly code as an example.

def get_number():
    return input("Give me a number: ")

def num_filter(num):
    print "oh you input %d!" % num
    return num % 2

def word_picker(hit):
    print "checking..."
    return ('even', 'odd')[hit]

def result_show(result):
    print "It is an %s number." % result


The process is very simple, get a number, and check if it's even or odd, finally it print out the result.

But this code has a problem, each time you want to add a level in your cascaded functions chain, you have to modify the code. The worst case is when you have more than 10 levels in your functions calling chain, you can't write it as a chain. You have to split them like this:

num = get_number()
hit = num_filter(num)
result = word_picker(hit)

This could be clearer, and it's easy to add a 'add_one' step after 'get_number', which plus 1 to the number you just typed. But you've added a lot of code and several temprorary variables. Let's hope your compiler can eliminate all the redundant part for you during the optimization.

Anyway, it's better not to modify the original code when you want to add new code.

Now here is the pipeline one:

def make_pipeline(procs):
    return lambda x: reduce(lambda y,p:p(y), procs, x)

procs = [num_filter, word_picker, result_show]
f = make_pipeline(procs)

def add_one(x):
    return 1+x

f2 = make_pipeline(procs.insert(0, add_one))

In Scheme

Although many pythoners enjoy lambda and reduce, not all the people have realized that these stuffs are borrowed from Lisp/Scheme land. There're cooler things in Scheme, maybe we should drop reduce and give others a try, say, fold:

(use-modules (srfi srfi-1)) ; `fold' dwells srfi-1

(define (get-number)
  (display "Give me a number: ")

(define (num-filter num)
  (format #t "oh you input ~d!~%" num)
  ;; ~d is similar to %d in Python, and ~% means newline
  (modulo num 2))

(define (word-picker hit)
  (display "checking...\n")
  (list-ref '("even" "odd") hit)) ; list accessing in Scheme

(define (result-show result)
  (format #t "It is an ~a number.~%" result))
  ;; ~a is similar to %r in Python

(define (make-pipeline . procs)
  (lambda (x) (fold (lambda (y p) (y p)) x procs)))

(define f (make-pipeline num-filter word-picker result-show))

(f (get-number))

See, there's no much difference from Python code. And you may read the manual of fold here.

After all, the content of this article is nothing but introduced a possible way to implement compose, which is very useful and interesting in programming. Now you already know what it is and how it can be, maybe it's time to hack more code, huh?