Information Engineering,
University of the Ryukyus
Shinji KONO, Kento YOGI
Suitable for writing applicationNow Available in http://sourceforge.jp/projects/cbc/
Verification aware
Compatibility with C
No function call
No environment on heap (unlike scheme)
Satiable for Embedded System
Hands out in http://www.ie.u-ryukyu.ac.jp/~kono/paper.html
We'd like to use continuation in C, or
Existing application written in C.
setjmp/try-catch is close, but too expesive to use.
So we designed Continuation based C
struct interface1 { int i; };
struct interface2 { int o; };
__code f(struct interface1 a) {
struct interface2 b; b.o=a.i;
goto g(b);
}
Just a segment of code
Architecture Independent Assembler
State Machine like programming language
Programming Language lower than C
Hardware Description Language
Specification Language
Do we have to write everything in code segments?
It is possible, but too cruel... for human...
void *env;
__code (*exit)(int);
__code h(char *s) {
printf(s);
goto (*exit)(0),env;
}
int main() {
env = __environment; exit = __return;
goto h("hello World\n");
}
__envrionment is a actually a frame pointer.
__return is a actually a address of part of function.
goto f(1),env; changes frame pointer and go.
Implementation is easyWanna return back to the middle of the function?
Insert extra function call for that.
If everything is written in CbC, there are no environment.
All states are kept in interface or global variable.
Continuation it self is an address.
A kind of light weight continuation.
vs. scheme continuation (entire stack image)
Thread Scheduler
Context Switch
Synchronization Primitives
I/O wait semantics
Cell Simulator written in CbC
CbC can write thread schedule by itself.
goto scheduler(self, task_list);State enumeration can be written in itself.
code scheduler(Thread self,TaskPtr list)
{
TaskPtr t = list;
TaskPtr e;
list = list->next;
goto list->thread->next(list->thread,list);
}
goto record_and_search_state(self, task_list);
Slower than SPIN, but faster than Java PathFinder.
State enumeration of Dining Philosophers .
process | states/sec CbC | states/sec SPIN | sec JPF |
5 | 38,984 / 0.66 | 94 / 0.008 | 3.98 |
6 | 159,299 / 3.79 | 212 / 0.01 | 7.33 |
7 | 845,529 / 27.59 | 494 / 0.03 | 26.29 |
8 | 3915,727 / 199.80 | 1,172 / 0.04 | 123.16 |
Advantage.
We can modify unit of interleaving (by hand)
We have our own full set C compiler
Very few optimization
Suitable for experiments
Not so fast
It is difficult to make good compiler
If we have GCC quality compiler implementation...
Idea is very old (2002).
Forcing C tail call elimination for all code segment.gcc before version 4.x is difficult to handle.
limited capability of Tail Call Elimination (TCE).gcc version 4.x is Ok. Implementation is done by B4 (Kento YOGI).
(a very good student)
Syntax tree and RTL is architecture indepent.
RTL: pseudo machine code
We implement CbC in Parser and RTL generation part.
int f() { return g(); }
_f: // Normal code
subl $12, %esp
call _g
addl $12, %esp
ret
_f: // Tail Call Eliminated code
jmp _g
__code is a typeUnlike C-- or Scheme, special syntax is necessary for
restriction on void function
no need for TCE check.
Special treatment for code segment is possible
different Binary API
Architecture independent (in theory...)
GCC expand_call() (generate rtx from syntax tree)
4463 lines 18527 145469 calls.cCbC expand_cbc_goto() for code segment
783 lines 2935 23651 cbc-goto.hActually code segment implementation is easier.
__code carg4(struct arg args0,struct arg args1,GCC simply give up TCE in this case.
int i, int j,int k,int l)
{
goto carg5(args1,args0,j,k,l,i);
}
Check overrap and saving in stack is necessary.
goto with environment , __return+, __environmentInvestigating
PowerPC, SPU is not working now.Theoretically architecture independent, but...
TCE it is not working well in this architecture
unknown insn
Version dependency. GCC changes drastically even in
case of minor change. (4.2.1 to 4.2.3), currently 4.3.0
is not supported.
f0(int i) {Artificial bench mark with heavy sub routing call overhead.
int k,j;
k = 3+i; j = g0(i+3);
return k+4+j;
}
g0(int i) {
return h0(i+4)+i;
}
h0(int i) {
return i+4;
}
typedef char *stack;
struct cont_interface {
// General Return Continuation
__code (*ret)();
};
__code f(int i,stack sp) {
int k,j;
k = 3+i;
goto f_g0(i,k,sp);
}
struct f_g0_interface {
// Specialized Return Continuation
__code (*ret)();
int i_,k_,j_;
};
__code f_g1(int j,stack sp);
__code f_g0(int i,int k,stack sp) { // Caller
struct f_g0_interface *c =
(struct f_g0_interface *)(
sp -= sizeof(struct f_g0_interface));
c->ret = f_g1;
c->k_ = k;
c->i_ = i;
goto g(i+3,sp);
}
__code f_g1(int j,stack sp) { // Continuation
struct f_g0_interface *c =
(struct f_g0_interface *)sp;
int k = c->k_;
sp+=sizeof(struct f_g0_interface);
c = (struct f_g0_interface *)sp;
goto (c->ret)(k+4+j,sp);
}
__code g_h1(int j,stack sp);
__code g(int i,stack sp) { // Caller
struct f_g0_interface *c =
(struct f_g0_interface *)(
sp -= sizeof(struct f_g0_interface));
c->ret = g_h1;
c->i_ = i;
goto h(i+3,sp);
}
__code g_h1(int j,stack sp) {
// Continuation
struct f_g0_interface *c =
(struct f_g0_interface *)sp;
int i = c->i_;
sp+=sizeof(struct f_g0_interface);
c = (struct f_g0_interface *)sp;
goto (c->ret)(j+i,sp);
}
This is awfully long...
__code h(int i,stack sp) {
struct f_g0_interface *c =
(struct f_g0_interface *)sp;
goto (c->ret)(i+4,sp);
}
struct main_continuation {
// General Return Continuation
__code (*ret)();
__code (*main_ret)();
void *env;
};
__code main_return(int i,stack sp) {
if (loop-->0)
goto f(233,sp);
printf("#0103:%d ",i);
goto (( (struct main_continuation *)sp)->main_ret)(0),
((struct main_continuation *)sp)->env;
}
__code f2_1(int i,char *sp) {
int k,j;
k = 3+i;
goto g2_1(k,i+3,sp);
}
__code g2_1(int k,int i,char *sp) {
goto h2_11(k,i+4,sp);
}
__code f2_0_1(int k,int j,char *sp);
__code h2_1_1(int i,int k,int j,char *sp) {
goto f2_0_1(k,i+j,sp);
}
Slightly short.
__code h2_11(int i,int k,char *sp) {
goto h2_1_1(i,k,i+4,sp);
}
__code f2_0_1(int k,int j,char *sp) {
goto (( (struct cont_interface *)
sp)->ret)(k+4+j,sp);
}
__code main_return2_1(int i,stack sp) {
if (loop-->0)
goto f2_1(233,sp);
printf("#0165:%d ",i);
goto (( (struct main_continuation *)sp)->main_ret)(0),
((struct main_continuation *)sp)->env;
}
compiler | ./conv1 1 function call | ./conv1 2 (CPS 1) | ./conv1 3 (CPS 2) |
icro-C | 8.97 | 2.19 | 2.73 |
CC | 4.87 | 3.08 | 3.65 |
CC (+omit) | 4.20 | 2.25 | 2.76 |
CC (+fast) | 3.44 | 1.76 | 2.34 |
Micro-C, GCC bench mark (in sec)
Don't think it is general result. This is an artificial bench mark.
Have to follow classical ABICode segment is too fine grain
Have to save many registers
for possible future use
It is not for human
Now we are going to have practical language with very fast continuation.
Available in https://sourceforge.jp/projects/cbc/
Implementing goto with environment in GCC
CbC based Many Core Kernel
(Cerium Engine on PS3)
CbC Self Model Checker
C2CbC converter
CbC based Many Core Kernel
Sequential Implementation in CwC
Generate code segment
Execute each code segment on each Core
No, I think it is a matter of time...