As I walked into that office, a heavy sense of pressure seemed to hang in the air.
A slim yet dignified middle-aged man personally poured me a cup of tea. The one receiving me was the deputy factory manager of this client, who had his hands in nearly all operational matters of the factory. He seemed a leader who liked to oversee things himself. Even in the brief moment it took to pour tea, he was signing a document for a security guard who had come to apply for funding. This level of hands-on involvement surprised me for a client leader of his stature. He carried himself with approachability and always made sure to handle things in person, His down-to-earth style left an impression of a leader who preferred rolling up his sleeves to getting others to do the work.
This was the first time in my life visiting an enterprise of such scale and renown. Its reputation throughout the country was deafening, even though it was just one subordinate factory within the larger conglomerate. Yet even as a subsidiary plant, its economic influence over the local area was immense, almost monopolistic. I could sense the enormous importance and weight this factory carried in the region as I walked through its expansive facilities. Being granted a tour of such an consequential operation was truly an impressive experience I will not soon forget.
As a developer-born entrepreneur, I confess such an experience left me feeling rather tense and stiff.
"So..what's your company, again?"
It's the third time he asked so. In just 10 minutes.
Our conversation was occasionally interrupted as he needed to either sign documents or give verbal instructions. Yet he was always adding more tea to my cup, even before I'd had a chance to take a sip, It seemed he used the act as a subtle yet deliberate signal to refocus our discussion. Then ask me the same question one more time.
"We're a startup focused on AI-Infra. We provide on-prem private AI solution based on opensource large language model, maybe you've tried ChatGPT…"
Acting swiftly to seize this unexpected opportunity, I retrieved my IPad Mini from my bag and set up the pitch deck I had prepared in advance. Placing it before him on the table, I began introducing our company as he listened attentively.
The font size is dammit small on the 7.69 inches iPad Mini screen!
I uttered a silent curse in my brain.
However, he did not seem concerned with the size of the text on screen. Instead, he leaned in closely with tea cup in hand, listening intently while carefully reading every word on the slides.
"…that's how we make it work, and we've learned about the certain sceanario in your department may need an automated solution to reduce the human interfere…"
5 minutes! He didn't stop me. That's the chance I can come to the unique feature of the product!
"If what you said is true, there may indeed be some applicability for this approach."
Sitting up straighter and leaning forward slightly, he offered, "We've been getting a lot of pitches for similar offerings lately. In fact, many that have come across my desk don't seem entirely sound."
"We actually considered privacy and data security for enterprise customers, please see this graph…"
Finally! This is the part of our story that I take most pride in. From the beginning, my vision was to build a cloud-native AI-Infra company focused on empowering organizations with privately held, yet responsibly governed AI. One of our founding principles acknowledges that in the future, not only will everyone have their own private AI, but these systems must respect privacy and ensure data security. The rationale is quite simple - once humans digitize their experiences into AI avatars, the security of these creations will become a threat not just to individuals and families, but societies at large.
From my perspective, for companies investing substantial time and financial resources to digitize core data and worker experiences into proprietary AI systems, wouldn't diligently protecting those assets be only prudent? Surprisingly, he did not respond to this point directly, but instead inquired about other aspects of our proposal.
I couldn't help but gently point out to him that privacy compliance and data security should be mandatory investment priorities, not optional considerations. To emphasize my point, I briefly recounted a few salient examples where lapses in these areas negatively impacted companies. My aim was not to lecture, but to highlight risks our solutions are designed to help mitigate. While respecting his wisdom and authority, I felt a duty to bring additional relevant perspectives to the discussion. By weaving real-world instances into our dialogue, I hoped he would see both strengthened commercial rationale and societal importance for the protections we advocate.
Suddenly, he turned to me with an impassive expression and spoke in a tone seeming to test for sales gimmicks: "My data resides in-house. Why would there be security issues if we already have robust systems in place?"
I offered a relevant example in response. There was a case, I explained, of an disgruntled employee at a company who was laid off. Seeking revenge, the individual maliciously deleted the company's entire database, crippling their operations. While the perpetrator was eventually brought to justice, the company nevertheless collapsed as a result of being unable to recover quickly from such a catastrophic breach. With this example, I attempted to demonstrate that while many security measures can defend against outside attacks, they may not always protect against insider threats. Merely keeping data in-house cannot singularly guarantee safety. No protective systems are infallible, as clever adversaries will exploit unforeseen cracks, whether external or internal.
"That's bullshit!"
He seemed caught off guard by the example, abruptly responding before rising to fetch water and refill our tea cups. Upon sitting again, he repeated the phrase in an contemplative tone.
"That's bullshit!"
Well, it is impossible to persuade someone who refuses to accept factual reality. In the case of such a prominent organization with an exceptionally low employee turnover rate, this town's residents would give anything for a job there in a heartbeat. The local government itself must stay in their good graces. Under such conditions, he seems to be confident enough to reject the fact.
However, no one can thrive without respecting the facts.
I think I should stop while I'm ahead. Since they are only interested in the AI core, I should just let them be and give the customer what they want. After all, I'm here to do business, not to baby-sit them.
I promptly steered the conversation toward aspects he was more interested in. Eventually, he arranged for his IT manager to liaise with me, allowing us to delve into more in-depth discussions.
Yet I still believe in my ideals - that sooner or later, people will realize that constructing a truly valuable AI system goes hand in hand with privacy protection and data security. They are an integrated whole, otherwise it could never become a successful product. Without them, it remains merely a shoddy, temporary draft.
But before the day come, I have to say, just give it up, no one cares about security.
Unless they feel painful someday.
ChatGPT is so hot that everybody is trying to use it everywhere. Recently I use it in my compiler to help users on error regeneration. Here's how I do it.
Laco is an optimized compiler for compact embedded system written by me. It supports Scheme programming language. In this post, I'm going to write some code with bugs intended. And compare the regular error message and the AI reviewer advice.
Here's a trivial Scheme program:
(let ((a (1)))
(display b))
It means to bind variable a with the value (1), and print the undefined variable b. Well, obviously it's wrong code. (1) means to apply integer 1 as a function which is definitely wrong. If we run it with any Scheme compiler/interpreter, it may throw error like this:
Usually, the error message is not friendly for the human newbies.
First, you need an OpenAI account for API token. Now let's set the token string to an environment LACO_OPENAI_KEY;
export LACO_OPENAI_KEY=your_openai_key
The logic is simple, if the compiler encounters compiling error, and LACO_OPENAI_KEY was set, then Laco compiler will wrap the source code with certain prompt to send to the OpenAI server for advice. The ChatGPT may randomly give you different advice each time, most of the time, it tells you what's wrong with the code and how to fix it properly. Here's one possible result:
Is AI good enough for error regeneration? I have to say it's not what I expected, but it works somehow. I've tried many prompts to get an expected result, however, I didn't find a perfect one. Sometimes the AI is too verbose but miss the point.
The critical limit is that AI has to get the hint that "the code can't be compiled". However, for our code example, the (1) part usually detects in the runtime. Fortunately, my Laco compiler can detect this error in the compile time, so the AI correctly get to the point.
What if the compiler can't detect the runtime error? The AI may randomly (and very likely) to congratulate your code is completely correct. So it seems that the current AI (GPT 3.5 turbo) is not good enough for the work on its own. But if you've confirmed the code is wrong, the AI will give you more friendly advice to fix the bug.
Is the advice good enough? Well, it depends. AI doesn't know the correct code in your mind, so even it find out the wrong part, it may not give you the expected fix. For example, in our case, AI may tell you (1) is wrong, but the fix I expected is to get rid of the parens, say, 1. But the solution from AI is randomly to add a quote to form a list, say, '(1). I can't blame AI, but it's a wrong fix for me. Anyway, this problem is impossible to have a perfect solution unless AI was connected to my brain.
I think the advice may not very useful for veteran, but still useful for newbies. Dunno, I haven't tested complex program. IMO, the AI is more like an assistant or tool at present.
What do you think? Comments are welcome.
ChatGPT is so hot that everybody is trying to use it everywhere. Recently I use it in my compiler to help users on error regeneration. Here's how I do it.
Laco is an optimized compiler for compact embedded system written by me. It supports Scheme programming language. In this post, I'm going to write some code with bugs intended. And compare the regular error message and the AI reviewer advice.
Here's a trivial Scheme program:
(let ((a ("hello")))
(display b))
It means to bind variable a with the value ("hello"), and print the undefined value b. Well, obviously it's wrong code. ("hello") means to apply string "hello" as a function which is definitely wrong. If we run it with any Scheme compiler/interpreter, it may throw error like this:
Usually, the error message is not friendly for the human newbies.
First, you need an OpenAI account for API token. Now let's set the token string to an environment LACO_OPENAI_KEY;
export LACO_OPENAI_KEY=your_openai_key
The logic is simple, if the compiler encounters compiling error, and LACO_OPENAI_KEY was set, then Laco compiler will wrap the source code with certain prompt to send to the OpenAI server for advice. The ChatGPT may randomly give you different advice each time, most of the time, it tells you what's wrong with the code and how to fix it properly. Here's one possible result:
Is AI good enough for error regeneration? I have to say it's not what I expected, but it works somehow. I've tried many prompts to get an expected result, however, I didn't find a perfect one. Sometimes the AI is too verbose but miss the point.
The critical limit is that AI has to get the hint that "the code can't be compiled". However, for our code example, the ("hello") part usually detects in the runtime. Fortunately, My Laco compiler can detect this error in the compile time, so the AI correctly get to the point.
What if the compiler can't detect the runtime error? The AI may randomly (and very likely) to congratulate your code is completely correct. So it seems that the current AI (GPT 3.5 turbo) is not good enough for the work on its own. But if you've confirmed the code is wrong, the AI will give you more friendly advice to fix the bug.
Is the advice good enough? Well, it depends. AI doesn't know the correct code in your mind, so even it find out the wrong part, it may not give you the expected fix. For example, in our case, AI may tell you ("hello") is wrong, but the fix I expected is to get rid of the parens, say, "hello". But the solution from AI is randomly to add a quote to form a list, say, '("hello"). I can't blame AI, but it's a wrong fix for me. Anyway, this problem is impossible to have a perfect solution unless AI was connected to my brain.
I think the advice may not very useful for veteran, but still useful for newbies. Dunno, I haven't tested complex program. IMO, the AI is more like an assistant or tool at present.
What do you think? Comments are welcome.
Hi LambdaChip falks!
Thank you for all the support to LambdaChip!
The problematic supply-chain due to multiple lock-down in mainland China which led to failure of business operations is inevitable. I'm here to make an announcement:
1. The company behind LambdaChip was officially closed in August 2022;
2. LambdaChip renamed to Animula, and donated to HardenedLinux community;
3. The Animula project will continue for the maintenance as the a free software project;
4. Feel free to fire bugs if you have issues during the Alonzo board usage.
Best regards.
Hi LambdaChip folks!
Thank you for all the support to LambdaChip!
The problematic supply-chain due to multiple lock-down in mainland China which led to failure of business operations is inevitable. I'm here to make an announcement:
1. The company behind LambdaChip was officially closed in August 2022;
2. LambdaChip renamed to Animula, and donated to HardenedLinux community;
3. The Animula project will continue for the maintenance as the a free software project;
4. Feel free to fire bugs if you have issues during the Alonzo board usage.
Best regards.
1961年,一个气象学家正利用计算机预测天气,不知怎地突然死机了,还好他的系统比较靠谱能恢复现场,重新开机后恢复中间数据接着算。可是见鬼了,他发现发现中途接着算的结果,跟重头算的结果差异巨大。后来发现,恢复数据的时候小数点后几位被截断了,普通人也许会认为这只不过是微不足道的精度变化,可是从此人类开始意识到天气系统具有敏感的初值依赖性。这并不是人类第一次真正遇到这个问题,在这位气象学家之前100年,庞加莱研究三体运动时就发现了同宿轨道,同样的,他发现了双曲点附近的轨道敏感地依赖于初值,无法预测其运动轨迹。
如果要从数学角度去理解初值依赖性,那么本文必将充斥着各种常微分方程的公式推导。我们也许可以通过一句话来定性地理解它,如果你知道蝴蝶效应的话,巴西的蝴蝶拍一下翅膀,可能会在一个月后引发美国德州的龙卷风,这句话暗含着一个道理:确定性的系统可以具有内在随机性。
正是这种”确定之中的随机性”,可以使得你十年后的结局取决于你十年前的一个决定。2021年诺贝尔物理学奖颁给三位科学家,表彰他们关于复杂系统的开创性研究。而复杂系统的特质之一,就是这种“确定之中的随机性”,它从数学上证明了一个事实,确定性和随机性并非截然对立。
很明显,创业是一个复杂系统。
如果你和我一样是个选择在2020年底创办公司的创业者,你轻松地操盘着自己完美规划的国际业务,在经济困境中起步,连续将中国、美国、欧洲市场逐一拿下。当十年后你面对着IPO的诱惑,发现自己即将在美国上市的公司,无法依据法律合并自己在中国、美国、欧洲等分公司的财务,导致账面盈利不足无法上市。
你的思绪退回十年前,你轻微地一笑,打开了 Google,或者打电话问朋友,或者你像我一样很自信地知道 VIE 架构,听说过开曼群岛。你知道这只需要花点钱就能搞定,一切都很顺利,还好你是一个见多识广的人,你对自己微笑。可是你也许像我一样搞不清楚 VIE 在国内的前景究竟如何,在后贸易战时代,真押宝你也不敢。
你的思绪再次退回十年前,你轻微地一笑,那些事情还是让你的CFO去考虑吧,虽然不知道她是谁,也不知道她在哪里,反正将来总会有一个的。让专业的人做专业的事,难道不对吗?你只需要考虑产品。还好你是一个知道自己定位的人,你不为搞不明白的事情花时间,你只想搞定自己的计划,过去十几年的学业生涯,不就是这么过来的吗?你对自己微笑。可是你也许跟我一样,刚刚听说 PLG,Product Lead Growth,呵,多么直白的方法论,让产品自己去增长,你做翘脚老板,你喜欢,你要把它贴在办公室大门上。
可是,你的合伙人告诉你,三年内主要靠2B市场的你们,并不能依靠PLG来拉动增长,也许还有不少的订制单需要做才能苟活下去。Fuck 项目制!Fuck 甲方!你咬着牙冷冷地说,可是真香啊!有钱拿,又有生产环境,在实实在在的场景中落地你的产品,不就是你创办硬核技术公司的初衷吗?
还好你像我一样经常跟创业的朋友交流,经常读VC的研报。知道项目制的成分不能过多,否则成了利润率不超过10%的大外包,在二级市场不值钱。
你的思绪再次回到十年前,这一次哪怕喝粥吃咸菜,也不能为了一点收入和流水陷入项目制的陷阱。还好,你可以无限使用时间回溯的能力,你对自己微笑。
就这样吧,差不多了,考虑太多问题容易睡不着觉,总不能为了十年后的IPO而抑郁吧,你轻松地去洗澡睡觉。你对着未来微笑。
等等,蝴蝶在什么时候扇的翅膀?
你在梦里的十年后不断询问自己这个问题,你想,这还好只是个梦,你对自己微笑。
软件也是个复杂系统,人生更是个复杂系统,不说你也懂。
万事开头难,可是很重要,记住初值依赖,要么谨慎地做好,要么永远别做。
Mae govannen, nothlir!
After careful consideration and consultation, as the maintainer of GNU Artanis, I hereby solemnly declare that GNU Artanis donates to HardenedLinux platform. This decision does not change the fact that GNU Artanis is an official GNU project, but further advances the progress of the GNU Artanis in Debian GNU/Linux based solution. This aim was decided by me, for the maintainer has the right to decide the direction of the project.
I have been dubbed as a GNU maintainer by RMS for over 10 years. Personally, I have full respect to RMS even when he was involved in whirlwind for the misinterpretation of his comments. I don't think this donation cause the seperation of GNU, but the enhancement of GNU in the modern FOSS connection. And I hope no one will think I have any problems with GNU or RMS.
Nothing will be changed, unless trival actions:
1. GNU Artanis on GitLab will be transferred to HardenedLinux namespace from my own.
2. GNU Artanis will mention HardenedLinux in release.
Officially, GNU Artanis is maintained on Savannah, and folks can discuss on mailing-list. But we also provide GitLab for modern people.
Nothing has changed, really, except for progress.
HardenedLinux was founded by Shawn Chang who is also an old-school hacker like me. Basically, it's been work out diverse best-practices based on enhanced Debian GNU/Linux.
It has gained a good reputation in the security industry and has long been a big supporter of free software.
Two years ago, Shawn had invited GNU Artanis to join HardendLinux, but I didn't accept. Because I was not sure about the future plan of GNU Artanis.
Today, I'm the CEO of LambdaChip, and I'm thinking about the product taking advantage of the power of Scheme language. GNU Artanis has been product ready in my previous day job. GNU Artanis is the first product-level Scheme web-framework. It is deserved to have a better plan for the future.
Joining HardenedLinux will be a new beginning, and we have the opportunity to be recognized by a mutual community, as well as cross-endorsed.
The interesting part is, 15 years ago, Shawn was the first person who told me about free software movement and the story of RMS.
15 years later, he once again introduced me to a path full of hope and thorns. This time, we will fight side by side for the dream of free software and old-school hackers.
See changelog.
LambdaChip v0.3.3 was released!
This is a bugfix release, so there is no any new features. But we fixed many fatal bugs.
LambdaChip v0.3.1 was released!
This is a bugfix release, so there is no any new features.
In the past decade, I've been trying to use Scheme programming language in the product environment. The existing lore around Scheme is always fascinating, paradoxical, and mystical. However, although Scheme is prominent in the academy, I'm pretty sure only a few people had tried it in a product. Because I didn't see many discussions about the problems using Scheme in product development. Most people were just following other people's opinions. And less people discusses the real problem, and how to deal with the problem.
My choice is GNU Guile, and I have also used other Scheme implementations. Of course, my experience is limited, and it doesn't mean GNU Guile is mature enough to replace Python. My opinions are not only for GNU Guile, they're considered for other Scheme implementations as well. I'd recommend you take a look at Chez Scheme, Racket, Chicken Scheme...to learn about the progress of Scheme community in the last two decades.
So this article is not going to tell you why to choose Scheme rather than Python. No, it's not the way to go. What we discuss is that if you already notice the trend of functional programming, and you want to take a look at some well-known functional programming languages, this article may help you to learn about the power of Scheme in a product.
There are tons of answers. I have to save time to list them exhaustively. If you are not familiar with Scheme land, here is what you should know.
- Scheme provides great expressiveness to write painless code to reduce the complexity.
- The low-level details are not the main concern in business logic, Scheme provides a good way for rapid development.
- Less complexity brings fewer bugs, but it doesn't mean there are no fatal bugs in Scheme code, it depends on the level of the programmer.
- Scheme is easy to understand. This is important when we have to analyze the program to debug.
- Scheme is a standardized language, the grammar and standard API are well defined.
- Full lambda calculus support. The lambda combination provides better expressiveness when you need it.
- From the perspective of professional software development, the Scheme is worth to invest. I'll prove it in this article.
Well, yes, it's hard to hire a professional Scheme programmer directly. Not many people have experience using Scheme in the product.
Five years ago, when I first time tried to build a Scheme team, I was also concerned about this issue. However, 2 months later, I realized that it's never a problem. We can train a Scheme programmer at a low cost. So here is my short story about the recruit.
At that time, we use C++14 for system programming, and Scheme for business logic scripting. Now that I am concerned about the difficulty to hire a Scheme programmer, my idea was only to ask about the functional programming part in C++. Nowadays, C++ has many functional features, so C++ programmer has to know something about FP. It's a fashion, right?
If the candidates were good enough to get our offer, I would teach them SICP in the newbie training. Of course, SICP is a big book for newbies, so I only picked the first 2 chapters. The "data abstract" is for understanding object-oriented, and the "procedure abstract" concept is for polishing algorithm implementation. And we dropped "meta-language abstract" and "machine abstract" in the SICP since we didn't create a compiler business (we do it now, that's another story I will tell later).
The training had lasted one month. After training, they started to do the small works on the product, fix bugs or tiny features. SICP is not only a book but also a formalized educational system created by the profound mind. So we can take advantage of this system, finish the training very soon, and use what you have learned in the daily work when it's still warm in your brain.
The SICP educational system can help us to train good Scheme programmers in a short time. Any technology company should have a training period for newbies we can do it in one month. It's a good and affordable solution. We've kept using it this way for five years, so far so good.
Think about it, we spent zero money on this training, and we transfer the knowledge of SICP into a product in a short time.
Not true anymore, have you heard of Chez Scheme?
Many years ago, when I was still a post-graduate. I've ever fascinated by the speed of the programming language.
When I became a professional developer, I realized that there are at least 3 kinds of speed in programming:
1. Learning speed: how easy to learn a language with zero knowledge?
2. Development speed: can we reduce the coding work as possible?
3. Execution speed: how fast can it execute?
Knowing about this, I suddenly understand why Ruby on Rails is popular in web development, although it was considered a slow language.
If we only consider the execution speed, the assembly or even machine language would be a better choice.
My friend, please, don't trust this conjecture anymore. It's not true in real work.
If you want a true story, here it is: the language is the most important for an ecosystem.
Python ecosystem is only for the Python language. Python has a big ecosystem for historical reasons, but it doesn't mean other languages need a comparable big ecosystem.
There's only one true ecosystem, the C ecosystem. The dynamic language modules are just the bindings of the existing C libraries.
So if a language has good FFI (Foreign Function Interface) support, it can take advantage of all C libraries. Today, people may concern more about distributed processing. The distributed system can be heterogeneous so that we don't have to build everything in just one ecosystem.
On the other side, a big ecosystem may contain a sufficient number of packages, but it doesn't mean all these packages and their dependencies are of good quality and maintenance. We shouldn't have any blind faith in it.
So let me break this illusion, an ecosystem may be significant in some situations, but not the most important for a language. When we create a language, we only consider its paradigm, grammar, expressiveness, and optimizing penalties. These things are the most important for a programming language.
In the past, a programming language was created and driven by a company. People have no reason to contribute to its ecosystem without any payment. So for a commercialized programming language community, the ecosystem was important, otherwise, it has grown slowly, and the investment was wasted. Nowadays, the programming language is mainly driven by FOSS (free-and-opensource software) community. The ecosystem is naturally growing under FOSS power. An experienced developer will worry about the expressiveness of a language since it decides your coding efficiency.
Today, deep learning is hot. I'd like to introduce the AISCM project which is written in GNU Guile using LLVM JIT and based on TensorFlow. It works. It's cool, and I had some patches to it. However, I don't have much time to spend on it. So if you hope to use Scheme for deep learning seriously, I'd recommend you contribute to AISCM.
Warning, there's already a middle codebase in AISCM. And it contains the support of OpenCV, FFmpeg as well. Thanks to the author Jan Wedekind.
In the other Scheme implementation, I have less information about their progress on deep learning things. But it's just the matter of my short-sightedness.
For historical reasons, there're many Scheme implementations, but the community is fragmented. Today, we see some hopes.
First, the Scheme standardization has good progress in the past decade. The R7RS has separated Scheme into small and big languages. IMHO, it's a good step. Because it keeps the minimalism of Scheme and provided a way to make Scheme an industrial level libraries spec.
My own business relies on R7RS-small, which is just like created for our business. I'll talk about it later.
Another hope is the modern package manager.
For me, I choose GNU Guix, a functional package manager written with GNU Guile. Most of the GNU Guile libraries can install with GNU Guix. The package manager creates the ecosystem. In practice, we don't say "ecosystem" since it's a commercial concept. The GNU is an operating system community.
Of course, Racket and Chicken Scheme has a good package manager as well. This helps them to form a good community.
Anyway, the consistent standard spec and cool package manager may fix the fragmented community. But it needs time.
Functional programming folks may want to use pure functional programming language, which means there are no side effects.
Yes, the Scheme is not pure, and it's a multi-paradigm language, not only a functional programming language. But it provides full functional features that you can find in other pure languages. Although there are side-effects in Scheme, you can decide what and when to use it. I don't think it's a good idea to forbid people from the side effects. People should have the freedom to choose the way to use a language. The Scheme provides this freedom, this is maybe good or bad. But anyway, you are free to choose.
Nowadays, most of the GUI programs are replaced by SaaS. That's why I mentioned Web development rather than traditional GUI development.
In 2013, I started to write GNU Artanis, which a modern web framework in GNU Guile. At that time, I just want to do some simple cases to use the Scheme for the web.
I didn't expect GNU Artanis can help me to finish an industrial-level SOA (Service of Architecture) in logistics robotic system, but yes it did. And now, GNU Artanis is not a small piece of code anymore. It contains URL-remapping, relational mapping, HTML template, asynchronous co-routine server core, integration of modern Javascript frontend framework like React, etc. You can build your site or service in just one minute.
Racket and Chicken Scheme have a good chance for web development too. Before I started to write GNU Artanis, I've used Chicken Scheme for web programming.
The Scheme has been well-researched for decades in the academy. From embedded systems to distributed server development, there're tons of Scheme papers to show positive outcomes.
So people will ask a question, for such a good language, how hard to learn and use it? The answer had been answered in the "first illusion" section:
In practice, we spent zero money on Scheme training, and there's a good educational system SICP that we can transfer the knowledge into a product in a short time.
The zero money story is true, and we didn't count my salary since I didn't spend much time training them myself. Of course, we will discuss the programming issue, but most of the discussions are around general programming, not Scheme specific. They can fluently write Scheme code by themselves if you've ever learned Scheme, you don't doubt it.
The cost is so low that it's harmless to give it a try.
As a seasoned C/C++ programmer, I have to say we shouldn't waste our time coding with C/C++ in the non-system-programming part. The business logic maybe changed according to the requirements. It's not flexible to write everything in C/C++. Don't waste your time on refactoring and debugging.
The Scheme has powerful expressiveness to help you reduce coding work.
But it depends on how you use it.
If you expect there're many libraries to save you work, Scheme is not the choice. Usually, this is for the proof of concept. You may choose Python.
If you use Scheme for scripting relies on your C/C++ codebase, it's a good way to go. The Scheme is minimalist and it's easy to be embedded as script code in a system.
When you tried to complain about the bad coding in a code review again, have you realized that you never gave a chance to your team members to equip real CS knowledge for serious programming? The Scheme can bring such a chance, and it's easy to learn, finally, it's good enough to use in a product.
I'm glad that you finally ask me this question.
In the past two decades, we can see great progress in modern software development: erupting FOSS communities, DevOps, SaaS, deep learning, functional programming, etc...we want to bring efficient development to embedded software development, and better workflow to integrate embedded software to AIoT solution.
Please follow us on Twitter, join our Reddit forum, and bookmark our site. We occasionally publish cool things.
Finally, choose your weapon wisely. And it's not bad to be diversity to learn more things.
LambdaChip is a Functional Language Virtual Machine designed for embedded system.
First, you need Docker, I recommend the official installation document.
Then you can pull Artanis image:
docker pull registry.gitlab.com/hardenedlinux/artanis:latest
Because the contents created in docker environment is volatile, I'd recommend you create a workspace in the host environment before jump into docker environment.
mkdir myapp
Now you can run a container:
docker run -it --rm -p 3000:3000 -v $PWD/myapp:/ registry.gitlab.com/hardenedlinux/artanis:latest bash
I'm going to explain something:
Now you can create your first Web App with Artanis:
art create hello
cd hello
art draw controller hello world
When everything is prepared, you should do this to boot up Artanis server:
art work -h 0.0.0.0
Because the default host is 127.0.0.1 which is impossible to map the socket port to the outside, so we set the host to 0.0.0.0 to make sure you can checkout your work in the outside browser.
You may want to visit http://localhost:3000/hello/world in your browser.
For more details, please read the manual of GNU Artanis.
Recently, I released GNU Artanis-0.5. Here's Relese Note.
So what's news in it? We listed some notable changes here and ignore misc bugfixes.
The most notable news is that we support Guile-3 since Artanis-0.5. Guile-3 is a significant version because it contains better optimizatoins and JIT. However, Guile-3 has something incompatible with Guile-2. You may read more in my mail. That is to say, you should not use Guile-2 anymore since Artanis-0.5.
I believe some people may ask the same question: why not GnuTLS?
Alright, we don't reject GnuTLS, but the current guile support in GnuTLS is dropped in some distros. I'm not going to write my own guile-gnutls wrapper. So I prefer to wait.
It's possible to support multi security libs if it's necessary. Fortunately, NSS has good coding quality and it is a nice option for our design in the long term.
In Artanis-0.5, libnss is only used for hashing algorithms. The hash functions (md5, sha, etc) are implemented in pure Scheme. Obviously, we have a better choice for efficiency and security to take advantage of existing industrial-level library.
In the future, we may add more features of crypto and security.
Ssql is a feature to map s-expression to SQL string. We fix 2 issues according to SQL standard.
String value should be single-quoted
In SQL, the string should be single-quoted, not double-quoted, for example:
(->sql select * from 'Persons (where (/and #:name "john" #:company "lambadchip.com")))
And you should get the result:
"select * from Persons where name='john' and company='lambadchip.com';"
Number value shouldn't be single-quoted
(->sql select * from 'Persons (where #:age 15))
And you should get the result:
"select * from Persons where age=15;"
In the current design of Artanis, these APIs are used for controling the cookies for the client.
The API only affects rc-set-cookie that handles new cookies from the server-side:
:cookies-ref
:cookies-set!
:cookies-setattr!
The API only affects rc-cookie that handles cookies of the client:
:cookies-remove!
:cookies-check
:cookies-value
You may use #:cookies to initialize cookie instance, then control it, the result will be updated to the HTTP response automatically. If you want to modify the existing cookie sent from the client, you have to link the client cookie with the cookie instance that you initialized:
(get "/set"
#:cookies '(names a)
(lambda (rc)
(:cookies-set! rc 'a "a" 123)
;; Don't forget the path should be the same, otherwise the modified
;; cookie will not affect.
(:cookies-setattr! rc 'a #:path "/test")
"done"))
(get "/test"
#:cookies '(names a)
(lambda (rc)
;; link with :cookies-set!
(:cookies-set! rc 'a "a" (:cookies-value rc "a"))
(:cookies-setattr! rc 'a #:expires 21600 #:secure #t)
"yes"))
For more details, please read Artanis manual about cookies.
Rust is a strange language.
It has modern and promising design idea, and good performance compared to C++. It's proved to be an alternative system language that can use for writing OS, although there's no product level rust OS yet, there're several implementations as the potential players. It's the highest expected, trying the whole bag of tricks to grow, gaining popularity in the young crowds...all these are very good to a novel and health community.
On the other hand, this language seems no idea where to go. No, I'm not talking about its dreaming goal, of course I know its goal is explicitly expressed: safety, speed, and concurrency. What I'm confusing is the way it choose towards its goal. Rust community choose an iterative model rather than waterfall model to develop Rust language. That is to say, the code and interfaces keeps changing while the design is still not confirmed. Although it's the popular development model in today's internet industry on business software, I doubt if it's the best way for a foundational foresight programming language. However, it's too early to say whether the development model is bad in nowadays' industry. We can see many companies are trying to use Rust even its design is still unstable. In the past, such kind of activity is insane to take risks in a serious product. But today's industry is in a rapid iterating, maybe our understand is too old for it, and maybe Rust's model is the brand new way to improve the industry. People can adapt the fast changing things even it's their fundamental langugage, and there could be stronger auto tools to rely on. Who knows, let's give it a try, warriors, the boss pays for it.
This article is about the Rust frontend on GCC. It's reasonable for me (or any potential contributors) to blame Rust's development model, since it's interfering the efforts of its own GCC frontend.
So far, the only reliable implementation is the officially supported LLVM front-end. I don't know if there're more, but I will not choose if it's not based on a mainstream compiler infrastructure. After all, it's system level language, once you choose, it's not expected to be changed in a decade. To my experiences to lead the software development in past decade, it's not so simple to change fundamental things, from libs to compilers to languages. It's not impossible, but why waste time to do it when you have lots of other workloads to deal with?
Stop, it's not a criticism article against Rust. So I'm not going to blame it more.
OK, that's enough, now that we're trying to work for the Rust front-end on GCC, it's reasonable to say we're interested in this language. And we hope to promote it better. As a perennial GNU hacker, it's natual for me to think about Rust on GCC. I have several experiences on writing language front-ends, the architecture is similar, and the theory is similar too. However, gcc-rust is a very hard project that couldn't be done successfully by one-man effort. In this article, I'll introduce the ideas and efforts around FOSS community. I hope all these information can help people to understand the situation, and help potential contributors to take part in.
Of course, Rust on LLVM has gone so far away, but what would I say?
Rust on GCC is a Hydra. Why? Because there're at least 3 possible ways to go. Oh well, three heads of a dragon.
These 3 ways are originally mentioned by @redbrain who is one of the enthusiasts of Rust on GCC (https://github.com/redbrain/gccrs/issues/2#issuecomment-563255365). Unfortunately, he has dropped his efforts, but his work of Rust on GCC is still inspiring.
The first way is the most lightweight way. The original Rust compiler provides a IR (Intermediate Representation) named MIR, so the idea is really simple, we can take advantage of the front-end of Rust on LLVM then convert MIR to GENERIC which is the general IR in GCC. The pros is that we don't have to deal with the occasionally changing of Rust's grammar. This idea is achievable, since the original Rust compiler is in the similar architecture to convert MIR to LLVM-IR. The only problem is that this Rust on GCC frontend has to require the original rustc, and this may bring the LLVM dependencies chain. Then people may wonder why we still bother to implement Rust on GCC, it doesn't reduce the dependencies in the development environment nor likely to be widely accepted by GCC folks. No mention the license issue.
The second way is to use GCC JIT interface on top of the original Rust compiler. This architecture implies to add a new backend (libgccgit https://gcc.gnu.org/onlinedocs/jit/cp/index.html) to the original Rust compiler. So it's not the Rust on GCC as most people expected, it's a new feature request to Rust developer to support yet another compiler backend and maintain the code by Rust community. This is the perfect approach only if Rust developers accept it. However, the hard part would be license issue, and it all depends on the willing of Rust community.
The third way is the traditional way, full C++ implementation of Rust on GCC. There's an existing good example, gccgo, which is the Go language front-end on GCC. It's the hardest way to build a Rust compiler from scratch. The pros is that the developers can expect to do everything they like to build an efficient Rust compiler to challenge the orginal one, and it doesn't require LLVM as the prerequisite. Finally, no any license issue, we can make it GPLv3+. However, there must be a strong group to maintain this front-end. Gccgo has a strong team hired by Google. No mention you have to keep updating the volatile Rust grammar periodically.
All these approaches are independent from each other, so it's impossible to combine all these 3 forces. The MIR way is working by @sapir https://github.com/sapir/gcc-rust/tree/rust, and it's still in progress; The libgccjit way requires the connection to Rust community that is out of my reach, and I don't know who is working on it; as a GNU hacker, I think my prefered way is the third one. The third way was dropped by @redbrain, but there's some inspiring code left.
My plan is not to secretly working on it then throw a release suddenly to shock people. I have too many projects in my TODO, I do want to have gcc-rust, and I do want to have more friends on it. Let me emphersize it again, it's impossible to do it by one-man effort.
Fortunately, @SimplyTheOther have done fully-operational and working lexer, parser of a recent Rust syntax. It's great work. I have enough experiences to write parsers for practical languages, it's time consuming work. That's why I appreciate @SimplyTheOther's work so much. Of course, I agree with @SimplyTheOther that the most time consuming work would be the type checking part, at least for Rust, I'm afraid that it could be true.
After I helped to fix tiny bugs to make it work smoothly, I started to write an enhanced AST dumper. It's reasonable for us, first, I need to get familiar with @SimplyTheOther's AST design; second, we need to output a human-readable AST for debugging.
Let me show you a silly example:
fn abc(x:u32, y:u32) -> u32 {
return x+y;
}
fn main() {
{
1+1;
}
println!("Hello World!");
abc(1, 1);
}
Alright, it's silly meaningless code, let's see how the AST dump looks like.
Assuming you stored the source code to test.rs, then run AST dump:
rust1 -frust-dump-parse test.rs
The output could be:
Crate:
inner attributes: none
items:
u32 abc(x : u32, y : u32)
BlockExpr:
{
outer attributes: none
inner attributes: none
statements:
ExprStmtWithoutBlock:
return ArithmeticOrLogicalExpr: x + y
final expression: none
}
void main()
BlockExpr:
{
outer attributes: none
inner attributes: none
statements:
ExprStmtWithBlock:
BlockExpr:
{
outer attributes: none
inner attributes: none
statements:
ExprStmtWithoutBlock:
ArithmeticOrLogicalExpr: 1 + 1
final expression: none
}
ExprStmtWithoutBlock:
println!("Hello World!")
ExprStmtWithoutBlock:
outer attributes: none
StructExpr:
PathInExpr:
abc(1, 1)
inner attributes:none
final expression: none
}
It's good for us to debug or extend the parser. After all, if we can't make sure the parser is correct, how can we move forward to type checking and IR transformation?
The project repo is here, patches are welcome: https://github.com/philberty/gccrs
Before answer this question, I guess you may wonder why I used `rust1` rather than a traditional name `gccrust` or something else.
`rust1` is the Rust compiler, while `gccrust` is a collection that calling compiler, assembler, and linker like a chain. We haven't done the IR transformation yet, so there won't be anything can be assembled and linked except for the AST. Oh my poor Rust compiler!
It's not so hard to transform AST to GENERIC which is the current toplevel IR of gcc. However, it's meaningless to rush if we haven't made sure the parser work correctly for the Rust grammar. No mention the critical feature of Rust, the linear type system.
If you understand what I've said so far, then I can answer the question "what could be the next".
1. The type system, too big to talk about it here.
2. The static analysis.
One of the advantages of llvm-rust was that it used the static analysis of LLVM. Fortunately, David Malcolm finished his GCC static analysis framework recently https://lwn.net/Articles/806099/. I'm not sure if there could be any hook for front-end to interact with, or maybe we can't touch it once we pass the authority to the next level IR. Anyway, it's an exciting feature of GCC to research.
3. The new middle-IR for Rust specific pre-optimizing.
This idea is largely inspired by gccgo because they've inserted a specifically designed IR for Go language before transforming it to GENERIC. This IR was named GOGO. Gccgo will do some pre-optimizing with GOGO. From the compiler design perspective, it's reasonable to do so, since some information may lose after transforming to the lower level, and can never rollback for certain optimizing.
4. AST optimizing/refactoring.
There could be caveat or limitation in the current AST, so far, I just found one: it lacks of printing record so that we can't dump AST with pretty indentation. I've implemented the indentation with a global variable. I don't like it, but it's safe since there's no threading in AST dumping.
5. AST to GENERIC/GIMPLE. Nothing to say.
6. Memory management model.
I haven't thought it deeply, but gccgo has been optimizing its memory footprint for many years, finally it got great result. So I don't think it's an easy work for us.
7. Exception throw and restore.
One of the ideas in Rust is "safety", it's easy for common users to understand from the manual and the advertisement. However, the real work is behind the syntax, if we failed to do it correctly in the compiler, then no one can guarrentee the safety. After all, the grammar and feature description is just a piece of paper, not the magic itself.
8. Runtime, standard library, package support, compatibility, etc ...
Hey, we're a small community with several core developers, so it's just my personal opinion, but I think they're something that would be done sooner or later.
Happy hacking!
1997年,有一个叫 Paco Underhill 的家伙写了一本书叫做《Why We Buy: The Science Of Shopping》,中文名《顾客为什么会购买》。此书出版后在某个群体内大受欢迎,再版了多次,并被誉为“新时代的销售圣经”。您不用去读这本书,原因待会儿您就知道了。那么这本书里面写了什么呢?它可以说是一本很有洞察力,并且极具实操性的一本书,它主张明智的销售总是顺应顾客的习性而不是试图改变他们,由此它提出在销售场景当中用科学的统计和分析方法,来研究如何通过重新规划物品的摆放、场地的利用、标识的搭配等等来提高顾客的购买率。当然,这本书里所讲的案例是跟具体销售场景密切相关的,经验并不能直接拷贝,但它所介绍的分析方法却是通用的。
举几个例子。
作者经过统计发现,某书店将畅销书和打折书一起摆在进门的地方,但大多数顾客从大门进来之后却被打折书所吸引,当顾客挑好一本书后很少再接着往深处逛。虽然这家书店因折扣书而获得赞誉,可折扣书价格太过于吸引人,又是如此容易获得,导致其他的书缺乏在读者眼前曝光的机会;作者通过问卷调查发现,许多男性顾客如果单纯是想进商店随便拿瓶饮料的话,红色标识的可乐通常会比较吸引到他们的注意;作者通过实地研究发现,结帐区如果没有跟陈列区明确分隔开的话,当结帐区拥挤时,陈列区的大部分顾客会因为无法认真挑选商品而放弃消费。这些结论并不高深,可在过去没有太多人意识到,也缺乏系统的方法去分析并解决。该书作者属于比较早期将科学方法应用到销售当中的人之一,所以他启发了很多人。
这个时候您或许要问,这么好的书,这些内容甚至是我一直想要了解的,为什么不推荐我读呢?
很简单,因为它过时了。
我们不能否认这位凭借畅销书大获成功的帕科先生堪称伟大的洞察力,但这个时候,通常我要先面对一两个杠精的提问。
通常第一个问题是,这些似是而非的结论究竟有什么证据来支撑?根据书中记载,这位勤奋的帕科先生手持录影机带着几个助手在商场里一蹲就很长时间,20年间他们拍摄了上千小时的录像,并进行了大量的个人调查。可以说所有的结论都是他们汗水的结晶。
第二个问题很可能是这样的,你搞哪行的怎么就敢高谈阔论起销售学来了?
好了,可爱的杠精们可以退场了。我们必须承认杠精有时候是有用的,目前他们唯一的作用就是帮我提出这两个关键问题。
为什么第一个问题很关键呢?很明显,这位勤奋的帕科先生太劳累了,如今一出门到处是传感器,到处是摄像头,人们的脸部都是可被识别的,生物信息跟职业、家庭、生活习性、银行帐户都可以进行技术性的绑定,要做这种科学销售,已经不需要手持摄影机蹲点并且算几个月数据了。当我们上网浏览的时候,有大量的脚本正在盯着我们的一举一动,我们鼠标滚轮的滚动速度,滚动圈数,静止时长等等都可以用于估算我们阅读页面时可能的区域以并量化我们对内容的兴趣。当你缺乏购物欲望时,要针对你个人浏览记录的侧像描绘来向你推送那些买买买的东西,技术上早以易如翻掌。
我必须再次强调帕科先生伟大的洞察力,他提出,顾客购买取决于当时的购物环境。也就是说,我们不能拿着A商店的结论去B商店刻舟求剑,一定要在B商店展开同样的调查去获取结论。但这还不够,许多顾客的购买兴趣是到了商店那一刻才产生的,所以要更为精准地把握顾客的动向,必须在那时那刻去分析。帕科先生指出,随着信息渠道的扩宽,顾客对品牌的迷信逐渐在消失,终身恪守一个品牌的顾客将越来越稀少,顾客所做的每一个决定都是全新的,没有想当然的结论。这就意味着,在销售场景中实时把控顾客的动向成为了精准销售的不二法门。可惜帕科先生再怎么勤奋,也不可能扛着摄像机随时随地去观察算计每一个顾客的心理。可是大数据可以做到,我知道“大数据”这个词已经用滥了,可就在我这篇文章当中的此时此刻,我所描述的技术和它要解决的问题,正是大数据的本质之一,搞不好这个词一定也很高兴这么多年总算有人把它给用对了。
我想我已经回答了“您不用去读这本书”的原因,这个时代,如果还要扛着摄像机去拍几千小时视频并且雇人去记录顾客行为,那竞争对手可要偷笑了。如果想要把握这个时代基于大数据和物联网的销售体系,这个课题很有趣味,但它已经超出本文范围。本文要解决的可不是跟您讨论怎么靠现代技术多赚钱的问题,或许以后有机会可以探讨。
那么本文想要聊什么呢?这就要回答杠精们的第二个问题了,我是干啥的,我为什么了解更为未来的销售学手段,因为我懂计算机,我吃这碗饭;而我为什么要高谈阔论这种未来销售学,因为它可能侵犯了我的隐私。
有个著名的段子说,克林顿之所以赢得1992年的大选,得益于他的竞选团队提出的口号:“It's the economy, stupid!(笨蛋,关键是经济)” 应该说这句口号在当时是振聋发聩的,因为当时的美国经济陷入低谷,人人都知道经济是个问题,比尔克林顿扛着经济大旗振臂一呼,相较比尔盖茨更能让半个地球颤抖。
可是,我如果在2020年的文章里也想达到醍醐灌顶的效果,这么说话却效果不佳,因为大部分人并不了解什么是隐私,我振臂一呼可能随之而来的是莫名其妙和嘲讽。这样我就很尴尬,倒不如放弃振臂一呼,来科普一下什么是隐私。
首先,我觉得一次正常的讨论,必须排除极端话题,也就是去掉一个最高分去掉一个最低分:我们不散播末日预言式的隐私恐慌,为此本文不倡导在信息时代把自己裹得跟个粽子一样啥个人信息都不泄漏;另一个极端,我们也不争论隐私保护的必要性。我觉得任何一个有正常智商和社会经验的人都会认可隐私保护的必要性,这是不言自明的,根本没有争论的必要。
可是,即便我们达成共识,隐私保护是必要的,却还是要搞清楚隐私究竟重要在哪里。
我是个实用主义者,所以本文不探讨隐私保护在哲学层面的意义。我们就说一说隐私泄漏会给您的个人生活带来什么麻烦。
许多人都看过小李子的大片《盗梦空间》,里面的盗梦者都有一个属于自己的图腾,只有自己知道它的属性和细节,绝不能被别人知道,否则别人如果造一个梦境去哄骗他们,是没有办法识破的,除非借助他人的力量。如果把这部电影的隐喻映射到现实生活当中,逻辑是这样的:你的头脑内必须有一块只属于你自己的信息区域而不能轻易向他人泄漏,否则你将无法识破一个精心策划针对你本人的骗局。
我们来举一个例子,某天我接到一个陌生电话,说我一个密友出事了,让我给她转一笔钱,我当然不相信了。可她这个时候透露了一个只有我和这位密友才知道的小信息,由于这个信息泄漏的可能性非常低,所以我出于一个普通人的想法认为她可能确实和我这个朋友有密切的关系,看在这位朋友的份上我选择相信她。通常来说我应该打电话给这位朋友确认,那么这就有个问题了,如果刚好这位朋友的电话占线,电话那头催得又急,数目也不大,我就很难拒绝。可是这钱打过去要是被骗了是很难追回的,也许会有人说支付宝或者微信转过去可以追查到的,可是我要怎么说明我是被骗的呢?客服是没有办法因为我声称被骗而取消转帐的,而搞笑的是,由于我主张被骗,举证责任反倒在我这边了。如果我被骗了,当然不是世界末日,可我开心吗?
许多人谈及隐私的时候,首先会反问一个问题:我没有做见不得人的事情,为什么要怕别人知道我的信息?通过这个故事大家可以看到,所谓的隐私,并不是指什么见不得人的秘密,而是一些值得守护但不足为外人道的信息,这些信息的泄漏对社会而言是中立的,不会有危害或者好处,但对个体来说却可能是造成困扰的源泉。这很容易想像,你和EX的情书被人挖出来传阅,这其实也并没什么见不得人的对吧,可你开心吗?
当然这个时候我会面对一个质疑:我并不是什么有钱人或者名人,别人没有什么理由找上我。可是人之所以被骗,除了自己的贪欲以外,也可能是自己的信息泄漏了,别人对你了如指掌,诈骗成本过低导致的。有个段子说赚大钱的手段都写在刑法里了,反过来看这是非常合理的,刑法这样规定,就把做这些“赚大钱”的事代价拉得极高,这样犯罪者要算计成本时就不得不考虑了。所以除了少数“艺高人胆大”或者智商太低的人会去触犯这些法律以外,绝大部分人想都不会去想。可如果做一个件坏事的成本过低,那就很难说了。有证据吗?有啊,我们每天都接到各种骚扰推销电话,他们对我们的情况了如指掌,甚至身份证号都能报上来,对他们来说会觉得自己做得不对而要收敛吗?才不会,他们只会觉得这些信息唾手可得,人人都在做,我为什么不能做。
在这个时代,我们过度地信任了计算机和互联网,我们把自己的信息全部放到了网上。可当人类掌握了各种先进技术时,这些技术的野蛮生长会透出一种贪婪,吞噬着它本不该触碰的东西,连你没有想过要交给互联网的东西,它也在想办法偷偷吞噬着。
有朝一日,谁来为这场完全可以凭理性预期的混乱买单呢?
从帕科先生的书里,我们完全能够理解那些面向未来的信息获取技术是存在真正的价值的,我们不能否认这种价值,一旦我们享受到了这种未来销售学的甜头,即便再理性的人也很难抵抗。既然如此,我们就很难有立场宣称这些技术是非法和邪恶的,很明显它们能提高我们的生活品质,能够大幅度拉动经济需求。
凯文凯利曾在《科技想要什么》里表示,为了让科技帮助自己提升生活品质,牺牲一些隐私是值得的。那么我们要问,哪些隐私值得我们牺牲?
说实话,我不知道。我估计你们也都不清楚。
当我作为W3C的隐私保护专家在审阅各种Web标准时,用户实际需求的多变以及围绕它而存在的技术的不断创新,我脑子里不断地回响着这样一个声音:人的需求和技术的创新性是两头不可压制的巨龙,我们唯有将其套上绳索让它们顺服我们的意志,才不至于被它们吞噬。我们目前所采取的策略,是堵,在浏览器端,所有可能泄漏用户隐私的技术行为,我们都会从Web标准的高度去堵住它,让互联网标准去规定哪些东西不能被采集或者泄漏。要知道堵是一个很不讨喜的办法,我们经常要进入技术性很强的辩论,当谷歌苹果好不容易把一个它认为很赞的东西弄好来提交的时候,他们思考的可能是这个东西能带来怎样激动人心的产品,而面对的可能是我们当头一棒在隐私层面的质疑。更多的时候,我们争执不下,我们也很难拿出足够的说服力来表明某些数据有隐私方面的担忧,通常来说我们会选择让用户opt in,也就是让浏览器询问用户的意愿,并且用户可以随时通过修改选项改变自己的选择。这就是我目前的答案:我不知道,你们估计也不清楚,选择留给你们,反正随时可以改变。这必将带来用户体验上的差别,在不远的将来你们会发现上网的时候经常跳出一些选项让你选择是否暴露这样那样的隐私,这不是最好的办法,但至少我们有选择。关于我们的一些工作内容,可以读《我们赖以生存的互联网标准是如何制定的》。
可能有人会说,就算标准定了,实际的产品实现也可以完全不遵守啊,标准能有什么约束力呢?怎么说呢,既然这篇文章谈到了销售学,那就顺便也说两句市场吧。你主打乱搞隐私但是体验貌似不错的产品,那只能代表你的市场定位,你实际上给你的竞争对手留了一个更好的路子去切割这个市场。换句话说,你要是不占据隐私这个山头,你的对手会占领。至于你看不看得上这个山头,拿不拿得下这个山头,就看你的水平了。隐私在销售行为中究竟重不重要,由不得任何一个人说了算,得由市场说了算。欧盟的通用数据保护条例(GDPR)凭空创造出一个新市场——隐私市场,难道就不是一种新的内需拉动?我国如果要仿效GDPR,又有多大阻力?请记住,不要走极端,不要用“隐私保护不适合中国国情”这种毫无意义的废话来掩饰自己的无知,好像中国市场就应该低人一等。重点不在于隐私的采集,而在于有了采集的权利能不能从法律上对隐私泄漏问责。GDPR也不阻止采集隐私,但是企业一旦泄漏用户隐私,对其惩罚是致命性的。正如我之前所说,想“赚大钱”,先要掂量自己是否承担得起那么大责任。
作为一个商人,我不想谈论商业以外关于隐私的观点,比如哲学、社科、政治等层面,那样超出我的能力范围,虽然技术层面是相通的,可考量的角度和意义是完全不同的。
在这样一个伟大的信息时代,想要不暴露隐私是不现实的,但是我坚持这样的观点:除非不可抗力,我有权利拒绝我不信任的企业采集我的信息。假如在某个服务市场没有这样的企业,那就很值得我去做一家这样企业,我也非常希望与我相同观点的人去尝试占领这样的市场,如果大家志趣相投,也完全可以一同创业,使得整个看似已经格局固化的互联网市场能够破局,从而进入下一轮的良性竞争。
就商业而言,怕的不是竞争,怕的就是格局固化而找不到破局的机会。
虽然作为用户我们无法确定自己应该放弃什么样的隐私,可作为企业,我们完全可以确定什么样的隐私我们不需要。目前许多企业的做法是一股脑的,尽量多获取数据,可数据多了不见得是好事。在大数据刚出来的时候,市场上一阵迷恋数据的热潮,仿佛有了数据就什么都有了。可到了算法、算力都有保障的年代,处理起来却发现数据太多太杂,以至于特征难以明确。等蓦然回首,才发现算法、算力不是什么福音而是个局,钱没挣多少,都让搞算法的溢价工程师和搞算力的硬件厂商们赚跑了。
没有哪个企业家会不算计这些,或许只是还没到时候。作为一个学管理出身的人,我很确信一点,那就是企业只会走向精细化经营,一股脑的玩法都是时代性的产物。到了信息时代的成熟期,精细化意味着信息挑选的精细化,以及信息处理的效率。如何做信息的减法,这是一个更具挑战的课题,我的理念是以隐私保护作为剪刀,这样我就有了明确的动机和方法论去消除那些不必要的数据。
江山代有人才出,各领风骚三五年。或许下一个时代我们会看到,我们会拥有一个新的人格——消费人格。这个人格由所有针对我们所搜集的商业数据提炼而成,它只能代表在消费时候的我们,当商家们试图推销的时候,这个人格能精准地反应我们的消费喜好。但是这个人格不能反推到我们的其他方面,它只存在于市场,它可以是一具商品,但具备一定的人格属性,这使得它能够更精确地反映我们作为消费者的消费心理。这个消费人格存在于大数据中,它对于我们的生活而言是安全的。
有一则寓言故事是这么说的,一个年轻人去找雕像大师问怎么才能学会雕像,大师回答:“雕像很简单,找一块大理石,把不需要的部分去掉就好了。”
大理石很好找,不需要的部分我们也知道了,就差把它们去掉了。
Today I released GNU Artanis 0.4.1. GNU Artanis is the product level Web Framework for Scheme programming language. Please see the release note for better description.
So what's new?
Nowadays, the industry people use Web Framework for RESTful API. The API should be managed by its version. Of course we can add version number in controller in MVC, but it's not so convenient. In 0.4.1, we provide API generator. Currently, we only support RESTful API, but it's not limited to RESTful style API. The module is extensible.
A new "art api" command was added, and the usage is explicit:
art api -c -v v1
You may ignore "-v v1" option, since the default version is "v1", and it will mention you if there's existing version.
After this command, an API file will be generated:
creating restful API v1
create app/api/v1.scm
The generated "app/api/v1.scm" looks like this:
;; RESTful API v1 definition of tt
;; Please add your license header here.
;; This file is generated automatically by GNU Artanis.
(define-restful-api v1) ; DO NOT REMOVE THIS LINE!!!
"define-restful-api" is a macro that pre-defined many things. Now you may use "api-define" to define you preferred API:
(api-define
hello
(lambda (rc) "hello"))
Please notice that you should visit "v1/hello":
curl localhost:3000/v1/hello
For more details, please checkout out the manual.
If you're interested in GNU Artanis, you may get it for a quick start with docker, please read Install GNU Artanis with Docker.
Happy hacking!
In the past 30 years, GNU Hurd has been the official kernel of GNU operating system. The term "kernel" may not be precise here. "The kernel" is the classical concept of the monolithic OS design; in microkernel, there's still the concept "kernel", but it's smaller than monolithic. However, GNU Hurd is not following the simple "kernel + userland" design. GNU Hurd is a collection of servers, which can interact with its microkernel GNU Mach by IPC (Inter Process Communication) for system level requesting. Sounds a little familiar? Try to imagine you're running bunch of dockers on the cloud, and there's a centralized node for scheduling.
Alright, alright, I know you're thinking about Kubernetes, please stop it, really.
Linux is a kernel of the complete OS: GNU or Android. The popular Linux distros are mostly GNU OS, although unnecessary a fullfilled GNU, like Debian/Arch/Ubuntu/SUSE...we call them GNU/Linux distro. Android is not a GNU/Linux distro, because it doesn't abide to GNU's system design. Android OS is independent of GNU OS.
Now we know GNU Hurd is "microkernel + servers + applications" pattern, and the "servers + applications" are in the userland. If we still think "microkernel + servers" a generic kernel concept, then what it will be if we replace this generic kernel with another one?
Since 1970s, there're several microkernels:
Amoeba. Authored by Andrew Tanenbaum who is also known as the author of Minux. The interesting point is that Python programming language was designed for this operating system. Well, I do think we can't anticipate the future, so many interesting changes in our history.
Minix. Well, no need to introduce, huh?
Chorus. It's one of the earliest microkenels (another is Mach), and a real-time distributed OS. In 1997, Sun Microsystems had aquired it for their JavaOS dream. Discontinued, unlucky.
Mach is the microkernel of GNU Hurd, GNU hackers are maintaining a modified version named GNU Mach. Nowadays, L4 is the believable better one. Microkernel has no traditional system call, so the efficiency of IPC becomes the critical factor for evaluating a microkernel design. Mach is the first generation of microkernels, its IPC was too slow. In the early 1990s, microkernels have received bad reputation for its efficiency, so people had tried a lot of efforts to change the situation. In 1987, German computer scientist Jochen Liedtke had started his 3rd OS design named after him (Liedtke 3rd system, aka L3), but he still found his microkernel too slow. Then he realized that the IPC is the critical factor, then he started his 4th OS design (yes, L4) , and concentrated on IPC optimizing with the hardware specific features. Finally, L4 become the new game changer, it's the second generation of microkernel. L4 brings unbelievable high speed IPC, only costs 0.04us on PentiumIII 500Mhz.
Unfortunately, Jochen Liedtke passed away suddenly on June 10th, 2001. His legacy is still shining in this world.
It's hardly to say the choice of Mach for GNU Hurd by Richard Stallman was wrong. Because Mach has proved itself in the industry, MacOS and WinNT are both based on Mach, and their Mach has been greatly improved, WinNT even re-implemented its microkernel following Mach design. There're two main developers of the original Mach, Avie Tevanian who was Apple's software CTO, and Richard Rashid who was top-level researcher in Microsoft Research. Given this background, we can hardly say Mach design is not powerful enough for an ambitious OS like GNU Hurd, however, GNU Mach is an older version that maybe not to compare to the Mach used by MacOS and WinNT. GNU Hurd hasn't been maintained by its original maintainer for decades. Anyway, today we do have other better alternative.
In 2014, seL4 was released with GPL2, it's a big news to the free-and-opensource OS world. Because seL4 is the only known OS that can full pass the formal verification (so far, full passed for ARM, x86 is still not full passed). seL4 is so extrodinary that we can't ingore it anymore, if we need to enhance or create new GNU Hurd, then I don't think there's other microkernel can be chosen other than seL4.
Before answer this question, we have a story to tell, a story about the struggling against L4 in Hurd community in past decades.
There were many efforts that try to port Hurd from Mach to L4, unfortunately, all of the known efforts are failed. However, previous experiments already yielded a lot of experience, which will be very useful for future Hurd development.
In the beginning, Hurd people tried to write a Mach-on-L4 layer to get the ball rolling. Then gradually move the Hurd servers to use L4 intefaces rather than Mach ones. As you may imagine, this approach makes Hurd run slower, but we may expect its future performance when L4 porting is completely done. I have to say this approach looks promising, but sadly, Hurd people finally came to a conclusion that L4 maybe not suitable for a general-purpose operating system design. But I don't think it's a reasonable understand today, because Genode can prove that L4 is suitable from embedded system to general-purpose operating system.
The Hurd/L4 development stopped in 2004. It provided several important experiences and docs about L4 on Hurd.
Hurd people also tried Coyotos and Viengoos which are different microkernels to L4, and these projects are also not proved successful so far. They are out of this article since I only want to focus on L4.
Today, the Hurd-NG (next generation) plan is more focusing on an independent design from the current Hurd. Another effort is x15 which is an alternative to Mach but a brand new microkernel, it's believed to be the more reasonable plan since it can reuse all exiting Hurd components. Mach is the only part will be replaced.
Now let's back to seL4.
First, it's GPLv2, not as good as GPLv3 but still very good in my opinion.
seL4 can pass formal verification because it's so small (but still powerful) around 10,000 loc. Mach on x86 has 90,000 loc. Linux? More than 2.7 Million lines. It's OK if a kernel is small, but if it's too small, then we need to add more necessary features for general-purpose OS. But if we add more features, we can hardly make it pass formal verification. So there has to be compromised. We can just make sure seL4 pass the verification in a certain level. If we can control the granularity for necessary requirements, then seL4's verification is still meaningful. But what extra features we need? It's another suspended topic.
If we replace Mach with L4, then we have to support L4 in glibc. For example, implementing c standard library with L4 APIs. It's never a small work plus necessary testing! Fortunately, microkernel is different from Linux. Linux has many system calls, but in a microkernel, in theory, there could be only 2 APIs, message-send and message-receive, please don't forget you're passing messages rather than function calling to require resources from the kernel. You may ask, how does other microkernel OS support C/C++ languages? Well, they may not choose glibc. For example, zircon which is the microkernel of Google's brand new OS Fuchsia chooses musl. However, I'm a GNU hacker and I want to see a complete better GNU OS so that glibc is necessary for GNU tools/applications.
I've discussed the seL4 plan on Hurd-L4 mailing-list. So far, there's no any coding work for it. I think it's better to share more of my ideas on my blogs to get more advice, or potential contributors. Personally, I have many projects on the way, if I have a good scheduling and sufficient ideas and information, then I think it could happen.
BTW, if we really have L4 on Hurd someday, we may have to call an operating system "Debian GNU/Hurd/L4" and "Debian GNU/Hurd/Mach" to distinct. I'm kidding, if that day come, we'll have special codenames for them. ;-)
There're many explainations and rumors around this question, but the answer IMHO is no so complex: there're no sufficient contributors. But let me ask it deeper: why less people are interested in it? Well, I think there're many people have interest, but they lack of sufficient knowladge to help GNU Hurd technically. After all, the research and application of monolithic kernel is prominent in the industry. But the microkernel, usually, you may learn about it only if you're a serious CS student. That's not cool, because the success of Linux is based on massive contributors without serious CS background.
I believe there should be more blogs or YouTube channels to help people to learn about GNU Hurd. And I guess most people never know GNU Hurd can work as a Debian or Arch distro, say, Debian/Hurd or Arch/Hurd. It works fine as a command line server. Few people know GNU Hurd had released working versions for years. But I would like to suggest you try a qemu image which won't harm your current system.
Some people may ask, I'm familiar with Linux, how can I start Hurd? Does it mean I have to learn new command and tools from scratch?
However, most of Linux people neither use Linux nor familiar with Linux, because Linux is a kernel. You only use Linux when you're a kernel developer. What you're using is GNU tools and other applications which is compatible with POSIX. Please don't forget that you're actually using GNU operating system. That is to say, if you switch to Debian/Hurd, you may use it just like a Debian/Linux, if you don't run "uname -r", you may not even realize it's Debian/Hurd. Of course, if there's no any differences, then we don't need Hurd, right? The difference appears when you're trying to do system level programming, and more complex software system development, it's out of this article.
"But why Hurd? There's Linux, right?"
Well, yes, there was Minix, why Linus wrote Linux? If you want to ask, there could be some technical reasons for you. But the only true reason, the intrisic one, is always "because I like and it's no harm".
In Dec 8th 2019, Italian artist Maurizio Cattelan had made a new sculpture in the Art Basel Miami Beach. It is a banana duct-taped to the wall, and it was priced $120,000. The orgnizer hope the visitors to think about how to confirm the value of a thing. Some people like this idea, some are not. Well, it's trivial in my article. I don't care how people think about it.
The title of this artwork is "Comedian". So who is the comedian?
GNU Hurd looks like the banana on the GNU wall. It can not move because it's taped as the offical GNU generic kernel. So far, we've spent more than million dollars on its development. Of course, Hurd contributors rarely get money from writing Hurd, but please think about the deserved income of the hardcore contributors for so many years. It's claimed to be valuable, or even priceless to somebody (like me), but it's never used in the senario that equals to its claimed value.
Huh, sounds like us Hurd contributors are the comedians, who is yearly blindly arduously pushing a rock to the top.
Sometimes when we look back to the history, old efforts had revealed the future to us, however, we never realized because it's too early to open it up. We may only realize the future when we're in the future. When we see something like Kubernetes or else, we may suddenly realize the value of the effort of GNU Hurd. If I can see those mocking face laughing at GNU Hurd decade ago, I guess I would see the real comedian. It's the fearless of ignorance facing the unknown future.
Yes, I know Kubernetes is never GNU Hurd, I heard you. They're different thing, I know.
GNU Hurd is not the banana too, fortunately.
对许多人来说,“朋克”这个词既熟悉又陌生。音乐上的朋克我很喜欢,因为和弦简单直接,一般就三个和弦不断重复,拨拉开了直接唱,所以唱词才是关键——重点是表达而不是炫技。这种反叛的态度转移到文化上,就成了反主流、反消费主义,说得通俗点,就是有点“老子有才,但老子不卖”的感觉。虽然听起来有些幼稚,但也是真性情,我自己虽然做不到,但无论如何也不能说是贬义词。
可是呢,朋克这个词也有另一种用法,那就是用于讽刺那种自相矛盾式的叛逆,比如最典型的“养生朋克”,一边琢磨怎么养生,一边疯狂地在极度地欲望消解中作死,比如泡杯枸杞熬个通宵啥的。
本文说的朋克,取最后一个意思。
我们来聊聊当今的开源软件是个什么现状。
几年前创客热的时候,我们SZDIY社区不知怎地就卷进了这个热潮里,那种疯狂我们一般都不好意思拿出来吹,因为听起来是真的吹,而我们实际上在业内又不牛逼,一般人听都没听过。
比如业内某大厂跑来跟我们说,一起弄个创客XX吧。这当然是好事,但是本社区一向比较屌丝,但凡谈技术,我们都习惯性地多问一句,“那开源吗?”
“当然开源!”对方拍着胸脯地表态,“我们API开源!”
这话遇到别人,可能以为是什么新形态的开源呢,喔API开源,中国创新啊,搞不好写文章宣称这是中国又一个第五大发明也说不定。
可是我们身为屌丝不懂什么创新,我们只懂点计算机历史。API开源,不就是闭源的本质性概念吗?
还有某互联网大厂,当时追逐热潮跟着玩创客,玩的什么呢?某一年,圈内但凡是个人都做手环,没挤进来的恨不得把全身上下能套的环,什么指环颈环全都做了,就差做紧箍咒了。
但是这位大厂很精准地找到了自己的定位——当时在业内没有任何一款手环开源——要不说人家怎么是大厂呢,眼光那可了不得。
于是就有了史上第一款开源手环,这么伟大的举措,第一时间就通知我们去申请了。
申请?为什么还要申请?因为要签NDA啊!
签了NDA,然后源代码给你,只能做着玩,不能发布哦。
正如我之前所说,我们屌丝不懂创新,我们只懂点计算机历史。
曾经有个MIT的学生,因为要签NDA才能拿到源代码,于是发起了一场运动,这场运动席卷全球,让自由软件的理念深入人心,改变了整个产业格局,自由地开放源代码成了软件世界的主旋律,连微软也不得不装模作样地来献殷勤。这位MIT的老哥就是自由软件的精神领袖Stallman简称RMS。
言归正传,这种先签NDA再给源代码然后还要限制你使用的授权方式,就是典型意义上的私有软件,它跟自由软件的概念是完全对立的。而自由软件这个概念的产生,就是为了要打破私有软件垄断的枷锁。
搞个私有软件来忽悠年轻人相信这就是开源,颇有种教唆年轻人认贼作父一般地猥琐。
所以一个懂历史的人,当然就明白,开源的软件 != 开源软件,就如同“北京的大学 != 北京大学”一样。
那这就有个问题了,既然开源的软件不见得是我们所认知的开源软件,那开源软件这个名字就容易让人混淆。实际上,我们所谈论的那种开源软件,是属于自由软件的范畴。
什么是自由软件?就是可以自由使用、自由修改、自由分发、自由学习研究,为了满足这四个自由度,开源是最基本的前提条件。所以自由软件是开源的超集,而相反,正如前文指出的,单纯强调开源可不保证其他任何东西。
但是自由这个概念在某些国家和地区不那么让人方便提及,所以为了照顾这些地方,我们采用国际通行的“自由及开源软件”这个概念,简称FOSS(Free and OpenSource Software)。
自由软件和开源软件不对立,后者只是假定无需再强调必要的自由性,它的问题在于它想省略的那个信息其实是不可以省略的。
而自由软件跟私有软件才是对立的。
说了这么多,其实还差一个。很多年前我们讨论说什么才是开源的最高形式,那时候还是太年轻,又身为一个屌丝,创新能力不足,想像空间不够大。那会儿我们一致认为追求用户的自由,又能不失商业性,就可以算作开源的最高形式了,其实也就是自由软件的核心理念。
结果前段时间还是打脸了,那一刻我们真正领悟了什么才是开源的最高、最深刻、最完美的形式——PPT开源。
各位朋克,是在下输了。
如果我的孩子没有做作业,她只要说她没做,没关系,以后补上就行了;如果她没做,非要说做了,并且还声称交给老师了,这就得收拾了。
要是哪个家长这种时候说哎呀她还小,要有耐心,我相信她的实力,那这种家长就很有特色了。你都不把人往正路上引,还一个劲儿为错误的行为说好话,这应该不是亲生的吧。
也罢,虽然第五大发明还没出来,但我们至少还有四大嘛,也不差。
反正,我们穷得也只剩耐心了。
公元1945年,Samuel Eilenberg和Mac Lane一同建立了范畴论,当时他们只把这个不怎么起眼的东西当作工具来解决一点代数拓扑上的问题。而这个东西是如此抽象和玄幻,以致于在之后很长一段时间内范畴论都在被当作哲学来讨论,直到一个叫格罗滕迪克的年轻人突发奇想把它真正用到数学研究上去,这个后来被封为数学之神的人在那一瞬间启发了凡人数学家们,于是范畴论开始从一门哲学变成了被数学家认真对待的东西。纯数学有太多的东西无法进入应用领域而被人类所关注了,所以绝大部分数学家可能都没有想到,这套玄门秘术会在计算机应用领域大放光彩。
同年8月,日本接受《波茨坦公告》,人类历史上死伤人数最多的全球暴力事件终于离划上句号只差一个签字了。谁也没有想到,人类历史的一次大波折刚刚落下帷幕,科技史上的一次大冲击又在酝酿之中了。
同月74年之后,我下了出租车,站在柏林波茨坦广场,身边就是柏林墙。望着晚霞中的斯坎迪娅酒店,璞玉楼台浑色金。人间此会论今古,心里想的却不是即将开幕的ICFP以及十几小时之后要做的演讲。我想到了一个很关键的问题,那就是这里是繁华商业所占据的波茨坦广场,而不是历史遗迹覆盖的波茨坦。波茨坦广场和波茨坦之间的关系,就是JavaScript和Java之间的关系,等价于雷锋叔叔和雷峰塔的关系。我终究没有机会在这里缅怀过去,因为我来到这里,是面向未来的。
孔子说,里仁为美。意思是定居要选个好地方,怎么个叫好地方?就是这个地方的人要知道仁义廉耻,不然住在这里智商会降低。择不处仁,焉得知(zhì,通“智”)?
我看柏林是个好地方,春风日日过东墙,过得多了就把柏林墙吹塌了。既然日过了东墙,肯定是要增添些智商的。既然有了智商,当然也就有了自信,要写个文章解释一下什么是编程语言的优雅。
关于优雅,英语里大致有两类词可用:
一种类似俚语posh,大意是说一个人喜欢把一个事情做花哨点(fancy),把自己用某种时尚“包装起来”;
另一类比如grace,用来形容一个人有内而外的得体,如果有朝一日真的“到那个份上了”,不仅内涵得体而且有身份了,就可以用elegant来形容这种人。
很明显,一个人即便一开始是posh,也完全可能通过某种路径最终走向elegant的。也就是英语里说的fake it till make,一开始只是装逼,但是随着见识的增加,耳濡目染,凭借自身的努力,逐渐就真的成为了心目中那样的人。
那么从范畴论的角度来看,我们来给一群人贴标签,我听说如今特别流行给人贴标签。把人群作为一个范畴,如果你造了间屋子,把进入屋子的人都贴上一个posh标签,那么这间屋子就称为一个posh Functor,类似可得elegant Functor。但是这群posh的人又可以通过fake it till make的方式成为elegant那类人,那么这个fake it till make的办法就叫做natural transformation。这些就是现在函数式编程最基本的一些概念,其实都很简单,人人都能学得会。
而汉语里面谈优雅,“是否得体”只不过是一半涵义而已,另一半则是“含蓄”。含蓄有什么好处,含蓄需要去做abstract。把所要表达的东西用更高的抽象层次表示出来,其表达性要远高于直接陈述。
良好的编程风格,讲究细节的隐藏,只暴露其语义。这样阅读代码的时候就可以快速抓住其在上下文中的涵义,而无需计较细节。
另一方面,当需要计较细节的时候,比如debug,实现细节已经被封装在起来了,那么排错的时候范围就被缩小了,可以提高debug效率。
前几天还有人在reddit上跟我抬杠,说C++明明可以用迭代语法,不明白为什么非要用Functor,言下之意似乎是我还不熟悉语法。我就跟这个家伙解释,如果用迭代语法不仅细节暴露了,而且是个statement不能当作参数传入,那就不能跟其他的函数进行组合;而用Functor就可以跟其他Functor进行组合,以这样简洁的方式这样就可以构建更大的更复杂的程序。不知道这家伙听懂没有,我看很多人都懂了,这应该也不难理解。
但是这一切都有一个范围,那就是不能是太小而功能简单的程序,因为那样的程序实在犯不着这么讲究,除非作为一种练习。
所以优雅地编程,就是要有这样的得体,以及善于隐藏细节,将功能性的东西抽象出来,让一个复杂的程序看上去就像是一系列功能的组合。这说起来简单,但是由于工程上存在的各种复杂性,尤其是副作用导致的复杂性,让这种“无非是把各种功能组合起来”的方法论成了一种奢侈的追求。大道虽简,但是行走这条大道却充满了荆棘。于是就有了聪明人试图发明一些工具,让你行走这条大道的时候没那么痛苦,函数式语言就在这种需求下应运而生。
然而对于真正的工程来说,换语言不是一件容易的事,好在函数式编程也可以作为一种范式而存在,只要你所使用的编程语言支持一些最基本的特性,如闭包(或lambda)、尾递归等,就可以将函数式编程的范式融入你的日常开发之中。虽然有些时候相比真正的函数式语言来说有些蹩脚,但是工程不是艺术,工程以解决实际问题为目标,这暂时也就够了。
在英文写作中,有一个术语叫做elegant variation,形容那些说毫无必要地使用一些替代性描述的写法,虽然表达的意思相同,但刻意地不使用更为清晰地表述,反而造成了阅读困难。甚至看似暗含了些什么意思,当读者试图深挖的时候,发现其实什么意思都没有,造成了一种阅读体验上的缺憾。
比如本文上一节开头的几句话,就属于elegant variation,其实我想表达的意思就是柏林天气很好,没什么别的意思。
那么我们在追求编程优雅性的过程中,也要注意避免这类情形。
比如写个简单独立的web服务器就不要去构造Functor了,while循环也是有尊严的。实际上不仅是函数式编程,以前面向对象流行的时候,也有年轻人不分青红皂白上来就用设计模式。我再说一次,while循环也是有尊严的,大到一棵树,小到一个指针,都有其尊严,尊重它们就是尊重你的时间。
但是如果你的程序是一个大系统的某部分,哪怕它再小再简单,都是值得琢磨让它变得优雅的。因为你肯定不希望这一辈子都维护这个程序,将来让人接手以后也希望彻底脱手。说白了,尊重别人的时间就是尊重自己的时间。
当然,尽信书不如无书。一个人即便通读各种秘籍,把各种编程范式玩得天花乱坠,也不如认认真真写文档的人。要我说,最优雅的编程范式,就是好好写文档和注释。文档和注释是不一样的,前者是写给不熟悉程序内部结构的使用者看,后者是写给合作者看的,这其中差别值得另写一篇文章了。
在过去一提起函数式编程,许多人的第一个反应是Lisp或者Scheme。时代在发展,如今一提函数式编程,许多人第一个反应是Haskell。
有人在Quora上问这年头SICP是否还有价值作为FP的教材来学习,答案其实是肯定的,因为Haskell所展现的是函数式编程更为现代的东西,Functor、Applicative、Composition、Monad之类,而SICP展现的是经典的东西,闭包、高阶过程、lazy等基础东西,一脉相承的,你在Haskell里这些经典东西一样要用。
某种程度上这很合理,因为现代函数式编程主要是在范畴论里打转,而Haskell是针对范畴论设计的编程语言,这使得它在表达范畴论的东西时特别简洁优雅。
想要用比较传统的语言,比如C++来实现同样效果会比较蛋疼,但用肯定是没问题的,有兴趣可以读 8 essential patterns you should know about functional programming in C++14.
但是问题还是在于,换语言对实际工程来说不是一件容易的事。几年前碰到个老友,他刚跳槽去了某个还蛮有名的硅谷互联网公司,然后热情地跟我大肆介绍那边Haskell氛围有多么浓厚,于是我就问实际生产里有没有用,他犹豫了下说生产中实际还是用Javascript,更多地还是从Haskell里学习范式。这个嘛,就比较符合常理了。
不过这也是好几年前的事了,现在是不是切换到了Haskell也难说。毕竟当你特别熟悉范式的时候,会更希望直接切到更符合该范式设计的语言。所以在我看来,先从范式学习和实践入手,会是一个比较理性的方法论。
为什么我要强调函数式编程并不等于Haskell?我看很多人在用别的语言模仿这些范式的时候,有意无意地去套用Haskell语法,这是毫无必要的,某种程度上讲,这其实是没有理解本质,就是去套别的语言的语法而已。套出来很难看,而且也不懂为什么要这样套。比如Haskell里告诉你Array是个Monad,那是因为Haskell里的Array是个ADT,它被优雅地实现成了一个Monad,不是说Array就一定是Monad。
Haskell虽好,但它在学习上确实带来了一些问题,那就是它所展现的优雅只属于它自己,没有办法转移到别的语言上。然而如果根据范畴论的原理来做,在别的语言上自有别的做法,无需套用Haskell的语法,那样会好很多。
这些年函数式编程的思想深入到了IT产业的许多方面,最有名的当属谷歌的MapReduce,还有现在很火的亚马逊lambda架构,还有不知道还能活几年的docker(如果你熟悉Scheme,就会明白docker的本质就是个超大型的continuation)。其实还有很多方面值得一提,尤其是形式化验证方面,在区块链和芯片设计上都是老生常谈。连硬件设计方面也开始转向函数式编程,可以忘掉verilog和VHDL了,未来FPGA可能都是函数式语言开发了,其中包括基于Scala方言开发的Chisle语言。
当然,如今年轻人更为关注的可能是比较烂俗的deep learning之类,那么除了Python以外,函数式编程领域是否有相关的东西呢?首先我跟大家汇报一下,Haskell那个tensorflow我是没跑起来。但是如果你想用Scheme或者Lua来用tensorflow,未来可以考虑AIScm,但是AIScm只是绑定了tensorflow(和一大堆视觉相关的库),我肯定是跑起来了的,但是接口太low-level,我们在基于它写一个类似Keras的东西。没办法,虽然deep learning连扫地大妈都能聊几句,但是我们需要在产品里加个推荐系统,这年头不用deep learning也不好,如果未来能去掉Python,利用AIScm基于LLVM-jit的优化,那就更好了。
总体上来说,未来还是比较值得期待的,如果你身边的人还没有听说过函数式编程,欢迎介绍他们阅读我的文章。
如果要吹牛逼聊八卦,料就很多了,以后有的是时间。现在唯一想说的就是,柏林天气很好,就这样。
To C++ folks, if the Functional Programming is still an academic theory or confusing concept to you, then you're out. It's already used a lot in our daily product development today. I'd like to share you 8 essential patterns to help you grab this powerful weapon quickly in a practical way. It's not hard, and I hope you're familiar with C++14 features.
This article is not for mathematicians, or any deeper PLT (Programming Language Theory) researchers. It's for you, daily professional programmers.
"Hey, if it's yet another math-textbook-copying article to waste my time, I'll close it at once!"
I promise no. I'll show you standard C++ code, not formulas. Let's talk about functional programming like a real daily programmer. The aim is to help your daily programming, not for researching PLT. If some important math concepts are not so useful in real programming experiences, then I'll skip them in this article.
Please find you a C++14 compiler, I'd recommend GCC-6.0+ or Clang-7.0+.
No, you don't need any pre-knowledge of Haskell, or Lisp/Scheme.
The provided example code is not optimized for product, on the contrary, I intended wrote them without further optimizations for better understanding. Anyway, I hope you get the idea, not just copy to your work.
I know there're many articles tried to explain some basics about functional programming. I'm not going to judge any of them in this article, however, if you find my explanation looks different from other articles, don't be surprised. ;-)
Unfortunately, we have to learn a little about Category Theory although you may dislike mathematics. Fortunately, I'm here to give you just one line explanation:
In practical daily programming, we mainly care about FinSet which is a subset of Category, a typical FinSet is an array or list.
OK, that's enough for a non-CS background programmer. You don't even need to read FinSet on Wikipedia (it's useless even you do).
Easy, right?
So why it's so important? Well, it's important because you can "learn once and do it everywhere", say, the ability of higher-level abstraction. Of course, you don't have to use it, however, people still wants to gain more expressiveness for productivity. That's why people always have been trying to study the higher-level abstraction.
(The Category is also used to abstract a database schema, which is out of the scope of this article.)
In functional programming, the most basic concept is closure.
"Wait, I've heard it should be lambda!"
Closure in functional programming has nothing to do with the closure in automata theory.
Closure can be thought as the technical implementation of the mathematical concept "lambda". Usually, it contains two essential parts, "environment" and "entry", if you're from the imperative programming world, I may replace them with "symbol table of the function" and "entry pointer of the function". Enough, you don't need more CS for closure here anymore.
I guess you've already learned how to use lambda in C++, it's the simplest concept in this article. If you don't, then I'll give you an example:
#include <functional>
using fn_t = std::function<int(int)>;
fn_t func(int x)
{
return [=](int y){ return x + y; };
}
"Hey, I know it's curried function, and it could be done with std::bind!"
OK, OK, you know more. But it's just an example for showing closure.
Anyway, the point is that you can create/pass/return a first-class-function. That's important for a functional programming featured language.
Lazy means delay to evaluate the result till it's necessary.
First we introduce thunk. A thunk is nullary function, say, a function without any parameter:
using thunk_t = std::function<int(void)>;
The return type is trivial, the point is "nullary".
Why the "nullary" matters? You have closure, so you can capture the values you need, parameters are unnecessary for you. You may realize that thunk may help you to unify the interface.
Why the thunk matters for lazy? In a brief, the thunk captures the value you need, but it never compute the function body until you call it. This feature helps you to implement lazy evaluation.
#include <functional>
#include <iostream>
int main()
{
int x = 5;
auto thunk = [x](){
std::cout << "now it run" std::endl;
return x + 10;
};
std::cout << "Thunk will not run before you run it" << std::endl;
std::cout << thunk() << std::endl;
return 0;
}
Lazy is sometimes useful. Years ago, I've ever written a message-passing framework in product. And I used lazy for performance. Usually, you create a message by encapsulating the computed result. However, some messages may require bigger computation, unfortunately, the receiver may drop the messages, then you wasted the computation. If you use lazy, the computation occurs when you called the thunk iff the message was not dropped.
First, the explanation of Map procedure seems largely confusing by Haskell programming language, because Haskell has fmap which is mapping between Categories of Functors. However, it doesn't mean every kind of Map can be used to map between Functors. The general Map is only guaranteed to map between Categories by its own usage. And not all Categories are Functors.
I'm not here to blame Haskell, but you should aware of the non-trivial difference between Map and fmap, in case you lost yourself in my article.
Personally, I like the alternative name in C++, std::transform, which is easier for newbies to understand what it's used for. Obviously, it transforms an iterate-able data structure to another by given transform function. This is exactly what we need to construct a Functor class, I'll explain it later.
#include <algorithm>
#include <iostream>
#include <list>
int main()
{
std::list<int> c = {1, 2, 3};
std::transform (c.begin (), c.end (), c.begin (),
[](int i){ return '0' + i; });
for(char i : c) std::cout << i << std::endl;
return 0;
}
"You said Array is a Category, but I've heard that an Array is a functor!"
Not exactly, ADT (abstract data type) could be a Functor under certain condition, but Array is usually referred to a data structure which is not a mapping at all. ADT is unnecessary to be a Functor, if you're really interested in this topic, please read <<When is an abstract data type a functor?>> from Pablo Nogueira.
In mathematics, a Functor is a special map from one Category to another. "Special" here means its mathematical-structure MUST be preserved after the mapping. This could be even easier, if two Categories are both FinSet, then the Functor between them is actually a Function. Remember what I said previously? Most containers you use in daily programming are FinSet.
So if you try to convert an array to another array by a certain function, you're actually constructing a Functor. If you can abstract you operation to be more general, you've done a Functor class.
There're many code examples to construct a Functor class if you Google it. However, some code examples are a bit outdated. Let me share you how to do it more elegantly with Lambdas.
Here's our Functor class:
// functor.hh
#include <algorithm>
#include <functional>
using namespace std;
template <class from, class to> class Functor
{
public:
Functor (function<to (from)> op) : op_ (op){};
~Functor (){};
template <class T> T operator()(T c)
{
transform (c.begin (), c.end (), c.begin (), op_);
return c;
};
private:
function<to (from)> op_;
};
In theory, this Functor class should map any Category to another, depends on the given surjection between objects, say, the /op/ function here.
Then let's try to map a list:
// test1.cpp
#include <iostream>
#include <list>
#include <array>
#include "functor.hh"
using namespace std;
int main ()
{
list<int> a = {1, 2, 3};
// <int, int> implies "int -> int" type annotation
auto f = Functor<int, int> ([](int x) { return 2 * x; });
auto g = Functor<int, int> ([](int x) { return 10 * x; });
auto z = Functor<int, int> ([](int x) { return x + 1; });
// Function composition preseving
auto result1 = g(f(a));
auto result2 = f(g(a));
cout << "Check: " << (result2==result1?"yes":"no") << endl;
// We use Functor p to print out the final result
auto p = Functor<int, int> ([](int x) {
cout << x << endl;
return x;
});
p(result1);
p(result2);
return 0;
}
You may uncomment the array line to checkout the array as well. The Functor class is general enough for the infinite enumerable collections which are the so-called FinSet, as we mentioned previously in this article, a special subset of general Category. In fact, it should cover any enumerable data structure which provides iterator. I'll leave it to you to polish it for product-level development.
You may realize that result1 equals to result2, that's the so-called "function composition", say, after Functor mapping, the order of calling functions is independent.
You may realize that function g(z(a)) != z(g(a)), however, you may add a transformation to let them equivalent. In math, such kind of transformation is called natural transformation.
Exercise: Try a "Byte -> char" Functor to map an integer collection to chars, and filter out all the non-printing chars. HINT: first you may need to define a Byte class.
Many years ago, the most confusing, powerful concept in functional programming is Continuation. But nowadays, the hottest one is Monad.
Both of them are not easy to understand. Oh well, now that it's so hot...
First thing you need to know is that Monad in programming is different from Monad in math, although they're related. That's a good news, since you don't have to learn math to understand it.
Let me ask you a question:
using vp = std::shared_ptr<int>;
using box_t = struct box { vp val; };
using box_tp = std::shared_ptr<box_t>;
int sum(box_tp x, box_tp y, box_tp z)
{
int xv = (x && x->val)? *x->val : 0;
int yv = (y && y->val)? *y->val : 0;
int zv = (z && z->val)? *z->val : 0;
return xv + yv + zv;
}
Do you see the problem? Each time you define a function to use box_tp, you have to check the null pointer for every argument. Unless you want to let users check it. All these validations are tiny but non-trivial code snippet, and they're in a certain repeated form.
"It sounds like what Maybe Monad does in Haskell!"
Good, that's the point.
We expect such kind of Macros in C++:
/* Fake Code! */
#define SUM(v1, v2, ...) sum(check(v1), check(v2), ...)
So that it will expand recursively to add "check" to each argument. What an elegant language!
"I know Scheme macros can do that!"
Hmm...but your boss knows nothing about Scheme. Let's back to C++.
BTW, this kind of repeated but non-trivial code snippet is called *boilerplate code*. Monads can help to eliminate them elegantly. Monads were used to embed an object into an object with a richer structure. The critical point is that such kind of embedding will not change the original semantics, and this is guaranteed by mathematics. This will be used to verify you program.
You may wonder how to eliminate, it's very easy: wrap the boilerplate code into a Functor then bind it to the function you want to apply. Please remember that your function implementation doesn't require to do any validation anymore:
int sum2(int x, int y, int z)
{
return x + y + z;
}
// box_tp -> int
auto check = [](box_tp b){ return ((p && p->val) ? *p->val : 0); };
auto safe_sum = BIND(check, sum2);
The purpose is to bind our preferred function sum2 to check function, so that each time when we call sum, the validation code was injected when applying the raw sum2 function, even if we pass a nullptr, it won't crash. Please notice that we wrote the validation code in check just once, no matter how many arguments need to validate.
"So what is BIND?"
Good question, hold my beer...
#include "functor.hh"
#include <functional>
#include <iostream>
#include <memory>
#include <vector>
using vp = std::shared_ptr<int>;
using box_t = struct box { vp val; };
using box_tp = std::shared_ptr<box_t>;
box_tp make_box (int x)
{
box b{ std::make_shared<int> (x) };
return std::make_shared<box_t> (b);
}
// An over simplified Monad, but it's enough to show the usage.
class Monad
{
public:
Monad (){};
~Monad (){};
/* I embedded the boilerplate nullptr validation code inside,
so this Monad become a Maybe Monad.
NOTE: Must be static for lambda capture.
*/
static int Maybe(box_tp p){
return ((p && p->val) ? *p->val : 0);
}
/*NOTE: The orginal >>= should pass Maybe as an argument but
here we use Object-Oriented to use an static public
method to define Maybe.
*/
std::function<box_tp(box_tp, box_tp, box_tp)>
operator>>= (std::function<int(int, int, int)> fn)
{
return [this, fn](box_tp x, box_tp y, box_tp z) {
std::vector<box_tp> args = {x, y, z};
// ugly, will be better in C++17
std::vector<int> safe_args = {0, 0, 0};
std::transform (args.begin (), args.end (),
safe_args.begin (), Maybe);
/* I know it's stupid to unwrap the safe_args here,
but the more elegant std::apply appears in C++17.
return std::apply(fn, args_in_tuple);
*/
return make_box(fn (safe_args[0], safe_args[1],
safe_args[2]));
};
};
};
int main ()
{
auto m = Monad ();
// The original function you defined
auto sum = [](int x, int y, int z) { return x + y + z; };
// Bind to the Monad for validation
auto safe_sum = m >>= sum;
auto a = make_box (1);
auto b = make_box (2);
auto c = make_box (3);
/* We have to use Monad::Maybe to get the available result,
because according to Monad laws, we have to make safe_sum
return box_tp type either.
*/
std::cout << "1 + 2 + 3 = "
<< Monad::Maybe(safe_sum (a, b, c))
<< std::endl;
std::cout << "1 + 2 + nullptr = "
<< Monad::Maybe(safe_sum (a, b, nullptr))
<< std::endl;
return 0;
}
Have you realized that I renamed bind to >>= which is the bind operator in Haskell? I guess you've seen it somewhere before.
Alright, I confess it's not so nice for general typing, but it's C++, there's only limited type-inference in C++14, and std::any is in C++17.
I have to tell you, the Monad in math is useless for daily programming, so you don't have to spend much time on its original math concept. (unless you love math just like me).
As I wrote in the comment, the nullptr validation code was inlined into Monad class, so it's actually a Maybe Monad. Although C++17 have added std::optional, I hope you can understand the basic principle of Monad: embedding boilerplate code by binding to a special pre-called function (Functor, actually).
This is not a complete Monad implementation, because it lacks of many mathematical attributes. All these attributes are critical for program verification. In a language like Haskell, all the attributes are implemented in the language. If your code can pass the compiling, then that means some code snippets were guaranteed to be correct by mathematics.
The Monad we talk in today's programming that is a refined and constrained concept from Category theory. Thanks Phil Wadler, his famous paper <<Comprehending Monads>> made this significant contribution. So that we can use Monad for enhancing our programming practice.
A Maybe Monad is the simplest case of Monad. I may write deeper article for Monad in the future. IMHO, the ability of Monad is very similar to Continuation, it can be used for parser, interpreter, control flow, exceptions handling, etc. If you're familiar with CPS (Continuation-Passing-Style), then you may realize that they're also the ability of Continuation. That is to say, both Monad and CPS can be used for IR (Intermediate Representation) in compiler development.
Exercise: It seems >>= can be implemented with Functor, and in Category theory, it should. Try to implement >>= with Functor described previously. You may need to consider to change the implementation of Functor a little.
Exercise: Actually, box_tp is a Monad too. From elegant perspective, it's better to define it properly so everything obey Monad laws. I spent too much time to write this article so that I don't want to write these code anymore. Do you want to challenge it?
Question: Do you think the current C++ (14 or higher) is good enough to do functional programming elegantly? Elegantly here means less redundant or confusing code.
To Mathematicians and Categorists:
1. "m >>= eta(x)" should be either "Monad::Maybe >>= [](box_tp x){ return x; }", the Monad laws are hold.
2. "m >>= sum(args)" is actually "m.bind(Monad::Maybe(x), f)", the Monad laws are hold.
3. We can composite Monads: "(m >>= sum(args)) >>= print(args)", this requires to improve the Monad class to be more complete. But I don't want to spend much time to write mathematical-elegant code in C++, so irritated...after all, C++ is an industrial engineering language.
Wait, fold is not fold-expression which appears in C++17. And std::reduce appears in C++17. In this article, we only talk about C++14.
Fold and Reduce are very similar high-order-function, actually they're same function defined by different interfaces.
Fold is a little different from the Functor that we introduced, the Functor will give you another Category, but Fold usually give you a value. Of course, a Fold could be a Functor, depends on what you return finally.
In C++, it's called std::accumulate.
#include <iostream>
#include <numeric>
#include <vector>
int main()
{
std::vector<int> v{1, 2, 3, 4, 5};
int result = std::accumulate(v.begin(), v.end(), 0,
[](int x, int y){ return x + y; });
std::cout << result << std::endl;
return 0;
}
In this case, we compute 1+2+3+4+5. You may see the pattern, and use it in your cases.
Exercise: Modify the Functor class to implement a Fold-Functor.
Originally, MRV is implemented for passing multiple values conveniently between continuations. Today, many languages are not continuation-based. So IMHO, MRV is only useful for compacted coding style. In Python, it's relative explicit:
def func():
return 1, 2
a, b = func()
In C++, there's no syntactic MRV support. You have to use both std::tie and std::tuple:
#include <tie>
#include <iostream>
std::tuple<int, int> func()
{
return {1, 2};
}
int main()
{
int a, b;
std::tie(a, b) = func();
std::cout << "a: " << a << ", b: " << b << std::endl;
return 0;
}
"Why can't I use struct to return multiple values?"
MVR just likes anonymous struct for returning multiple values, you don't have to define a specific struct for a function. And no explicit assignment for each value. As I said, compacted coding style.
You may be familiar with std::lock_guard which is helpful to release the locks automatically, when you get out of certain scope. However, it's only locks. What if you want to do similar clean work for other data structure? Say, you want to restore your integer to a safe value in a similar context?
// guard.hh
#include <functional>
#include <exception>
#include <iostream>
using namespace std;
using thunk_t = function<void()>;
using GUARD =
struct my_guard
{
my_guard (){};
my_guard (thunk_t init, thunk_t clean) : clean_ (clean)
{ init (); }
~my_guard () { clean_ (); };
thunk clean_;
};
Here's an example. Assume we have a function named work which accepts two int a and b, and it will change the values of the int variables. We need to make sure the values of a and b be kept back to a safe value when there's any exception happened. In this example, 0 is the safe value. The expected good result is that a and b should be 0 after calling work().
// test2.cc
#include "guard.hh"
using namespace std;
void work (int &a, int &b) noexcept
{
auto init = [&]() { a = 1; b = 2; };
auto clean = [&]() { a = 0; b = 0; };
GUARD guard (init, clean);
try
{
cout << "Start working, a = " << a << ", b = " << b << endl;
/* Unfortunately, error happend!
(of course, we trigger it intendedly, it's just an example)
*/
throw runtime_error ("I was shot!");
/* The rest code will never be excecuted because the
exception happend before.
So we have to rely on the guard.
*/
}
catch (const exception &e)
{
cout << "Error happened: " << e.what () << endl;
/* Sometimes we need to return or throw to the upper level
here, so that we don't have chance to keep value safe
by unwinding the stack of catch in C++.
*/
return;
}
cout << "The rest lines have no chance to be excecuted!" << endl;
}
int main ()
{
int a = 0, b = 0;
cout << "In the beginning, a = " << a << ", b = " << b << endl;
work (a, b);
cout << "work done!" << endl;
cout << "b' is " << b << ", it's "
<< (b ? "dangerous" : "safe") << endl;
return 0;
}
All these are very basic concepts in functional programming. Fortunately, the more complex things are constructed with these basic blocks. I've skipped many mathematical concepts to simplify this article, so that it may make the article not so precise and formal. But anyway, I hope you can understand these concepts from C++ programming perspective.
Thinking:
1. What's the benefit to use functional programming patterns rather than putting a bunch of redundant procedures together?
2. Is it necessary for a small program?
3. What about the small core but flexible and scalable system?
4. What about a huge serious long term maintainable system?
Enjoy!
First, I'd recommend newbies try to use docker for painless installation.
Now let's figure out what's new in 0.4:
If you're writing a dashboard page, or RESTful API, you may need authentication. In 0.4, we provide a new feature to authenticate automatically:
(get "/dashboard"
#:with-auth "/dashboard/login"
(lambda (rc) ...)) ; render dashboard page
You may use #:with-auth to specify the login page URL. When users visit "/dashboard", Artanis will try to check the SID from cookies, if the authentication failed, then it'll redirect to the URL automatically.
Artanis provide many patterns in #:with-auth for you, please check it out in the manual.
If you encouter any status other than 200, then Artanis will throw a system page for indicating an exception. In the previous, Artanis only support static system page, say, you have to put your customized 404.html to sys/pages/ directory of application folder.
Now we support dynamic, for example:
(http-status
404
(lambda () "My prefered 404 page!"))
So you may generate JSON for exceptional status, this is helpful in WebAPI. Another way is to render a view in MVC. Depends on your requirements. Please check it out in the manual.
Here're the situations you should clean cache and recompile all your WebApp modules:
1. You upgraded Artanis or any dependencies libs.
2. You changed the code in lib/ directory of application folder.
3. You want to make sure the caches are clean for debugging.
Now you have a friend:
art work --refresh
The redirect-to API changed two things:
1. Since 0.4, redirect-to will always generate absolute URL for Location header.
2. Removed #:scheme, which is unnecessary now.
The details and technical explanation is in the manual.
There're some other small features, I just list most notable new features. Please read the Changes in release log.
如今是网络已经成为我们生活的一部分,可以说没有网络就没有生活。我曾经也很怀疑自己这个论断,在这个年代难道山沟沟里的人就没有生活了吗?后来有一次我去到一处穷乡僻壤,无意中看到了“村村通”的成果,电话、网络等在山沟里也是可以通到的,淘宝等电商在山沟里也是能享受到的,只不过物流没有那么完美,只能在村里设置个商铺作为接收点,然后买家自己过来拿。所以我越发坚信自己这个论断,即便互联网尚未100%覆盖这个世界,它也已经是我们生活的一部分了。
既然互联网已经成为了我们生活必备的基础设施,那么它的各项标准是如何被制定出来的呢?作为W3C的受邀专家(Invited Expert,简称IE),我就来聊一聊互联网标准是怎么制定的。
谈到互联网标准,很多人可能搞不清楚W3C和IETF的区别,这是两个常见的平行的标准体系。W3C最典型的输出就是HTML5标准,而IETF则是那一大堆RFC打头的标准。
但是这两套体系不是对立的,也不存在高下之分,我们来举个例子说明下二者之间的关系。
比如说,Websocket标准(RFC 6455),就属于IETF体系;而W3C同样有一个Websocket API标准, 在这个标准中很明确地表示它与RFC 6455是关联标准。因为Websocket API主要是在Web前端被Javascript或者WebAssembly所调用,而后端的Web框架需要考虑的是RFC 6455的实现细节,两者共同作用才能完成Websocket的应用开发。
那么是不是W3C和IETF就是绝对意义上的前端和后端标准的差别呢?其实也不尽然,因为Websocket API是要浏览器具体实现的,很明显浏览器为了实现它必须得考虑RFC 6455的具体细节。
当然大部分情况下这么理解好像也没什么太大问题:W3C主要还是侧重于Web端,包括HTML5、CSS3、音频视频、字体标准等等;IETF主要被后端框架所考虑。
从产品体验来区分貌似也是一种办法,比如W3C侧重于产品能被用户直接感知到的那部分,而IETF侧重于用户不感兴趣,但对公司业务真正有价值的那部分。
除此之外,IETF真正体现了互联网开放的特点,任何人都可以提出自己的RFC(Request for Comments)并最终推动成为标准。IETF并不存在真正的会员制度,任何人都可以申请加入并参与。
而W3C是有门槛的,它的参与者分为会员(Member)和受邀专家(Invited Expert)。会员一般是业内大公司派驻的代表,并且必须要缴纳会员费,根据地区和公司规模不同每年要缴纳几千美元到上万欧元不等。而受邀专家无需缴纳任何费用,但是需要被邀请。
这其实是许多人心里不好说出来的疑问:凭什么你们这群人我都不认识,坐在那里定个标准咱就得遵循了?如果您有这样的疑问,听我把道理讲了就不会抵触了。
首先,制定互联网标准主要还是为了给从业者降低成本的,不是来收保护费的。这点跟USB标准、蓝牙标准那些不一样,那些其实就是大厂和财团合伙弄个圈子,然后收保护费,美其名曰授权费。互联网标准是不收授权费的,刚才也说了,IETF也是完全开放无门槛的;W3C虽然收公司的会员费,但对于不想参与制定标准的公司或个人,不交会费跟你能不能靠HTML5赚钱没关系。另一方面,这个标准的制定是很有技术含量的,要综合商业、技术、法规等等各种因素,并且是全球化合作,最后制定出来大家拿着用必须是靠谱的,这种针对专业素养的信任才是标准化的基石。
如果一个团队新开一家互联网公司就要自己定标准,那就不用等着产品上市了,肯定完蛋了。
假如我就是很有个性、不差钱,我就不遵守这些标准,又怎么样呢?虽然这个世界上一般情况下没有这么傻的土豪,但就当闲聊好了,假定就有这么一傻瓜,我们来看看这家公司会遭遇什么问题。第一个遇到的问题就是你的产品跟产业链条上的东西不匹配,不遵守W3C标准,那么就无法让用户使用主流浏览器,好吧,假定老板很有钱,手下一帮牛人,重新搞一浏览器并且烧钱提升用户装机量,还是可以解决,谷歌当年就这么干的;不遵守IETF,那么现成的Web框架就用不了,老板有钱重新重金打造一个,也可以。经常有一句话说,用钱能解决的问题就不是问题。以上两个解决方案,就是能用钱解决的极限了。剩下的是用钱解决不了的问题,那就是耗时间的问题。烧钱买用户貌似可以缩短占领市场的时间,但是要知道你的竞争对手可不是死人,在你还在自己研究标准的时候人家产品早上市完成第一轮验证了。而任何标准一开始都有非常多的问题,在最终定型公布之前,会有多年的测试和研讨,这些都是烧钱也缩不短的时间成本。
所以结论很简单,任何想要认真做产品的人,都不会反对标准给自己找不痛快。更何况制定标准所需的高额成本,都由大厂的研发预算和不拿钱的第三方独立专家帮你扛了。脑子清醒的人会明白,尊重互联网标准就是尊重自己有限的精力;脑子清醒的老板也会明白,尊重标准就是尊重自己手里的钱。
由于我更为熟悉W3C的流程,在这里我只聊W3C。
为了避免加入过多细节,我在这里简单介绍下组织架构,主要包括两种类型的小组来运作:工作组(work group),和兴趣组(interest group)。工作组是针对已经成型有方法论的议题,比如WebAuth,已经在制定许多细节和API了,比较正式地进行标准化运作了;兴趣组针对那些暂时还没成型的议题,比如我所在的隐私兴趣组(Privacy Interest Group,简称PING),更多地是搜集意见,大量讨论,研究其他议题的内容,想办法去构建方法论。如果有一天我们这套东西都搞得差不多了,就会转化成工作组,比较正式地进行标准的制定。但是现在还不行,一个很实在的问题,就是“隐私”这个概念我们现在都还没找出一个令人满意的标准定义来。
这听起来好像PING这样的兴趣组其实没什么影响力啊?
虽然PING尚未准备得非常充分,但是今年6月份进行投票的一份流程整改方案赋予了PING相当大的权力:1、 任何提案都必须通过PING的审核;2、PING在必要的时候有类似一票否决的权力。
在25个投票会员当中只有谷歌投了反对票,目前内部还在讨论中,其中的一个看法是:如果W3C发布一个标准连自己的隐私保护小组都无法认可的话,上了头条可不好看。
投票结束后,谷歌在8月份发起了一份正式反对案(Formal Objection),更多的细节超出了本文范围,有兴趣的朋友可以看彭博社的报道。个人不太喜欢彭博社,之前报道中国出产的主板埋后门事件缺乏证据,后来一直就没给个说法。这篇报道还算比较客观,值得一读。
但是不管怎样,W3C对于隐私保护已经是表现得非常严肃了。
回到正题,PING如何开展工作呢?流程说起来简单,做起来工作量不小。就拿最近的一个案例来说,比如WebAudio的一个提案上来,整个标准细节全部都列好了,要审核隐私问题只能一点点把里面的东西全部搞清楚,然后才能跟人家沟通隐私上可能存在的问题。到了这个层面,大家技术水平都不差,不可能对人家的工作不了解就做评价,否则会降低自己的credit。PING这边不通过就只能打回去再提交,下次再提交时对方需要把上次所有相关的问题全部列上去以及整改回复,以期尽快通过审核。
总之整个标准制定和评审的流程非常冗长,可能要很多年,其间多有反复,但大家都是不厌其烦的,因为一旦搞错了,成本就远比这样细致的审核更高了。
作为一篇科普文,我尽量省略了细节,技术名词也尽可能少。
写这篇文章的目的,除了科普一下互联网标准化方面的常识,也是想告诉大家,科技大厂的权力不是没有制约的。技术权力的扩张和反制,这样的一个元循环,才给互联网带来了活力和不断改进的机会。当然,标准的制定也是一门妥协的艺术,毕竟标准如果太严苛,企业即便想说爱你也不容易。不过相比一开始就想着妥协,我更愿意一开始就走极端,宁愿最后被迫妥协,这样的策略才有机会把自己所坚持的理念更多地推进一步。
“争名者于朝,争利者于市。” —— 《战国策·秦策一》
每年9月第三个周六是软件自由日,一股欢庆热潮席卷全世界。或许是最终在商业上想明白了未来的道路,老牌私有软件霸主微软邀请自由软件领袖理查德.斯托曼(以下简称RMS)前去演讲,随后开源了MSVC的标准库。这股庆祝软件自由的热潮仿佛是在纪念普罗米修斯的盗火之义,唯一不同的是,RMS并没有因此被绑在山顶受折磨。软件世界从一开始的嘲笑,容忍,到观察,学习,最后跟随,并最终主动开始引领这股潮流。7年前你若跟别人谈自由软件开放源代码,得到的多半是不解甚至嗤之以鼻。而如今聊起自由软件和开源,可能对方甚至还能反客为主如数家珍,即便你可能之前比她/他懂得更多。但时代已经变了,这种积极的变化在全世界推动了整个计算机产业的发展。
在我写这篇文章之前,得知RMS近期因为不当言论而遭遇压力被迫辞去自由软件基金会主席的职务,心里也有些着急和担心。因为在自由软件运动如火如荼之时,领袖人物却被迫离任,每个人都很担心是否会打击到这场运动的发展。而实际上这两天与RMS和一些朋友沟通之后,明白这是又一次被媒体不当解读的报道,在政治正确的巨大压力下所做出的一次妥协。这让我想起美国大选时,最安全的表达是支持希拉里,你那时候但敢说支持川普,那压力就大了。不过嘴巴上政治正确,投票时身体是诚实的。事后RMS也表示他会继续参与自由软件运动,这个事情其实对于真正踏实做事的人来说不过是一个小插曲。
到此为止了,我们都知道那是万恶的美帝国主义,还是让我们来谈点正能量,我们要关注的是中国自由软件的未来。
每次在谈论自由软件之前,RMS都会习惯性地强调“自由并非免费”。我想这是由于中西方文化不同,因为free在英语里有免费的意思。但是对于中国人来说,怕是人人骨子里都知道,自由是有代价的。所以在这一点上我就不废话了。
自2012年以来,我一直作为GNU黑客为自由软件做贡献,你可以到https://www.gnu.org/people/找到我的名字 Mu Lei。从页面可以看到,在中国大陆,我的同伴并不多,但这不意味着来自中国的自由及开源软件的贡献者不够多。
基于这样的背景,以下是我的见解。
在这里我并不打算为自由软件做道德上的布道,我只想把道理讲清楚,等待一群未来的赢家听懂这些道理。中国未来一定会有一群伟大的科技公司崛起,这些公司的创始人或许还在街边喝粥,但这不妨碍他们听一听我的话,为自己的事业奠定一个理论根基,发掘自己理想的与众不同。
“除了圣贤之外,没有人真心热爱自由,人们要的是许可证。 (None can love freedom heartily, but good men; the rest love not freedom, but license.)” —— 约翰.弥尔顿 (John Milton)
关于自由软件所有的一切,都是围绕许可证来进行的。首先要知道软件是怎么赚钱的,传统上来说,是卖私有许可证,也叫软件授权 (license)。
有些人会告诉你,不要去了解自由软件,那东西一点都不好。听到这种话,先反问说这种话的人靠什么赚钱,如果是靠私有软件赚钱,那么你已经知道他们的立场了。
什么是私有许可证?简单来说,就是你购买软件之后,只获得了软件的使用权,并且使用范围是被许可证限定的。与之相应的是私有软件。
那么私有软件有什么问题呢?对于许多普通用户来说,私有软件有很多好处,首先使用体验好,因为是经过产品级打磨的;其次有质量保证,出了问题有技术支持;最后,私有软件很好做广告,因为它对你来说是个黑箱,它吹多大的牛你都只能相信,因为用户看不到源代码,你对它所有的信任都基于发布会上的那份PPT。人民群众雪亮的眼睛,对它是毫无办法的。
相对的概念就是自由软件。所谓自由软件,通俗地说,就是说要保证用户有四种自由的权利:自由使用、自由研究学习、自由转送给别人、自由改进并发布。
我们逐个来说一说。
自由使用,就是说你通过合法方式获得一份软件之后,要怎么使用是你的自由。那么一个问题就出现了,如果有人把一个本身是好的软件,用作不好的用途,那怎么办?这个问题很好回答,相关法律会制裁这种人。而由商业公司制定的软件许可证,并无资格越权替代法律行使某种权力来限制用户的使用自由,否则岂不是一种私刑?另一方面,商业公司以有限的能力去承担这种社会责任是毫无必要的,在商言商才是合理的。
自由研究学习,就是说你可以合法对软件本身进行研究。这一点在国内的产业或许还没有那么大度,许多公司担心自己的产品被人研究并被赶超。其实与其把这个事情看作大度,我更相信这是一种格局。大格局能挣钱,小格局同样能挣钱,我们无法说服别人都选择大格局,但是我们不得不承认格局足够大才会成为最后的赢家。许多人津津乐道facebook、google开源了那些重要项目,临渊羡鱼有什么用呢,要看它们开源以后是怎么运作的。开放源代码跟泡妞在一点上是相同的,那就是表白完以后才刚开始,一段感情不去经营就会死,社区不善运营,项目也会死。
自由转送给别人,这个专业点说其实叫做自由分发,但是我想还是通俗些好。有一些软件的最终许可协议里是不允许你购买以后转送给别人的,比如你买个操作系统盘自己装完了借给别人装,会被定义为盗版。这个其实是软件发行方在收费模式上无能的表现,因为用户购买以后转送别人难以针对新用户进行收费,所以以剥夺使用者分发自由的代价,把责任转嫁到用户头上。当然,如今时代不一样了,发行方比如微软找到了网络收费的办法,所以分发安装镜像不再是一个大不了的问题。办法嘛,总是人想出来的,咱不能自己搞不好就让用户承担,对不对?
自由改进并发布。我们经常说山寨,我推荐一本书叫做《论剽窃》,理查德·波斯纳著。什么叫山寨,其实英文叫copycat,这个词是怎么来的呢,就是小猫出生以后模仿母猫的行为,逐渐把母猫的行为方式通过模仿学习成为自己的,最终达到独立行走。这里我不打算展开,请读者意会。在已有的软件项目上改进并发布,是重要的创作行为。我们要搞清楚为什么中国的计算机相关产业对“改改代码拿来当自己的”表现得深恶痛绝,我看主要不是说这种创作方式不好,而是有些人脸皮太厚,抄抄就把功劳都归自己,甚至捞取大笔经费,或者靠这个刷KPI之类。这一切我都深表理解,但是这不是“改进并发布”这种创作模式的错,是人的错,以及审核制度缺陷的错。保障这种自由,就是保护我们创造性的源泉之一,不然就是自废武功。
这四个自由度,互为前提,缺一不可。而为了保证这四个自由度,开放源代码是最最基本的要求。这就是自由软件和开源软件在概念上的不同,因为后者其实是买椟还珠,强调了那个漂亮的盒子,而把真正自由之珠玉给拿漏了。听我的,盒子那么漂亮肯定要拿,但是里面四颗珠子更有价值,也是属于你的,请一并拿走吧,拿走本来就属于自己的东西并不可耻,不拿才让人笑话。
所以自由软件和开源软件并不矛盾,只是不能因为开源强调了源代码开放而以为自由软件其实不重要,正如所建议的,连着珠子一并拿走。符合自由软件定义的开源许可证,我们现在称为自由及开源软件(以下简称FOSS)。FOSS许可证现在非常多,基本上涵盖了平时能见到的所有开源许可证:MIT、BSD-2/3 clause、Apache-2.0等,你可以在这里找到一个列表:https://www.gnu.org/licenses/license-list.en.html
最后我一定会面对一个疑问:即便开放了源代码,作为专业人士也需要足够的水平才能看懂,假如一个人不懂软件,那么开放源代码于他何用?
我悄悄告诉你,我家里有全套精美的厨具,但是我不会做菜,即便会做菜的也要足够的水平才能把它们玩转。我所能做的,就是保护这套厨具别被人偷了,不然即便我家里来了大厨,我也吃不上好东西了。
我最早接触自由软件是2006年,那时候伴随着最早的开放课程——MIT OCW, 我才开始有一个规范系统化的渠道正式学习计算机软件。虽然我那时候在读研,但当时中国的计算机教育严重脱节,教学和实验环节中充斥了大量私有软件。我作为一个计算机相关专业的学生,如果只会像一个普通用户一样地使用私有软件,那么家里花那么多钱供我上学有什么意义呢?我甚至可能连份像样的工作都找不到。所以自由软件对于相关专业的学生来说,就是一堆无尽的宝藏,至于宝藏如何转化为自己生命中的财富,还需要他们自己努力。
对于普通使用者来说,自由软件带来的价值是间接的。他们会因为自由软件造成的行业巨大变化,而体验到更好的产品价值。但他们或许并不会真正直接体会到自由软件的价值。这个其实是没有关系的,普通用户并不需要关心,除非有一天他们意识到自己分明有四项自由但是毫无道理被剥夺了,否则实用角度而言,最终用户无需关心这些问题。正如同我喜欢喝红酒,但是我无需关心红酒行业里针对行业变革的你死我活斗争,以及各种技术细节和优劣的争执。我只需要享受变革带来的成果,并且付出我的金钱去买酒,以作为对这些从业者勇敢变革决心和投入的回报。
对于计算机相关产业来说,自由软件虽然大部分情况下不收费,但并不能用来降低开发成本。因为自由许可证通常都会声明无质保 (WITHOUT ANY WARRANTY),这被很多外行人士用来作为攻击自由软件的证据。其实这类人真的是不懂行,一个懂商业的人立刻就明白了,这个无质保条款不就是为了之后收费做的铺垫嘛,也就是说,自由软件许可证从一开始就注定了是有商业行为的。所以无论是说“自由软件就是必须靠情怀”,还是说“自由软件就是反商业的”,都属于贻笑大方。只不过自由软件许可证不会因为收费问题而影响你的四个自由度,你可以花钱买订阅服务,也可以花更高的价钱买资讯服务,但不付一分钱,你还是可以获得源代码和四个自由度。
回到我们的问题,为什么在商业开发过程中,自由软件并不能用于降低开发成本?因为无质保!你要么买服务,要么高价雇佣靠谱的工程师。可是,这跟我直接花钱买私有软件有什么区别呢?因为可以降低开发成本...嗯,别急,我脑子没糊涂。
因为开发成本除了钱以外还有时间,钱你省不了,可是时间可以省,效率可以提升。自由软件及其相关社区由于极大降低了信息不对称的情况,使得开发的时间大大缩短。在私有软件时代,如果你汇报一个bug可能要等技术支持很久,而即便你手里有一个不错的研发团队,也只能干等着。而自由软件时代,你自己就能发现问题并消除它,你只需要针对你解决不了的那部分问题付费买技术服务。钱是小事,我说真的,当你参与管理一个大项目中的研发团队时,你担心的根本不是花钱买软件的钱,而是你的迭代推进效率能有多高,否则整个项目赶不上交付可能就得玩蛋。
在这个时代,中国自由软件不再需要高屋建瓴,而是脚踏实地建立高校、社区和商业公司之间的共生关系。
我在前文列举了自由软件的诸多好处,可这就是自由软件真正的价值吗?不是的,这只不过是我个人这十几年来的领悟而已,自由软件比我所领悟的更具有深度,我举两个例子。
几年前,日本黑客 NIIBE Yutaka (加密软件GNU PG的维护者之一,GPG里ed25519算法是他实现的) 来到深圳造访SZDIY社区。跟他一聊我才知道,原来日本早已经经历过一次自由软件的兴起,并伴随着日本半导体业的衰败而落潮,但整个根基打得非常好,正在酝酿着第二次自由软件的兴起。那时候许多的年轻人出于爱好去给日本的半导体厂商写驱动,做移植,而这些厂商对爱好者十分友好,主动提供了很多资料,这些代码也被贡献到自由软件社区,其间理所当然地产生了良好的商业合作关系,最终导致了双赢。而到了这个时代,当我还在热情地给他讲解我对自由软件的上述思考以及未来跟中国产业界的结合时,NIIBE考虑的却是更大的格局——自由计算。在那个年头,云计算还没现在那么火,开源硬件也还没火起来。他考虑到的却是,当计算资源被财团巨头大量垄断时,我们是否还具有计算能力的自由。这已经超出了自由软件本身的思考,却又是立足于自由软件的根基。
前段时间,GNU的一个内部讨论中,RMS询问还有哪些值得探讨的议题,我回答:隐私问题。我这么说不仅因为中国科技产业存在很严重的隐私侵犯问题,同时也因为我自己W3C的受邀专家 (Invited Expert),主要参与的是跟隐私保护相关的标准制定和审阅。但是RMS没有回应我,他和其他黑客们开始探讨起另一个很重要的话题,就是这个世界上还有很多国家和地区的人并不能像我们一样,以很低的成本享受计算机产品和互联网,比如非洲许多国家和各个国家的贫困地区——重点是自由软件必须考虑这些情况,而不能一味地增加特性,否则在性能落后的机器上可能都无法编译和运行。这本质上还是自由计算的问题,随着科技发展,我们似乎在逐渐丧失计算的自由。我们之所以享受着低成本的云计算,只是因为财团还在为了更大的事业布局而已。
需要说清楚的一件事情是,参与讨论的人当中,包括RMS和我在内,没有一个人是反商业的。财团的布局从商业上来说没有任何问题,其实某种程度上是好事。但是这个世界上总会有人从另一个角度观察和思考问题,这并不代表是非此即彼。换个角度,满足低成本计算需求,解决落后地区的计算自由问题,也未必就不可以成为一笔好生意。
说了这么多,无非是想表达,自由软件的内涵早已升华,不再是开不开源的争辩了,开放源代码在整个世界范围内早已是无可争辩的趋势。当我们构想未来时,不能再把眼光局限在“有没有必要开放源代码”这么粗浅基础的问题上了。如果我们要坐下来探讨中国软件的未来,那么有且只能有这样一个基本情形:那就是在必须开放源代码的基础上,我们来探讨未来的路应该怎么走。至于早开、晚开,无所谓,反正做好准备必须开就是了。世界大潮来临的时候,不要封闭自己,中国人应该最懂得封闭自己会吃什么亏。
那么什么时候开放源代码比较合适呢?
我之前曾经强调过,开放源代码不是决定性的,社区建设才是。所以如果没有做一个好的社区建设规划,并且决定投入足够的人力物力,以及在商业上的转化盈利模型等等之前,开不开其实效果差别并不明显。你以为开源了就有人关注了吗,肯定不是的,构建一个开源社区跟运营一个创业公司其实许多地方都很相似的。
所以当我在这个阶段说“中国自由软件的未来”的时候,我更多地是想说:商业公司和自由软件开发者群体,如何共同构建,有商业转化能力的自由软件社区的问题。
对于自由软件开发者来说,软件自由不应该成为一种宗教信仰,它不能高高在上不食人间烟火。所谓不食人间烟火,并不是说让自由软件爱好者去使用私有软件,而是说自由软件可以也应该走入商业化。你的兴趣是会产生价值的,你只是需要一个更好的模式来让你的内心和收益达到某种平衡。
对于商业公司来说,自由软件不是洪水猛兽。我曾听一名软件从业人员这样说,我们熟悉私有软件的模式,搞自由软件会把我们搞死。没有错,我认为他说的很对,但搞死他们的不是自由软件,而是趋势。自由软件不仅不是会搞死他们的洪水猛兽,反而才是他们应对趋势的关键武器。没有人在面对洪水的时候会把手边的木板扔掉,并且认为是这块木板招来了洪水。
还有人说,“我相信市场经济的供需关系,我必须先知道需求在哪里,请你证明自由软件是存在真正需求的”。你知道微软为什么是微软,而苹果为什么是苹果吗?因为他们没有把志向局限在当下的需求上,他们总是试图创造新的供需关系。当一个人问出这类问题的时候,恰好说明这个人其实不懂市场经济。微软主动拥抱开源,为什么?有一天大家都非开源不用的时候,它又再次立于不败之地了——它再次成功地凭空借势创造出了新供需关系。至于开源是不是一定比不开源更能满足原先的需求,商业上而言那不重要,不要用你有限的人生去质疑趋势。听起来像是随大流,但决定权是在你自己手里的。
洋洋洒洒写了这么多,我开始怀疑自己的写作能力。诸葛亮为三国未来进行规划时所作的隆中对,才350字不到,而我写那么多还没把问题说清楚。
这是不是说明诸葛亮水平很高呢?我不这么看,我觉得可能是因为刘备水平很高,350字不到的PPT讲完就懂了。
所以我更有兴趣成为刘备那样的人,因为那样的人更为稀缺;而我不要做诸葛亮,因为我觉得在如今的产业界里,像诸葛亮一样优秀的职业经理人是很多的。
RMS没有回应我关于隐私的话题,我那天比较失望,当然我承认他们谈论的话题也很有意义。过了几天RMS受邀去微软演讲,其中提及隐私问题。
看来他早有准备,用这种方式回应了我。
另一方面,也许说明我做事和思考的速度又落后了。
但是我不这么看,我在一个地方做事,就要适应那个地方的方式和速度,把比尔盖茨只身扔到中国来,把我拥有的条件给他,他不一定比我做得更好。
但我们总是前进着的。
If you're suffering from the installation and dependencies of GNU Artanis, here's an easier way to go.
First, you need Docker, I recommend the official installation document.
Then you can pull Artanis image:
docker pull registry.gitlab.com/nalaginrut/artanis
Now you can run a container:
docker run -it --rm -v /var/www:/var/www -p 3000:3000 registry.gitlab.com/nalaginrut/colt bash
I'm going to explain something:
When everything is prepared, you should do this to boot up Artanis server:
art work -h 0.0.0.0
Because the default host is 127.0.0.1 which is impossible to map the socket port to the outside, so we set the host to 0.0.0.0 to make sure you can checkout your work in the outside browser.
Enjoy!
If you encounter this kind of error by Nginx:
/xxx/yyy/colt.png" failed (13: Permission denied)
Then here's a check list for solving it quickly:
1) Assuming the user option in /etc/nginx/nginx.conf is www-data
# /etc/nginx/nginx.conf
user www-data;
2) The public directory which serves static files, say, /xxx/yyy/ must be www-data:www-data
sudo chown -R www-data:www-data /xxx/yyy
3) The path /xxx/yyy MUST NOT be /root/yyy, this point had wasted lot of my time to figure out
4) My suggestion is /var/www/yyy, and set permissions properly
# optional, my personal suggestion
sudo chown www-data:www-data /var/www
# required to do
sudo chmod 750 -R /xxx/yyy
sudo find /xxx/yyy -type f -print0 | xargs -0 chmod 640
Oops, seems I haven't updated my blog for 11 months. What I'm doing?
I quit my boring day-job, and have my second daughter, and I've been invited to publish a paper about GNU Artanis on Schemeworkshop2016 affiliated by ICFP (well, I'll have a post for it after the conference), and I've tried many cool projects with my friends of SZDIY Community, and I'm planning to get my hands dirty on Artifitial Intellegence (but I dislike machine learning), and I may have a secret project end of this year (folks, you'll love it)...
But now, I would like to introduce our Lua frontend on GNU Guile, it's not ready for real work, but most of the work is done. There was a bit old, half baked Lua frontend in Guile branch, but seems unmaintain for a long time. I've rewritten one from scratch. Please don't ask me why bother to reinvent wheel. The new one exists now, so let's talk about it. I would like to avoid redundant work in other projects, but not in this one.
The term frontend here doesn't mean the web layout work for a page. A language frontend on Guile is a programming language implemented with Scheme language and taking advantage of all the Guile existing libs/modules, proper tail call, first class continuation, and all the optimizing.
It's still very experimental, but you may get it from github:
git clone git://github.com/NalaGinrut/guile-lua-rebirth.git
For anyone who wants to try, please make sure you have the latest Guile-2.1+ from master branch. It only works for the latest Guile. And it'll display many debug info such as AST analysis, environment printing, before you see the final result. As I said, it's not usable yet. Run it like this:
cd guile-lua-rebirth guile -L ./ GNU Guile 2.1.3.127-cb421-dirty Copyright (C) 1995-2016 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> ,L lua Happy hacking with Lua! To switch back, type `,L scheme'. lua@(guile-user)>
Now Guile REPL become a Lua REPL. Just try it as you know about Lua.
The most frequent statment I've heard like this: it should be easy to write frontend on Guile, you just parse Lua code and translate it to Scheme. Then let Scheme handle the scope and continuations or blabla. So the noticeable work is almost only the tokenizing and parsing.
NO! It's not how guile-lua-rebirth work, nor does any serious language frontend on Guile. We're not writing an interpreter of Lua with Scheme. We're trying to build a real compiler for Lua, with Scheme. For a multi-language platform like Guile, the Intermediate Language optimizing and code generation has been done. What we have to do is to parse Lua code to AST, and convert the AST to tree-il (the first layer of Guile intermediate representation) without losing any information especially the environment. In addition, we have to handle primitives for Lua, most of them are written in Scheme, and we have to find a proper way to call them from Lua acrossing the modules calling. When most of the work is done, we have to write many libs being compatible with origin Lua as possible to make this language implementation usable. Finally, we have to find a good way to let modules written in Scheme and Lua could be called by each other. Then you may take advantage of the contributions to Guile community.
So, the simplest work is the tokenizing and parsing in our work.
Guile-lua-rebirth has a well tested lexer (tokenizer), and LALR(1) parser for complete Lua-5.2 grammar, a new designed Abstract Syntax Tree (AST) layer for pre-optimizing, arithmetic operations with paramatric polymorphism (you may plus a number to a string, overloading, if you're familiar with such a term), Lua tables, proper tail call...
Too many obscured terms, huh? Don't worry, they're not so important for a Lua user. Folks just focus on Lua programming, and ignore what the compiler it is. Unless you want to contribute, of course, it's welcome. I'll try to explain how Lua is implemented on Guile in the later posts.
In addition, I'm trying to add new features into Lua. The rule to extend is not to break the compatibility as possible. I've added two so far.
The first new feature is actually a fix of Lua language. Let's see the code:
a={b={c={}}} -- define with colon reference function a.b.c:foo() return self end print(a.b.c:foo()) -- ==> table: 0x1a12ba0 print(a.b.c.foo()) -- ==> nil ----------------------------- -- define with point reference function a.b.c.bar() return self end print(a.b.c:bar()) -- ==> nil print(a.b.c.bar()) -- ==> nil
This is a common issue in Lua. The variable 'a' is a table, which contains another table 'b', and 'b' contains table 'c'. Lua use this approach to mimic Object-Oriented programming: you may treat tables just like classes.
When you define a function in a table, there're two methods to reference the value with a key in the table. The colon reference, and the point reference. The only difference is that colon-ref will pass the current table object as a hidden argument into the function, and bound to the variable named self. Just like this object in Javascript or C++, although they're not the same, they're similar. For point-ref, the self is unbound to any value, so it's nil in default according to Lua spec.
The interesting point is that you have to use colon-ref if you want to get meaningful `self' value. Otherwise, you will get nil, even if you've defined the function with colon. (We name this issue as ISSUE-1 for our later context.)
It is rediculous.
I think it's a simple logical problem. If you define the function with colon, then you must want to reference a meaningful `self' value. If you don't, and you don't want any other people to reference the meaningful `self' value, you'll never define the function with colon, which mentions the compiler pass the current table to the defined function as a hidden argument. On the other hand, if people write a.b.c.foo() in their code, they must want to get a meaningful result of `self', rather than nil. So the problem is that ISSUE-1 prevents people to get meaningful result even if they've defined the function with colon, and make people confuse with code because they can't figure out where's the error when they encoutered an unexpected nil somewhere.
My idea? I dropped the old design to pass the current table as the hidden argument. Because such an implementation will cause ISSUE-1. What I did in guile-lua-rebirth is to bind `self' to the current table in the nearest environment of the defined function. This makes `self' get a solid value when the function is defined with colon. No matter how you reference the `self' with colon or point.
---- The new design in guile-lua-rebirth a={b={c={}}} -- define with colon reference function a.b.c:foo() return self end print(a.b.c:foo()) -- ==> table: 0x1a12ba0 print(a.b.c.foo()) -- ==> table: 0x1a12ba0
Of course, the activity is unchanged when you defined the function with point, the `self' will be nil either. So it is unnecessary to show the redundant code here.
Many people thought Lua is just similar to C, since most of the syntax looks like C. Well, it is not true. At least, you can't use `continue'. You may read the discussion on StackOverflow. And the explain from the Lua author:
This is a common complaint. The Lua authors felt that continue was only one of a number of possible new control flow mechanisms (the fact that it cannot work with the scope rules of repeat/until was a secondary factor.) In Lua 5.2, there is a goto statement which can be easily used to do the same job.
There's no `continue' statement in Lua, but this feature is useful in certain context. For example, you want to print all odd number from 1 to 10.
---- For Lua-5.1 for i=1,10 do if i%2~=0 then print(i) end end ---- You have to change code logic to make it work ---- For Lua-5.2+ or Lua-jit for i=1,10 do if i % 2 == 0 then goto continue end print(i) ::continue:: end ---- You have to use GOTO which is obsolete for modern language ---- For guile-lua-rebirth for i=1,10 do if i%2==0 then print(i) else continue end end ---- No need to explain for any C user
Guile-lua-rebirth hasn't implemented GOTO yet. It won't be too difficult to implement when we have delimited-continuations. But I'm still thinking if we really need it, except for compatibility with Lua community. After all, Edsger Dijkstra has the famous paper GOTO statement considered harmful. But folks may argue against as "continuations are GOTOs too, and you plan to add it".
Well, I don't know.
I have to close comment since I found my blog is unstable to get comments from folks. It's all my bad, I should build a better blog. And I saw comments from you for encouraging me. Thank you very much Hayden Jones!
There're serveral concurrent models nowadays, CSP, Actors, π-calculus ...etc. It is believed that these concurrent models could bring high performance and scalable concurrent network service. Actor-model(I may call it Actors in the rest of the article) is a hot one that many folks would like to talk about it. But rather than using it, have you imagined to implement one for better learning it? Well, if your answer is yes then this is the article for you. I'll show you a feature-limited tiny actor-model implementation in Scheme, although you may call it a poor man's threading system, hope that's enough for you to understand the principle.
BTW. folks may kindly mention LFE(stands for Lisp Flavored Erlang) which is a Lisp syntax frontend implemented on Erlang compiler. In this article, I'm trying to show a tiny Actors framework written with pure Scheme. So they're very different in concept.
I would pick GNU Guile, because we need its delimited-continuations for coroutine. But it won't stop you to pick your fav Scheme implementation for Actors if you get the principle in this article.
Why I've picked Actors rather than others, say, CSP? Because Actors has something to do with Scheme historically. And this is a Scheme blog, don't you know? Alright, not true, I write many things, but this blog is built with Scheme. ;-)
After Carl Hewitt introduced Actor-model in 1973, Gerald Sussman and Guy Steele's attempt to understand Actors resulted in the creation of the Scheme. Then they published a paper to show the conclusion: "we discovered that the 'actors' and the lambda expressions were identical in implementation."
It is believed that a proper Actors implementation requires at least two key features provided by languages[1]: first-class-continuation and proper-tail-call. First-class-continuation is used to construct light-weight process in userland, and the proper-tail-call is actually more efficient, in addition to being clearer. Maybe it's the reason why Scheme stick to them in the original. I'll try to explain why they're important for Actors, later. Well, folks may say that there'd be message passing and pattern matching as well, but these features could be contructed relative easily upon any mainstream languages. It's unnecessay to be provided by language itself.
The benefit of Actors is well-known: lock-free by means of message passing, async (if you play them properly with non-block), high concurrency...
Of course, there're caveats in Actors too. The actor model has no direct notion of inheritance or hierarchy, which means it is time consuming and confusing to program actors with trends of common behaviour. Besides, a *real* Actors framework demands scalability, which means it should support exchanging states and continuations between distributed nodes. It is a hard problem for efficient implementing. Moreover, it is not easy to do static-analysis debuggning and optimizing...etc for such a model.[2] These prolems seem to have been preventing Actors to be a mainstream computing model for a long time. For me, I would like to use it only in a high concurrent and scalable server design. It's great for server developping, but maybe not so generic for common usage just like what Object-Oriented Model does.
According to Carl Hewitt, unlike previous models of computation, Actors was inspired by physics, including general relativity and quantum mechanics. In theory, Actors can be described by several of axioms/laws/formulas. This article is not going to explain Actors in formal mathmatics, since it's painful to understand for non-mathmaticians.
The Actors, from programming perspective, at least consist of these priciples:
1. Light-weight processes (actors) are the units for scheduling, it is unnessary to be the OS-level process; 2. Actors must communicate only by means of message(immutable) passing; 3. Each actor has its own mailbox, and can be accessed only by its owner. This means the states are local; 4. A proper pattern matching machanism for parsing the messages; 5. Actors may block themselves, but no actor should block the thread on which it is running; 6. Actors may spawn new actors; 7. Sharing states and processes by distributed nodes.
Then we may write a prove-of-concept one by following them. Some features maybe considered to be very hard, but they're somehow implemented in certain modern languages, say, Erlang.
First, we have to consider how to implement light-weight processes, say, actors. The best way is not calling fork, nor pthread, but continuations.
Continuations are dragons! Keep distance from it and pray some hero to slaught it for you is safer!
Alright, you might leave now.
Oh you don't buy it, right?
Continuations is good for implementing lightweight threads/coroutines. Unlike fork/pthread, such threads won't trap into OS kernel, hence this so-called green-thread would reduce much overhead, and lockfree (actually, just less locks). In brief, it's faster. Unfortunately, I'm not going to discuss what is continuations and how they're implemented in GNU Guile in this article. But it is necessary to explain the difference between full-continuations(call/cc) and delimited-continuations in a simple way. To show why the later would be better for light-weight coroutines. Maybe I could write another article to discuss them, deeper.
Similarly, I will discuss continuations from programing perspective.
When we talk about continuations, people may mention call/cc, as folks may know it is full-continuations. Full stack will be copied when capturing it, nevertheless, the worse situation is in Guile, there'll be two stacks are copied, continuations-stack and VM-stack. Please don't worry if you don't understand what I said. It's too heavy to use call/cc, that's what I want to say. Please see the figure below:
This figure maybe too simple, but you get the idea about how full-continuations are captured in principle.
And how about delimited continuations?
Again, it's too simple to show the true implementation in Guile, but you got how delimited-continuations are captured. Here's a better article if you want to learn more.
I hope it could help to understand why delimited-continuations is excellent for implementing light-weight processes. The point is that you just need to copy few stack frames (rather than full stack) in userland, which is light enough for scheduling.
I'm not going to discuss how to handle delimited-continuations in detail here, because it could be a very large topic. One may play delimited-continuations in many fancy ways. Here I just give an simplest example to show the principle of a coroutine.
(use-modules (ice-9 control)) ; import delimited-continuations module (define workqueue '()) ;; silly function to show the principle (define (fun x) (display "start\n") (abort) ; yield here, the rest computation (continuation) is stopped then scheduled (display x)(newline) (display "end\n")) ;; very simple scheduler, just add a yield thread to a queue (define (scheduler k) (set! workqueue (cons k workqueue))) ;; % is a breviate of call-with-prompt (define (spawn-thread proc) (% (proc) scheduler)) (spawn-thread (lambda () (fun 123))) ;; ==> start workqueue ;; ==> (#<partial-continuation 28baba0>) ; or something similar ((car workqueue)) ; run this thread ;; ==> 123 end
You may notice that the key points are two, % for drawing the bottom line for capturing and abort for yielding. Besides, we have scheduler to deal with the captured continuation, it's the second argument of % as you might see.
Well, now I believe it's enough to enter Actors.
There're various ways to implement Actors, here I'm trying to mimic an Erlang style one. Here's a ping-pong game code. If you ever played Erlang, you'll be familiar with the ping-pong code in Erlang manual:
-module(tut15). -export([start/0, ping/2, pong/0]). ping(0, Pong_PID) -> Pong_PID ! finished, io:format("ping finished~n", []); ping(N, Pong_PID) -> Pong_PID ! {ping, self()}, receive pong -> io:format("Ping received pong~n", []) end, ping(N - 1, Pong_PID). pong() -> receive finished -> io:format("Pong finished~n", []); {ping, Ping_PID} -> io:format("Pong received ping~n", []), Ping_PID ! pong, pong() end. start() -> Pong_PID = spawn(tut15, pong, []), spawn(tut15, ping, [3, Pong_PID]).
But if you transfer Erlang code to S-expr:
(define (ping . args) (match args ((0 pong-pid) (! pong-pid 'finished) (format #t "ping finished~%")) ((n pong-pid) (! pong-pid (list 'ping (self))) (receive ('pong #t (format #t "ping received pong~%"))) (ping (1- n) pong-pid)))) (define (pong) (receive ('finished #t (format #t "pong finished~%")) (('ping ping-pid) (format #t "pong received ping~%") (! ping-pid 'pong) (pong)))) (define (start) (let ((pong-pid (spawn pong '()))) (spawn ping (list 3 pong-pid)) (active-scheduler)))
Well, looks similar huh? Anyway, I just intended to write so, maybe helpful for you to learn Actors in Guile compared to Erlang code.
Before we try the code above to work, we have to implment the Actors framework first. We need to implement message passing, mailbox check, process spawn, and scheduler. Let me show the code directly:
;;==============Tiny framework of Actor-model========================= (use-modules (ice-9 control) ; for delimited-continuations (ice-9 match) ; for pattern matching (ice-9 q)) ; for queue data structure (define *mailbox* (make-hash-table)) (define *task-queue* (make-q)) (define (gen-pid) (gensym "actor-")) ;; send message to the pid (uniq to a process) (define-syntax-rule (! pid msg) (let ((mq (hashq-ref *mailbox* pid))) (if mq (enq! mq msg) (error '! "send msg: invalid pid" pid)))) (define-syntax-rule (has-task?) (not (q-empty? *task-queue*))) ;; get pid of the current process (define-syntax-rule (self) (if (has-task?) (car (q-front *task-queue*)) (error 'self "No task!"))) ;; check mail box for current process, schedule to sleep when it's empty (define-syntax-rule (receive body ...) (let lp() (match (hashq-ref *mailbox* (self)) ((or #f (? q-empty?)) (abort 'sleep) (lp)) ;; Very important!!! We must schedule the process after each receive scheduled, ;; or we failed to capture the correct continuation here. Don't blame me if you ;; see ghost when remove it, follow me to keep you safe!!! (mq ((lambda (_) (match _ body ...)) (deq! mq)) (abort 'sleep))))) (define-syntax-rule (%schedule) (when (has-task?) ((cdr (q-front *task-queue*))))) (define-syntax-rule (active-scheduler) (% (%schedule) scheduler)) ;; a simple scheduler to dispatch the process with status ;; `sleep' to suspend the current process and run the next process from queue ;; `quit' to exit the current process (define (scheduler k s) (case s ((sleep) (enq! *task-queue* (cons (car (deq! *task-queue*)) k))) ((quit) (hashq-remove! *mailbox* (car (deq! *task-queue*))))) (active-scheduler)) ;; spawn a new process to run proc with args as arguments (define (spawn proc args) (let ((pid (gen-pid))) (hashq-set! *mailbox* pid (make-q)) (enq! *task-queue* (cons pid (lambda () (apply proc args) (abort 'quit)))) pid))
Pretty easy, right? The principle is explicitly:
1. spawn a process as running a function 2. send messages between processes, depends on how you write these functions 3. use receive to check mailbox for the current process and handle them 4. schedule the current process when mailbox is empty 5. quit the process properly when the function is over (no loop again)
And maybe you've noticed that the two key features mentioned above: first-class-continuations and proper-tail-call plays great role in the Actors. Actually, we're using first-class-delimited-continuations for implementing light-weight processes. The proper-tail-call, well, you see it everywhere in the code, right? it makes the code clearer and elegant. Or you may imagine how to rewrite them in while loops.
Here is a complete version you may try.
Don't be upset if you can't understand delimited-continuations properly, it looks easy but not easy at all. I may write another article for it. But this time, please focus on Actors' principle only.
I'm sorry but I didn't implement serializable-continuations which is used to Sharing states and processes by distributed nodes. It's another big topic to make it properly. I may discuss it in the future.
[1] Scheme@33, William.D.Clinger.
[2] Why has the actor model not succeeded? Paul.Mackay http://www.doc.ic.ac.uk/~nd/surprise_97/journal/vol2/pjm2/
These days I'm facinated in working on MAL (stands for Make a Lisp) project with Guile. I've done several steps so far, and I'll send pull request when it's all finished. And I found my implementation in Guile-2.1 runs faster than most of others, only little slower than C version. Seems there's something to be expected. ;-)
Today I don't want to talk about this project. I want to discuss FFI in Guile. The reason why I jump to play FFI is because the spec of MAL uses PCRE for its tokenizer hint. Personally, I don't think it's good idea. Because in this case, PCRE regex hides all the lexer details for compiler writers. But if it's for beginner compiler hackers, this would reduce their pain on lexer.
Let me get into the topic in short, Guile has no good enough PCRE implementation so far. Oh, yes, there is my favorite irregex, very cool stuff, and supports most of PCRE features. Well, but I don't want to send my pull request including this lib, since it's too big. I would like to try a tiny & elegant solution. Then I saw FFI.
This case uses libpcre, if you need libpcre2, hmm...at least you have this article, right?
There's something we need to put in front of code:
(use-modules (rnrs) ; for bytevectors (system foreign)) ; for FFI ;; Get dynamic link from .so file, note you don't have to write .so explicitly. (define pcre-ffi (dynamic-link "libpcre"))
Now pick existed functions in the .so lib you want to bind to Guile.
(define %pcre-compile2 (pointer->procedure '* ; specify return type, here, it's a pointer (dynamic-func "pcre_compile2" pcre-ffi) ; get function pointer you want (list '* int '* '* '* '*)) ; declare arguments' types according to function signature ...
Note, no matter what kind of C pointer you faced, '* is the only way to go.
pointer->procedure is used to convert C function pointer to a Guile callable procedure.
These helper functions would be useful as you will learn later.
(define (make-blob-pointer len) (bytevector->pointer (make-bytevector len))) (define* (make-c-type-wrapper v l h type set ref #:optional (endian 'little)) (define size (sizeof type)) (define _obj (make-bytevector size 0)) (or (and (> v l) (< v h)) (error "value is overflow!" v)) (if (> size 1) (set _obj 0 v endian) (set _obj 0 v)) (lambda (cmd . arg) (case cmd ((ref) (if (> size 1) (ref _obj 0 endian) (ref _obj 0))) ((set) (or (and (> (car arg) l) (< (car arg) h)) (error "value is overflow!" v)) (if (> size 1) (set _obj 0 (car arg) endian) (set _obj 0 (car arg)))) ((&) (bytevector->pointer _obj)) (else (error "Invalid cmd, should be ref/set/&" cmd))))) ;; Assuming we're little endian in this case (define (make-C-uint8 x) (make-c-type-wrapper x 0 255 uint8 bytevector-u8-set! bytevector-u8-ref)) (define (make-C-sint8 x) (make-c-type-wrapper x -128 127 sint8 bytevector-s8-set! bytevector-s8-ref)) (define (make-C-uint16 x) (make-c-type-wrapper x 0 65535 int16 bytevector-u16-set! bytevector-u16-ref)) (define (make-C-sint16 x) (make-c-type-wrapper x -32768 32767 sint16 bytevector-s16-set! bytevector-s16-ref)) (define (make-C-uint32 x) (make-c-type-wrapper x 0 4294967295 uint32 bytevector-u32-set! bytevector-u32-ref)) (define (make-C-sint32 x) (make-c-type-wrapper x -2147483648 2147483647 sint32 bytevector-s32-set! bytevector-s32-ref)) ... ;; Try to finish the rest by yourself! Don't forget float!
IMO, the only difficulty in FFI is how you handle various C pointers to meet the arguments. Other types, int, long...are trivial.There're at least 4 situations.
Regular C pointer point to nothing, any types except pointer typenon_pointer_type *a; func(a);
This kind of pointer is usually used to return value if one doesn't want to use function returning mechanism.
(let ((a %null-pointer)) ; In Guile, you have to point to NULL explicitly (func a) ; now 'a' holds something returned from func for later using ...)Regular C pointer point to certain variable, any types except pointer type
non_pointer_type a = certain_obj; non_pointer_type *p = &a; // Assuming we have this silly function in .so lib int func(int *x) { *x = 10; return 0; }
Usually, this kind of pointer makes side-effect for the pointed variable, maybe change the value of it.
But we can't use scm->pointer directly!!!, since the pointer returned by this procedure is the pointer in VM, not in C stack or heap! Obviously, we have to get help from bytevectors.
;; For integer, including int/short/long (let* ((a (make-C-uint8 5)) (p (a '&))) (func p) (a 'ref)) ;; ==> 10 ; Yay!!! ;; Please try other types by yourselfPointer to pointer pointing to nothing
This kind of pointer is usually used to hold the memory block allocated within the callee function. Take a look this tutorial if you have any question.
non_pointer_type **ptr; malloc_something_func(ptr);
For Guile, you have two choices for freeing the allocated block. One is to free it as C programer does; another is to register its finalizer, then all the jobs delievered to GC, say, it'll be freed automatically when there's no reference to it.
The second point need to be noted is that you have to allocate a proper bytevector to hold the pointer of pointer. We need to allocate memory blocks manually. Because C will allocate memeory in stack automatically, but Guile wouldn't do it for C code since Guile doesn't know C grammar for that job. One of possible design is to embed a C parser for doing that. But it's out of our topic.
The last point is to remember to use dereference-pointer.
(define manual-free (pointer->procedure void (dynamic-func "xxx_free" xxx-ffi) (list '*)) ;; NOTE: finalizer has to be a C function pointer rather than a Guile procedure! (define auto-free (dynamic-func "xxx_free" xxx-ffi)) (let ((pp (make-blob-pointer (sizeof ptrdiff_t)))) ; allocate memory to store a pointer (malloc_something_func pp) ;; set finalizer, GC will free it automatically, as a Lisper (set-pointer-finalizer! (dereference-pointer pp) auto-free) ... ; doing your job (manual-free (dereference-pointer pp))) ; or you may free as a C programer
You don't have to take care of pp, since it's allocated by GC. What you should take care is (dereference-pointer pp) which points to a block allocated in malloc_something_func.
Three and higher level pointersNo!!! Don't ask me about it....
Here's an workable example using FFI:
git clone https://github.com/NalaGinrut/guile-pcre-ffi.git
And you may try tokenizer of MAL:
(use-modules (nala pcre)) (define *token-re* (new-pcre "[\\s,]*(~@|[\\[\\]{}()'`~^@]|\"(?:\\\\.|[^\\\\\"])*\"|;[^\n]*|[^\\s\\[\\]{}('\"`,;)]*)")) (define (tokenizer str) (filter (lambda (s) (and (not (string-null? s)) (not (string=? (substring s 0 1) ";")))) (pcre-search *token-re* str))) (tokenizer "nil true ,false") ;; ==> ("nil" "true" "false")
There're at least two problems if you want to use pure FFI.
One is reentry issue. If something C stuff can't promise you reentry, you may have to write some C wrapper for that.
The second is that FFI can't handle pointers elegantly although it looks like elegant. Actually, you have to endure many segmentfalt before you get them work. And it's hard to debug when you use pure FFI. The same situation would be easier in C code.
Even my guile-pcre-ffi works fine, there's fatal bug causes segmentfault while you try to free pcre object, no matter explicitly or inexplicitly. Unfortunately, I haven't find the reason. As I said, it seems not so nice to debug...
I'm pleased to announce artanis-0.0.3 here.
GNU Artanis is a web application framework(WAF) written in Guile Scheme. It is designed to support the development of dynamic websites, web applications, web services and web resources. Artanis provides several tools for web development: database access, templating frameworks, session management, URL-remapping for RESTful, page caching, and so on.
GNU Artanis is under GPLv3+ & LGPLv3+ (dual licenses).
GNU Artanis is also the official project of SZDIY community. It's used to build server side of SZDIY common service. It is offered to GNU project to make free software better.
Here are the compressed sources:
http://alpha.gnu.org/gnu/artanis//artanis-0.0.3.tar.gz (432KB) http://alpha.gnu.org/gnu/artanis//artanis-0.0.3.tar.bz2 (352KB)
Here are the GPG detached signatures[*]:
http://alpha.gnu.org/gnu/artanis//artanis-0.0.3.tar.gz.sig http://alpha.gnu.org/gnu/artanis//artanis-0.0.3.tar.bz2.sig
Use a mirror for higher download bandwidth:
http://www.gnu.org/order/ftp.html
Here are the MD5 and SHA1 checksums:
751adf2bee25fd780041142ee4f714f6 artanis-0.0.3.tar.gz d4aa8076c5af142785037546a378cc61 artanis-0.0.3.tar.bz2 b33cd373f6d969db7e25ce38b4567ea6fb85adc6 artanis-0.0.3.tar.gz acc5d2fa70f620639aeae9b965cc167562601c3a artanis-0.0.3.tar.bz2
[*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this:
gpg --verify artanis-0.0.3.tar.gz.sig
If that command fails because you don't have the required public key, then run this command to import it:
gpg --keyserver keys.gnupg.net --recv-keys EE78E925
and rerun the 'gpg --verify' command.
This release was bootstrapped with the following tools:
Autoconf 2.69.120-5dcda-dirty Guile 2.1.0.2306-22c9e
Changes in 0.0.3 * Notable changes Fixed several bugs. Support JSONP.
Yesterday night I've written command line interface (CLI) for GNU Artanis, after took care of my daughter to sleep, I had only one hour to hack something.
My aim is to add `art create proj-name' command to create a bunch of files/directories for initializing a new Artanis web app. Just like Rails does. Well, since my Turing brain has limited time for this hack, I have to choose my favorite way to traverse the tree for this job, the recursive way. ;-)
One of the proper way to mkdir a directory tree is to use DFS. I chose the pre-order DFS.
If you have well structured Tree implementation like this, then the pseudo code below could be simple enough to try.
;; This is pseudo code (define (preorder node visit) (cond ((leaf? node) #t) (else (visit node) (for-each preorder (children node)))))
Actually, the same algorithm is used in my unfinished data-taistructure lib Nashkel. Could be optimized to iterative one though...
Here's a problem: I don't known if I'll add/remove directories in the future, so I need a flexible design to let me do this job easily. A proper tree design is a good start.
For example:
|-- app | |-- controller | |-- model | `-- view |-- lib |-- log |-- prv |-- pub | |-- css | |-- img | | `-- upload | `-- js |-- sys | |-- i18n | `-- pages |-- test | |-- benchmark | |-- functional | `-- unit `-- tmp `-- cache
In Scheme, the most convenient data structure is the list:
(define *dir-arch* '((app (model controller view)) ; MVC stuff (sys (pages i18n)) ; system stuff (log) ; log files (lib) ; libs (pub ((img (upload)) css js)) ; public assets (prv) ; private stuff, say, something dedicated config or tokens (tmp (cache)) ; temporary files (test (unit functional benchmark)))) ; tests stuffs
Let me explain this tree structure built with list, since the algorithm implementation has something to do with the data structure design. I gave the rule below:
((node1 (child1_1 child1_2 ...)) (node2 ((node3 (child3_1 child3_2 ...)) child2_1 child2_2 (node4 (child4_1 child4_2 ...)) ...)) (node5) ; no any children ...)
The disadvantage of this tree is hard to modify dynamically (alright, you may use set-car! and set-cdr!). But it's easy to handle statically, because the hierarchy is clear.
Now that I'm using Scheme, I'll choose FP way containly. High order function just like design pattern of FP. Design a good and generic one will save your lot time in the future development.
Here is mine:
(define (dfs tree visit level) (match tree (() #t) (((r (children ...)) rest ...) (visit r level) (for-each (lambda (x) (dfs (list x) visit (cons r level))) children) (dfs rest visit level)) (((r) rest ...) (visit r level) (dfs rest visit level)) ((children ...) (visit (car children) level) (dfs (cdr children) visit level))))
So it's easy to mkdir a directory tree now:
(define (create-framework) (define (->path x l) (format #f "~{~a~^/~}" (reverse (cons x l)))) (dfs *dir-arch* (lambda (x l) (mkdir (->path x l))) '()))
This CLI feature won't appear in 0.0.3, and because it'll bring great change to GNU Artanis, maybe it should be in 0.1.0. Dunno...
Happy hacking!
I'm pleased to announce artanis-0.0.2 here.
GNU Artanis is a web application framework(WAF) written in Guile Scheme. It is designed to support the development of dynamic websites, web applications, web services and web resources. Artanis provides several tools for web development: database access, templating frameworks, session management, URL-remapping for RESTful, page caching, and so on.
GNU Artanis is under GPLv3+ & LGPLv3+ (dual licenses).
GNU Artanis is also the official project of SZDIY community. It's used to build server side of SZDIY common service. It is offered to GNU project to make free software better.
Here are the compressed sources:
http://alpha.gnu.org/gnu/artanis/artanis-0.0.2.tar.gz (440KB) http://alpha.gnu.org/gnu/artanis/artanis-0.0.2.tar.bz2 (360KB)
Here are the GPG detached signatures[*]:
http://alpha.gnu.org/gnu/artanis/artanis-0.0.2.tar.gz.sig http://alpha.gnu.org/gnu/artanis/artanis-0.0.2.tar.bz2.sig
Use a mirror for higher download bandwidth:
http://www.gnu.org/order/ftp.html
Here are the MD5 and SHA1 checksums:
0914f4511263a725973f9f1462a18d53 artanis-0.0.2.tar.gz b288e77c5986b5b95ce03f8ef15a4d86 artanis-0.0.2.tar.bz2 af62cdf790ee9540172201109b6b6e5be6dd3fce artanis-0.0.2.tar.gz 2aa174e8fdc12cbe8e7c108fac9cdf0ecf569f41 artanis-0.0.2.tar.bz2
[*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this:
gpg --verify artanis-0.0.2.tar.gz.sig
If that command fails because you don't have the required public key, then run this command to import it:
gpg --keyserver keys.gnupg.net --recv-keys EE78E925
and rerun the 'gpg --verify' command.
This release was bootstrapped with the following tools:
Autoconf 2.69.120-5dcda-dirty Guile 2.1.0.2301-7b0a8
Changes in 0.0.2 * Notable changes Updated for GNU project.
Artanis is a new web application framework (WAF) written with pure GNU Guile Scheme. Artanis is free software, under GPLv3 and LGPLv3.
Artanis contains common HTTP stuffs (cookies,authentication,cache,sessions...), URL-remapping, HTML templating, and various experimental methods to handle Database: SSQL(SQL in s-expr), FPRM (Functional Programming Relational Mapping, and SQL-mapping. Now it supports mysql/postgresql/sqlite3 as DBD.
The current performance of server in Artanis is weak. It is planed to implement an async green-thread server with Guile's brand new delimited-continuations in next big version (0.2).
Artanis was announced as certificated awesome project in 2013 Lisp in summer projects contest. It's an official project of SZDIY community for building the server side of web-service and mobile-app of the community.
Unfortunatly, Guile doesn't provide a method to convert hex to bin (When I'm writing, the current version is 2.0.11). People have to write their own one.
But fortunatly, writing code in Scheme is kind of enjoy. Every time I've done my program in Scheme, a new idea will strike my head to refactor it. Then you can't stop to continue to program.
Well, sometimes, we just want a quick way to do our job. But in Scheme, you may become a lib/framework writer unexpectedly. Maybe it's the worst point of Scheme. And one may enjoy wasting such time...hey I'm sorry Boss...
Here's my version, you may take it for free, or donate some fixes/suggestions/comments. ;-)
(use-modules (rnrs)) ; we need bytevectors in rnrs module (define (%numstr->bin str final base) (define len (string-length str)) (let lp((i 0) (ret '())) (cond ((= i len) (final (reverse! ret))) (else (lp (+ i 2) (cons (string->number (substring str i (+ i 2)) base) ret)))))) ;; NOTE: substring in Guile happens to be copy-on-write, so it would be efficient (define (hex->bin str) (%numstr->bin str u8-list->bytevector 16)) (define (hex->list str) (%numstr->bin str identity 16)) (define (hex->ascii str) (%numstr->bin str (lambda (x) (utf8->string (u8-list->bytevector x))) 16)) ;; (hex->bin "4a4b4c") ;; => #vu8(74 75 76) ;; (hex->list "4a4b4c") ;; => (74 75 76) ;; (hex->ascii "4a4b4c") ;; => "JKL"
In the beginning, I just need hex->bin, but finally, I wrote a high order function for a tiny hex lib.
Response to comments:
Suggestions from @Raymond:
1) get rid of the `final' argument of %numstr->bin.
(%numstr->bin str final base) isn't any shorter or faster than (final (%numstr->bin str base)); in the latter case it's easier to understand what's happening and there aren't as many positional parameters to remember.
Answer:Yes, you're right. I agree a callback in final position could be optimized out and put it outside.
2) rename %numstr->bin to %numstr->u8-list.
If you eliminate the `final' parameter as suggested, then %numstr->bin always returns the same type, and you can use a more descriptive name.
Answer:Yes, it's a related optimize. ;-)
According to SICP, high-order function gives us the ability to build abstractions by assigning names to common patterns and then to work in terms of the abstractions directly. It's very useful while you're trying to refactor you code.
There're several built-in classical high-order functions in most modern languages, like map, reduce, filter, accumulator... Frequently, we use them for solving most problems.
But it doesn't mean you can't find or invent new one. Actually, you may be inspired during programming and find a new pattern which could be abstracted to use high-order functions.
This article introduced a new high-order function for certain cases. I wrote it in Scheme, but you can rewrite it in any languages supported first-class function.
It's easy to find such a pattern in some cases: the input data needs to be handled repeatedly, each round the result would be the input of next round, *BUT* we need to pass two arguments to the kernel function. This is similar to pipeline pattern, but different. Because pipeline pattern pass only one argument amount rounds.
To make this clearer, let's try a simple example.
Considering a GCD(Great Common Divisor) function:
(define (my-gcd x y) (if (zero? y) x (my-gcd y (modulo x y))))
You may found a rule that my-gcd was called repeatedly, and each time the arguments are related to the last round. Actually, the kernel function my-gcd generates the input arguments for next round. This situation appears in most tail-call procedures.
So let me introduce a new pattern named urob stands for Uroboros. It's a snake eating itself. I can't say this metaphor is very proper, but I hope it can active your imagination.
(define (urob func init val pred) (if (pred val) init (call-with-values (lambda () (func init val)) (lambda (x y) (urob func x y pred)))))
Very simple huh?
The pred would predicate if the recursive calling should be end. And func is the kernel function. Let's try it for GCD:
(define (my-gcd2 x y) (urob (lambda (a b) (values b (modulo a b))) x y zero?))"But why we drop the previous simple and elegant GCD implementation for this?"
Someone may shout.
I'll explain it in the end. Now let's try more complex case:
A delimiter lexer is easy to understand. You specifiy a delimiters list, and the lexer will tokenize the input string:
;; (lexer str delimiters) (lexer "hello.x+(bye)" ".+()") ;; ==> ("hello" "." "x" "+" "(" "bye" ")")
It's not hard to implement such a lexer with loop:
(define (lexer str delimiters) (define cs (->char-set delimiters)) (define-syntax-rule (-> w r) (if (null? w) r (cons (list->string (reverse! w)) r))) (let lp((lst (string->list str)) (ret '()) (word '())) (cond ((null? lst) (reverse! ret)) (else (if (char-set-contains? cs (car lst)) (lp (cdr lst) (cons (string (car lst)) (-> word ret)) '()) (lp (cdr lst) ret (cons (car lst) word)))))))
But I'm going to show you a uroboros version:
(use-modules (ice-9 rdelim) (rnrs)) (define (lexer str delimiters) (define (-> c) (if (eof-object? c) '() (list (string c)))) (define (tokenizer lst str) (call-with-input-string str (lambda (port) (let* ((token (read-delimited delimiters port 'peek)) (delim (-> (read-char port))) (rest (get-string-all port))) (values (if (string-null? token) `(,@lst ,@delim) `(,@lst ,token ,@delim)) rest))))) (urob tokenizer '() str string-null?))
Choosing a proper high-order function could help you to break you code into more maintainable independent functions, and reuse the code largely. Although understanding the classical high-order functions is important in functional programming, you can even invent your own during the practicing.
I always thought, if design pattern is so significant for OOP, maybe high-order function has equal status in FP. Maybe both.
Who knows...
The concept of pipeline in functional programming means a succession of functions that operate, one after another, on an array of data, which consists of a chain of processing elements arranged so that the output of each element is the input of the next. One of the famous practices is Unix pipeline.
This article is about pipeline-like programming pattern, rather than 'yet another tutorial of pipeline tools'.
Let's see some code:
Here's a silly code as an example.
def get_number(): return input("Give me a number: ") def num_filter(num): print "oh you input %d!" % num return num % 2 def word_picker(hit): print "checking..." return ('even', 'odd')[hit] def result_show(result): print "It is an %s number." % result result_show(word_picker(num_filter(get_number)))
The process is very simple, get a number, and check if it's even or odd, finally it print out the result.
But this code has a problem, each time you want to add a level in your cascaded functions chain, you have to modify the code. The worst case is when you have more than 10 levels in your functions calling chain, you can't write it as a chain. You have to split them like this:
num = get_number() hit = num_filter(num) result = word_picker(hit) result_show(result)
This could be clearer, and it's easy to add a 'add_one' step after 'get_number', which plus 1 to the number you just typed. But you've added a lot of code and several temprorary variables. Let's hope your compiler can eliminate all the redundant part for you during the optimization.
Anyway, it's better not to modify the original code when you want to add new code.
Now here is the pipeline one:
def make_pipeline(procs): return lambda x: reduce(lambda y,p:p(y), procs, x) procs = [num_filter, word_picker, result_show] f = make_pipeline(procs) f(get_number()) def add_one(x): return 1+x f2 = make_pipeline(procs.insert(0, add_one)) f2(get_number())
Although many pythoners enjoy lambda and reduce, not all the people have realized that these stuffs are borrowed from Lisp/Scheme land. There're cooler things in Scheme, maybe we should drop reduce and give others a try, say, fold:
(use-modules (srfi srfi-1)) ; `fold' dwells srfi-1 (define (get-number) (display "Give me a number: ") (read)) (define (num-filter num) (format #t "oh you input ~d!~%" num) ;; ~d is similar to %d in Python, and ~% means newline (modulo num 2)) (define (word-picker hit) (display "checking...\n") (list-ref '("even" "odd") hit)) ; list accessing in Scheme (define (result-show result) (format #t "It is an ~a number.~%" result)) ;; ~a is similar to %r in Python (define (make-pipeline . procs) (lambda (x) (fold (lambda (y p) (y p)) x procs))) (define f (make-pipeline num-filter word-picker result-show)) (f (get-number))
See, there's no much difference from Python code. And you may read the manual of fold here.
After all, the content of this article is nothing but introduced a possible way to implement compose, which is very useful and interesting in programming. Now you already know what it is and how it can be, maybe it's time to hack more code, huh?
In spite of a Scheme implementation, Guile is also an extension language platform. This means you can write new language on it, which could be totally different from Scheme. Say, PHP/Lua/Ecmascript...and all these front-end will take advantage of the compiling optimization machenism of Guile.
This article introduced a simple way to define a very simple programing language with 50 lines code, in Guile.
The contents may require some basic knowledge of compiler principle. If you're not so comfortable with the terms in the article, please read the Dragon Book first.It's better to clarify something before we start. When I'm talking 'create a language', I mean designing the grammar of the language in certain form, usually, in BNF. And when I say 'implement a language', it means writing lexer/parser, then transfer the source code to an immediate form, like AST(Abstract Syntax Tree) or other IL(Intermediate Language) for some optimizations. Finally we may have two choices:
1. To interprete the IL to get the meaningful result -- It's an Interpreter!
2. To generate some kind of code to store in a file -- It's a Compiler, if the final code is bytecode, you're writing a VM one; or native code for AOT compiler.
This article is about the front-end only: lexer and parser, and transforming a simple AST (actually it's list type in Scheme) to another kind of AST, tree-il, the first level of Guile intermediate language. After the tree-il was generated, the rest of the compiling work would be taken by Guile.
So we don't have to face the complicated compiling optimization stuffs. This feature makes Guile very easy to implement new languages. Well, now we need to create a language first, we name it 'simple', here's the BNF:
exp ::= exp op exp | number op ::= * | / ;; `multi' and `div' has higher precedence than `plus' and `minus' | + | -
Very easy huh? Only the elementary arithmetic. The only difficulty is to handle the precedence, we can rewrite it to this:
exp ::= exp + term | exp - term | term term ::= term * factor | term / factor | factor
This new BNF is clearer than the old one, and it promises the correct precedence.
Guile has intergrated LALR(1) parser generator in the core, but no lexer generator, so users have to write lexer manually. It's fine to me, since writing lexer is interesting. Then the key point to multi-lang is to know what does this lalr module wants to eat.
Anyway there's an alternative external lexer generator: silex.
The parser generated by the lalr-parser macro is a function that takes two parameters. The first parameter is a lexical analyzer while the second is an error handler. A token is either a symbol, or a record created by the function make-lexical-token:
(make-lexical-token category source value)
A lexical token record has three fields: category, which must be a symbol, a source location object source, and a semantic value associated with the token. For example, a string token would have a category set to 'STRING while its semantic value is set to the string value "hello". The field accessors are:
lexical-token-category lexical-token-source lexical-token-value
Once the end of file encountered, the lexical analyzer must always return the symbol '*eoi*. In spite of this, your lexer must return the token struct to the parser, then let the parser do its work.
Before we start our 50-lines tour, we need some preparation, the code below is to load LALR module, and define two helper functions.
;; Be sure you imported LALR module: (use-modules (system base lalr)) ;; Two helper macros to create the token struct for returning (define-syntax-rule (port-source-location port) (make-source-location (port-filename port) (port-line port) (port-column port) (false-if-exception (ftell port)) #f)) (define-syntax-rule (return port category value) (make-lexical-token category (port-source-location port) value))
The lexer is a tool of lexical analysis, which aims to produce tokens.
Let's see some code:
These functions are useful to predicate different token for you.
(define (is-whitespace? c) (char-set-contains? char-set:whitespace c)) (define (is-number? c) (char-set-contains? char-set:hex-digit c)) ;; operators, in this simple case, we just have four operators (define (is-op? c) (string-contains "+-*/" (string c))) (define (is-delimiter? c) (or (eof-object? c) (string-contains " +-*/;\n" (string c))))
And these two functions are used to get the basic two types of token in our Simple language: numbers, and the operators.
(define (get-number port) (let lp((c (peek-char port)) (ret '())) (cond ((is-delimiter? c) ; encounter delimiter, finish to read a number ;; convert to a number representation (string->number (list->string (reverse ret)))) (else (read-char port) ; next char (lp (peek-char port) (cons c ret)))))) (define (get-op port) (string->symbol (string (read-char port))))
The key function is next-token which is used to check then get then return the proper token to the parser.
(define (next-token port) (let ((c (peek-char port))) (cond ((or (eof-object? c) (char=? c #\nl)) ; end of line, or end src '*eoi*) ; return '*eoi* because LALR module need it ((is-whitespace? c) (read-char port) (next-token port)) ; skip white space ((is-number? c) (return port 'number (get-number port))) ((is-op? c) (return port (get-op port) #f)) (else (read-char port) (next-token port)))))
This tokenizer is important because the parser need it be passed in.
(define (make-simple-tokenizer port) (lambda () (next-token port)))
The tokenizer must return a thunk (stands for a function without any args) Each time calling this thunk, it returns a token. The parser will call it several times automatically, depends on the length of your source code. That's what LALR parser need to be fed.
Now it's the parser part:
(define (make-parser) (lalr-parser ;; Since we handled precedence manually in BNF, so we ;; don't need to specify it in lalr-parser. ;; (number (left: + -) (left: * /)) (number + - * /) (program (exp) : $1 (*eoi*) : (call-with-input-string "" read)) ; *eof-object* (exp (exp + term) : `(+ ,$1 ,$3) (exp - term) : `(- ,$1 ,$3) (term) : $1) (term (term * factor) : `(* ,$1 ,$3) (term / factor) : `(/ ,$1 ,$3) (factor) : $1) (factor (number) : `(number ,$1))))
You may found that we just converted the BNF to a parse tree, which is actually the list in Scheme, say:
'(* (number 1) (number 3))
This is one of the perfect features of Scheme, because you can use this fundamental data type to indicate a tree. You don't have to write any new/complex data structure for that. That's why I emphasized that it is very cool to implement new languages with Scheme language.
Then you may try this line in the REPL to test if the precedence is correct:
(call-with-input-string "1+1*2/3-4" (lambda (port) ((make-parser) (make-simple-tokenizer port) error)))It should return:
(- (+ (number 1) (/ (* (number 1) (number 2)) (number 3))) (number 4))Maybe you're not so comfortable with this form, it's actually the same with:
(1 + ((1 * 2) / 3)) - 4
It's the correct precedence according to our BNF.
These code defined a function named compile-tree-il which is used to transform our parse tree to sort of AST, say, Tree-IL [1]. You see, only few lines.
(define (compile-tree-il exp env opts) (values (parse-tree-il (comp exp '())) env env)) (define (comp src e) (match src (('number x) `(const ,x)) ;; If you're using master branch of Guile, please use `call' to replace `apply'. ;; If you are using stable-2.0 or Guile-2.0.x, `apply' should be fine [1]. ((op x y) `(apply (primitive ,op) ,(comp x e) ,(comp y e)))))
You may put all the code showed above into one file, say, simple.scm. And make sure it's in language/simple directory, please make sure for it. Because Guile will pick the language front-end from language directory when you ask for loading the language.
Although we've done all the coding work for our Simple language, we have to write one more file to let Guile know it. This is the spec file. The syntax is very easy:
(define-module (language simple spec) #:use-module (system base language) #:use-module (language simple simple) #:export (simple)) ;; The definition of Simple language ;; You don't have to understand it, just copy and modify it from other front-end (define-language simple #:title "simple" #:reader (lambda (port env) ((make-parser) (make-simple-tokenizer port) error)) #:compilers `((tree-il . ,compile-tree-il)) #:printer write)
You may type all the code yourself, it's better. But here is a git repo if you need any example:
git clone https://github.com/NalaGinrut/simple.git
Please type sudo make install, and run Guile REPL (just run 'guile' in your command line). And run ,L simple in your REPL, you should see:
GNU Guile 2.1.0.1771-48c2a Copyright (C) 1995-2014 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> ,L simple ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /usr/local/share/guile/2.2/language/simple/spec.scm ;;; compiled /home/nalaginrut/.cache/guile/ccache/2.2-LE-8-3.4/usr/local/share/guile/2.2/language/simple/spec.scm.go Happy hacking with simple! To switch back, type `,L scheme'. simple@(guile-user)>
OK, now you see the language was changed to 'simple', and just play maths in it:
1+1 => 2 3*4 => 12
'simple' is just a simple language, maybe too simple for a serious compiler writer. Because even a front-end will take you a lot of time and hack power. No mention the backend. Fortunately, Guile provides a nice way to let language fans focus on the grammar rather than optimization. Nevertheless, all the languages front-end can call from each other, If you're interested in this feature, please read Wingo's post[2].
I wrote an IMP[3] language before. I expect to share more in the future.
----------
Refs
[1] Tree-IL in Guile: https://www.gnu.org/software/guile/manual/html_node/Tree_002dIL.html
[2] Andy Wingo introduced Ecmacript in Guile, and inter calling between Ecmascript and Scheme: http://wingolog.org/archives/2009/02/22/ecmascript-for-guile
[3] https://github.com/NalaGinrut/imp
Take it easy if you're uncomfortable with the word 'elegant'. This article is not driven by any academic purpose. I'm trying to show an interesting way to write chaos Scheme code. ;-)
IMO, learning a language in a chaos way is fascinating.
Before your advanture, please make sure your Scheme interpreter support R7Rs.
For example, the latest Racket(5.3+) or GNU Guile(2.0.10+).
If you picked GNU Guile, please enable R7Rs symbols feature:
;; run this line in your Guile REPL: (read-enable 'r7rs-symbols)
OK
Now here is the code:
;; Try this code: (apply + `(,((lambda (|2+3|) #(1)2+3) #e2e3) ,(* 120 .((*(/(+(/(* #e144#) 84)))))) ,(*(/(+ #e5/7)) 5)))
See, if you copy this code or type it correctly, what it should be showed?
Surprise hum?
Now let me explain the code if you can't figure it out clearly.
We can split this chaos code to several parts. The first part is:
((lambda (|2+3|) #(1)2+3) ; *exp1* #e2e3) ; *exp2*
As you may know the symbol in the parens after lambda depicts the parameter(as formal parameter) name. So |2+3| means a parameter named "2+3". Note that "2+3" here won't be evaluated to 5, but stored as a symbol for parameter name. This is a new syntax in R7Rs, an identifier can be represented by a sequence of zero or more characters enclosed within vertical lines (|), analogous to string literals.
#(1) means a vector contained one element: integer 1. It's reasonable to consider it as C array, because of the randomly accessing:
int arr[1] = {1};
The right side paren ')' is delimiter in Scheme, so *exp1* can be rewritten as:
(lambda (x) ; imagine x is replaced by a variable named "2+3" #(1) x) ; return x
Now it's easy to understand that it's a anonymous function which always return the parameter passed in. The vector #(1) would be ignored in the context.
Let's consider *exp2*. The first question is "In what order should this code be read?"
Is it e2e3 after a hashtag? No.
Actually, it's #e followed by 2e3
2e3 means $2.0\times 10^3 = 2000.0$.
#e means "produce an exact number". So the code *exp2* can be replaced as:
(* 2 (expt 10 3)) ;; NOTE: ;; #e means "produce an exact number" NOT "convert it to exact number" ;; The difference is that the cast can't promise the exactness
((lambda (|2+3|) #(1)2+3) #e2e3)
Let's move to the second part:
(* 120 .((*(/(+(/(* #e144#) 84))))))
Well, now you've already known the meaning of #e here, right?
And 144# seems an ambiguous mystery syntax. Actually, it's guaranteed in string->number of R5Rs[0]. You may consider it as $ddd\# = produce\_exact\_num(ddd \times 10)$ [1].
(string->number "144#") ;; ==> 1440.0 144# ;; ==> 1440.0 #e144# ;; ==> 1440 ;; #e means "produce an exact number"
The REPL calls read procedure to read&parse "144#", then it will call string->number to convert it to an inexact number 1440.0.
With all hints so far, the rest code is easy to analyze, it's your turn if you're still here. ;-)
Happy hacking!
-------------
NOTE:
[0] Actually there's an explicit removing of this feature in R6Rs and R7Rs: # can no longer be used in place of digits in number representations.
[1] Racket/Chicken/Guile has this feature. Some Scheme implementation don't guarantee it. Scheme48 doesn't replace # as 0, but 5. Its code intend doing so, but I've no idea why.
Well, I love Scheme. That's all.
OK, seriously, Scheme is cool enough to hassle, especially after you've learned SICP.
I picked GNU Guile Scheme for the work. Because it appears in almost every GNU/Linux distro.
But you may need to install Guile2 rather than Guile-1.8:
sudo apt-get install guile-2.0
- Hey! I'll shut this page if you tell me to output HTML with bunch of `display' calling! - Don't worry, we're not going that way.
Here's a nice blog engine provided by Andy Wingo, Tekuti is its name. Tekuti is a Git based blog engine, so there's no more SQL-injection risk. All the data are handled by Git on the server side.
I know many folks like to build their blog on github nowadays. All you need is a static-page-generator, I could write another tutorial for this on Scheme too! But you don't have to use github if you've already had a VPS, right?
Anyway, this blog is built with Tekuti.
I suggest you try my fork of Tekuti, it's easier to use:
git clone http://github.com/NalaGinrut/nala-tekuti.git
And compile then install it:
./configure && make && sudo make install
You have to modify your own config file.
cd nala-tekuti/tekuti cp config.template config.scm emacs config.scm
There're several items in config.scm, but the most significant is `*admin-user*' and `*admin-pass*'. Change them to your prefered admin name and passwd. And you may modify others as you wish.
Although Tekuti can run independently with its own inner-server, it's better to take advantage of reverse-proxy in Nginx, rather than disposing 8080 port to internet.
You may add these line to your /etc/nginx/nginx.conf:
location /blog { proxy_pass http://127.0.0.1:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }
Then restart you Nginx:
sudo service nginx restart
Now you can start you blog engine:
sudo service tekuti start
Then visit the site http://your_domain_name/blog if you're lucky enough ;-)
Happy hacking!
折腾了快一个月,总算把这个新博客弄好了。要说有啥特色,其实主要是技术上的特色,本博客完全用Guile写的,采用了Andy Wingo的博客框架tekuti。我做了一些改进,修复了一些bug。
架构上很简单,最前面是Nginx,然后做逆向代理转发到tekuti的服务端口。虽然tekuti可以自给自足,但前面用个Nginx总是心理踏实点(恐怕也不只是心理作用那么简单了)。
等之后写一篇如何用Guile搭建博客的教程吧。