Jekyll2024-02-08T18:30:15+00:00https://ya0guang.com/feed.xml摇光无意识地搞事情的地方 Hack Music Photography Touhou Computer Science ya0guangya0guang@protonmail.com摄影和科研2023-12-23T13:00:07+00:002023-12-23T13:00:07+00:00https://ya0guang.com/caprice/ResearchAndPhotography<p>最近重新拾起了相机,惊觉摄影和科研(至少在计算机领域)简直相似地过分。强烈建议学术民工尝试一下摄影,或者摄影师尝试一下科研(?)。</p>
<h2 id="从创新开始">从创新开始</h2>
<p>首先要强调的就是创新。摄影创新的角度和科研简直不要太像了。</p>
<blockquote>
<p>当下我们习以为常的造物,若是能置于历史的车轮下,我们即刻就能发现它的伟大。</p>
</blockquote>
<h3 id="选题和时代">选题和时代</h3>
<p>简单地说,选题就是去找一些历史上真就没人拍过的东西。在Duchamp的小便池(<a href="https://en.wikipedia.org/wiki/Fountain_(Duchamp)">Fountain</a>)成为艺术之前,从未有人想到过这货也能是艺术?!</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Fontaine_Duchamp.jpg/300px-Fontaine_Duchamp.jpg" alt="Fountain" /></p>
<p>摄影家之所以能成为摄影家,很大程度上是取决于他们在题材上取得了一些突破,而且这种突破往往是带有时代性的。比如工业文明和工业废墟的出现,使得Edward Burtynsky的作品成为了可能。在科研上,很多工作也都跟着时代在走。当下大语言模型(Large Language Model)的热潮再次引发了人们对于AI在安全,道德,以及社会影响的讨论。时代的变化,使一些作品变得可能,也使一些作品变成永恒。</p>
<p><img src="https://images.squarespace-cdn.com/content/v1/5915c70e59cc6830f44a9d74/1499802260383-6GLAL2L7OSS2LF4ORW2S/URM_01_97_big.jpg" alt="Densified Oil Drums #1" /></p>
<p>我很喜欢一些安全领域的会议会设置一个奖项,叫做“Test of Time Award”。这个奖项的意义在于,这篇文章在当时可能并没有引起足够的关注,但是随着时代的变化,这篇文章的价值也随之被发掘。在摄影上,也有类似的事情。虽然不是以奖项的形式出现,但是在当下所拍摄的每一张照片在下一秒钟就成为历史了。这也就使得很多摄影家拍摄的照片不仅仅具有美学意义,也具有了历史研究价值。上个世纪的中国发展十分落后时,国内很少有人拥有相机,或很少能够使用价格高昂的胶片去拍照。而那时不少来华的摄影师,比如Marc Riboud,就为当时的中国和中国人留下了很多珍贵的影像资料。而当这些尘封在历史下的照片,或者是文章再一次被发掘的时候,也能带给我们新的启发。</p>
<p><img src="http://marcriboud.com/wp-content/uploads/2015/11/marc-riboud-chine-ancienne-014.jpg" alt="Harvest in Shaanxi province, 1957" /></p>
<h3 id="技术驱动">技术驱动</h3>
<p>显然,不像很多其他传统艺术形式的用具或技法没有发生变化那般,摄影的技术在近百年来迭代地十分迅速。中国画使用的笔墨纸砚和古代似乎并没有本质的区别,而油画(蛋彩)的画布与颜料的变化在几百年的时间内的演变几乎可以说是迟缓了。</p>
<p>但是,摄影不一样,科研也不一样。</p>
<p>在<a href="https://zh.wikipedia.org/zh-hk/%E9%93%B6%E7%89%88%E6%91%84%E5%BD%B1%E6%B3%95">银版摄影</a>作为一种技术刚刚被发明时,很难想象摄影能够成为一门艺术,也大大限制了所能拍摄的题材。一方面是超级庞大且复杂的设备,另一方面是需要动辄长达半小时的曝光时长,这种限制使得进入该领域的门槛变得很高,也让题材仅限于静物。这也使我想到了早期的计算机。人们需要在占据半个房间那么大的电子和机械装置上接驳各种电线才能完成一次简单的计算。必然地,能在早期使用摄影术和计算机的人就是凤毛麟角了。</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/ENIAC_Penn1.jpg/1920px-ENIAC_Penn1.jpg" alt="ENIAC" /></p>
<p>然而,随着黑白胶片的发明,摄影的技术门槛极大地降低了。计算机也一样,逐渐进入中型机和小型机的时代。技术门槛降低,使得越来越多的人可以接触到这些仪器。但把照片变成艺术品,与把计算机的使用当作科研成果,类同地,并不是那么容易的事情,这也让科研和摄影的相似性更加明显了。诚然,技术的每一次突破都使得一些新的事情变成了可能,而做成了这些所谓“新的事情”的人,必然会在技术上留下自己的名字。在摄影上,这些名号属于彩色摄影先驱William Eggleston,不断进化自己拍摄器材的Stephen Shore。而在计算机科学上,这些名号属于分时操作系统的先驱Fernando Corbató,编程语言的先驱John Backus,以及图形界面的先驱Alan Kay。</p>
<h2 id="新的角度">新的角度</h2>
<p>许多学术论文应用现有的技术到新的问题上,而很多摄影作品则是用新的角度去观察事物。对摄影讲,这个角度可高可低,可远可近。作为客观世界影响的主观选择与呈现的载体,照片上的内容可以由摄影师来决定。Joel Sternfeld有一个作品,《美国前景》,其中列出了他所看到的很多“荒诞”的美国景色。角度决定了构图,也决定了内容如何被呈现出来。</p>
<p><img src="https://cdn.shortpixel.ai/spai/q_lossless+w_888+to_webp+ret_img/independent-photo.com/wp-content/uploads/2020/02/Joel-Sternfeld-American-Prospects-8-1511x1200.jpg" alt="美国前景的照片" /></p>
<p>而在科研上,角度也包括了许多东西。以安全为例,需要从usability,threat model,technique等角度去做出分析和trade off。安全自始至终都不是一个系统构建的目的,而是一个副产物。但需要解决怎样的安全问题则是和这个系统本身的属性息息相关的。或许从某个角度看过去平平无奇的事情在另一个角度看来就有很大的问题。就以side channel attack为例,人们都知道它可以泄露信息。然而人们(曾经)不知道的是,在Google的搜索框内逐一输入字母的过程中,Google响应的包的大小会泄漏出来输入(搜索引擎)的内容。这属实是在已知的攻击手段上,玩出了花。</p>
<h3 id="决定性瞬间">决定性瞬间</h3>
<p>这里提及的角度,并不只是空间上的,还可以是时间上的。决定性瞬间是纪实摄影重磅人物布列松提出的理念。而对于从事安全研究的我而言,决定性瞬间又是什么呢?</p>
<p><img src="https://upload.wikimedia.org/wikipedia/zh/d/d1/Vj_day_kiss.jpg" alt="胜利之吻" /></p>
<p>那就是找到漏洞或者发现一个新的攻击。代码和系统都是在不断变化的,抓住某个版本的漏洞就是安全研究的决定性瞬间了吧。这个决定性瞬间或许会变成CVE。很多漏洞在发生时候往往会给人一种“这里也能出现bug?”的想法。然而再过一段时间之后,或许反而会有“这里也能出现bug!”的想法吧(笑</p>
<h2 id="taste">Taste</h2>
<p>对于摄影这门艺术而言,品味是很重要的。而对科研而言,也有“品味”的说法。</p>
<p>在review论文的过程中,不同的reviewer对于同一个文章的想法有时候可能天差地别。有的人会注重绝对的创新性,有的人会关注这个工作对于现实世界是不是有帮助的,而有的人则会关注工程上的实现是不是完善。摄影师的作品——照片里面,也是如此。不同的人所在意的东西是不同的,比如构图,色彩,还是主题。</p>
<p>然而我发现,好的照片和好的论文的共同点,都在于它们会“讲故事”。然而这种讲故事的方式则是绝对理性,或是非常感性的。论文讲好的故事非常澄澈,清楚,使人跟着作者的思路一起思考,而照片讲出的故事则是读者对于作品本身的私自解读。如何解读这些东西呢?大抵读者们可以自己发问:“我看到了什么?”</p>
<h3 id="糖水-engineering-effort--和学术垃圾">糖水, Engineering Effort, 和学术垃圾</h3>
<p>最后就简单说说不那么耀眼的东西吧。在摄影中有一类作品被大家称为“糖水”。这类作品往往拥有良好的构图,平衡的色彩,优秀的后期,以及,sometimes,漂亮的模特。然而,这些作品永远不会是“摄影家”的作品。为什么呢?大概是因为作品只是被精美的表象堆砌而成的工业糖精。它们大抵是没有灵魂的。</p>
<p>放在学术成果上,也有一类学术垃圾:没有insight的东西。并不像很多安全圈里面苛刻的reviewer,我并不认为engineering effort是无用的。虽然这听上去是一种“换个人也能做”的东西,但是许多工作就是不可避免地需要大量的工程实践,以此来证明一个简单的想法(idea)是有效的。甚至于一个研究花费的大量时间很可能都在工程上。这些事情至少是necessary evil了。</p>
<h2 id="总结">总结</h2>
<p>叨叨了不少,然而还是每次开会都要被老板教做人。还是得多多思考,多多学习,多多<del>科研</del>拍照啊!</p>ya0guangya0guang@protonmail.com最近重新拾起了相机,惊觉摄影和科研(至少在计算机领域)简直相似地过分。强烈建议学术民工尝试一下摄影,或者摄影师尝试一下科研(?)。GPUTEE从入门到升天(尚未升天)2023-08-18T10:10:07+00:002023-08-18T10:10:07+00:00https://ya0guang.com/tech/GPUTEE<p>从零开始一个H100 GPUTEE的配置,顺便体会做Artifact Evaluation的痛苦。这里主要记录踩坑,文档里面有的东西不再赘述。我对于VM和GPU的知识基本等于空白,搞了两天终于卡在了需要NVIDIA支持的地方上了。</p>
<h2 id="参考资料">参考资料</h2>
<ul>
<li><a href="https://docs.nvidia.com/confidential-computing-deployment-guide.pdf">Confidential Computing Deployment Guide</a></li>
<li><a href="https://docs.nvidia.com/cuda/cuda-installation-guide-linux/contents.html">CUDA Installation Guide</a></li>
<li><a href="https://github.com/AMDESE/AMDSEV">GitHub: AMDSEV</a></li>
</ul>
<h2 id="system-setting">System Setting</h2>
<p>我用的server是一块H100,两颗EPYC 914,坐落在ASUS ESC8000A-E12这架服务器上。</p>
<h2 id="bios">BIOS</h2>
<p>大多数BIOS现在似乎都是支持SEV(SEV-SNP)的了,按照文档enable或者设置就好了。但是文档里面有个小坑:还需要enable IOMMU。
这是为了能够把PCIe device pass through到VM里面。</p>
<h2 id="host-side">Host Side</h2>
<p>基本上就是按照文档中的步骤去build一个5.19 customized kernel。
这一步基本上不会有什么问题,但这里有一个东西可能会在build之后的kernel中缺失:<code class="language-plaintext highlighter-rouge">vfio-pci</code>。
这个问题虽然不是很大,但是文档里面并没有说,我走了很多弯路才知道怎么搞定它。
先说结论:直接<code class="language-plaintext highlighter-rouge">sudo modprobe vfio-pci</code>即可。
找了很多网上setup vfio的说明,其中一个经典的方法是在<code class="language-plaintext highlighter-rouge">/etc/default/grub</code>的<code class="language-plaintext highlighter-rouge">GRUB_CMDLINE_LINUX_DEFAULT</code>里面加上<code class="language-plaintext highlighter-rouge">amd_iommu=on iommu=pt</code>。
这没有大错误。但是坑爹的地方在于SNP初始化的时候不允许IOMMU设置成<code class="language-plaintext highlighter-rouge">pt</code>(pass throhgh)。
善用dmesg来发现问题。</p>
<p>除此之外一个奇怪的点是H100的device ID在文档中写的是<code class="language-plaintext highlighter-rouge">2336</code>,然而实际上我拿到的是<code class="language-plaintext highlighter-rouge">2331</code>。
这里可能是工程机和量产机的区别,可以不必深究。
不过谨记在后续的步骤中使用正确的device ID。</p>
<h2 id="guest-side">Guest Side</h2>
<p>文档最坑爹的地方来了。
它使用了<code class="language-plaintext highlighter-rouge">Ubuntu 22.04.2 LTS</code>的系统,并贴上了一个下崽链接。
然而这个链接完全不work,在Ubuntu的官网上也很难找到这个版本,因为现在(2023.08.18)的版本是<code class="language-plaintext highlighter-rouge">22.04.3 LTS</code>。
我想着,就差一个minor minor version,应该问题不大。
事实证明我大错特错,在Guest里面反复装了driver一万次都是没法看到GPU。
后来突然意识到这个问题,换成了<code class="language-plaintext highlighter-rouge">22.04.2 LTS</code>再来一次。
当然还是无法work。不知道是不是在安装的时候已经完成了kernel升级,我进入系统的时候kernel已经变成了<code class="language-plaintext highlighter-rouge">6.2</code>,然而Guest的kernel版本应该是<code class="language-plaintext highlighter-rouge">5.19</code>。
于是乎再折腾一波,终于搞定。</p>
<p>神奇师弟给了我两个链接方便地吓到了kernel和年轻的Ubuntu:</p>
<ul>
<li><a href="https://old-releases.ubuntu.com/releases/">Old Ubuntus</a></li>
<li><a href="https://github.com/pimlie/ubuntu-mainline-kernel.sh">一键安装kernel</a></li>
</ul>
<h2 id="nvidia-firmware">NVIDIA Firmware</h2>
<p>当我以为一切都搞定的时候,发现Firmware版本不对了。
大抵是GPU买早了,文档中用了更新的版本。
等待NVIDIA答复,寄。</p>
<h2 id="一些吐槽">一些吐槽</h2>
<ol>
<li>NVIDIA的<a href="https://github.com/NVIDIA/nvtrust/blob/main/docs/deployment_guide.pdf">repo中的文档</a>和<a href="https://docs.nvidia.com/confidential-computing-deployment-guide.pdf">官网的文档</a>是不一样的!!!!</li>
</ol>ya0guangya0guang@protonmail.com从零开始一个H100 GPUTEE的配置,顺便体会做Artifact Evaluation的痛苦。这里主要记录踩坑,文档里面有的东西不再赘述。我对于VM和GPU的知识基本等于空白,搞了两天终于卡在了需要NVIDIA支持的地方上了。简记割裂的20222023-08-12T14:10:07+00:002023-08-12T14:10:07+00:00https://ya0guang.com/caprice/2022Final<p>最近几年发生的事情真是一年比一年离谱了。时常在网上冲浪的我对于这种离谱的事情已经不感觉到离谱的时候,人就已经出问题了。
拖延癌发作的我终于得空在飞机上吧这个未完成的博文完成(虽然此时2023已经过去一大半了,然而只要我把date设置成2022就大概不会有人发现它是2023写的吧)。这个博文就是想到哪里写到哪里,毫无组织纪律。</p>
<h2 id="割裂">割裂</h2>
<p>当一群人完全不能理解另一群人,乃至于产生敌视、甚至攻击等行为时,便有了割裂。作为一个海外华人,这几年深切地感受到了各种意义上的割裂,也愈发体会到那句至理名言:“人类是无法相互理解的。”
每当这时,我都会感受到了三体文明的伟大:不需要言语就可以将所思所想准确无误地传达给另一个个体。但凡能够理解对方的想法,也就不会彼此有那么多奇奇怪怪的揣测了。</p>
<h3 id="疫情">疫情</h3>
<p>逃不掉的肯定是讲疫情了。支持封城的和支持放开的真就各有各的道理和说法。互联网的算法以及某些荒诞的言论审查或许使得这两方总是处在一个信息茧房之中,于是人们变得越来越愚蠢。获取信息本身在当今毫不费力,但是主动从多个渠道获取信息对于大多数人,也包括我,在很多时候都没有做到。接受各种app的推送轰炸似乎使我变成了一些算法的奴隶。所谓奴隶就是无条件的服从,对于工作几乎没有反抗。我在浏览信息的时候,这些算法的操控者赚取了广告费,我花掉了时间,然而或许得到的只是一遍遍重复的信息。大抵我需要改变一些获取信息的方式了。</p>
<p>到了2023年的八月,仿佛这个世界上的大家对于疫情的记忆已然开始模糊,只剩下被伤害的人和家庭等待着伤疤的愈合–或许也永远无法愈合罢。疫情并没有离开,只是在生存面前变得不那么重要了。少数掌握资源的强者或许再关个三四年能够垄断更多的财富,但大多数人或许再关三四个月就真的连生存都是问题了。这些人的牺牲无足轻重吗?在宏大叙事之下,许多【还能发出声音的人】对此都是麻木的。他们或许不被知道同处在一个世界的其他角落到底发生了什么,也或许是已经不关心这些了吧。</p>
<h2 id="衰退">衰退?</h2>
<p>说到衰退,其根源或许也是疫情。互联网大厂在扩张之后迎来的并不是预期之中的繁荣。人们需要为治愈疫情期间的创伤所付出代价,哪还有钱贡献给互联网呢?尾大不掉的结果紧接着断臂求生。这里我所有的观察基本都基于互联网厂商,很可能不适用于其他领域。我所实习的research组整个被layoff,大中小厂freeze hiring甚至开始裁员。很难言说这些员工哪里不好或是公司哪里不对,只是一些不幸运的人和不得已的决定吧。当风暴来临的时候,苟且活下去已经开始变得很艰难了,未来的那些远景都是泡影。无法直接给公司创收的人被抛下,成千上万的这样的人,成为修饰财报中盈利的那一个数字。</p>
<p>我不由得想知道,世界上有避风港吗?或许吧。我不觉得有永恒的避风港。如同选择不同的交通工具一样,工作地方的事故率和死亡率也是十分相似。公司如同路上的轿车,事故率中等,但是事故造成的伤害有大有小,而且大多数并不那么致命。学校和事业单位就像一架飞机,事故率十分低,然而发生事故了基本就很难侥幸(虽然飞机上写下这段文字并不那么吉利,但我不信这种玄学)。政府部门或许如同航空母舰,出事了大概就是波及颇大的。</p>
<p>然而,谁又关心那些骑着电驴,连轿车都没有的人呢?</p>
<h2 id="回家">回家</h2>
<p>我有回家的念头已经很久了。不是不能,而是不敢。前些年怕疫情,这几年怕签证。这事儿该怪谁呢?我要是讲出来可能我就回不了家了吧?懂得都懂哈哈。我很喜欢国内,但很不喜欢国内的“环境”。所谓最发达的国家的政府在一些问题上的处理其实也是蛮离谱的,尤其是枪支和毒品。有没有哪里是一个世外桃源呢?我不知道。不过我倒是很喜欢我们能够让那些还有问题的地方的问题少一些吧。</p>
<h2 id="尾声">尾声</h2>
<p>博文结尾了,旅途没有。今年我逐渐开始思考一些以前不太思考的问题,比如我适合/喜欢什么?我的人际关系(应该)是怎样的?我未来应该在哪里做什么?或许这就是男人上了年纪之后会想的事情吧:)</p>ya0guangya0guang@protonmail.com最近几年发生的事情真是一年比一年离谱了。时常在网上冲浪的我对于这种离谱的事情已经不感觉到离谱的时候,人就已经出问题了。 拖延癌发作的我终于得空在飞机上吧这个未完成的博文完成(虽然此时2023已经过去一大半了,然而只要我把date设置成2022就大概不会有人发现它是2023写的吧)。这个博文就是想到哪里写到哪里,毫无组织纪律。Verifying Constant-time2023-03-01T10:10:07+00:002023-03-01T10:10:07+00:00https://ya0guang.com/tech/ConstantTime<p>Constant-time的验证方法们。</p>
<h2 id="intro">Intro</h2>
<p>Constant-time作为密码学实现上一个重要的property,在抵御side-channel上面有很重要的作用。尤其是在这个Post-spectre的年代,microarchitectural level的side-channel attack已经成为了一个很大的威胁。近年来这个领域的researcher们飙了很多关于constant time的文章,很多都和microarchitectural side-channel相关。其中A哥(Almeida)和B哥(Barthe)尤甚。我还发现这个领域的大佬们似乎有很多欧洲的,可能也和欧洲佬数学好,喜欢搞各种verification有关。</p>
<p>大体上讲,Constant-time指的是运行的时间与secret无关。运行时间的不同主要源于:</p>
<ul>
<li>Control-flow dependency:比如secret在<code class="language-plaintext highlighter-rouge">if</code>中出现</li>
<li>Latency fo instruction:一些指令的运行时间并不是固定的,其可能取决于oprands。e.g. <code class="language-plaintext highlighter-rouge">idiv</code>)</li>
<li>Data/Code access dependency:根据secret来access data时,显然access memory的时间比register/cache长</li>
<li>Microarchitectural:即使constant-time在通常的model下面被满足了,speculative execution可能会导致secret泄漏</li>
</ul>
<h2 id="summary">Summary</h2>
<table>
<thead>
<tr>
<th>Paper</th>
<th>Abstract Language</th>
<th>IR</th>
<th>Binary</th>
<th>Notes</th>
<th>Code</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="#PLDI'20">PLDI’20</a></td>
<td>✅</td>
<td>✅</td>
<td>-</td>
<td><a href="https://ya0guang.com/tiddly/output/notebook.html#Constant-Time%20Foundations%20for%20the%20New%20Spectre%20Era">here</a></td>
<td><a href="https://github.com/PLSysSec/pitchfork-angr">link</a></td>
<td>Symbolic Execution</td>
</tr>
<tr>
<td><a href="#PLDI'19">PLDI’19</a></td>
<td>✅ DSL</td>
<td>✅</td>
<td>-</td>
<td><a href="https://ya0guang.com/tiddly/output/notebook.html#FaCT%3A%20A%20Flexible%2C%20Constant-Time%20Programming%20Language">here</a></td>
<td><a href="https://github.com/PLSysSec/FaCT">link</a></td>
<td>z3 + ct-verif</td>
</tr>
</tbody>
</table>
<h2 id="a-dilemma-in-verification">A Dilemma in Verification</h2>
<p>对于Constant time的verification而言这是一个格外困难的问题。source code和assembly code之间是存在gap的。这就导致了在一些情况下,即便是source code的constant-time property验证通过时,其对应的assembly code也可能不满足constant-time property。Compiler在优化时一般会guarantee semantics consistency,但是无法保证constant-time。
在源码上验证的context更为丰富,信息较多,但是在binary上验证的结果会更加可靠。</p>
<p>这就对verification造成了困难。目前主流的做法是在IR上面做验证。其既能很大程度上保留语义的信息,在compilation pipeline里面也比较贴近low-level assembly。</p>
<h2 id="abstract-language-model">Abstract Language Model</h2>
<p>这个level的工作主要是从理论上解决一些问题。作者首先定义一个abstract language model,然后在此model上面定义可能的side-channel leak, e.g. small-step semantics<a href="#PLDI'20">PLDI’20</a>。 这个model需要能够帮助real-world assembly/IR来reason constant time。</p>
<h2 id="at-ir-level">At IR Level</h2>
<p>大多数工作都是在IR level上做verification,especially LLVM IR。这个level能使用的分析工具就有很多,同时比较贴近low-level assembly code。这里可能也存在一些IR constant-time到assembly constant-time的gap。</p>
<h2 id="executable-binary">Executable Binary</h2>
<p>这个level的工作一般而言不是很好formalize。似乎大多数都是使用接近于test的方法去做的(<a href="#DATE'17">DATE’17</a>, <a href="#SP'20">SP’20</a>)。</p>
<h2 id="related-work">Related Work</h2>
<h3 id="patching">Patching</h3>
<p>Rewrite the code/binary to make it constant-time</p>
<h3 id="implementation">Implementation</h3>
<p>业界有不少实现。如何在高性能的情况下实现constant-time?</p>
<h2 id="some-thoughts">Some Thoughts</h2>
<p>如何在其他general purpose的语言(非C/C++)下实现constant-time?现在Rust能够保证safety,这对于crypto这种application而言也是十分重要的。在Rust下如何实现constant-time?我看到了<a href="https://github.com/dalek-cryptography/subtle">subtle</a>这个crate,但不知道它能否在Rust这种一直进化的语言中实现constant time。我能想到的两个问题是:</p>
<ul>
<li>在emit assembly code的时候,Rustc实际上没有对于LLVM backend的控制不会那么强。如何(or if any)控制LLVM IR to ASM的时候不会violate constant-time?这个问题实际上也存在于之前那些用LLVM来验证constant-time的工作中。</li>
<li>Rust仍然在不断进化(e.g., MIR)。有没有一种比较有效的方法能够在这种系统上(high-levelly)去验证constant-time?<a href="https://github.com/facebookexperimental/MIRAI/blob/main/documentation/TagAnalysis.md">MIRAI</a>实际上提供了constant-time的验证。但是根据我们的经验,其验证十分拉胯。</li>
</ul>
<h2 id="reference">Reference</h2>
<ol>
<li><a name="PLDI'20" href="https://dl.acm.org/doi/pdf/10.1145/3385412.3385970">PLDI’20</a> Sunjay Cauligi, Craig Disselkoen, Klaus v. Gleissenthall, Dean Tullsen, Deian Stefan, Tamara Rezk, and Gilles Barthe. 2020. Constant-time foundations for the new spectre era. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 913–926. https://doi.org/10.1145/3385412.3385970</li>
<li><a name="POPL'19" href="https://dl.acm.org/doi/pdf/10.1145/3371075">PLDI’20</a> Gilles Barthe, Sandrine Blazy, Benjamin Grégoire, Rémi Hutin, Vincent Laporte, David Pichardie, and Alix Trieu. 2019. Formal verification of a constant-time preserving C compiler. Proc. ACM Program. Lang. 4, POPL, Article 7 (January 2020), 30 pages. https://doi.org/10.1145/3371075</li>
<li><a name="Security'16" href="https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/almeida">Security’16</a> Almeida, Jose Bacelar, et al. “Verifying {Constant-Time} Implementations.” 25th USENIX Security Symposium (USENIX Security 16). 2016.</li>
<li><a name="DATE'17" href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7927267">DATE’17</a> Reparaz, Oscar, Josep Balasch, and Ingrid Verbauwhede. “Dude, is my code constant time?.” Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, 2017.</li>
<li><a name="EuroSP'18" href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8406587"> EuroSP’18</a> Simon, Laurent, David Chisnall, and Ross Anderson. “What you get is what you C: Controlling side effects in mainstream C compilers.” 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2018.</li>
<li><a name="CCS'22" href="https://dl.acm.org/doi/pdf/10.1145/3548606.3560689"> CCS’22</a> Ammanaghatta Shivakumar, Basavesh, et al. “Enforcing Fine-grained Constant-time Policies.” Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2022.</li>
<li><a name="INDOCRYPT'20" href="http://repositorium.uminho.pt/bitstream/1822/71547/1/Certified%20Compilation%20for%20Cryptography.pdf"> INDOCRYPT’20</a> Almeida, José Bacelar, et al. “Certified compilation for cryptography: Extended x86 instructions and constant-time verification.” Progress in Cryptology–INDOCRYPT 2020: 21st International Conference on Cryptology in India, Bangalore, India, December 13–16, 2020, Proceedings 21. Springer International Publishing, 2020.</li>
<li><a name="Security'19" href="usenix.org/system/files/sec19-gleissenthall.pdf?ref=hvper.com&utm_source=hvper.com&utm_medium=website"> Security’19</a> Gleissenthall, Klaus V., et al. “IODINE: Verifying constant-time execution of hardware.” Usenix Security. Vol. 19. No. 10.5555. 2019.</li>
<li><a name="PLDI'19" href="https://dl.acm.org/doi/pdf/10.1145/3314221.3314605"> PLDI’19</a> Cauligi, Sunjay, et al. “Fact: a DSL for timing-sensitive computation.” Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2019.</li>
<li><a name="SP'20" href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9152766"> SP’20 </a> Daniel, Lesly-Ann, Sébastien Bardin, and Tamara Rezk. “Binsec/rel: Efficient relational symbolic execution for constant-time at binary-level.” 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020.</li>
</ol>
<h2 id="less-related-papers">Less Related Papers</h2>
<ol>
<li><a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9833713&tag=1">“They’re not that hard to mitigate”: What Cryptographic Library Developers Think About Timing Attacks</a></li>
<li><a href="https://dl.acm.org/doi/pdf/10.1145/3133956.3134078">Jasmin: High-Assurance and High-Speed Cryptography</a></li>
<li><a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9519449">SoK: Computer-Aided Cryptography</a></li>
<li><a href="https://dl.acm.org/doi/pdf/10.1145/3213846.3213851">Eliminating timing side-channel leaks using program repair</a></li>
<li><a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9370305">Memory-Safe Elimination of Side Channels</a></li>
<li><a href="https://dl.acm.org/doi/pdf/10.1145/3460120.3484583">Constantine: Automatic Side- Channel Resistance Using Efficient Control and Data Flow Linearization</a></li>
</ol>
<h2 id="other-resources">Other Resources</h2>
<ol>
<li><a name="Agner" href="https://agner.org/optimize/"> Agner Software optimization resources</a> There is a table of instruction latency and throughput: <a href="https://agner.org/optimize/instruction_tables.pdf">pdf</a>.</li>
<li><a href="https://stackoverflow.com/questions/53401547/is-clmul-constant-time">Is CLMUL constant time?</a></li>
<li><a href="https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/secure-coding/mitigate-timing-side-channel-crypto-implementation.html">Intel: Guidelines for Mitigating Timing Side Channels Against Cryptographic Implementations</a></li>
<li><a href="https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html">Intel® 64 and IA-32 Architectures Software Developer Manuals</a></li>
</ol>ya0guangya0guang@protonmail.comConstant-time的验证方法们。学术相关,每月Check!2022-10-09T10:10:07+00:002022-10-09T10:10:07+00:00https://ya0guang.com/meta/AcademicFollowing<h1 id="academic-conferences">Academic Conferences</h1>
<p>Mainly focus on Security, System & PL.</p>
<h2 id="security">Security</h2>
<h3 id="top-tier">Top Tier</h3>
<ul>
<li><a href="https://www.ndss-symposium.org">NDSS, Mar.</a></li>
<li><a href="https://www.ieee-security.org/TC/SP2023/">SP, May</a></li>
<li><a href="https://www.usenix.org/conference/usenixsecurity22">Security, Aug.</a></li>
<li><a href="https://www.sigsac.org/ccs/CCS2022/">CCS, Nov.</a></li>
</ul>
<h3 id="great-ones">Great Ones</h3>
<ul>
<li><a href="https://www.ieee-security.org/TC/EuroSP2023/">Euro SP, July</a></li>
<li><a href="https://asiaccs2023.org">Asia CCS, June</a></li>
<li><a href="https://asiaccs2023.org">DSN, June</a></li>
<li><a href="https://www.acsac.org">ACSAC, Dec.</a></li>
<li><a href="https://esorics2022.compute.dtu.dk">ESORICS, Sep.</a></li>
<li><a href="https://raid2022.cs.ucy.ac.cy">RAID, Oct.</a></li>
</ul>
<h2 id="system">System</h2>
<ul>
<li><a href="https://sosp2021.mpi-sws.org">SOSP, Oct.</a></li>
<li><a href="https://www.usenix.org/conference/osdi22">OSDI, July</a></li>
<li><a href="https://www.usenix.org/conference/atc22">ATC, July</a></li>
<li><a href="https://2022.eurosys.org">EuroSys, Apr.</a></li>
<li><a href="https://asplos-conference.org">ASPLOS</a></li>
</ul>
<h2 id="pl">PL</h2>
<ul>
<li><a href="https://pldi22.sigplan.org">PLDI, June</a></li>
<li><a href="https://popl22.sigplan.org">PLDI, Jan.</a></li>
<li><a href="https://www.sigplan.org/Conferences/OOPSLA/">OOPSLA, Oct.</a></li>
</ul>
<h2 id="others">Others</h2>
<ul>
<li><a href="https://www.usenix.org/conference/nsdi22">NSDI, Apr.</a></li>
</ul>
<h1 id="researcherlab">Researcher/Lab</h1>
<h2 id="security-1">Security</h2>
<ul>
<li><a href="https://davejingtian.org/">Dave Tian</a> Purdue</li>
<li><a href="https://reyammer.io/">Yanick Fratantonio</a>: 离开了学术界的大佬</li>
<li><a href="https://cseweb.ucsd.edu/~dstefan/">Deian Stefan</a> UCSD 做PL+Security 主营WASM & Verification</li>
<li><a href="https://teecertlabs.com/">TEECert</a></li>
<li><a href="https://www.andrew.cmu.edu/user/bparno/">Bryan Parno</a></li>
<li><a href="http://people.eecs.berkeley.edu/~sseshia/">Sanjit A. Seshia</a></li>
<li><a href="https://hiroki-chen.notion.site/">Haobin</a></li>
<li><a href="https://binsec.github.io/">BINSEC</a> Open source binary security to from academia</li>
</ul>
<h2 id="verification">Verification</h2>
<ul>
<li><a href="https://gbarthe.github.io">Gilles Barthe</a></li>
<li><a href="https://www.inesctec.pt/en/people/jose-bacelar-almeida#short_bio">José Bacelar Almeida</a></li>
</ul>
<h2 id="pl-1">PL</h2>
<ul>
<li><a href="https://lawrencecpaulson.github.io/">Machine Logic</a></li>
<li><a href="https://people.mpi-sws.org/~rupak/">Rupak Majumdar</a></li>
<li><a href="https://www.cs.princeton.edu/~appel/">Appel</a></li>
</ul>ya0guangya0guang@protonmail.comAcademic ConferencesSMT/SAT从0到0.12022-06-10T22:52:07+00:002022-06-10T22:52:07+00:00https://ya0guang.com/tech/SMTSAT101<p>最近在学习一些关于SMT/SAT solver的东西,在这里简单总结一下。</p>
<h1 id="概念">概念</h1>
<ul>
<li>SMT: satisfiability modulo theories;<a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories">wiki</a></li>
<li>SAT: (boolean) satisfiability <a href="https://en.wikipedia.org/wiki/SAT_solver">wiki</a></li>
</ul>
<p>SMT可以认为是SAT的泛化。SAT基本只限制在求解boolean的一个表达式的satifiability上,即(是否有)赋值使得一个表达式为真(SAT)。
如若不存在这样的一组赋值,则这个表达式无法被满足(UNSAT)。</p>
<h1 id="software-verification">Software Verification</h1>
<p>本章主要可以参考《Handbook of Satisfiability》中同名章节。</p>
<p>一般而言程序可以抽象为State + State Transition。前者是一些mappings(e.g., registers and memories),后者则是在两个state上的relationship。</p>
<p>当知道程序目前的状态$s$和期望的属性$p$时,$s$ 和$\neg p$输入到SMT solver中。如果结果是UNSAT,则说明这个属性$p$被某一个状态$s$满足。
反之如果结果是SAT,则说明存在某一情况使得属性$p$被违背了。通常solver可以给出具体的一个例子。</p>
<p>在SMT solver的背后其实是SAT。在计算机中,整数被表示成bit vector的形式(e.g., 32位整数是32个bit组成的),其计算也可以被“表示”成bit-wise logic arithmatics。
“这里”的表示实际上和数字电路里面用与或非门实现二进制运算的原理是一样的。而当把这些state转换成boolean value之后,SMT把这些东西随后丢进SAT solver中求解。</p>
<h1 id="soundness">Soundness</h1>
<p>然而SMT有时会给出错误的答案,这也催生了不少SMT fuzzing的工作。也有工作使用多个SMT solver来求解同一个问题,观察他们结果的不一致。</p>
<p>更加具体的文献可以参考<a href="https://ya0guang.com/tiddly/output/notebook.html#SAT%20%26%20SMT">我的note</a>。</p>
<p>针对这个问题,人们也给出了很多解决方案。一方面是设计formally verified SAT/SMT solver,另一方面则是使得proof能够被验证。
此外,前者和后者也是可以相结合的。相关的工具和paper也可以在刚刚的note中找到。</p>ya0guangya0guang@protonmail.com最近在学习一些关于SMT/SAT solver的东西,在这里简单总结一下。符号和解释2022-03-06T22:52:07+00:002022-03-06T22:52:07+00:00https://ya0guang.com/caprice/SymbolAndInterpretation<p>在被<a href="https://coq.inria.fr/">Coq</a>折磨了快一个月后,终于我终于看完了Logic Foundation。作为<a href="https://softwarefoundations.cis.upenn.edu/">Software Foundations</a>系列书籍中的第一员大将,它还是有点东西的。这里只浅谈一下我对于符号和解释这两个概念所产生的更深的理解。正文部分基本不涉及任何与编程本身相关的问题。</p>
<h1 id="preface">Preface</h1>
<p>我并不打算对书的中知识概括总结,而是希望浅谈自己在读书过程中对于两个重要的概念,符号与解释的一些理解和思考。</p>
<p>在这之前先简介一下Coq:简单的说它是一个functional programming language。但是它奇怪的地方在于它也可以被当成一个辅助证明工具用。由于一些项目需要我用到Coq(当然我也对它比较感兴趣),于是开始学习<a href="https://cel.archives-ouvertes.fr/inria-00001173v6/document">Coq in a Hurry</a>这个文档。然而并没有什么卵用:它并不能帮我learn Coq in a hurry。于是我便走向了Software Foundations的<del>康</del>羊<del>庄</del>肠<del>大</del>小道。</p>
<p>刚开始看到这玩意的时候我直接乐呵了:软件基础?这是有多么基础?第一本书还就叫逻辑基础(Logic Foundations)。笑死,不就是一本基础书籍么,这不是分分钟学穿了?后来我逐渐理解一切:在PL这个领域,所有宣称自己是基础的都是骗子。这门课在宾大的课号是CS500:这意味着它是研究生阶段和顶级本科生的课程。当然,等我被骗上贼船发现这一切的时候已经晚了。</p>
<p>Anyway,这仍旧是一本非常不错的书,因为它十分阳间。似乎它在每次开课前都能得到更新,这也就意味着书中的内容在最新的stable Coq上仍然有效。不像某些阴间教材一样,即使是2021年出版的书,还非得用着自己写的,在N年前已经停止支持软件(还只支持windows)。这里我说的就是Thinking Programs!书虽然是挺不错的,但是里面的Code实在是过于陈旧了,用的都是祖传软件上的入土代码,对Commandline和Mac而言极不友好。</p>
<p>感谢SF一书的作者们为其提供public access,让人类能自由免费地学习其中的知识。</p>
<h1 id="符号">符号</h1>
<p>在幼儿时期的我经常思考一个问题:“我“究竟是什么?这里并不牵扯到哲学上的任何意义,将这个问句中的”我”替换成其他任何词语并不会影响这个问题的本质,比如“红色”究竟是什么?或“dASYg”究竟是什么?</p>
<p>聪明的小朋友可能已经发现了,这里用引号圈出来的【东西】,就单纯的是一个符号,它或许不具备任何的意义。不过这个问题就十分奇怪,为什么”我“这个符号能够表达【我自己】的意义?是因为谁先为”我“赋予了【我】的意义呢?所有说汉语的人都明白”我“代表【我】吗?大大的问题对我幼小心灵产生了极大的冲击,直到我知道还有一种叫做色盲的特殊人类时,才开始渐渐明白这个问题。</p>
<p>在他们(more specifically,红绿色盲)眼里,“红色”并不意味着我眼里的【红色】。而对于那些不认识汉字的人而言,”红色“甚至都不能表达【红色】的意义。它们充其量只是符号————甚至他们无法知道这究竟是一个符号还是俩符号。实际上,符号或许在不同人眼里就是不同的。而这里的所有文字,也不过只是堆彻而成的一堆符号,我甚至很庆幸在观察着这一堆符号的”你“能够猜测到我想表达的意思。</p>
<h2 id="formally">Formally</h2>
<p>先放一个wiki上关于formal symbol的<a href="https://en.wikipedia.org/wiki/Symbol_(formal)">链接</a>。这里formal被译作形式化,我曾经一直不太理解这个词语真正的含义,但现在逐渐能够理解。形式化的符号就是脱离了意义的符号,它们只和这个符号长啥样有关,基本上可以视作【得意忘形】这个词语的反面。</p>
<p>而在寻求形式化的过程中,很多教材中会提到语言。这里也是最有意思的地方:语言学,计算机,数学,甚至于哲学居然在这个奇奇怪怪的地方交汇了!形式化的符号如果想要组成一个句子(将符号组合成为更加复杂的符号),那么就必须依赖于语法(<a href="https://en.wikipedia.org/wiki/Formal_grammar">formal grammar</a>)。这里的语法和我们年轻时学习外语中学到的语法并无二致:即是一套符号的重写规则。例如在英文中,最简单的“主谓宾”语句,其主、谓、宾三个字(符号)都可以被符合要求的另一个符号重写,从而形成另一个具体的句子。</p>
<p>另一方面,数理逻辑也十分依赖于符号。逻辑的推导一方面可以看作是语义(意义)上的关系,另一方面就是单纯基于一些规则的符号演算(Sequent calculus)。而这些东西再往深了去扯一扯的话,就涉及哲学了。有一个分支叫做【<a href="https://zh.wikipedia.org/wiki/%E5%BD%A2%E4%B8%8A%E5%AD%B8">形而上学</a>】,和符号/逻辑有着不小的关系。鉴于我并不懂这玩意,就不在这里瞎说了。
但是我认为这个词语是有歧义的。如何去理解“而上”呢?是理解为【建立在XXX之上】,还是说【在XXX之上的另一个境界】?英文是Metaphysics,我更倾向于后者的意思。</p>
<p>在此,我想用“1 + 1 = 2”这个例子收本节的尾。从小学二年级我就不明白这个等式为什么成立。然而在思考为什么一加一等于二之前,或许更值得关注的问题在于,为什么“1”代表【一】。</p>
<h1 id="解释">解释</h1>
<p>简单的说,解释就是在为符号赋予意义(<a href="https://zh.wikipedia.org/wiki/%E8%A7%A3%E9%87%8B_(%E9%82%8F%E8%BC%AF)">wiki</a>)。或许当我在问为什么“XXX”代表【XXX】这个意思时,一个最合理的答案便是:我们将【XXX】这个意思赋予了“XXX”这个符号本身。回到上面那个例子,“1”表示【一】,“2”表示【二】本质上都是人为了方便而将一些意义用符号表示。而同一个意义可能被不同的符号表示,同一个符号也可以在不同的解释下表示不同的含义。例如罗马数字和阿拉伯数字的符号不同,但数值相等的数意义却一样;“艹”这个符号就在中文和日文中(笑)表示不同的意义,即它有着两种不同的解释。值得一提的是,刚刚括号内的(“笑”)并非想表达我真的笑了,而是想表示“艹”在日文中可以表示【笑】的意思。由此可见歧义真的是无处不在。</p>
<blockquote>
<p>人类是无法互相理解的。</p>
</blockquote>
<p>在形式化的逻辑过程中,解释并非是必要的,然而人类或许需要某种解释。对于这句名言我有了更深刻的体会。如何知道别人用符号表达的“东西”就和ta脑子里面对于这些符号的解释是同一个“东西”?语言在许多时候都显得有那么一些苍白。首先,其并不一定能精确地表达一个人的真实想法,也无法保证会被另一个接受信息的人按照原本语言想要被表达的意思所理解。在想到这里时,我顿悟了三体人物种的优越性:他们可以感受到同类的脑中的想法,这种脱离了语言的意识表达能够更加精确地传达意义;也明白了巴别塔为什么不能被建立:语言都不通的话,相互理解更是无稽之谈。</p>
<h2 id="愚蠢的机器">愚蠢的机器</h2>
<p>在很久之前接触到“图形计算器”这个单词的时候,还觉得它是一个十分酷炫的概念。其中有一个功能是“符号计算”。这个词语对彼时还是人类幼崽我的心灵造成了更大的冲击:啥是符号计算?我久久不能理解它,并需要一个【解释】。难道加减乘除不是符号?这里的符号是啥玩意呢?后来我才逐渐理解一切,简单的说它与数值计算相对应,可以在计算中掺入符号(e.g., X, Y, Z)。掺入了符号的多项式化简就需要使用这种能力。</p>
<blockquote>
<p>电脑很笨的,只认识0和1。</p>
</blockquote>
<p>这是在年轻时某人对我说过的话。起初我并不能理解这句话,因为电脑上能显示出各种语言的文字和段落,它能做各种计算,甚至“理解”我需要做什么。不过现在想想或许这句话也有道理,这些文本和计算并非被机器所理解了,而是作为一个个符号被展示(print)出来而已。真正解释这些符号背后的意义的并不是计算机,而是人类,而它们所真正【理解】的东西就只有0和1了。不过话又说回来,既然人类无法相互理解,人类就能理解计算机吗?人类为什么就能知道计算机无法理解这些符号的意义呢?或许只有真正设计出芯片的人们更有发言权吧,而我们只知道这个黑箱子会产生可以被我们所【理解】的内容。</p>
<p>在计算机中,神经网络的局限性多被认为是【不可解释】。虽然有一些解释它们的方法被开发出来,似乎这仍然是困扰学者的一个大问题。人们既然无法理解某一坨模型到底在想什么,那么凭什么信任它能够产生足够正确的结果呢?我们在喂给了机器大量的数据(符号)之后,机器也能够习得这些符号之间的推演规则。然而我们并不能理解机器为什么会输出某些奇奇怪怪的符号,甚至于机器本身也不能理解这些符号的意义。所以,机器或许也需要一个解释吧。待到机器真正能够【解释】的那天,或许【计算机】就不只是“计算机”了。</p>
<h1 id="后记">后记</h1>
<p>话又说回来,我们为什么认为自己【理解】了符号呢?有没有可能,只是人类的大脑把这些符号替换成了更加复杂的符号(low level representation),而我们只是根据特定的规则作出处理这些符号之后的动作?(笑)</p>ya0guangya0guang@protonmail.com在被Coq折磨了快一个月后,终于我终于看完了Logic Foundation。作为Software Foundations系列书籍中的第一员大将,它还是有点东西的。这里只浅谈一下我对于符号和解释这两个概念所产生的更深的理解。正文部分基本不涉及任何与编程本身相关的问题。从零开始搓一个编译器2021-12-12T01:00:07+00:002021-12-12T01:00:07+00:00https://ya0guang.com/tech/CompilerFromScratch<p>今年手痒痒选了一个Implementation of PL的课,需要徒手搓编译器。这里来小记一下这个过程,感觉这可能是我校CS最硬核的课之一了。</p>
<p>其实老师给了Racket和Python两种语言的选择,奈何我实在是不习惯那长到姥姥家的Racket括号,于是选择了Python。幸好在最新的Python 3.10中引入了<code class="language-plaintext highlighter-rouge">match case</code>的新特性,这直接使得代码量减少了非常多!它太好用了!!!(感觉Rust的compiler实现起来可能会更有趣一些)</p>
<p>源码目前放在<a href="https://github.com/ya0guang/ToyPythonCompiler">这个repo</a>里。里面将阶段性的进展都打上了tag。如若有人有兴趣的话可以去看看代码!我觉得我的代码质量还算凑合?</p>
<p><del>这个坑先开着,慢慢填!</del>期末复习,可以开始填坑了。</p>
<p>这门课程主要focus在编译的过程(pass)上,而编译之前的一些处理,例如生成AST并没有cover,因为有自动化的工具可以在拥有语法定义的情况下自动生成AST。</p>
<p>开坑日期:2021-10-16</p>
<p>最近更新:2021-12-12</p>
<h1 id="overview">Overview</h1>
<blockquote>
<p>罗马不是一天建成的。 – 匿名</p>
</blockquote>
<blockquote>
<p>造轮子可真是太有趣了。 – 我</p>
</blockquote>
<h2 id="features">Features</h2>
<p>这句话同样适用于编程语言,我们需要一步一步地为其增加新的Feature。</p>
<ol>
<li>$L_{Int}$: 整型及其计算和输出(<code class="language-plaintext highlighter-rouge">print</code>)</li>
<li>$L_{Var}$: 引入变量及其赋值,同时可以接受输入(<code class="language-plaintext highlighter-rouge">input_int</code>)</li>
<li>$L_{If}$: 引入<code class="language-plaintext highlighter-rouge">bool</code>和<code class="language-plaintext highlighter-rouge">if</code> statement</li>
<li>$L_{While}$: 引入<code class="language-plaintext highlighter-rouge">while</code>循环。这里没有涉及<code class="language-plaintext highlighter-rouge">for</code>,但是后者可以通过前者实现</li>
<li>$L_{Tup}$: 引入<code class="language-plaintext highlighter-rouge">tuple</code>,<code class="language-plaintext highlighter-rouge">list</code>可以通过前者实现</li>
<li>$L_{Fun}$: 引入<code class="language-plaintext highlighter-rouge">function</code>,同时支持尾调用优化</li>
<li>$L_{Lambda}$: 引入<code class="language-plaintext highlighter-rouge">Lambda</code>,支持了closure</li>
<li>$L_{Dyn}$: 支持动态类型</li>
</ol>
<p>最终支持的语法如下:</p>
<h3 id="concrete-syntax">Concrete Syntax</h3>
<p><img src="/assets/images/Compiler/LambdaAST.png" alt="ConcreteSyntax" /></p>
<h3 id="abstract-syntax">Abstract Syntax</h3>
<p><img src="/assets/images/Compiler/AbstractSyntax.png" alt="AbstractSyntax" /></p>
<p>虽然只是一个小的Python的子集,不过这货也就一课程作业了。</p>
<h1 id="passes">Passes</h1>
<p>我们小学二年级就学过,compiler是由一堆pass组成的。在我的实现里面并没有对其做高于函数这个level的优化,所以每一个函数的编译都是independent的。这样就可以从更高层次对compile进行更好的控制。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Compiler</span><span class="p">:</span>
<span class="s">"""compile the whole program, and for each function, call methods in `CompileFunction`"""</span>
<span class="c1"># TODO: How to ref CompileFunction.arg_passing?
</span> <span class="c1"># num_arg_passing_regs = len(CompileFunction.arg_passing)
</span> <span class="n">num_arg_passing_regs</span> <span class="o">=</span> <span class="mi">6</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">functions</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># function -> CompileFunction instance
</span> <span class="bp">self</span><span class="p">.</span><span class="n">function_compilers</span> <span class="o">=</span> <span class="p">{}</span>
<span class="c1"># function -> {original_name: new_name}
</span> <span class="bp">self</span><span class="p">.</span><span class="n">function_limit_renames</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">num_uniquified_counter</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">class</span> <span class="nc">CompileFunction</span><span class="p">:</span>
<span class="s">"""compile a single function"""</span>
<span class="n">temp_count</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">tup_temp_count</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1"># used for tracking static stack usage
</span> <span class="n">normal_stack_count</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">shadow_stack_count</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">tuple_vars</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># `calllq`: include first `arg_num` registers in its read-set R
</span> <span class="n">arg_passing</span> <span class="o">=</span> <span class="p">[</span><span class="n">Reg</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">arg_passing</span><span class="p">]</span>
<span class="c1"># `callq`: include all caller_saved registers in write-set W
</span> <span class="n">caller_saved</span> <span class="o">=</span> <span class="p">[</span><span class="n">Reg</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">caller_saved</span><span class="p">]</span>
<span class="n">callee_saved</span> <span class="o">=</span> <span class="p">[</span><span class="n">Reg</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">callee_saved</span><span class="p">]</span>
<span class="n">builtin_functions</span> <span class="o">=</span> <span class="p">[</span><span class="s">'input_int'</span><span class="p">,</span> <span class="s">'print'</span><span class="p">,</span> <span class="s">'len'</span><span class="p">]</span>
<span class="n">builtin_functions</span> <span class="o">=</span> <span class="p">[</span><span class="n">Name</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">builtin_functions</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="nb">str</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span>
<span class="bp">self</span><span class="p">.</span><span class="n">basic_blocks</span> <span class="o">=</span> <span class="p">{}</span>
<span class="c1"># mappings from a single instruction to a set
</span> <span class="bp">self</span><span class="p">.</span><span class="n">read_set_dict</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">write_set_dict</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">live_before_set_dict</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">live_after_set_dict</span> <span class="o">=</span> <span class="p">{}</span>
<span class="c1"># this list can be changed for testing spilling
</span> <span class="bp">self</span><span class="p">.</span><span class="n">allocatable</span> <span class="o">=</span> <span class="p">[</span><span class="n">Reg</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">allocatable</span><span class="p">]</span>
<span class="n">all_reg</span> <span class="o">=</span> <span class="p">[</span><span class="n">Reg</span><span class="p">(</span><span class="s">'r11'</span><span class="p">),</span> <span class="n">Reg</span><span class="p">(</span><span class="s">'r15'</span><span class="p">),</span> <span class="n">Reg</span><span class="p">(</span><span class="s">'rsp'</span><span class="p">),</span> <span class="n">Reg</span><span class="p">(</span>
<span class="s">'rbp'</span><span class="p">),</span> <span class="n">Reg</span><span class="p">(</span><span class="s">'rax'</span><span class="p">)]</span> <span class="o">+</span> <span class="bp">self</span><span class="p">.</span><span class="n">allocatable</span>
<span class="bp">self</span><span class="p">.</span><span class="n">int_graph</span> <span class="o">=</span> <span class="n">UndirectedAdjList</span><span class="p">()</span>
<span class="bp">self</span><span class="p">.</span><span class="n">move_graph</span> <span class="o">=</span> <span class="n">UndirectedAdjList</span><span class="p">()</span>
<span class="bp">self</span><span class="p">.</span><span class="n">control_flow_graph</span> <span class="o">=</span> <span class="n">DirectedAdjList</span><span class="p">()</span>
<span class="bp">self</span><span class="p">.</span><span class="n">live_before_block</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">prelude_label</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">name</span>
<span class="c1"># assign this when iterating CFG
</span> <span class="bp">self</span><span class="p">.</span><span class="n">conclusion_label</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">name</span> <span class="o">+</span> <span class="s">'conclusion'</span>
<span class="bp">self</span><span class="p">.</span><span class="n">basic_blocks</span><span class="p">[</span><span class="bp">self</span><span class="p">.</span><span class="n">conclusion_label</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># make the initial conclusion non-empty to avoid errors
</span> <span class="c1"># TODO: come up with a more elegant solution, maybe from `live_before_block`
</span> <span class="c1"># self.basic_blocks[self.conclusion_label] = [Expr(Call(Name('input_int'), []))]
</span> <span class="c1"># why need this?
</span> <span class="bp">self</span><span class="p">.</span><span class="n">sorted_control_flow_graph</span> <span class="o">=</span> <span class="p">[]</span>
<span class="bp">self</span><span class="p">.</span><span class="n">used_callee</span> <span class="o">=</span> <span class="nb">set</span><span class="p">()</span>
<span class="bp">self</span><span class="p">.</span><span class="n">stack_frame_size</span><span class="p">:</span> <span class="nb">int</span>
<span class="bp">self</span><span class="p">.</span><span class="n">shadow_stack_size</span><span class="p">:</span> <span class="nb">int</span>
<span class="c1"># Reserved registers
</span> <span class="bp">self</span><span class="p">.</span><span class="n">color_reg_map</span> <span class="o">=</span> <span class="p">{}</span>
<span class="n">color_from</span> <span class="o">=</span> <span class="o">-</span><span class="mi">5</span>
<span class="k">for</span> <span class="n">reg</span> <span class="ow">in</span> <span class="n">all_reg</span><span class="p">:</span>
<span class="bp">self</span><span class="p">.</span><span class="n">color_reg_map</span><span class="p">[</span><span class="n">color_from</span><span class="p">]</span> <span class="o">=</span> <span class="n">reg</span>
<span class="n">color_from</span> <span class="o">+=</span> <span class="mi">1</span>
</code></pre></div></div>
<h2 id="shrink">shrink</h2>
<p>把离散在源文件中不属于任何函数的语句汇集起来搓出一个<code class="language-plaintext highlighter-rouge">main</code>。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">def</span> <span class="nf">shrink</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">p</span><span class="p">:</span> <span class="n">Module</span><span class="p">)</span> <span class="o">-></span> <span class="n">Module</span><span class="p">:</span>
<span class="s">"""create main function, making the module body a series of function definitions"""</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">Module</span><span class="p">))</span>
<span class="c1"># main_args = arguments([], [], [], [], [])
</span> <span class="n">main</span> <span class="o">=</span> <span class="n">FunctionDef</span><span class="p">(</span><span class="s">'main'</span><span class="p">,</span> <span class="n">args</span><span class="o">=</span><span class="p">[],</span> <span class="n">body</span><span class="o">=</span><span class="p">[],</span>
<span class="n">decorator_list</span><span class="o">=</span><span class="p">[],</span> <span class="n">returns</span><span class="o">=</span><span class="nb">int</span><span class="p">)</span>
<span class="n">new_module</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">p</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="n">match</span> <span class="n">c</span><span class="p">:</span>
<span class="n">case</span> <span class="n">FunctionDef</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">_args</span><span class="p">,</span> <span class="n">_body</span><span class="p">,</span> <span class="n">_deco_list</span><span class="p">,</span> <span class="n">_rv_type</span><span class="p">):</span>
<span class="n">new_module</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">functions</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="c1"># print("DEBUG, function: ", name, "args: ", args.args, body, _deco_list, _rv_type)
</span> <span class="n">case</span> <span class="n">stmt</span><span class="p">():</span>
<span class="n">main</span><span class="p">.</span><span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
<span class="n">main</span><span class="p">.</span><span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Return</span><span class="p">(</span><span class="n">Constant</span><span class="p">(</span><span class="mi">0</span><span class="p">)))</span>
<span class="n">new_module</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">main</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, new module: "</span><span class="p">,</span> <span class="n">new_module</span><span class="p">)</span>
<span class="k">return</span> <span class="n">Module</span><span class="p">(</span><span class="n">new_module</span><span class="p">)</span>
</code></pre></div></div>
<p>当前的实现没有支持global var,但是感觉应该不会太难。这里其实可以把离散在源文件中的变量定义都放在heap上面,使其能够被其他函数access。</p>
<h2 id="uniquify">uniquify</h2>
<p>在函数中会存在variable shadowing的情况。为了使其更加清楚,我们对每一个variable在需要的时候重命名即可。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">def</span> <span class="nf">uniquify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">p</span><span class="p">:</span> <span class="n">Module</span><span class="p">)</span> <span class="o">-></span> <span class="n">Module</span><span class="p">:</span>
<span class="k">class</span> <span class="nc">Uniquify</span><span class="p">(</span><span class="n">NodeTransformer</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">outer</span><span class="p">:</span> <span class="n">Compiler</span><span class="p">,</span> <span class="n">mapping</span><span class="p">:</span> <span class="nb">dict</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span> <span class="o">=</span> <span class="n">outer</span>
<span class="bp">self</span><span class="p">.</span><span class="n">uniquify_mapping</span> <span class="o">=</span> <span class="n">mapping</span>
<span class="nb">super</span><span class="p">().</span><span class="n">__init__</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">visit_Lambda</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">generic_visit</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
<span class="n">match</span> <span class="n">node</span><span class="p">:</span>
<span class="n">case</span> <span class="n">Lambda</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="n">body_expr</span><span class="p">):</span>
<span class="n">new_mapping</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">uniquify_mapping</span><span class="p">.</span><span class="n">copy</span><span class="p">()</span>
<span class="n">new_args</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">args</span><span class="p">:</span>
<span class="n">new_v</span> <span class="o">=</span> <span class="n">v</span> <span class="o">+</span> <span class="s">"_"</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span><span class="p">.</span><span class="n">num_uniquified_counter</span><span class="p">)</span>
<span class="c1"># find the new name in the previous mapping
</span> <span class="n">new_mapping</span><span class="p">[</span><span class="n">new_mapping</span><span class="p">[</span><span class="n">v</span><span class="p">]]</span> <span class="o">=</span> <span class="n">new_v</span>
<span class="c1"># delete the old mapping
</span> <span class="k">del</span> <span class="n">new_mapping</span><span class="p">[</span><span class="n">v</span><span class="p">]</span>
<span class="n">new_args</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_v</span><span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span><span class="p">.</span><span class="n">num_uniquified_counter</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">new_uniquifier</span> <span class="o">=</span> <span class="n">Uniquify</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span><span class="p">,</span> <span class="n">new_mapping</span><span class="p">)</span>
<span class="n">new_body_expr</span> <span class="o">=</span> <span class="n">new_uniquifier</span><span class="p">.</span><span class="n">visit</span><span class="p">(</span><span class="n">body_expr</span><span class="p">)</span>
<span class="k">return</span> <span class="n">Lambda</span><span class="p">(</span><span class="n">new_args</span><span class="p">,</span> <span class="n">new_body_expr</span><span class="p">)</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="k">def</span> <span class="nf">visit_Name</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">generic_visit</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
<span class="n">match</span> <span class="n">node</span><span class="p">:</span>
<span class="n">case</span> <span class="n">Name</span><span class="p">(</span><span class="nb">id</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">id</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">uniquify_mapping</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Name</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">uniquify_mapping</span><span class="p">[</span><span class="nb">id</span><span class="p">])</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="k">def</span> <span class="nf">do_uniquify</span><span class="p">(</span><span class="n">stmts</span><span class="p">:</span> <span class="nb">list</span><span class="p">,</span> <span class="n">uniquify_mapping</span><span class="p">:</span> <span class="nb">dict</span><span class="p">)</span> <span class="o">-></span> <span class="nb">list</span><span class="p">:</span>
<span class="s">"""change the variable names of statements in place according to the uniquify_mapping"""</span>
<span class="n">uniquifier</span> <span class="o">=</span> <span class="n">Uniquify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">uniquify_mapping</span><span class="p">)</span>
<span class="n">new_body</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">stmts</span><span class="p">:</span>
<span class="n">new_body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">uniquifier</span><span class="p">.</span><span class="n">visit</span><span class="p">(</span><span class="n">s</span><span class="p">))</span>
<span class="k">return</span> <span class="n">new_body</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">Module</span><span class="p">))</span>
<span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">p</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">FunctionDef</span><span class="p">))</span>
<span class="n">uniquify_mapping</span> <span class="o">=</span> <span class="p">{}</span>
<span class="n">new_args</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">f</span><span class="p">.</span><span class="n">args</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, v: "</span><span class="p">,</span> <span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="s">"type: "</span><span class="p">,</span> <span class="nb">type</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span>
<span class="n">new_arg_name</span> <span class="o">=</span> <span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="s">"_"</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">num_uniquified_counter</span><span class="p">)</span>
<span class="n">uniquify_mapping</span><span class="p">[</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span> <span class="o">=</span> <span class="n">new_arg_name</span>
<span class="n">new_args</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">new_arg_name</span><span class="p">,</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="p">))</span>
<span class="bp">self</span><span class="p">.</span><span class="n">num_uniquified_counter</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">f</span><span class="p">.</span><span class="n">args</span> <span class="o">=</span> <span class="n">new_args</span>
<span class="n">f</span><span class="p">.</span><span class="n">body</span> <span class="o">=</span> <span class="n">do_uniquify</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">body</span><span class="p">,</span> <span class="n">uniquify_mapping</span><span class="p">)</span>
<span class="k">return</span> <span class="n">p</span>
</code></pre></div></div>
<h2 id="reveal-functions">reveal functions</h2>
<p>Python里面的function是first class citizen,可以被当做变量来使用。故而在AST level它们是无法区别的。但是由于我们将函数赋值给一个变量时的语义和将其他东西(e.g. 数值,另一个变量的值)赋值给一个变量的语义是有区别的:前者实际上在获取函数的地而非函数体。因此需要将这两者区别开来,把所有在源码内定义的函数的引用从<code class="language-plaintext highlighter-rouge">Name(XXX)</code>转换为<code class="language-plaintext highlighter-rouge">FunRef(XXX)</code>。(<code class="language-plaintext highlighter-rouge">FunRef</code>是老师自己搓的一个AST node)。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">def</span> <span class="nf">reveal_functions</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">p</span><span class="p">:</span> <span class="n">Module</span><span class="p">)</span> <span class="o">-></span> <span class="n">Module</span><span class="p">:</span>
<span class="s">"""change `Name(f)` to `FunRef(f)` for functions defined in the module"""</span>
<span class="k">class</span> <span class="nc">RevealFunction</span><span class="p">(</span><span class="n">NodeTransformer</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">outer</span><span class="p">:</span> <span class="n">Compiler</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span> <span class="o">=</span> <span class="n">outer</span>
<span class="nb">super</span><span class="p">().</span><span class="n">__init__</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">visit_Call</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">generic_visit</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
<span class="n">match</span> <span class="n">node</span><span class="p">:</span>
<span class="n">case</span> <span class="n">Call</span><span class="p">(</span><span class="n">Name</span><span class="p">(</span><span class="n">f</span><span class="p">),</span> <span class="n">args</span><span class="p">)</span> <span class="k">if</span> <span class="n">f</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span><span class="p">.</span><span class="n">functions</span><span class="p">:</span>
<span class="c1"># what if f is a builtin function? guard needed
</span> <span class="k">return</span> <span class="n">Call</span><span class="p">(</span><span class="n">FunRef</span><span class="p">(</span><span class="n">f</span><span class="p">),</span> <span class="n">args</span><span class="p">)</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">Module</span><span class="p">))</span>
<span class="c1"># Why this does't work?
</span> <span class="c1"># new_body = RevealFunction(self).visit_Call(p)
</span> <span class="c1"># p.body = new_body
</span>
<span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">p</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="n">new_body</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">f</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="n">new_line</span> <span class="o">=</span> <span class="n">RevealFunction</span><span class="p">(</span><span class="bp">self</span><span class="p">).</span><span class="n">visit_Call</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">new_body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_line</span><span class="p">)</span>
<span class="c1"># print("DEBUG, new node: ", ast.dump(n))
</span> <span class="n">f</span><span class="p">.</span><span class="n">body</span> <span class="o">=</span> <span class="n">new_body</span>
<span class="k">return</span> <span class="n">p</span>
</code></pre></div></div>
<h2 id="convert-assignments">Convert Assignments</h2>
<p><strong>TODO</strong></p>
<h2 id="convert-to-closure">Convert to Closure</h2>
<p><strong>TODO</strong></p>
<h2 id="limit-function">Limit Function</h2>
<p>在一些函数的参数多余六个时,多出来的那些参数无法通过寄存器传递。通常的做法是放在stack上面,但是为了进行尾调用优化,我们可以将这些参数放在shadow stack(实际上是heap)上面。具体的做法是:为第六个和多余的参数创建一个tuple,并将这个tuple作为第六个参数传递。</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">def</span> <span class="nf">limit_functions</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">p</span><span class="p">:</span> <span class="n">Module</span><span class="p">)</span> <span class="o">-></span> <span class="n">Module</span><span class="p">:</span>
<span class="s">"""limit functions to 6 arguments, anything more gets put into a 6th tuple-type argument"""</span>
<span class="k">class</span> <span class="nc">LimitFunction</span><span class="p">(</span><span class="n">NodeTransformer</span><span class="p">):</span>
<span class="c1"># limit call sites & convert name of args
</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">outer</span><span class="p">:</span> <span class="n">Compiler</span><span class="p">,</span> <span class="n">mapping</span><span class="p">:</span> <span class="nb">dict</span> <span class="o">=</span> <span class="p">{}):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">outer_instance</span> <span class="o">=</span> <span class="n">outer</span>
<span class="bp">self</span><span class="p">.</span><span class="n">mapping</span> <span class="o">=</span> <span class="n">mapping</span>
<span class="nb">super</span><span class="p">().</span><span class="n">__init__</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">visit_Name</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">):</span>
<span class="c1"># substitute the >5th arguments with subscript of a tuple
</span> <span class="bp">self</span><span class="p">.</span><span class="n">generic_visit</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
<span class="n">match</span> <span class="n">node</span><span class="p">:</span>
<span class="n">case</span> <span class="n">Name</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="k">if</span> <span class="n">n</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">mapping</span><span class="p">.</span><span class="n">keys</span><span class="p">():</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">mapping</span><span class="p">[</span><span class="n">n</span><span class="p">]</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="k">def</span> <span class="nf">visit_Call</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">generic_visit</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
<span class="n">match</span> <span class="n">node</span><span class="p">:</span>
<span class="n">case</span> <span class="n">Call</span><span class="p">(</span><span class="n">FunRef</span><span class="p">(</span><span class="n">f</span><span class="p">),</span> <span class="n">args</span><span class="p">)</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">args</span><span class="p">)</span> <span class="o">></span> <span class="n">Compiler</span><span class="p">.</span><span class="n">num_arg_passing_regs</span><span class="p">:</span>
<span class="c1"># print("DEBUG, HIT in visit_FunRef: ", ast.dump(node))
</span> <span class="n">new_args</span> <span class="o">=</span> <span class="n">args</span><span class="p">[:</span><span class="n">Compiler</span><span class="p">.</span><span class="n">num_arg_passing_regs</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span>
<span class="n">new_args</span><span class="p">.</span><span class="n">append</span><span class="p">(</span>
<span class="n">Tuple</span><span class="p">(</span><span class="n">args</span><span class="p">[</span><span class="n">Compiler</span><span class="p">.</span><span class="n">num_arg_passing_regs</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:],</span> <span class="n">Load</span><span class="p">()))</span>
<span class="k">return</span> <span class="n">Call</span><span class="p">(</span><span class="n">FunRef</span><span class="p">(</span><span class="n">f</span><span class="p">),</span> <span class="n">new_args</span><span class="p">)</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">return</span> <span class="n">node</span>
<span class="k">def</span> <span class="nf">args_need_limit</span><span class="p">(</span><span class="n">args</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="nb">list</span><span class="p">):</span>
<span class="c1"># print("DEBUG, args: ", args, type(args))
</span> <span class="c1"># print("DEBUG, argsAST: ", args[1][0])
</span> <span class="k">return</span> <span class="nb">len</span><span class="p">(</span><span class="n">args</span><span class="p">)</span> <span class="o">></span> <span class="n">Compiler</span><span class="p">.</span><span class="n">num_arg_passing_regs</span>
<span class="c1"># else:
</span> <span class="c1"># print("DEBUG, args: ", ast.dump(args))
</span> <span class="k">return</span> <span class="bp">False</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">Module</span><span class="p">))</span>
<span class="c1"># limit defines
</span> <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">p</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="n">match</span> <span class="n">f</span><span class="p">:</span>
<span class="n">case</span> <span class="n">FunctionDef</span><span class="p">(</span><span class="n">_name</span><span class="p">,</span> <span class="n">args</span><span class="p">,</span> <span class="n">_body</span><span class="p">,</span> <span class="n">_deco_list</span><span class="p">,</span> <span class="n">_rv_type</span><span class="p">)</span> <span class="k">if</span> <span class="n">args_need_limit</span><span class="p">(</span><span class="n">args</span><span class="p">):</span>
<span class="c1"># args = args.args
</span> <span class="n">new_args</span> <span class="o">=</span> <span class="n">args</span><span class="p">[:</span><span class="mi">5</span><span class="p">]</span>
<span class="n">arg_tup</span> <span class="o">=</span> <span class="p">(</span><span class="s">'tup_arg'</span><span class="p">,</span> <span class="n">TupleType</span><span class="p">([</span><span class="n">a</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="n">a</span> <span class="ow">in</span> <span class="n">args</span><span class="p">[</span><span class="mi">5</span><span class="p">:]]))</span>
<span class="n">alias_mapping</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">args</span><span class="p">)):</span>
<span class="n">alias_mapping</span><span class="p">[</span><span class="n">args</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="mi">0</span><span class="p">]]</span> <span class="o">=</span> <span class="n">Subscript</span><span class="p">(</span>
<span class="n">Name</span><span class="p">(</span><span class="s">'tup_arg'</span><span class="p">),</span> <span class="n">Constant</span><span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="mi">5</span><span class="p">),</span> <span class="n">Load</span><span class="p">())</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, alias_mapping: "</span><span class="p">,</span> <span class="n">alias_mapping</span><span class="p">)</span>
<span class="n">new_args</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">arg_tup</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, new_args: "</span><span class="p">,</span> <span class="n">new_args</span><span class="p">)</span>
<span class="n">f</span><span class="p">.</span><span class="n">args</span> <span class="o">=</span> <span class="n">new_args</span>
<span class="n">new_body</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">f</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="c1"># new_line is new node
</span> <span class="n">new_line</span> <span class="o">=</span> <span class="n">LimitFunction</span><span class="p">(</span>
<span class="bp">self</span><span class="p">,</span> <span class="n">alias_mapping</span><span class="p">).</span><span class="n">visit_Name</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, new_line: "</span><span class="p">,</span> <span class="n">new_line</span><span class="p">)</span>
<span class="n">new_body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_line</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG, new_body: "</span><span class="p">,</span> <span class="n">new_body</span><span class="p">)</span>
<span class="n">f</span><span class="p">.</span><span class="n">body</span> <span class="o">=</span> <span class="n">new_body</span>
<span class="n">case</span> <span class="n">_</span><span class="p">:</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">FunctionDef</span><span class="p">))</span>
<span class="n">new_body</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">f</span><span class="p">.</span><span class="n">body</span><span class="p">:</span>
<span class="n">new_line</span> <span class="o">=</span> <span class="n">LimitFunction</span><span class="p">(</span><span class="bp">self</span><span class="p">).</span><span class="n">visit_Call</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">new_body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_line</span><span class="p">)</span>
<span class="n">f</span><span class="p">.</span><span class="n">body</span> <span class="o">=</span> <span class="n">new_body</span>
<span class="k">return</span> <span class="n">p</span>
</code></pre></div></div>
<p>因为tail call优化会抹掉当前function的stack frame,所以参数不再能够通过stack传递给callee。因此将多余的参数丢在shadow stack上面。</p>
<p>不过其实我感觉似乎也可以抹掉当前stack之后为callee准备参数,然后直接jump到callee里面(跳过stack initialization的部分)。但是老师提到说准备参数的时候可能会用到当前stack frame上面的东西,有mess up的可能性。不过似乎总是有解决思路的:例如可以先拓展一下当前的stack frame把callee的参数生成在一个干净的地方,然后再准备callee的stack frame:把自己的frame pop掉之后把callee的参数复制上去。不知道production compiler是怎么做的,说不定有更高效的办法。</p>
<h2 id="expose-allocation">Expose Allocation</h2>
<p>我们选择将tuple存储在Heap上面,而将指向其的指针存储在一个shadow stack (root stack)上。这是为了在garbage collector更方便地去工作(之后细说)。这个pass只对tuple有效:将tuple的初始化变成一系列的statement,进而在最后用一个expression来为其赋值。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">def</span> <span class="nf">expose_allocation_hide</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">t</span><span class="p">:</span> <span class="n">Tuple</span><span class="p">)</span> <span class="o">-></span> <span class="n">Begin</span><span class="p">:</span>
<span class="c1"># Autograder call `expose_allocation` in a wrong way, so this is hidden from the autograder
</span> <span class="s">"""convert a tuple creation into a begin"""</span>
<span class="k">assert</span><span class="p">(</span><span class="nb">isinstance</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="n">Tuple</span><span class="p">))</span>
<span class="n">content</span> <span class="o">=</span> <span class="n">t</span><span class="p">.</span><span class="n">elts</span>
<span class="n">body</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="p">)):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">CompileFunction</span><span class="p">.</span><span class="n">is_atm</span><span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="n">i</span><span class="p">]):</span>
<span class="k">print</span><span class="p">(</span><span class="s">"DEBUG: ???"</span><span class="p">)</span>
<span class="n">temp_name</span> <span class="o">=</span> <span class="s">'temp_tup'</span> <span class="o">+</span> \
<span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">tup_temp_count</span><span class="p">)</span> <span class="o">+</span> <span class="s">'X'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Assign</span><span class="p">([</span><span class="n">Name</span><span class="p">(</span><span class="n">temp_name</span><span class="p">)],</span> <span class="n">content</span><span class="p">[</span><span class="n">i</span><span class="p">]))</span>
<span class="n">tup_bytes</span> <span class="o">=</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">8</span>
<span class="n">if_cond</span> <span class="o">=</span> <span class="n">Compare</span><span class="p">(</span><span class="n">BinOp</span><span class="p">(</span><span class="n">GlobalValue</span><span class="p">(</span><span class="s">'free_ptr'</span><span class="p">),</span> <span class="n">Add</span><span class="p">(),</span> <span class="n">Constant</span><span class="p">(</span><span class="n">tup_bytes</span><span class="p">)),</span> <span class="p">[</span>
<span class="n">Lt</span><span class="p">()],</span> <span class="p">[</span><span class="n">GlobalValue</span><span class="p">(</span><span class="s">'fromspace_end'</span><span class="p">)])</span>
<span class="c1"># TODO: Expr(Constant(0)) OK here?
</span> <span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">If</span><span class="p">(</span><span class="n">if_cond</span><span class="p">,</span> <span class="p">[],</span> <span class="p">[</span><span class="n">Collect</span><span class="p">(</span><span class="n">tup_bytes</span><span class="p">)]))</span>
<span class="n">var</span> <span class="o">=</span> <span class="n">Name</span><span class="p">(</span><span class="s">"pyc_temp_tup_"</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">tup_temp_count</span><span class="p">))</span>
<span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Assign</span><span class="p">([</span><span class="n">var</span><span class="p">],</span> <span class="n">Allocate</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="p">),</span> <span class="n">t</span><span class="p">.</span><span class="n">has_type</span><span class="p">)))</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="p">)):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">CompileFunction</span><span class="p">.</span><span class="n">is_atm</span><span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="n">i</span><span class="p">]):</span>
<span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Assign</span><span class="p">([</span><span class="n">Subscript</span><span class="p">(</span><span class="n">var</span><span class="p">,</span> <span class="n">Constant</span><span class="p">(</span><span class="n">i</span><span class="p">),</span> <span class="n">Store</span><span class="p">())],</span> <span class="n">Name</span><span class="p">(</span>
<span class="s">'temp_tup'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">tup_temp_count</span><span class="p">)</span> <span class="o">+</span> <span class="s">'X'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">))))</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">body</span><span class="p">.</span><span class="n">append</span><span class="p">(</span>
<span class="n">Assign</span><span class="p">([</span><span class="n">Subscript</span><span class="p">(</span><span class="n">var</span><span class="p">,</span> <span class="n">Constant</span><span class="p">(</span><span class="n">i</span><span class="p">),</span> <span class="n">Store</span><span class="p">())],</span> <span class="n">content</span><span class="p">[</span><span class="n">i</span><span class="p">]))</span>
<span class="bp">self</span><span class="p">.</span><span class="n">tup_temp_count</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="n">Begin</span><span class="p">(</span><span class="n">body</span><span class="p">,</span> <span class="n">var</span><span class="p">)</span>
</code></pre></div></div>
<p>这里需要注意的是,分配内存直到创建完这个tuple的过程需要是atomic的,因为分配的内存没有经过初始化,如果中间被插入了别的操作(which也进行了allocation)的话会出现memory corruption,具体应该是因为一些全局指针会跳飞。</p>
<h2 id="remove-complex-op">Remove Complex Op*</h2>
<p>这个pass将复杂的operation和oprand转换成若干更加简单的操作。例如加法的oprand需要是一个atom(e.g. 常数/变量)。</p>
<p>由于这个函数代码过长,可以参见repo。</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">def</span> <span class="nf">rco_exp</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span> <span class="n">expr</span><span class="p">,</span> <span class="n">need_atomic</span><span class="p">:</span> <span class="nb">bool</span><span class="p">)</span> <span class="o">-></span> <span class="n">typing</span><span class="p">.</span><span class="n">Tuple</span><span class="p">[</span><span class="n">expr</span><span class="p">,</span> <span class="n">Temporaries</span><span class="p">]:</span>
<span class="n">temps</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># tail must be assigned in the match cases
</span> <span class="k">if</span> <span class="n">CompileFunction</span><span class="p">.</span><span class="n">is_atm</span><span class="p">(</span><span class="n">e</span><span class="p">):</span>
<span class="s">"""nothing need to do if it's already an `atm`"""</span>
<span class="k">return</span> <span class="p">(</span><span class="n">e</span><span class="p">,</span> <span class="n">temps</span><span class="p">)</span>
<span class="n">match</span> <span class="n">e</span><span class="p">:</span>
<span class="n">case</span> <span class="n">XXX</span><span class="p">:</span>
<span class="n">tail</span> <span class="o">=</span> <span class="n">YYY</span>
<span class="k">return</span> <span class="p">(</span><span class="n">tail</span><span class="p">,</span> <span class="n">temps</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">rco_stmt</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">s</span><span class="p">:</span> <span class="n">stmt</span><span class="p">)</span> <span class="o">-></span> <span class="n">List</span><span class="p">[</span><span class="n">stmt</span><span class="p">]:</span>
<span class="n">result</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">temps</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">match</span> <span class="n">s</span><span class="p">:</span>
<span class="n">case</span> <span class="n">XXX</span><span class="p">:</span>
<span class="bp">self</span><span class="p">.</span><span class="n">rco_exp</span><span class="p">(</span><span class="n">XXX</span><span class="p">)</span>
<span class="c1"># late binding
</span> <span class="k">for</span> <span class="n">binding</span> <span class="ow">in</span> <span class="n">temps</span><span class="p">:</span>
<span class="c1"># print("DEBUG, binding: ", binding)
</span> <span class="n">result</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Assign</span><span class="p">([</span><span class="n">binding</span><span class="p">[</span><span class="mi">0</span><span class="p">]],</span> <span class="n">binding</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span>
</code></pre></div></div>
<p>省下了每个case里面的一行<code class="language-plaintext highlighter-rouge">return</code>罢。这里需要递归地进行调用。</p>
<p>总而言之代码还是有不少优化的空间,也有一些冗余,但是相对而言还是尽可能在 readability 和 simplicity 上尽可能地区平衡了,至少我觉得我写的这些代码还是相当牛逼的。</p>
<h2 id="explicate-control">Explicate Control</h2>
<p>在这里就需要河里地编排程序的control flow了。我们小学三年级就学过,在一个函数有了不同的控制流之后,需要一些basic block来分摊具体的分支和循环,而 explicate control 就是做这个的。这个pass也是十分臃肿,<a href="https://github.com/ya0guang/ToyPythonCompiler/blob/16b8c1ca43260156d689bea36d4552b0a545ac38/compiler.py#L601">代码就不贴了</a>。但是 high level 来讲可以分解为五个主要的部分:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">explicate_pred</code>: 处理<code class="language-plaintext highlighter-rouge">if</code>和<code class="language-plaintext highlighter-rouge">while</code> loop。这里还需要考虑<code class="language-plaintext highlighter-rouge">if</code>嵌套</li>
<li><code class="language-plaintext highlighter-rouge">explicate_assign</code>: 处理赋值。</li>
<li><code class="language-plaintext highlighter-rouge">explicate_tail</code>: 处理<code class="language-plaintext highlighter-rouge">return</code>。这里可以进行尾调用优化。</li>
<li><code class="language-plaintext highlighter-rouge">explicate_effect</code>: 处理其他<strong>只有</strong> side-effect 的 expression,例如 <code class="language-plaintext highlighter-rouge">print()</code>,未复制给某一个变量的<code class="language-plaintext highlighter-rouge">input()</code>(后者通常用来pause一个程序)。</li>
<li><code class="language-plaintext highlighter-rouge">explicate_stmt</code>: 将一个statement dispatch到以上的这些case中。</li>
</ul>
<p>除此之外我还实现了 lazy evaluation 的优化:这样可以减少一些 trivial basic block,which只包含了单向的<code class="language-plaintext highlighter-rouge">jmp</code>。</p>
<h2 id="select-instruction">Select Instruction</h2>
<p>大多数都是brute force的操作:为每一个expression找到合适的X86_64指令。<a href="https://github.com/ya0guang/ToyPythonCompiler/blob/16b8c1ca43260156d689bea36d4552b0a545ac38/compiler.py#L772">码长不贴</a>。</p>
<p>在这里我有一个骚操作,但是我也不知道这种东西在compiler领域有没有专门的叫法,不过倒是可以借用<em>late binding</em>这个名字:即所有<code class="language-plaintext highlighter-rouge">Assign</code> statement 的 rhs 都先在<code class="language-plaintext highlighter-rouge">select_expr()</code>绑定到一个 <code class="language-plaintext highlighter-rouge">Unnamed_Pyc_Var</code> 上,然后再通过<code class="language-plaintext highlighter-rouge">bound_unamed()</code>根据 lhs 在<code class="language-plaintext highlighter-rouge">rco_stmt()</code>中进行绑定。这样做可以极大减少冗余代码(我觉得老师的代码太复杂了,看得人头大= =)。</p>
<p>我希望AST能够设计的再简单一些,现在还是有些太复杂了。</p>
<h2 id="register-allocation">Register Allocation</h2>
<h3 id="liveness-analysis">Liveness Analysis</h3>
<h3 id="interference-graph">Interference Graph</h3>
<h3 id="graph-coloring">Graph Coloring</h3>
<h2 id="patch-instruction">Patch Instruction</h2>
<h2 id="prelude-and-conclusion">Prelude and Conclusion</h2>
<h1 id="something-interesting">Something Interesting</h1>
<p>这里是一些其他用到的技术,不知道有没有时间细表,不过可以先把坑挖上。</p>
<h2 id="dynamic-typing">Dynamic Typing</h2>
<h2 id="garbage-collection">Garbage Collection</h2>
<p>因为使用 heap 来存放tuple,所以需要GC。为此设置了一个 root stack(shadow stack),用来存放所有的 tuple pointers。</p>
<p>内存的形状是这样的:
<img src="/assets/images/Compiler/rootStack.png" alt="root stack" /></p>
<p>把 tuple 都丢到shadow stack上面的原因是:GC需要一个 starting point 来捡垃圾,这个 shadow stack就是所谓 starting point。确切地说应该所有需要 garbage collection 的玩意都应该丢在那里。</p>
<p>GC会寻找<em>所有能够通过 shadow stack索引到的元素</em>来保证它们不会被 collect,而那些找不到的就命途多舛了。能够被索引到的元素包括:root stack 中 pointer 指向的元素,以及 recursively 这些被指向元素中的 pointer 能够指向的元素。那么这就带来了一些问题:如果有一些元素被指向两次怎么办?如果有一些指针指成环儿了怎么办?</p>
<p>所以实际上 tuple 在内存中的形状是这样的:
<img src="/assets/images/Compiler/tupleStruct.png" alt="tupleStruct" /></p>
<p>tuple中的第一个 double word 来存储一些元数据,包括:</p>
<ul>
<li>pointer mask: 表示 tuple 当中的那些是 pointer</li>
<li>vector length: 表示 tuple 的长度</li>
<li>forwarding: 在GC工作的时候来标记有没有被处理过</li>
</ul>
<p>具体的GC算法 Two-Space Copying。从字面意思上而言还是很好理解的,这里不多赘述了。详情可以参考 <a href="https://en.wikipedia.org/wiki/Cheney's_algorithm">Wiki</a>。我读了一下这玩意的C实现,可以说是十分蛋痛了。虽然写的不算复杂,但是我如果不懂这个算法是啥的话还真的很难看出来这是个GC(可能是没经验吧)。</p>
<p>由于GC的引入,需要做一些额外的事情:在程序开始执行时初始化 shadow stack;reserve 一个寄存器(<code class="language-plaintext highlighter-rouge">r15</code>)来用作 shadow stack 的指针;再 reserve 一个寄存器(<code class="language-plaintext highlighter-rouge">r11</code>)作为 tuple access 的指针(因为X86要求memory write 通过寄存器实现,而<code class="language-plaintext highlighter-rouge">rax</code>之前用在了 instruction patching 中来避免两个 oprand 都是内存的情况,故而无法复用在 tuple 这里);在分配空间前检查够不够用,不够的话还需要 allocate。</p>
<h2 id="data-flow-analysis">Data Flow Analysis</h2>ya0guangya0guang@protonmail.com今年手痒痒选了一个Implementation of PL的课,需要徒手搓编译器。这里来小记一下这个过程,感觉这可能是我校CS最硬核的课之一了。待在加州的一段时间2021-07-08T22:00:07+00:002021-07-08T22:00:07+00:00https://ya0guang.com/caprice/SeveralNightsInCalifornia<p><del>刚刚从加州回来,迫不及待地想要把将见闻和体验稍做记录。</del>
拖了一个多月再回来把这个blog完成。</p>
<h1 id="楔子">楔子</h1>
<p>今年暑期有幸找到了一个国内大牌但是ownership属于美国的公司的公司的实习。虽然对于实习生并没有要求onsite办公,但碰上了公司的活动,恰巧有机会能够去往加州体验不一样的时光。</p>
<h1 id="客观事实">客观事实</h1>
<p>在回到我的大学所在的城中村(or村中城)里之后,才意识到加州究竟是一个怎样的风水宝地。</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>每天都能够拥抱阳光而气温又不至于炎热;
远远望去还是沙漠但有人居住的地方布满绿植;
夜间空气清冽却不至于感到萧瑟从而能舒爽入睡。
</code></pre></div></div>
<p>这个有些沙漠的地方已经被好几代人改造成了十分适合居住的地方。不过这也带来了一些问题:</p>
<ul>
<li><strong>卷</strong>。这一点我倒是没有太多的实感,不过加州家长的鸡娃程度应该是丝毫不亚于国内的。</li>
<li><strong>堵车</strong>。大家还处于WFH的末期,所以这个问题还不算特别严重。不过经常在高速路的出入口看到一大坨车挤在一两个车道上,估计全面开工了之后的高峰期也会有水泄不通的时日吧。</li>
<li><strong>不安全</strong>。加州大抵是出了名的不安全。虽然我很幸运没有遇到什么safety方面的问题(大概是因为做security比较多:),但是这里的<strong>所有人</strong>都不敢把包放在车里,甚至在几乎所有停车场都会提醒你不要把包遗留在车内,也能侧面反映出来砸车是多么普遍的事情。除此之外Asian hate在加州<em>可能</em>也是比较严重的。</li>
<li><strong>物价小贵</strong>。简单地说,基本上商店里面的所有东西都比我所常住的小村庄贵50%+。当然对于这点也不是不能理解,毕竟这里能买到的东西也很多。</li>
</ul>
<p>当然,这里也有许多值得称道的地方:</p>
<ul>
<li><strong>人多</strong>。当然这是相对于我所居住的屯子而言。在这边三周的social量快赶上我一年的了 :P</li>
<li><strong>天气好</strong>。虽然已经称道过了加州的天气,但是这里专门用对比的修辞手法再说一次。我回到村里的第一天就被它用暴雨相迎,买完菜的我楞是在车里做了十分钟才敢下车。当然下车并不是因为暴雨变小了,而是因为有一个大雷差点劈到我脸上,让我觉得衣服湿了可能比被雷到更好处理。理所应当地,网也被劈没了 QAQ。However,在我出发时本来意图是应对下雨的伞倒是用成了遮阳伞,完全没有发挥过它防雨的功能。当然雨多也是有好处的,我这辈子还没洗过车呢。</li>
<li><strong>生活方便</strong>。华人生活所需要的经济基础和上层建筑基本都有。这里的各种外卖,甚至是生鲜外送服务简直是对没有租车的我的一个恩赐,中餐的种类也是相当齐全了。当然,出于美国的人工费用,就不能指望享受到国内超低的配送费用了。简单地说就算不会说英文也完全可以依赖华人所开的店来完成各种生活的基本需求。<del>贵可能不是他们的问题,是我的问题。</del></li>
</ul>
<p>除此之外,关于工作与生活我想更加详细地聊聊。</p>
<h1 id="主观感受">主观感受</h1>
<p>接下来就是一些主观的感受和体验了,且容我一一道来</p>
<h2 id="工作与生活">工作与生活</h2>
<p>首先,可能也是最重要的便是这一点吧。WLB(Work Life Balance)在加州真的存在!(在美国大部分公司也是存在的)。码农们的工作没有那么大压力,公司让他们“输出”的同时,也会负责他们的“成长”。没有在国内待过的我倒是没有拿美国的情况和国内比较的资本,但很多小道消息里面将国内互联网公司描述成炼狱一般的形状还是很让人畏惧的。那么这边的打工人有什么life呢?我知道的大概有:social king/queen,玩音乐,运动,调酒,etc. 我觉得能够找到爱做的事情并从中汲取快乐是十分幸福的事情。</p>
<h1 id="好吃的">好吃的!</h1>
<p>加州好吃确实不少,关键它种类还贼多。这对于我这种从农村过去的小子而言简直是打开了新世界的奢靡大门。虽然我所居住的地方也被加州人称之为农村,不过他们显然对于真正的农村的认知有一些偏差。不过说实话,口味基本上都比不过国内。推荐<strong>锦鲤缘</strong>和<strong>老家陕西</strong>,这两家大概是我觉得味道十分不错的了!如果有朋友有知道好吃的地方也可以告诉我哈!</p>
<h1 id="一些遗憾">一些遗憾</h1>
<p>由于没有自驾,很多景色很好的地方都没有去逛。之后来开会的时候再慢慢去玩吧!准备把所有成为MacOS上面的版本名称的地方都耍一下。顺便还有一些人也没有见上,就日后再见吧!</p>
<h1 id="不是结尾的尾声">不是结尾的尾声</h1>
<p>非常感谢在加州这段时间里,老朋友、新朋友和之前素昧谋面师兄给予我的帮助!这些人实在是对我太好啦!尤其是我的mentor,感觉我让他操碎了心(手动捂脸。</p>ya0guangya0guang@protonmail.com刚刚从加州回来,迫不及待地想要把将见闻和体验稍做记录。 拖了一个多月再回来把这个blog完成。计算机玄学——主要是踩的Rust的坑2021-06-24T12:00:07+00:002021-06-24T12:00:07+00:00https://ya0guang.com/tech/ComputerXuanice_Rust<p>有言道:</p>
<blockquote>
<p>我的一生,<br />
是和Rust Compiler搏斗的一生。<br />
——我</p>
</blockquote>
<p>又有言道:</p>
<blockquote>
<p>当你觉得Rust出现问题,<br />
不是因为Rust有问题,<br />
而是,你有问题。<br />
——还是我</p>
</blockquote>
<h1 id="cstring-ownership">CString Ownership</h1>
<p>最近在做Teaclave的开发,使用到了很多FFI的玩意。但是CString却给我留下了深刻的印象。</p>
<h2 id="problem">Problem</h2>
<div class="language-rs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">NativeSymbol</span> <span class="p">{</span>
<span class="n">symbol</span><span class="p">:</span> <span class="nn">CString</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">"teaclave_open_input"</span><span class="p">)</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="mi">_</span><span class="p">,</span>
<span class="n">func_ptr</span><span class="p">:</span> <span class="n">wasm_open_input</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">c_void</span><span class="p">,</span>
<span class="n">signature</span><span class="p">:</span> <span class="nn">CString</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">"($)i"</span><span class="p">)</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="mi">_</span><span class="p">,</span>
<span class="n">attachment</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ptr</span><span class="p">::</span><span class="nf">null</span><span class="p">(),</span>
<span class="p">},</span>
</code></pre></div></div>
<p>上面这段代码创建一个<code class="language-plaintext highlighter-rouge">NativeSymbol</code> struct,其中<code class="language-plaintext highlighter-rouge">symbol</code>和<code class="language-plaintext highlighter-rouge">signature</code>两个C的字符串都在创建结构体的过程中被生成,并得到其指针。<br />
然而问题就出现在了这里,根据Rust的ownership rule,在离开这个花括号(scope)时,在这个scope内创建的属于这个scope的data会被drop掉。<br />
换句话说,就是<code class="language-plaintext highlighter-rouge">CString</code>么了,但是pointer还在。这是什么?这是妥妥的内存dangling pointer啊!最坑爹的地方在于Rust并不会向你报错!<del>我是不知道为什么它不报错,但是感觉这个样子是有问题的。</del>在Rust里面创建raw pointer是没有问题的,但是dereference它则unsafe。这里对于raw pointer的deref并没有在(safe) Rust里面,而是在unsafe FFI call里面,所以Rust也没有在编译时报错。</p>
<h2 id="solution">solution</h2>
<p>这个问题的解决办法是什么呢?首先是使用String literals:</p>
<div class="language-rs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">NativeSymbol</span> <span class="p">{</span>
<span class="n">symbol</span><span class="p">:</span> <span class="n">b</span><span class="s">"teaclave_open_input</span><span class="se">\0</span><span class="s">"</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="mi">_</span><span class="p">,</span>
<span class="n">func_ptr</span><span class="p">:</span> <span class="n">wasm_open_input</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">c_void</span><span class="p">,</span>
<span class="n">signature</span><span class="p">:</span> <span class="n">b</span><span class="s">"($)i</span><span class="se">\0</span><span class="s">"</span><span class="nf">.as_ptr</span><span class="p">()</span> <span class="k">as</span> <span class="mi">_</span><span class="p">,</span>
<span class="n">attachment</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ptr</span><span class="p">::</span><span class="nf">null</span><span class="p">(),</span>
<span class="p">},</span>
</code></pre></div></div>
<p>String literals在Rust里的lifetime是static的,因此指向其的指针不会被简单地invalidate。</p>
<p>除此之外似乎还可以使用reference(<code class="language-plaintext highlighter-rouge">&</code>)。但是这个方法我还没有尝试,暂时还不知道具体的实现是怎样的,等我实现了之后再来更新吧。</p>ya0guangya0guang@protonmail.com有言道: 我的一生, 是和Rust Compiler搏斗的一生。 ——我