<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5802665338436118579</id><updated>2011-04-22T07:17:30.894+09:00</updated><category term='最近计划'/><category term='python'/><title type='text'>mocibb的笔记</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5802665338436118579.post-1675665020708593338</id><published>2009-05-20T12:37:00.003+09:00</published><updated>2009-05-20T12:45:20.173+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='最近计划'/><title type='text'>计划-20090520</title><content type='html'>虽然最近工作比较忙，但是还是想把算法再捡一捡。&lt;br /&gt;1.算法&lt;br /&gt;   USACO 做题(每周至少三道)&lt;br /&gt;2.编译器的学习&lt;br /&gt;   继续读spidermonkey的代码&lt;br /&gt;   继续读Engineering a Compiler&lt;br /&gt;外加英语听力，感觉好忙呀。&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5802665338436118579-1675665020708593338?l=mocibb.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/1675665020708593338/comments/default' title='帖子评论'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5802665338436118579&amp;postID=1675665020708593338' title='0 条评论'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/1675665020708593338'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/1675665020708593338'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/2009/05/20090520.html' title='计划-20090520'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5802665338436118579.post-7888846837591774555</id><published>2009-03-09T16:33:00.021+09:00</published><updated>2009-03-10T19:00:18.307+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>使用os.walk要当心</title><content type='html'>今天用python写了一段脚本，作用是遍历目录查找里面文件的内容。&lt;br /&gt;遍历目录的代码使用了os.walk，大致如下：&lt;br /&gt;&lt;pre class="brush: py"&gt;&lt;br /&gt;def all_files(path):&lt;br /&gt;for root, dirs, files in os.walk(path, topdown=False):&lt;br /&gt;    for name in files:&lt;br /&gt;        yield os.path.join(root, name)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;调用的时候传入一个路径的参数，根据路径返回目录下的文件。&lt;br /&gt;&lt;pre class="brush: py"&gt;&lt;br /&gt;print list(all_file("c:/test"))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;但是实际执行的时候，test目录下总是有些文件取不到。&lt;br /&gt;改成u"c:/test"就都能取到了。&lt;br /&gt;查看了一下os.walk的代码，os.walk做的就是调用listdir，在对子目录进行递归调用。&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush: py"&gt;&lt;br /&gt;try:&lt;br /&gt;    # Note that listdir and error are globals in this module due&lt;br /&gt;    # to earlier import-*.&lt;br /&gt;    names = listdir(top)&lt;br /&gt;except error, err:&lt;br /&gt;    if onerror is not None:&lt;br /&gt;        onerror(err)&lt;br /&gt;    return&lt;br /&gt;&lt;br /&gt;dirs, nondirs = [], []&lt;br /&gt;for name in names:&lt;br /&gt;    if isdir(join(top, name)):&lt;br /&gt;        dirs.append(name)&lt;br /&gt;    else:&lt;br /&gt;        nondirs.append(name)&lt;br /&gt;&lt;br /&gt;if topdown:&lt;br /&gt;    yield top, dirs, nondirs&lt;br /&gt;for name in dirs:&lt;br /&gt;    path = join(top, name)&lt;br /&gt;    if followlinks or not islink(path):&lt;br /&gt;        for x in walk(path, topdown, onerror, followlinks):&lt;br /&gt;            yield x&lt;br /&gt;if not topdown:&lt;br /&gt;    yield top, dirs, nondirs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;经过调试，问题这里的listdir上。&lt;br /&gt;再查看到listdir的代码(posixmodule.c)&lt;br /&gt;&lt;pre class="brush: cpp; tab-size:4; gutter:true"&gt;&lt;br /&gt;static PyObject *&lt;br /&gt;posix_listdir(PyObject *self, PyObject *args)&lt;br /&gt;{&lt;br /&gt;/* XXX Should redo this putting the (now four) versions of opendir&lt;br /&gt;   in separate files instead of having them all here... */&lt;br /&gt;#if defined(MS_WINDOWS) &amp;amp;&amp;amp; !defined(HAVE_OPENDIR)&lt;br /&gt;&lt;br /&gt;PyObject *d, *v;&lt;br /&gt;HANDLE hFindFile;&lt;br /&gt;BOOL result;&lt;br /&gt;WIN32_FIND_DATA FileData;&lt;br /&gt;char namebuf[MAX_PATH+5]; /* Overallocate for \\*.*\0 */&lt;br /&gt;char *bufptr = namebuf;&lt;br /&gt;Py_ssize_t len = sizeof(namebuf)-5; /* only claim to have space for MAX_PATH */&lt;br /&gt;&lt;br /&gt;#ifdef Py_WIN_WIDE_FILENAMES&lt;br /&gt;/* If on wide-character-capable OS see if argument&lt;br /&gt;   is Unicode and if so use wide API.  */&lt;br /&gt;if (unicode_file_names()) {&lt;br /&gt; PyObject *po;&lt;br /&gt; if (PyArg_ParseTuple(args, "U:listdir", &amp;amp;po)) {&lt;br /&gt;  ......&lt;br /&gt;}&lt;br /&gt;#endif&lt;br /&gt;&lt;br /&gt;if (!PyArg_ParseTuple(args, "et#:listdir",&lt;br /&gt;                      Py_FileSystemDefaultEncoding, &amp;amp;bufptr, &amp;amp;len))&lt;br /&gt; return NULL;&lt;br /&gt;if (len &gt; 0) {&lt;br /&gt; char ch = namebuf[len-1];&lt;br /&gt; if (ch != SEP &amp;&amp; ch != ALTSEP &amp;&amp; ch != ':')&lt;br /&gt;  namebuf[len++] = '/';&lt;br /&gt;}&lt;br /&gt;strcpy(namebuf + len, "*.*");&lt;br /&gt;&lt;br /&gt;if ((d = PyList_New(0)) == NULL)&lt;br /&gt; return NULL;&lt;br /&gt;&lt;br /&gt;hFindFile = FindFirstFile(namebuf, &amp;amp;FileData);&lt;br /&gt;......&lt;br /&gt;#endif /* which OS */&lt;br /&gt;}  /* end of posix_listdir */&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;上面代码是我删减过的，去掉了根据不同操作系统的预处理，只留下windows相关的。&lt;br /&gt;这里根据传递的路径是字符串还是unicode，执行不同的分支。&lt;br /&gt;如果传递的是字符串，则从26行开始执行。&lt;br /&gt;首先对传入的路径进行处理。&lt;br /&gt;如果路径不是以\\结尾的，则追加上/，然后再追加上*.*&lt;br /&gt;所以如果传递的是c:/test，处理之后就变成了c:/test/*.*。&lt;br /&gt;看起来没有毛病，但是对于有些字符集(例如shift_jis)，处理\\以外，有些字符也是以\\结尾的。&lt;br /&gt;比如说在shift_jis&lt;br /&gt;&gt;&gt;&gt; '表'&lt;br /&gt;'\x95\\'&lt;br /&gt;所以如果以'表'结尾的目录经过处理不会变成&lt;br /&gt;'表/*.*' 而是'表*.*'&lt;br /&gt;这时取到的不是这个目录里面的内容，而是目录本身!!!&lt;br /&gt;listdir正常返回，只是取到结果不对。&lt;br /&gt;这个结果返回给os.walk，所以得到的结果也不对。&lt;br /&gt;&lt;br /&gt;结论：&lt;br /&gt;使用os.walk时，一定要用unicode。&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5802665338436118579-7888846837591774555?l=mocibb.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/7888846837591774555/comments/default' title='帖子评论'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5802665338436118579&amp;postID=7888846837591774555' title='0 条评论'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/7888846837591774555'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/7888846837591774555'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/2009/03/alerthello-world-alerthello-world.html' title='使用os.walk要当心'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5802665338436118579.post-856201587272501735</id><published>2008-07-15T15:49:00.000+09:00</published><updated>2008-07-15T17:53:34.758+09:00</updated><title type='text'>关于treetop的一点备忘</title><content type='html'>&lt;div class="notextitle"&gt;treetop是一个PEG的解释器。&lt;/div&gt;&lt;div class="notextext"&gt;1. PEG跟CFG不同，我觉得主要有两点不同&lt;br /&gt;(1)它比CFG使用要简单，速度可能更快。&lt;br /&gt;另外, 它的token扫描和语法放到一起也是一个优势．&lt;br /&gt;(2)匹配时依赖顺序&lt;br /&gt;A/B不会等于B/A,这一点需要注意.&lt;br /&gt;绝大多数的语法都可以用PEG来解析，&lt;br /&gt;我开始使用的时候，出错的时候，总是怀疑是不是有些语法无法用PEG来解析，&lt;br /&gt;但是最后都是我写的rule的顺序有问题,  所以请相信PEG :).&lt;br /&gt;&lt;br /&gt;2. 使用的注意要点&lt;br /&gt;(1)treetop没有可以匹配大小写混在的方法，一个解决的方法就是写成下面的形式。&lt;br /&gt;&lt;pre&gt;rule select_keyword&lt;br /&gt; [Ss] [Ee] [Ll] [Ee] [Cc] [Tt]&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;  (2)左循环&lt;br /&gt;PEG的算法决定了不能很好的处理左循环，所以需要我们手动的把它拆开．&lt;br /&gt;&lt;br /&gt;(3)顺序&lt;br /&gt;如果/之间的表达式都可以匹配，&lt;br /&gt;要把大的顺序放到前面，然后是小的．&lt;br /&gt;&lt;br /&gt;(4)匹配关键字，而不是以关键字开始的&lt;br /&gt;&lt;pre&gt;rule end_keyword&lt;br /&gt;'end' !(!' ' .)&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt; (5)匹配标识符&lt;br /&gt;&lt;pre&gt;rule identifier&lt;br /&gt; (!keyword name) / (keyword name)&lt;br /&gt;end&lt;br /&gt;rule name&lt;br /&gt; [a-zA-Z]+&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5802665338436118579-856201587272501735?l=mocibb.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/856201587272501735/comments/default' title='帖子评论'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5802665338436118579&amp;postID=856201587272501735' title='0 条评论'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/856201587272501735'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/856201587272501735'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/2008/07/treetop.html' title='关于treetop的一点备忘'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5802665338436118579.post-6866093114566876065</id><published>2007-04-07T23:11:00.000+09:00</published><updated>2007-04-07T23:19:32.090+09:00</updated><title type='text'>Ubuntu下访问blogspot</title><content type='html'>网上看到访问被封杀的blogspot的方法。&lt;br /&gt;&lt;br /&gt;1. 修改/etc/hosts文件&lt;br /&gt;加入&lt;br /&gt;72.14.219.190   mocibb.blogspot.com&lt;br /&gt;&lt;br /&gt;2.新建proxy.pac， 设置为firefox的自动加载脚本。&lt;br /&gt;&lt;pre&gt;function FindProxyForURL(url,host){&lt;br /&gt;   if(dnsDomainIs(host, ".blogspot.com")){&lt;br /&gt;       return "PROXY 72.14.219.190:80";&lt;br /&gt;   }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;参考：&lt;br /&gt;1.http://my.opera.com/fermi/blog/2007-03-22-how-to-visit-the-banned-blogspot&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5802665338436118579-6866093114566876065?l=mocibb.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/6866093114566876065/comments/default' title='帖子评论'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5802665338436118579&amp;postID=6866093114566876065' title='0 条评论'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/6866093114566876065'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/6866093114566876065'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/2007/04/ubuntublogspot.html' title='Ubuntu下访问blogspot'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5802665338436118579.post-1159467738718123341</id><published>2007-04-07T22:10:00.000+09:00</published><updated>2007-04-07T22:33:10.447+09:00</updated><title type='text'>在Ubuntu上安装python</title><content type='html'>今天在机器上安装了python， 把步骤总结一下。&lt;br /&gt;&lt;br /&gt;Ubuntu6.10默认就已经安装python了， 版本是2.4。&lt;br /&gt;注意不要卸载默认的2.4，因为有些包可能依赖指定版本的python。&lt;br /&gt;&lt;br /&gt;1. 解压缩source安装&lt;br /&gt;&gt;$mkdir ~/python&lt;br /&gt;&gt;$cp Python-2.5.tgz ~/python&lt;br /&gt;&gt;$tar -zxvf Python-2.5.tgz&lt;br /&gt;&lt;br /&gt;解压之后会创建目录~/python/Python-2.5&lt;br /&gt;2. 编译安装&lt;br /&gt;编译python需要libc6-dev&lt;br /&gt;&gt;$apt-get install libc6-dev&lt;br /&gt;&gt;$cd ~/python/Python-2.5&lt;br /&gt;&gt;$./configure&lt;br /&gt;&gt;$make&lt;br /&gt;&gt;$make install&lt;br /&gt;默认的安装目录是/usr/local，可以通过./configure --prefix=安装目录， 来指定。&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5802665338436118579-1159467738718123341?l=mocibb.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://mocibb.blogspot.com/feeds/1159467738718123341/comments/default' title='帖子评论'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5802665338436118579&amp;postID=1159467738718123341' title='0 条评论'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/1159467738718123341'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5802665338436118579/posts/default/1159467738718123341'/><link rel='alternate' type='text/html' href='http://mocibb.blogspot.com/2007/04/ubuntupython.html' title='在Ubuntu上安装python'/><author><name>mocibb</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
