I just started my new job at Microsoft in Mountain View. I’m a program manager (sounds fancier than it actually is) on the Mac team here. It’s been a pretty exciting month, arriving in San Francisco with no idea where I’d be living. Lucikly I was able to find a gorgeous townhouse in Potrero Hill, so I’m taking the Caltrain to work everyday. I’ll post pictures of my new pad + workplace soon ![]()
A pretty big rewrite of the worker part of dsage landed in trac now. It’s ticket #3600 on the Sage trac and is eagerly awaiting review. Here’s a quick rundown of the major changes:
1. Workers no longer poll the server for new jobs.
It’s been a while since I’ve blogged about dsage so here’s a big braindump on what’s been happening…
I was really happy by the amount of activity and interest in distributed computing at Sage Devel Days 1. I think the major participants were William Stein, Glenn Tarbox and Bill Furnish.
As the week went on it became clear that dsage as it stands right now does not fulfill the needs of some (maybe even many, or most) users. As a result, right now I am aware of several “next gen” dsage proposals such as dsageNG and some stuff that Bill wrote (i think by the time dev1 was over, Bill was at dsage4
Bill and I chatted a bit about the architecture of dsage currently and the major problem he saw was that dsage workers used polling and there could be a significant (i.e., seconds) delay between when a new job arrived and when a job was processed by the workers.
I thought about this for a while and I think that he’s absolutely right. Therefore, I’ve rewritten parts of the worker/server code so now workers will start on a job immediately. This should bring down the overhead of running dsage jobs considerably.
The rewrite is also more Twisted by using twisted’s async process communication. It was actually surprisingly easy to write a worker pool using the existing tools in the framework.
I also added a convenience function called eval_function() that allows you to submit a live function as a job. This works for any function that can be pickled (its arguments as well of course). For example:
sage: def f(n): ... return n*n ... sage: j = d.eval_function(f, ((25,),{}), job_name='square') sage: j 625 sage: print j.wall_time 0:00:00.144780
This is much, much faster than the performance before the rewriting of the workers.
Having eval_function als makes it really easy to the map part of map reduce (it’s also being referred to as scatter/gather). For example:
sage: jobs = d.map(f, [10..20]) sage: jobs [100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400]
Also, dsage now supports the new @parallel convenience decorator that William wrote:
sage: P = parallel(p_iter = d.parallel_iter) sage: @P ....: def f(n,m): ....: return n+m ....: sage: f([(1,2), (5, 10/3)]) [((1, 2), 3), ((5, 10/3), 25/3)]
William and I both agree that the strategy for Sage should be to have something that is very fast on local multicore machines (multiprocessing module comes to mind) while also having something that will work both on local clusters and WANs (dsage).
If anyone is interested in helping me to test out the new version and making it more robust, please let me know!
I’m hoping that this stuff will make it into the next major release of Sage since I will going on vacation for 6 weeks (hooray) on July 14th.
I discovered freehg.org today, which is a free mercurial repo hosting site. In spirit it’s very akin to github, but obviously lacks many of the features and much of the polish. However, this particular purpose, it’s more than sufficient.
If you access many *nix machines on a regular basis, you’ve probably have been annoyed that your custom configuration files are not immediately available. I used to scp config files around all the time, but that gold old quick. I’m going to show you my current setup for making sure that all the dotfiles (zshrc, vimrc, etc) that I use are version controlled and are easily accessible.
First, collect all your dotfiles in one directory and make it an hg repository. I use ~/.dotfiles, you can use whatever you like. Here is what my .dotfiles looks like:
iapetus:~/.dotfiles> ls -l total 54k -rw-r--r-- 1 yqiang staff 85 2008-07-04 10:20 ackrc -rw-r--r-- 1 yqiang staff 50 2008-05-29 18:24 bash_profile -rw-r--r-- 1 yqiang staff 2.0k 2008-05-29 18:24 bashrc -rw-r--r-- 1 yqiang staff 569 2008-07-04 14:09 create_symlinks.py drwx------ 3 yqiang staff 102 2008-06-13 15:52 gtk-2.0 -rw-r--r-- 1 yqiang staff 624 2008-07-04 09:48 gvimrc -rw-r--r-- 1 yqiang staff 48 2008-06-17 15:46 hgignore -rw-r--r-- 1 yqiang staff 454 2008-07-04 10:30 hgrc drwx------ 5 yqiang staff 170 2008-06-29 17:55 irssi -rw-r--r-- 1 yqiang staff 403 2008-06-22 11:20 pdbrc -rw-r--r-- 1 yqiang staff 642 2008-06-29 10:49 screenrc -rw-r--r-- 1 yqiang staff 6.8k 2008-07-04 13:39 vimrc -rw-r--r-- 1 yqiang staff 5.8k 2008-06-29 17:43 zshrc
Then create a freehg.org account and initialize a public repo there. You can find mine at:
http://freehg.org/u/yqiang/dotfiles/
Now, when you access a new machine, to get all your dotfiles in order, just do:
veritas:~ > hg clone http://freehg.org/u/yqiang/dotfiles/ .dotfiles requesting all changes adding changesets adding manifests adding file changes added 22 changesets with 37 changes to 15 files 15 files updated, 0 files merged, 0 files removed, 0 files unresolved veritas:~ > cd .dotfiles veritas:~/.dotfiles > python create_symlinks.py Symlinking /home/yqiang/.dotfiles/zshrc to /home/yqiang/.zshrc Symlinking /home/yqiang/.dotfiles/gvimrc to /home/yqiang/.gvimrc Symlinking /home/yqiang/.dotfiles/bash_profile to /home/yqiang/.bash_profile Symlinking /home/yqiang/.dotfiles/hgignore to /home/yqiang/.hgignore Symlinking /home/yqiang/.dotfiles/bashrc to /home/yqiang/.bashrc Symlinking /home/yqiang/.dotfiles/hgrc to /home/yqiang/.hgrc Symlinking /home/yqiang/.dotfiles/pdbrc to /home/yqiang/.pdbrc Symlinking /home/yqiang/.dotfiles/ackrc to /home/yqiang/.ackrc Symlinking /home/yqiang/.dotfiles/screenrc to /home/yqiang/.screenrc Symlinking /home/yqiang/.dotfiles/vimrc to /home/yqiang/.vimrc
create_symlink.py is a simple python script that will create the symlinks for you. Here is the code for it:
#!/usr/bin/env python import os home = os.path.abspath(os.environ['HOME']) path = os.path.join(home, '.dotfiles') excludes = ['gtk-2.0', 'create_symlinks.py'] for f in os.listdir(path): if f.startswith('.'): continue if f not in excludes: dst = os.path.join(home, '.' + f) src = os.path.abspath(f) try: print "Symlinking %s to %s" % (src, dst) os.symlink(src, dst) except Exception, msg: print "Failed to symlink %s to %s " % (src, dst) print msg
Tada. All your config files are in place now. If you’re really into it, you can run a cron script that automatically does an hg pull so you don’t even have to think about it.
I haven’t looked at sage -h for a while and was surprised to see many useful convenience features that have been added. I will highlight some ones that I’ve been using constantly that makes Sage development more convenient.
sage -b [branch] -- switch to and build SAGE branch in devel/sage-branch
sage -br [branch] -- switch to, build, and run SAGE branch in devel/sage-branch
sage -clone [new branch] -- clone and run a new branch of the SAGE library from current branch
sage -python -- run the python interpreter
sage -sh -- run $SHELL (/opt/local/bin/zsh) with SAGE environment variables set
sage -t [-optional] [-verbose] [-long] -- test examples in .py, .pyx, .sage or .tex files
-optional -- include examples with 'optional' and 'package'
-long -- include lines with the phrase 'long time'
-verbose -- print debuging output during the test
In particular, sage -sh is really useful for setting all the shell variables.
You can get a list of all the command line options by doing
sage -advanced
If you’re like me and are curious about everything on your system, you might like this tip:
http://www.tipstrs.com/tip/1821/Fix–home-directory-after-installing-Leopard
It shows you how to remove the /home directory in your root on a Leopard machine.
Check out how “global” Sage development is here:
http://lite.sagemath.org/devmap.html
This was developed by Harald Schilly. If you’re a Sage developer and want to show up on the map, contact Harald Schilly.
Updated version of Colloquy which fixes Python plugins on Leopard
April 9th, 2008mac, sage No CommentsYou can find an updated binary distribution of Colloquy here:
http://yiqiang.org/Colloquy.zip
The only modification is that it is linked using -weak_library so that it uses Python 2.5 if it exists on your machine and falls back to Python 2.3 if you’re using 10.4. This is needed because Python plugins for Colloquy need the pyobjc bridge, which is in Python 2.5 (as shipped with Leopard), but not Python 2.3.
You can find a sample plugin here:
http://yiqiang.org/sage-devel-trac.py
To install it, drop it into
~/Library/Application Support/Colloquy/PlugIns/
and either restart Colloquy or type /reload plugins.
Today the Sage team received some very exciting and encouraging news.
Chris DiBona, who is the Open Source Programs Manager at Google, was able to secure funding for several students to work on Sage this summer. The students and the projects are:
Gary Furnish (Rewrite and Vastly Optimize Symbolic Computation)
Mike Hansen (Combinatorial Species)
Robert Miller (Backtracking Algorithms and Permutation Groups)
Yi Qiang (Distributed Computing with DSage)
More details are in the original proposal:
http://yiqiang.org/google_proposal.pdf
Thanks again to Google and everyone who worked on making this happen!







Recent Comments