ruminations

code, math, life

Archive for July 4th, 2008

dsage news

with one comment

It’s been a while since I’ve blogged about dsage so here’s a big braindump on what’s been happening…

I was really happy by the amount of activity and interest in distributed computing at Sage Devel Days 1. I think the major participants were William Stein, Glenn Tarbox and Bill Furnish. 

As the week went on it became clear that dsage as it stands right now does not fulfill the needs of some (maybe even many, or most) users. As a result, right now I am aware of several “next gen” dsage proposals such as dsageNG and some stuff that Bill wrote (i think by the time dev1 was over, Bill was at dsage4 ;-) Bill and I chatted a bit about the architecture of dsage currently and the major problem he saw was that dsage workers used polling and there could be a significant (i.e., seconds) delay between when a new job arrived and when a job was processed by the workers. 

I thought about this for a while and I think that he’s absolutely right. Therefore, I’ve rewritten parts of the worker/server code so now workers will start on a job immediately. This should bring down the overhead of running dsage jobs considerably. 

The rewrite is also more Twisted by using twisted’s async process communication. It was actually surprisingly easy to write a worker pool using the existing tools in the framework.

I also added a convenience function called eval_function() that allows you to submit a live function as a job. This works for any function that can be pickled (its arguments as well of course). For example:

sage: def f(n):
...     return n*n
...
sage: j = d.eval_function(f, ((25,),{}), job_name='square')
sage: j
625
sage: print j.wall_time
0:00:00.144780

This is much, much faster than the performance before the rewriting of the workers.

Having eval_function als makes it really easy to the map part of map reduce (it’s also being referred to as scatter/gather). For example:

sage: jobs = d.map(f, [10..20])
sage: jobs
[100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400]

Also, dsage now supports the new @parallel convenience decorator that William wrote:

sage: P = parallel(p_iter = d.parallel_iter)
sage: @P
....: def f(n,m):
....:     return n+m
....:
sage: f([(1,2), (5, 10/3)])
[((1, 2), 3), ((5, 10/3), 25/3)]

William and I both agree that the strategy for Sage should be to have something that is very fast on local multicore machines (multiprocessing module comes to mind) while also having something that will work both on local clusters and WANs (dsage).

If anyone is interested in helping me to test out the new version and making it more robust, please let me know!

I’m hoping that this stuff will make it into the next major release of Sage since I will going on vacation for 6 weeks (hooray) on July 14th.

Written by Yi Qiang

July 4th, 2008 at 3:44 pm

Posted in sage

Tagged with

publishing your dotfiles

with 10 comments

I discovered freehg.org today, which is a free mercurial repo hosting site. In spirit it’s very akin to github, but obviously lacks many of the features and much of the polish. However, this particular purpose, it’s more than sufficient.

If you access many *nix machines on a regular basis, you’ve probably have been annoyed that your custom configuration files are not immediately available. I used to scp config files around all the time, but that gold old quick. I’m going to show you my current setup for making sure that all the dotfiles (zshrc, vimrc, etc) that I use are version controlled and are easily accessible.

First, collect all your dotfiles in one directory and make it an hg repository. I use ~/.dotfiles, you can use whatever you like.  Here is what my .dotfiles looks like:

iapetus:~/.dotfiles> ls -l
total 54k
-rw-r--r-- 1 yqiang staff   85 2008-07-04 10:20 ackrc
-rw-r--r-- 1 yqiang staff   50 2008-05-29 18:24 bash_profile
-rw-r--r-- 1 yqiang staff 2.0k 2008-05-29 18:24 bashrc
-rw-r--r-- 1 yqiang staff  569 2008-07-04 14:09 create_symlinks.py
drwx------ 3 yqiang staff  102 2008-06-13 15:52 gtk-2.0
-rw-r--r-- 1 yqiang staff  624 2008-07-04 09:48 gvimrc
-rw-r--r-- 1 yqiang staff   48 2008-06-17 15:46 hgignore
-rw-r--r-- 1 yqiang staff  454 2008-07-04 10:30 hgrc
drwx------ 5 yqiang staff  170 2008-06-29 17:55 irssi
-rw-r--r-- 1 yqiang staff  403 2008-06-22 11:20 pdbrc
-rw-r--r-- 1 yqiang staff  642 2008-06-29 10:49 screenrc
-rw-r--r-- 1 yqiang staff 6.8k 2008-07-04 13:39 vimrc
-rw-r--r-- 1 yqiang staff 5.8k 2008-06-29 17:43 zshrc

Then create a freehg.org account and initialize a public repo there. You can find mine at:

http://freehg.org/u/yqiang/dotfiles/

Now, when you access a new machine, to get all your dotfiles in order, just do:

veritas:~ > hg clone http://freehg.org/u/yqiang/dotfiles/ .dotfiles
requesting all changes
adding changesets
adding manifests
adding file changes
added 22 changesets with 37 changes to 15 files
15 files updated, 0 files merged, 0 files removed, 0 files unresolved
veritas:~ > cd .dotfiles
veritas:~/.dotfiles > python create_symlinks.py
Symlinking /home/yqiang/.dotfiles/zshrc to /home/yqiang/.zshrc
Symlinking /home/yqiang/.dotfiles/gvimrc to /home/yqiang/.gvimrc
Symlinking /home/yqiang/.dotfiles/bash_profile to /home/yqiang/.bash_profile
Symlinking /home/yqiang/.dotfiles/hgignore to /home/yqiang/.hgignore
Symlinking /home/yqiang/.dotfiles/bashrc to /home/yqiang/.bashrc
Symlinking /home/yqiang/.dotfiles/hgrc to /home/yqiang/.hgrc
Symlinking /home/yqiang/.dotfiles/pdbrc to /home/yqiang/.pdbrc
Symlinking /home/yqiang/.dotfiles/ackrc to /home/yqiang/.ackrc
Symlinking /home/yqiang/.dotfiles/screenrc to /home/yqiang/.screenrc
Symlinking /home/yqiang/.dotfiles/vimrc to /home/yqiang/.vimrc

create_symlink.py is a simple python script that will create the symlinks for you. Here is the code for it:

#!/usr/bin/env python
import os
home = os.path.abspath(os.environ['HOME'])
path = os.path.join(home, '.dotfiles')
excludes = ['gtk-2.0', 'create_symlinks.py']
for f in os.listdir(path):
    if f.startswith('.'):
        continue
    if f not in excludes:
        dst = os.path.join(home, '.' + f)
        src = os.path.abspath(f)
        try:
            print "Symlinking %s to %s" % (src, dst)
            os.symlink(src, dst)
        except Exception, msg:
            print "Failed to symlink %s to %s " % (src, dst)
            print msg

Tada. All your config files are in place now. If you’re really into it, you can run a cron script that automatically does an hg pull so you don’t even have to think about it.

Written by Yi Qiang

July 4th, 2008 at 2:40 pm

Posted in linux, sage

Tagged with , ,