Archives:
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005
- September 2005
- August 2005
- July 2005
Meta:
Categories:
- blunt-hate
- comp
- debian
- django
- dogbert
- dovhcajt
- faks
- kiberpipa
- kitchen sink
- linux
- pootle
- python
- slovenija
- zanimivosti
Recent Posts:
- EuroPython 2008, Day 1
- Coincidence?
- The Linux rt2×00 channel 12 and 13 problem, solved
- A replacement for bzip2 compression
- Disconnecting the hard drive, the software way
- Best Django error page ever
- Doubious means of testing
- Who’s afraid of OOM killer
- Žale imajo svoj obvoz
- Opera fonts, again
Blogroll:
EuroPython 2008, Day 1
Posted on July 7th, 2008 in python |
Here are some flashbacks from today’s schedule.
My god it’s full of files by Tommi Virtanen was a talk about different abstractions of the file system throughout the software world and the lack of the Pythonic one. A special emphasis was on the abstraction provided by the different VFS modules, that allow developers the luxury of treating ZIP files as if they were just another folder on your hard drive and also similarly abstracting away the remoteness of the network file systems. There are several implementations, ranging from FUSE with kernel support, KIO that comes with KDE, GnomeVFS and newer GVFS from the Gnome Project. What surprised Markos was the fact, that neither Windows or OS X have built in support for comprehensive VFS abstraction, that would support accessing remote files as if they were on a local file system. The advances of using a module for file system abstraction could be, besides the network, also as a file system for testing purposes where a script could be tested to properly manipulate files in /etc, or, for example, testing program behavior in some strange conditions, for example when the disk is full.
The talk was a comprehensive overview of the VFS world and we found out that twisted has a broken VFS implementation (as of 2008). But the talk was only an overview and the implementation is to be left to the conference attendants. What was new for me, was the Allmydata Tahoe encrypted distributed file system. Hopefully I’ll have some time to see how it performs.
Why I want you to use eggs by Ignas Mikalajūnas was a presentation or rather more of a motivational talk to use Python eggs. I didn’t quite grasp what exactly the advantages were, but yes, eggs had some issues and now it’s supposed to be much better. Be cool and use them.
PyPy status by Maciej Fijalkowski was an excellent talk about the current state of PyPy. The room was full and PyPy seems to be generating a lot of interest this year. The interface to C is done via Ctypes module, which is a bit slow, but is getting better. They’ve got the SQLite bindings working and demoed Django running on pypy-c. Django however still relies on one obscure CPython internal quirk, but 1.0 is to be PyPy compatible. Pylons, Twisted and Nevow work as should most of the pure Python code. There is however a lack of C library bindings, which they can’t really afford to develop, if we want to see PyPy release by the end of the year.
PyPy uses proper garbage collection, not reference counting, so there are some interesting cases, where code snippets fail, for example
open('xxx','w').write('foo')
var = open('xxx').read()
will not work correctly, because the garbage collector will not discover that file descriptor is not reachable by the time the second line is executed.
PyPy-C speed is from 0.8 to 2x slower compared to CPython, and they are progressing slowly but steadily. The bets are on JIT.
What I liked was the fact, that the team dumped half of their semi-cool stuff, that wasn’t fully working and wasn’t that important for the PyPy. Doing that they might actually manage to release in time.
Building data mashups with SnapLogic by Mike Pittaro. A presentation about SnapLogic, a framework for creating a distributed processing pipeline that enables you to simply use different sources, either a file or a DB or a stream from somewhere and do some processing on it and pass it on to another process or out of the system. What impressed me was the ability to join database tables and records in files. While the operation is fairly simple, it’s not often implemented. Mike was creating an example to draw information from different Trac instances and aggregate them in one place, so he can have a better overview of the system.
Data portability and Python by Christian Scholz was a talk about all the semantic buzz words. OpenID, oAuth, FOAF, XFN, RDF, XRDS, microformats and whatnot. Great for newcomers to technologies behind the social networks. Christian is the author of pydataportability Python packages and showed us some snippets of the library in action.
Later in the evening was the Guido’s keynote about the upcoming Python 3.0 release and the pros and cons with Q&A. Nothing particularly new, Python 3.0 will hopefully deal with the unicode errors and bring some fresh air in Python.
That about covers it for today.
Coincidence?
Posted on June 4th, 2008 in dovhcajt |
The Linux rt2×00 channel 12 and 13 problem, solved
Posted on May 31st, 2008 in dovhcajt |
The Linux rt2×00 wireless drivers come with channel 12 and 13 disabled by default, this is due to different regulations over the world. In US the wireless usage is restricted to channels 1 to 11, while in Europe, channel 12 and 13 are also permitted. However, the rt2×00 driver has the defaults set not to support the last two channels, and finding a way to enable them requires a fair amount of googling.
To spare other from the pain, here is the solution to rt2×00 channel 12 and 13 problem is to make the new wireless stack mac80211 aware of the fact, that you are not a resident of US. The rt2×00 will then show additional channels as available, if you do ‘iwlist ra0 freq’
A replacement for bzip2 compression
Posted on May 15th, 2008 in debian, linux |
I was looking for a replacement for the old and slow bzip2. Not to say it doesn’t work, it’s just too slow for use case I have, and gzip just doesn’t bring enough space savings. And, after all, why settle for less than you’re able to achieve?
So, after checking out Linux compression utilities, the only one that fits my requirements:
- it should be included in Debian repositories
- it should compress approximately as good as bzip2 does in similar time, but should decompress faster
- if possible, it should be free software
The only real contestant is 7-zip, which uses the super efficient LZMA algorithm for compression. It can be quite slow, though. So, the idea is to try to fine tune the utility, to use at most the space bzip2 would or less, and be faster when decompressing. p7zip, the Unix port, has similar compression settings as gzip has, ranging from 1 to 9. I tested some of them, to find optimal settings for my use case and made some benchmarks. I used three different test files, all of which were tar files, but with different contents. Test case 1 was 40MB of text, test case 2 was about 200MB of a recent Haiku OS image and test case 3 were essentially a 70MB bunch of Java JAR files.
This is how the archives compressed. Numbers are normalized to bzip2, for comparison.

Time needed for compression:

Time needed for decompression:

You can see that 7zip always decompresses faster, and that in general, higher 7z compression makes the archive decompress faster. Interesting.
Some more info:
- 7zip was Debian package p7zip-full 4.57~dfsg.1-1
- bzip2 was Debian package 1.0.5-0.1
- test machine was 2.16GHz Macbook with 2GB RAM, doing only the tests
- frequency scaling was off
- all files were first cached in RAM by doing “cat file > /dev/null” so disk I/O was not the bottleneck
Disconnecting the hard drive, the software way
Posted on May 8th, 2008 in debian, linux |
A hard drive recently died, taking away lots of data and filling the logs with failed ide requests. Annoying in too many ways. So, how do you make the disk stop yelling?
The easiest is to make the kernel not see it. There’s a neat trick that enables you to do just that, and it’s called module unbinding. So, basically you unmount the drive (or what’s left of it), then you only need to figure out what device to disconnect from which driver, which isn’t that hard. If you do “ls -al /sys/block/sda/”, you’ll see something like this:
drwxr-xr-x 8 root root 0 maj 8 21:47 . drwxr-xr-x 20 root root 0 maj 8 21:47 .. -r--r--r-- 1 root root 4096 maj 8 21:47 capability -r--r--r-- 1 root root 4096 maj 8 21:47 dev lrwxrwxrwx 1 root root 0 maj 8 21:47 device -> ../../devices/pci0000:00/0000:00:1f.2/host2/target2:0:1/2:0:1:0 drwxr-xr-x 2 root root 0 maj 8 21:47 holders drwxr-xr-x 2 root root 0 maj 8 21:47 power drwxr-xr-x 3 root root 0 maj 8 21:47 queue -r--r--r-- 1 root root 4096 maj 8 21:47 range -r--r--r-- 1 root root 4096 maj 8 21:47 removable drwxr-xr-x 4 root root 0 maj 8 21:47 sda1 drwxr-xr-x 4 root root 0 maj 8 21:47 sda2 -r--r--r-- 1 root root 4096 maj 8 21:47 size drwxr-xr-x 2 root root 0 maj 8 21:47 slaves -r--r--r-- 1 root root 4096 maj 8 21:47 stat lrwxrwxrwx 1 root root 0 maj 8 21:47 subsystem -> ../../block -rw-r--r-- 1 root root 4096 maj 8 21:47 uevent
The line of our interest is
lrwxrwxrwx 1 root root 0 maj 8 21:47 device -> ../../devices/pci0000:00/0000:00:1f.2/host2/target2:0:1/2:0:1:0
where you can see, that the device id is “2:0:1:0″. The last step is to actually unbind the device:
cd /sys/block/sda/device/driver echo -n "2:0:1:0" > unbind
Et voila, your hard drive is no more as far as the kernel knows.



