Tibor's Musingshttp://tiborsimko.org/2015-02-25T19:00:00+01:00Deploying Dockerised Applications on CERN OpenStack2015-02-25T19:00:00+01:002015-02-25T19:00:00+01:00Tibor Šimkotag:tiborsimko.org,2015-02-25:/docker-on-cern-openstack.html<p>In a <a class="reference external" href="http://tiborsimko.org/docker-for-python-applications.html">previous blog post</a> we have seen how to
use Docker for developing Python web applications. In this blog post
we shall see how to deploy thusly dockerised web application on the
CERN OpenStack cloud, for example to demonstrate the application to
colleagues.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="simple-approach">
<h2>Simple approach</h2>
<p>Ideally, we would use …</p></div><p>In a <a class="reference external" href="http://tiborsimko.org/docker-for-python-applications.html">previous blog post</a> we have seen how to
use Docker for developing Python web applications. In this blog post
we shall see how to deploy thusly dockerised web application on the
CERN OpenStack cloud, for example to demonstrate the application to
colleagues.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="simple-approach">
<h2>Simple approach</h2>
<p>Ideally, we would use the new <a class="reference external" href="https://github.com/docker/machine">docker-machine</a> management tool that permits to
easily create and manage Docker containers on various remote cloud
platforms, including OpenStack. However, <tt class="docutils literal"><span class="pre">docker-machine</span></tt> is still
at a relatively early development stage; its OpenStack driver supports
only the Ubuntu host systems, while we would like to use the vanilla
CERN OpenStack cloud with its usual offer of CERN Linux images.</p>
<p>Let us therefore use a simple approach that will consist of (1)
creating naked OpenStack virtual machine, (2) provisioning it with
Docker, after which we (3) deploy our web application by "exporting"
it from the local development environment and "importing" it into the
remote Docker instance by using <tt class="docutils literal">ssh</tt>. We shall create some useful
bash aliases to ease the job.</p>
<p>We shall use the "hello world" example web application from the
<a class="reference external" href="http://tiborsimko.org/docker-for-python-applications.html">previous blog post</a>,
deploying it on a new CERN CentOS 7 virtual machine on CERN OpenStack.</p>
</div>
<div class="section" id="create-cern-centos-7-vm">
<h2>Create CERN CentOS 7 VM</h2>
<p>We start by creating a fresh new vanilla CERN CentOS 7 virtual machine
on the CERN OpenStack, named <tt class="docutils literal"><span class="pre">simko-testvm-cc7</span></tt>, using <tt class="docutils literal">CC7 Extra -
x86_64 <span class="pre">[2015-02-10]</span></tt> image:</p>
<pre class="literal-block">
$ source ./Personal\ simko-openrc.sh
$ openstack server create \
--key-name simko_openstack_key \
--image "CC7 Extra - x86_64 [2015-02-10]" \
--flavor m1.medium \
simko-testvm-cc7
</pre>
<p>(We could also use the <a class="reference external" href="https://openstack.cern.ch/">web interface</a>
to create it.)</p>
</div>
<div class="section" id="define-useful-local-shell-functions">
<h2>Define useful local shell functions</h2>
<p>While the machine is being created, let us define two convenient local
shell functions so that we could work with the remote VM from the
laptop. Here, <tt class="docutils literal">rssh</tt> stands for "remote ssh" and <tt class="docutils literal">rdocker</tt> stands
for "remote docker":</p>
<pre class="literal-block">
$ function rssh() { ssh -C -i ~/.ssh/cernopenstack.pem root@$RDOCKERHOST "$@" ;}
$ function rdocker() { ssh -C -i ~/.ssh/cernopenstack.pem root@$RDOCKERHOST "docker $@" ;}
</pre>
<p>The remote machine we'd like to manage can be changed anytime by
setting the environment variable <tt class="docutils literal">RDOCKERHOST</tt>. Our new VM is
called <tt class="docutils literal"><span class="pre">simko-testvm-cc7</span></tt>, so:</p>
<pre class="literal-block">
$ export RDOCKERHOST=simko-testvm-cc7.cern.ch
</pre>
<p>We can now use <tt class="docutils literal">rssh</tt> and <tt class="docutils literal">rdocker</tt> commands to work comfortably
with the remote VM from our laptop.</p>
</div>
<div class="section" id="provision-newly-created-remote-vm">
<h2>Provision newly created remote VM</h2>
<p>Once the VM is up and running, we provision it by simply installing
and enabling Docker:</p>
<pre class="literal-block">
$ rssh yum install -y docker-io
$ rssh systemctl start docker
$ rssh systemctl enable docker
</pre>
<p>We also need to open the HTTP port in the firewall:</p>
<pre class="literal-block">
$ rssh yum install -y firewalld
$ rssh systemctl start firewalld
$ rssh firewall-cmd --add-service http --permanent
$ rssh firewall-cmd --reload
$ rssh systemctl enable firewalld
</pre>
<p>That's it! Our new virtual machine is now fully ready to serve
dockerised applications.</p>
</div>
<div class="section" id="deploy-application-image">
<h2>Deploy application image</h2>
<p>Before deploying the application, let us build it one last time:</p>
<pre class="literal-block">
$ cd ~/private/src/helloworld
$ docker-compose build
</pre>
<p>We can now use <tt class="docutils literal">docker save</tt> on the laptop to "export" freshly built
application image, pipe it to the provisioned OpenStack machine via
<tt class="docutils literal">ssh</tt>, where it shall be "imported" by means of <tt class="docutils literal">docker load</tt>.
Using our handy shell functions, the operation is simple:</p>
<pre class="literal-block">
$ docker save helloworld_web | rdocker load
</pre>
<p>The image transfer takes about two minutes, the image size being about
750 MB.</p>
</div>
<div class="section" id="run-dockerised-application">
<h2>Run dockerised application</h2>
<p>The application image should now appear on the remote VM. Let us
confirm that the image was well transferred:</p>
<pre class="literal-block">
$ rdocker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
helloworld_web latest d7f9887ccdab 18 hours ago 752.6 MB
</pre>
<p>We can now start the application on the remote machine:</p>
<pre class="literal-block">
$ rdocker run -d -p 80:5000 helloworld_web
e0da6d79f780bff22d29bf0a7b3b1a03d7617b38b8696c839043c910a48bc705
</pre>
<p>Let us verify that it correctly runs:</p>
<pre class="literal-block">
$ firefox http://simko-testvm-cc7.cern.ch
</pre>
<p>We can also remotely check its logs:</p>
<pre class="literal-block">
$ rdocker logs -f e0da6d79f780
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
* Restarting with stat
128.141.95.173 - - [25/Feb/2015 18:48:27] "GET / HTTP/1.1" 200 -
</pre>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p>Using a few simple bash functions, we can easily deploy dockerised
local applications on the remote CERN OpenStack cloud, for example to
demo our developments to colleagues.</p>
</div>
<div class="section" id="appendix-scientific-linux-cern-6">
<h2>Appendix: Scientific Linux CERN 6</h2>
<p>If you would like to use the Scientific Linux CERN 6 virtual machine
as a host, the provisioning step would look like:</p>
<pre class="literal-block">
$ export RDOCKERHOST=simko-testvm-slc6.cern.ch
$ rssh yum install -y docker-io
$ rssh service docker start
$ rssh /sbin/iptables -I INPUT -p tcp -m tcp --dport 80 -j ACCEPT
$ rssh /etc/init.d/iptables save
</pre>
<p>The rest of the procedures remain exactly the same.</p>
<p><strong>Side note:</strong> if the SLC6 system does not show its full disk space
capacity on the root partition (e.g. it shows only 8 GB instead of 40
GB), one needs to <a class="reference external" href="http://information-technology.web.cern.ch/book/cern-cloud-infrastructure-user-guide/administering-vms/resizing-disks">extend the partition</a>
as follows:</p>
<pre class="literal-block">
# growpart /dev/vda 2
# reboot # ... and after reboot ...
# pvresize /dev/vda2
# lvextend -l +100%FREE /dev/mapper/VolGroup00-LogVol00
# resize2fs /dev/mapper/VolGroup00-LogVol00
</pre>
</div>
Troubleshooting Mbsync Duplicate UID Errors2015-02-16T20:15:00+01:002015-02-16T20:15:00+01:00Tibor Šimkotag:tiborsimko.org,2015-02-16:/mbsync-duplicate-uid.html<p>I've been using excellent <a class="reference external" href="http://isync.sourceforge.net/">msync</a>
tool to synchronise IMAP mailboxes between cloud and laptop. (mbsync
seems both faster and less resource-hungry than <a class="reference external" href="http://offlineimap.org/">offlineimap</a>.) Once, probably due to a wrong move
operation, the synchronisation started to report <em>"Maildir error:
duplicate UID"</em> messages. How to quickly repair this situation?</p>
<!-- PELICAN_END_SUMMARY --><p>Here is the …</p><p>I've been using excellent <a class="reference external" href="http://isync.sourceforge.net/">msync</a>
tool to synchronise IMAP mailboxes between cloud and laptop. (mbsync
seems both faster and less resource-hungry than <a class="reference external" href="http://offlineimap.org/">offlineimap</a>.) Once, probably due to a wrong move
operation, the synchronisation started to report <em>"Maildir error:
duplicate UID"</em> messages. How to quickly repair this situation?</p>
<!-- PELICAN_END_SUMMARY --><p>Here is the error message:</p>
<pre class="literal-block">
$ mbsync -a
[...]
Synchronizing...
Selecting master INBOX...
Selecting slave INBOX...
Loading master...
Loading slave...
Maildir error: duplicate UID 2.
</pre>
<p>This means the INBOX folder has problematic messages with UID=2. How
to find them:</p>
<pre class="literal-block">
$ cd ~/Local/mbsyncmail/INBOX # where my INBOX lives
$ find . -name "*U=2:*" -exec ls -l {} \;
-rw------- 1 simko simko 1523 Dec 20 21:20 ./cur/1419106858.5661_2.pcuds06,U=2:2,S
-rw------- 1 simko simko 39500 Feb 13 10:20 ./cur/1423819205.29514_1.pcuds06,U=2:2,S
</pre>
<p>Note the <tt class="docutils literal">,U=2:</tt> part of the filename: the second message got
somehow the same UID as the first one. (Probably by wrong move
between folders without changing file name.)</p>
<p>The fix consists in keeping the first (older) message as is, and
changing the second (newer) message file name to remove the <tt class="docutils literal"><span class="pre">,U=...</span></tt>
suffix part:</p>
<pre class="literal-block">
$ mv ./cur/1423819205.29514_1.pcuds06,U=2:2,S ./cur/1423819205.29514_1.pcuds06
[...]
-rw------- 1 simko simko 1523 Dec 20 21:20 ./cur/1419106858.5661_2.pcuds06,U=2:2,S
-rw------- 1 simko simko 39500 Feb 13 10:20 ./cur/1423819205.29514_1.pcuds06
</pre>
<p>This deduplicates the problematic UID and forces <tt class="docutils literal">mbsync</tt> to create
new UID for the second message at its next run.</p>
<p>Was this message the only problematic one?</p>
<pre class="literal-block">
$ ls -lR cur | grep -o 'U=.*:' | sort | uniq -d
U=38:
U=39:
U=4:
U=7:
</pre>
<p>Nope, there are four more messages to fix in the same manner. Once
done, the synchronisation works well again:</p>
<pre class="literal-block">
$ mbsync -a | grep -i error | wc -l
0
</pre>
Running Multiple Daemon Processes in Docker2015-02-11T21:00:00+01:002015-02-11T21:00:00+01:00Tibor Šimkotag:tiborsimko.org,2015-02-11:/docker-running-multiple-processes.html<p>A Docker container is usually dedicated to running one daemon process,
such as Apache to serve a web application. If the application needs a
cache, or a database, then the container running the application is
<em>linked</em> to another containing running the cache, and yet another
container running the database. However …</p><p>A Docker container is usually dedicated to running one daemon process,
such as Apache to serve a web application. If the application needs a
cache, or a database, then the container running the application is
<em>linked</em> to another containing running the cache, and yet another
container running the database. However, what if we need (or want) to
run several daemon applications inside the same single container?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="the-use-case">
<h2>The use case</h2>
<p>While it is a kind of anti-pattern to run several daemons inside the
same container, instead of running several containers for each daemon
and then cross-linking them, it may be sometimes useful. A concrete
example from the <a class="reference external" href="https://github.com/inveniosoftware/invenio/">Invenio</a> digital library
software world: we would like to fully emulate what users get when
they run e.g. Invenio v1.0.5 on a vanilla CentOS 5 server system.
This means running then-available Python-2.4, with then-available
version of MySQL-5.0 and then-available pre-packaged version of
MySQL-python library.</p>
<p>Creating an Invenio docker image that relies on CentOS 5 and runs
everything inside the same container will allow us to achieve this
emulation. (Theoretically, we could create several CentOS 5 based
docker images and linked containers, one for running the Apache, one
for running the MySQL, etc. However, we may just as well run
everything inside the same image, considering that we can use the same
kick-start installation scripts to provision the box, and considering
that providing support to older installations in this way is
relatively rare.)</p>
</div>
<div class="section" id="the-technique">
<h2>The technique</h2>
<p>Let's assume then that we'd like to run an MUA daemon process such as
Exim, a web server process such as Apache, and a database server
process such as MySQL all inside the same CentOS 5 based container.</p>
<p>The trick to use <a class="reference external" href="http://supervisord.org/">supervisord</a> and run the
<tt class="docutils literal">supervisord</tt> daemon that will take care of keeping all the other
wanted daemons alive.</p>
<p>Here is a minimal <tt class="docutils literal">Dockerfile</tt> to test the idea:</p>
<pre class="literal-block">
FROM centos:5
RUN yum install -y epel-release && \
yum update -y && \
yum install -y exim \
httpd \
mysql-server \
supervisor && \
yum clean all
RUN /sbin/service mysqld start && \
mysqladmin -u root password ''
RUN echo "[supervisord]" > /etc/supervisord.conf && \
echo "nodaemon=true" >> /etc/supervisord.conf && \
echo "" >> /etc/supervisord.conf && \
echo "[program:exim]" >> /etc/supervisord.conf && \
echo "command=/usr/sbin/exim -bd -q1h" >> /etc/supervisord.conf && \
echo "" >> /etc/supervisord.conf && \
echo "[program:mysqld]" >> /etc/supervisord.conf && \
echo "command=/usr/bin/mysqld_safe" >> /etc/supervisord.conf && \
echo "" >> /etc/supervisord.conf && \
echo "[program:httpd]" >> /etc/supervisord.conf && \
echo "command=/usr/sbin/apachectl -D FOREGROUND" >> /etc/supervisord.conf
CMD ["/usr/bin/supervisord"]
</pre>
<p>Here is corresponding <tt class="docutils literal"><span class="pre">docker-compose.yml</span></tt>:</p>
<pre class="literal-block">
web:
build: .
command: /usr/bin/supervisord
</pre>
</div>
<div class="section" id="the-verification">
<h2>The verification</h2>
<p>Let us build the image and start the container:</p>
<pre class="literal-block">
$ docker-compose build
$ docker-compose up
web_1 | 2015-02-11 20:56:56,265 CRIT Supervisor running as root (no user in config file)
web_1 | 2015-02-11 20:56:56,282 INFO supervisord started with pid 1
web_1 | 2015-02-11 20:56:56,283 INFO spawned: 'httpd' with pid 6
web_1 | 2015-02-11 20:56:56,284 INFO spawned: 'mysqld' with pid 7
web_1 | 2015-02-11 20:56:56,285 INFO spawned: 'exim' with pid 8
web_1 | 2015-02-11 20:56:57,282 INFO success: httpd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | 2015-02-11 20:56:57,283 INFO success: mysqld entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | 2015-02-11 20:56:57,285 INFO success: exim entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
</pre>
<p>Let us connect to the running container to confirm that the daemons
are well up and running:</p>
<pre class="literal-block">
$ docker exec -i -t 6cc431e69c5f bash
[root@6cc431e69c5f /]# ps aux | grep mysql
root 7 0.1 0.0 10844 2168 ? S 20:56 0:00 /bin/sh /usr/bin/mysqld_safe
mysql 83 0.3 0.3 188400 28532 ? Sl 20:56 0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --log-error=/var/log/mysqld.log --socket=/var/lib/mysql/mysql.sock
root 105 0.0 0.0 61256 1792 ? S+ 20:57 0:00 grep mysql
[root@6cc431e69c5f /]# ps aux | grep httpd
root 10 0.0 0.0 174412 7076 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 21 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 22 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 23 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 24 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 25 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 26 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 27 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
apache 28 0.0 0.0 174544 3704 ? S 20:56 0:00 /usr/sbin/httpd -D FOREGROUND
root 107 0.0 0.0 61256 1884 ? S+ 20:57 0:00 grep httpd
[root@6cc431e69c5f /]# ps aux | grep exim
exim 8 0.0 0.0 79924 6388 ? S 20:56 0:00 /usr/sbin/exim -bd -q1h
</pre>
</div>
Using Docker for Developing Python Applications2015-01-29T19:00:00+01:002015-01-29T19:00:00+01:00Tibor Šimkotag:tiborsimko.org,2015-01-29:/docker-for-python-applications.html<p><a class="reference external" href="http://www.docker.com/">Docker</a> became popular software solution
permitting to deploy applications inside isolated Linux software
containers. From a Python related point of view, one could consider
Docker containers as "virtual environments on steroids", because they
encapsulate and isolate not only application's Python pre-requisites
(say given version of PyPDF2 package), but also any …</p><p><a class="reference external" href="http://www.docker.com/">Docker</a> became popular software solution
permitting to deploy applications inside isolated Linux software
containers. From a Python related point of view, one could consider
Docker containers as "virtual environments on steroids", because they
encapsulate and isolate not only application's Python pre-requisites
(say given version of PyPDF2 package), but also any non-Python
utilities of the operating system that the application relies on (say
given version of LibreOffice). The following primer shows how to use
Docker for developing Python applications.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installation">
<h2>Installation</h2>
<p>Installing <tt class="docutils literal">docker</tt> on Debian GNU/Linux is easy:</p>
<pre class="literal-block">
sudo apt-get install docker.io
</pre>
<p>The <tt class="docutils literal">docker</tt> now runs:</p>
<pre class="literal-block">
$ docker info
Containers: 0
Images: 289
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 289
Execution Driver: native-0.2
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
WARNING: No memory limit support
WARNING: No swap limit support
</pre>
<p>We can use it as is, however, note the memory and swap limit warning
that may be good to fix before we continue.</p>
</div>
<div class="section" id="enabling-memory-swap-limit-support">
<h2>Enabling memory/swap limit support</h2>
<p>On Debian GNU/Linux systems, the memory limit and swap limit features
can be set by configuring kernel boot parameters. This is done by
editing <tt class="docutils literal">/etc/default/grub</tt> in the following way:</p>
<pre class="literal-block">
$ sudo vim /etc/default/grub # edit GRUB_CMDLINE_LINUX as follows
$ grep GRUB_CMDLINE_LINUX /etc/default/grub
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
$ sudo update-grub
$ sudo shutdown -r now
</pre>
<p>After which the two WARNING lines disappear.</p>
<p>Note that memory accounting of running containers will be inspectable
via cgroup and friends:</p>
<pre class="literal-block">
$ systemd-cgtop
$ ls -l /sys/fs/cgroup/memory/system.slice
</pre>
</div>
<div class="section" id="enabling-cern-dns">
<h2>Enabling CERN DNS</h2>
<p>One more installation related comment, of importance to inside-CERN
users. The network works best when one specifies CERN DNS IPs
explicitly. This can be done by using <tt class="docutils literal"><span class="pre">--dns</span></tt> parameter to the
<tt class="docutils literal">docker</tt> commands below, or else it can be done globally by means of
configuring <tt class="docutils literal">DOCKER_OPTS</tt> in the following way:</p>
<pre class="literal-block">
$ sudo vim /etc/default/docker # edit DOCKER_OPTS as follows
$ grep DOCKER_OPTS /etc/default/docker
DOCKER_OPTS="--dns 137.138.16.5 --dns 137.138.17.5 --dns 8.8.8.8 --dns 8.8.4.4"
$ sudo /etc/init.d/docker restart
</pre>
</div>
<div class="section" id="throw-away-python-containers">
<h2>Throw-away Python containers</h2>
<p>Now that Docker is installed, how to use it to develop Python
applications? We can start by pulling pre-existing Python docker
images from the <a class="reference external" href="https://registry.hub.docker.com/_/python/">docker registry hub</a>:</p>
<pre class="literal-block">
$ docker search python
$ docker pull python:2.7
$ docker pull python:3.4
</pre>
<p>This will permit us to start throw-away Python containers:</p>
<pre class="literal-block">
$ docker run -i -t --rm python:2.7
Python 2.7.9 (default, Jan 28 2015, 01:38:45)
[GCC 4.9.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 + 1
2
</pre>
<p>This creates interactive (<tt class="docutils literal"><span class="pre">-i</span></tt>) Python container attached to the
terminal (<tt class="docutils literal"><span class="pre">-t</span></tt>) that will be removed once we quit the session
(<tt class="docutils literal"><span class="pre">--rm</span></tt>).</p>
<p>Throw-away containers are useful to quick test Python constructs. For
example, how fast are Python list comprehensions in various Python
versions?</p>
<pre class="literal-block">
$ docker run -i -t --rm python:2.7 python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 82.2 usec per loop
$ docker run -i -t --rm python:3.3 python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 83 usec per loop
$ docker run -i -t --rm python:3.4 python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 87.7 usec per loop
</pre>
<p>The higher the version, the slower the Python seem to be; but let's
not digress again.</p>
</div>
<div class="section" id="simple-application">
<h2>Simple application</h2>
<p>Consider we are developing some simple Python application, such as web
site based on <a class="reference external" href="http://flask.pocoo.org/">Flask</a> framework. Here is
minimal "hello world" code example:</p>
<pre class="literal-block">
$ cat app.py
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
return 'Hello world'
if __name__ == "__main__":
app.run(host="0.0.0.0", debug=True)
</pre>
<p>with the following requirements:</p>
<pre class="literal-block">
$ cat requirements.txt
Flask
</pre>
<p>The application is started as:</p>
<pre class="literal-block">
$ python app.py
</pre>
<p>It will run on <tt class="docutils literal"><span class="pre">http://0.0.0.0:5000</span></tt> and simply greets its user.</p>
</div>
<div class="section" id="dockerfile">
<h2>Dockerfile</h2>
<p>Let us build a Docker image enabling to start a container running this
application. While we could start an interactive Python container as
described above, install pre-requisite and save the work for later, it
is best to fully automatise creation of Docker images by means of a
<tt class="docutils literal">Dockerfile</tt>.</p>
<p>For our simple application, the <tt class="docutils literal">Dockerfile</tt> would look as follows:</p>
<pre class="literal-block">
$ cat Dockerfile
FROM python:2.7
ADD requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
ADD . /code
WORKDIR /code
EXPOSE 5000
CMD ["python", "app.py"]
</pre>
<p>This means we are starting from Python-2.7 Docker image, adding
current <tt class="docutils literal">requirements.txt</tt> file and run it to install Flask, then
adding current directory in a <tt class="docutils literal">/code</tt> directory in the container,
working on the code there. The application will be run on port 5000
when the container starts by means of <tt class="docutils literal">python app.py</tt>.</p>
<p>The docker image can be then built by running:</p>
<pre class="literal-block">
$ docker build -t tiborsimko/helloworld .
</pre>
<p>A new container can be instantiated out of this image as follows:</p>
<pre class="literal-block">
$ docker run -p 5000:5000 tiborsimko/helloworld
</pre>
<p>On the host OS, we see the web site running on port 5000 that is
exposed from the container to the host system.</p>
<p>Another useful option is <tt class="docutils literal"><span class="pre">-v</span></tt> (for volume management) that permits
to mount current working directory under <tt class="docutils literal">/code</tt> in the container,
so that we could use our preferred editor on the host machine to edit
the application and see its changes live in the container. This can
be achieved by <tt class="docutils literal"><span class="pre">-v</span> <span class="pre">.:/code</span></tt> option, but there's another way to
automatise this.</p>
</div>
<div class="section" id="docker-compose">
<h2>docker-compose</h2>
<p><tt class="docutils literal"><span class="pre">docker-compose</span></tt> provides useful composition services on top of
Docker that permit us to automatise building and running of
containers. First install it as follows:</p>
<pre class="literal-block">
sudo pip install docker-compose==1.1.0-rc1
</pre>
<p>You may need to upgrade <tt class="docutils literal">PyYAML</tt> beforehand:</p>
<pre class="literal-block">
sudo apt-get remove python-openssl
sudo apt-get install libyaml-dev
sudo pip install PyYAML
</pre>
<p>(Note that the above example replaces system Python packages with
locally installed ones, which may be dangerous. A better technique
would be to use <a class="reference external" href="https://github.com/mitsuhiko/pipsi">pipsi</a> that
installs Python programs and their dependencies into virtual
environments, permitting their better isolation from system Python
package versions.)</p>
<p>Here is <tt class="docutils literal"><span class="pre">docker-compose</span></tt> configuration for our simple application
example:</p>
<pre class="literal-block">
$ cat docker-compose.yml
web:
build: .
command: python app.py
ports:
- "5000:5000"
volumes:
- .:/code
</pre>
<p>The building is then done via:</p>
<pre class="literal-block">
$ docker-compose build
</pre>
<p>and a container can be fired up via:</p>
<pre class="literal-block">
$ docker-compose up
</pre>
<p>Note how docker command line options are being stored in more readable
YAML configuration, including exposing port 5000 or mounting current
working directory under <tt class="docutils literal">/code</tt>. Basically, <tt class="docutils literal"><span class="pre">docker-compose</span></tt>
permits to automatise via YAML what we would otherwise have to express
by hand via docker command line options.</p>
<p>This advantage will be even more apparent for complex applications
where the application would require to link more containers together,
such as the Python application running inside the <tt class="docutils literal">web</tt> container,
that is linked to a <tt class="docutils literal">redis</tt> container caching, a <tt class="docutils literal">db</tt> container
running PostgreSQL database, and a <tt class="docutils literal">worker</tt> container running Celery
tasks.</p>
</div>
<div class="section" id="dockerignore">
<h2>.dockerignore</h2>
<p>If we want to share our created image with others, it is useful to
define <tt class="docutils literal">.dockerignore</tt> file that will permit to ignore certain files
or directories from being included in the built Docker image. A good
example is <tt class="docutils literal">.git</tt>: by putting it in <tt class="docutils literal">.dockerignore</tt>, we won't
expose our local unstable branches to friends, though we still retain
the option to have them available for local developments by volume
mounting.</p>
</div>
<div class="section" id="docker-build-cache">
<h2>Docker build cache</h2>
<p>Why have we defined in <tt class="docutils literal">Dockerfile</tt> the following part?</p>
<pre class="literal-block">
ADD requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
ADD . /code
</pre>
<p>The <tt class="docutils literal">requirements.txt</tt> is mounted in the third line as <tt class="docutils literal">/code</tt>
anyway, isn't it?</p>
<p>The reason the <tt class="docutils literal">requirements.txt</tt> file was added explicitly before
the rest of the code is the docker build cache. If we repeat the
build process several times, docker caches prior layers (roughly
speaking prior RUN statements) and reuse them whenever possible. For
example, if <tt class="docutils literal">requirements.txt</tt> did not change, and only our
<tt class="docutils literal">app.py</tt> changes, this means that our application requirements won't
have to be installed over and over; Docker will reuse previously built
layers.</p>
<p>Automated build cache is one of the very cool features of Docker. It
makes building images and creating containers an easy, fast, and
disposable process. It is therefore important to write <tt class="docutils literal">Dockerfile</tt>
in such a manner that most of the pre-requisite installation job is
being done before we add our code.</p>
</div>
<div class="section" id="container-user">
<h2>Container user</h2>
<p>If we run bash shell in the built container:</p>
<pre class="literal-block">
$ docker run -i -t --rm tiborsimko/helloworld bash
root@06436a85c124:/code# id
uid=0(root) gid=0(root) groups=0(root)
</pre>
<p>we'll see that the container process runs as root, which is not the
best from the point of view of security.</p>
<p>It is desirable to create a new user that the application would run
as. Ideally, it would be a user with the same UID as the main user of
the host system, so that if we mount current directory into the
container, and if the build process needs to create some files (say by
running Bower and friends), all files created within the container
would bear the same ownership as the files in the host system.</p>
<p>This can be achieved in <tt class="docutils literal">Dockerfile</tt> via:</p>
<pre class="literal-block">
RUN adduser --uid 1000 --disabled-password --gecos '' tiborsimko && \
chown -R tiborsimko:tiborsimko /code
USER tiborsimko
</pre>
<p>before starting the application.</p>
</div>
<div class="section" id="wash-your-bowl">
<h2>Wash your bowl</h2>
<p>Consider we've been developing Docker images and running Docker
containers for some time. The crust may have accumulated while we
have been tweaking the least bits of <tt class="docutils literal">Dockerfile</tt>. How to clean
after ourselves?</p>
<p>We can remove all containers by running:</p>
<pre class="literal-block">
$ docker rm $(docker ps -aq)
202c5f3e482e
93112fa2ad87
</pre>
<p>We can remove all "incompletely built" images by running:</p>
<pre class="literal-block">
$ docker images | grep none | awk '{print "docker rmi " $3;}' | sh
</pre>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p>We have seen a simple example on how to start developing Python
applications using Docker. A more realistic examples of Docker
configurations will be committed to various <a class="reference external" href="https://github.com/inveniosoftware">inveniosofware</a> projects in the coming days.</p>
</div>
One Key, Two Functions, or CapsLock as an Escape and Control2014-10-12T19:32:00+02:002014-10-12T19:32:00+02:00Tibor Šimkotag:tiborsimko.org,2014-10-12:/capslock-escape-control.html<p>In search for greater keyboard ergonomy and improved typing
productivity, the <tt class="docutils literal">CapsLock</tt> key seems to be taking place for not
much use. Many programmers redefine it to become a modifier key. One
popular habit: Vi users prefer to set it to <tt class="docutils literal">Escape</tt>, while Emacs
users to <tt class="docutils literal">Control</tt>. What if one …</p><p>In search for greater keyboard ergonomy and improved typing
productivity, the <tt class="docutils literal">CapsLock</tt> key seems to be taking place for not
much use. Many programmers redefine it to become a modifier key. One
popular habit: Vi users prefer to set it to <tt class="docutils literal">Escape</tt>, while Emacs
users to <tt class="docutils literal">Control</tt>. What if one likes both?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="best-of-both-worlds">
<h2>Best of both worlds</h2>
<p>The best of the both worlds can be achieved by means of the <a class="reference external" href="https://github.com/alols/xcape.git">xcape</a> utility that permits to
configure modifier keys to act as other keys when pressed and released
on their own. By means of <tt class="docutils literal">xcape</tt>, we can configure the keyboard in
such a manner that <tt class="docutils literal">CapsLock</tt> generates <tt class="docutils literal">Escape</tt> when
single-pressed, and <tt class="docutils literal">Ctrl</tt> when pressed together with another key.</p>
<p>First install it:</p>
<pre class="literal-block">
sudo apt-get install xcape
</pre>
<p>Then configure it in the following way:</p>
<pre class="literal-block">
# make CapsLock behave like Ctrl:
setxkbmap -option ctrl:nocaps
# make short-pressed Ctrl behave like Escape:
xcape -e 'Control_L=Escape'
</pre>
<p>Done. When short-pressed, <tt class="docutils literal">CapsLock</tt> will generate <tt class="docutils literal">Esc</tt>. When
pressed with another key like <tt class="docutils literal">a</tt>, it will generate <tt class="docutils literal"><span class="pre">Ctrl-a</span></tt>
sequence. Could we be any more efficient?</p>
</div>
<div class="section" id="usage-context">
<h2>Usage context</h2>
<p>I'm using <tt class="docutils literal">xcape</tt> in my keyboard layout modification script together
with several other useful utilities, such as <a class="reference external" href="https://packages.debian.org/sid/xinput">xinput</a> that permits to detect
attached keyboards before performing layout modification operations on
them.</p>
<p>For example I'm using Dvorak keyboard layout on both my laptop's
internal keyboard and on the attached Kinesis Advantage external
keyboard. <tt class="docutils literal">xinput</tt> permits to detect attached devices and
<tt class="docutils literal">setxkmap</tt> permits then to modify them.</p>
<p>The core of the script looks like:</p>
<pre class="literal-block">
# set internal keyboard layout:
deviceid=$(xinput -list | grep 'AT .* keyboard' | head -1 | grep -oE 'id=[0-9]+' | sed 's/id=//g')
if [ "${deviceid}" != "" ]; then
setxkbmap -device "${deviceid}" dvorak
setxkbmap -device "${deviceid}" -option ctrl:nocaps # make CapsLock behave like Ctrl
fi
# set Kinesis keyboard layout:
for deviceid in $(xinput -list | grep ' HID ' | grep -oE 'id=[0-9]+' | sed 's/id=//g'); do
if [ "${deviceid}" != "" ]; then
setxkbmap -device "${deviceid}" us # 'us' but it means 'dvorak' actually due to Kinesis
setxkbmap -device "${deviceid}" -option ctrl:nocaps # make CapsLock behave like Ctrl
fi
done
</pre>
<p>Further keymap re-definitions are done via <tt class="docutils literal">xmodmap</tt> statements.
For example, on a ThinkPad x240 keyboard, the <tt class="docutils literal">PrtSc</tt> key is located
next to right <tt class="docutils literal">Ctrl</tt> key, where it can be easily mis-pressed. So we
can turn it into Control as well:</p>
<pre class="literal-block">
# amend PrtSc key: (useful for ThinkPad x240):
xmodmap -e "keycode 107 = Control_R"
xmodmap -e "add Control = Control_R"
</pre>
</div>
<div class="section" id="troubleshooting">
<h2>Troubleshooting</h2>
<p>If the <tt class="docutils literal">CapsLock</tt> key gets "stuck" and produces say uppercase letter
combinations instead of lowercase ones, the easiest way to remove
<tt class="docutils literal">xcape</tt> mapping is to revert <tt class="docutils literal">CapsLock</tt> key function via
<tt class="docutils literal">xmodmap</tt>:</p>
<pre class="literal-block">
xmodmap -e 'keycode 0x42 = Caps_Lock'
</pre>
<p>followed by re-running of the keyboard modification script.</p>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p><tt class="docutils literal">xcape</tt>, when paired with <tt class="docutils literal">xinput</tt>, <tt class="docutils literal">setxkbmap</tt> and other
<tt class="docutils literal">xmodmap</tt> friends, provides very efficient tool permitting to
overload and remap even the modifier keys' behaviour to one's liking.</p>
</div>
Accelerating Web Sites with Varnish2014-09-18T21:13:00+02:002014-09-18T21:13:00+02:00Tibor Šimkotag:tiborsimko.org,2014-09-18:/varnish-http-accelerator.html<p><a class="reference external" href="https://www.varnish-cache.org/">Varnish</a> is a popular HTTP
accelerator that can speed up web sites. Here is an example of how to
set it up on SLC6 box in view of testing <a class="reference external" href="http://opendata.cern.ch/">CERN Open Data</a> portal responsiveness.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installation">
<h2>Installation</h2>
<p>Official Varnish packages for Scientific Linux 6 (a distribution that
is binary API compatible with …</p></div><p><a class="reference external" href="https://www.varnish-cache.org/">Varnish</a> is a popular HTTP
accelerator that can speed up web sites. Here is an example of how to
set it up on SLC6 box in view of testing <a class="reference external" href="http://opendata.cern.ch/">CERN Open Data</a> portal responsiveness.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installation">
<h2>Installation</h2>
<p>Official Varnish packages for Scientific Linux 6 (a distribution that
is binary API compatible with CentOS6 and RHEL6) are outdated. To
install latest Varnish version 4, one can use Varnish's own package
repository:</p>
<pre class="literal-block">
sudo rpm --nosignature -i https://repo.varnish-cache.org/redhat/varnish-4.0.el6.rpm
sudo yum install -y varnish
</pre>
</div>
<div class="section" id="configuration">
<h2>Configuration</h2>
<p>Let us configure Varnish so that it would listen on port 80 and
forward traffic to our web application that runs on port 8080 on the
same machine.</p>
<p>First, let's configure Varnish listening port:</p>
<pre class="literal-block">
sudo perl -pi -e 's,VARNISH_LISTEN_PORT=6081,VARNISH_LISTEN_PORT=80,g' /etc/sysconfig/varnish
</pre>
<p>and make it use more memory while we are at it:</p>
<pre class="literal-block">
sudo perl -pi -e 's,VARNISH_STORAGE_SIZE=256M,VARNISH_STORAGE_SIZE=512M,g' /etc/sysconfig/varnish
</pre>
<p>The web application we are trying to accelerate, in this case CERN
Open Data test instance, runs on top of Apache and listens on port
8080 only and on incoming IP address 127.0.0.1 only:</p>
<pre class="literal-block">
sudo perl -pi -e 's,Listen 80,Listen 8080,g' /etc/httpd/conf/httpd.conf
sudo -u apache perl -pi -e 's,128.142.151.32:80,127.0.0.1:8080,g' /opt/open-data/.virtualenvs/opendata/var/invenio.base-instance/apache/invenio-apache-vhost.conf
</pre>
<p>After restart of Apache and Varnish:</p>
<pre class="literal-block">
sudo /etc/init.d/httpd restart
sudo /etc/init.d/varnish restart
</pre>
<p>we can check that the processes are well listening where they should:</p>
<pre class="literal-block">
$ sudo netstat -lp | grep varnish
tcp 0 0 *:http *:* LISTEN 50690/varnishd
tcp 0 0 localhost:6082 *:* LISTEN 50685/varnishd
tcp 0 0 *:http *:* LISTEN 50690/varnishd
$ sudo netstat -lp | grep httpd
tcp 0 0 *:webcache *:* LISTEN 50592/httpd
unix 2 [ ACC ] STREAM LISTENING 5289212 50592/httpd /opt/open-data/.virtualenvs/opendata/var/run.50592.0.1.sock
</pre>
<p>and that the web client connecting from laptop sees things as it
should:</p>
<pre class="literal-block">
$ curl -I http://opendata.cern.ch/
HTTP/1.1 200 OK
Date: Thu, 18 Sep 2014 19:35:08 GMT
Server: Apache
Content-Type: text/html; charset=utf-8
X-Varnish: 98611 3
Age: 99
Via: 1.1 varnish-v4
Content-Length: 8934
Connection: keep-alive
</pre>
<p>However, due to Varnish proxy, Apache log sees all incoming requests
as coming from 127.0.0.1:</p>
<pre class="literal-block">
$ tail /opt/open-data/.virtualenvs/opendata/var/log/apache.log
127.0.0.1 - - [18/Sep/2014:21:24:15 +0200] "GET /gen/almond.js?5127e506 HTTP/1.1" 304 - "http://opendata.cern.ch/" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 660
127.0.0.1 - - [18/Sep/2014:21:24:15 +0200] "GET /gen/invenio.js?8e21d7fc HTTP/1.1" 304 - "http://opendata.cern.ch/" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 606
127.0.0.1 - - [18/Sep/2014:21:24:15 +0200] "GET /gen/jquery.js?a6392293 HTTP/1.1" 304 - "http://opendata.cern.ch/" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 1019
</pre>
<p>Let's fix it.</p>
</div>
<div class="section" id="mod-rpaf">
<h2>mod_rpaf</h2>
<p>A reverse proxy add forward <a class="reference external" href="https://github.com/gnif/mod_rpaf">mod_rpaf</a> module can help us here.
However, it is not available for RHEL6 out of the box.</p>
<p>One could take it from CentOS 6, the binary API compatible
distribution:</p>
<pre class="literal-block">
sudo rpm -ivh ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/Apache:/Modules/CentOS_CentOS-6/x86_64/mod_rpaf-0.6-1.2.x86_64.rpm
for configoption in "LoadModule rpaf_module modules/mod_rpaf-2.0.so" \
"RPAFenable On" \
"RPAFsethostname On" \
"RPAFproxy_ips 127.0.0.1 ::1" \
"RPAFheader X-Forwarded-For"; do
if ! grep -q "${configoption}" /etc/httpd/conf.d/mod_rpaf.conf; then
echo "${configoption}" | sudo tee -a /etc/httpd/conf.d/mod_rpaf.conf
fi
done
</pre>
<p>We can also easily compile it ourselves:</p>
<pre class="literal-block">
cd /tmp
wget http://www.stderr.net/apache/rpaf/download/mod_rpaf-0.6.tar.gz
tar xvfz mod_rpaf-0.6.tar.gz
cd mod_rpaf-0.6
sudo yum install -y httpd-devel
sudo apxs -i -c -n mod_rpaf-2.0.so mod_rpaf-2.0.c
# gives /usr/lib64/httpd/modules/mod_rpaf-2.0.so
</pre>
<p>Once available, let's configure <tt class="docutils literal">mod_rpaf</tt> as follows:</p>
<pre class="literal-block">
$ sudo vim /etc/httpd/conf.d/mod_rpaf.conf
$ cat /etc/httpd/conf.d/mod_rpaf.conf
LoadModule rpaf_module modules/mod_rpaf-2.0.so
RPAFenable On
RPAFsethostname On
RPAFproxy_ips 127.0.0.1 ::1
RPAFheader X-Forwarded-For
</pre>
<p>After restarting Apache, we see real IP addresses in the apache log:</p>
<pre class="literal-block">
86.209.237.81 - - [18/Sep/2014:21:59:27 +0200] "POST /results/83ce2e1d87cb0b8a190d34e69cba4786 HTTP/1.1" 200 43145 "http://opendata.cern.ch/search" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 490552
86.209.237.81 - - [18/Sep/2014:21:59:27 +0200] "GET /facet/collection/83ce2e1d87cb0b8a190d34e69cba4786?parent=CMS-Derived-Datasets HTTP/1.1" 200 13 "http://opendata.cern.ch/search" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 12762
86.209.237.81 - - [18/Sep/2014:21:59:27 +0200] "POST /results/83ce2e1d87cb0b8a190d34e69cba4786 HTTP/1.1" 200 43145 "http://opendata.cern.ch/search" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 968808
86.209.237.81 - - [18/Sep/2014:21:59:27 +0200] "POST /results/83ce2e1d87cb0b8a190d34e69cba4786 HTTP/1.1" 200 24569 "http://opendata.cern.ch/search" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 957070
86.209.237.81 - - [18/Sep/2014:21:59:27 +0200] "POST /results/83ce2e1d87cb0b8a190d34e69cba4786 HTTP/1.1" 200 24569 "http://opendata.cern.ch/search" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0" 4294376
</pre>
<p>All is well; the basic configuration is finished.</p>
</div>
<div class="section" id="massaging-cookies">
<h2>Massaging cookies</h2>
<p>To profit from the web acceleration, we can speed things by
configuring Varnish to cache everything coming from the backend
application except for the search pages. Let's do this regardless of
what our backend application says. This is because our application
does not offer any user-specific functionality that would
differentiate one guest user from another. Every use is treated
equally, there is no login, no restricted data, etc.</p>
<p>The Invenio back-end application currently does not handle
<tt class="docutils literal"><span class="pre">Set-Cookie</span></tt> very nicely, see <a class="reference external" href="https://github.com/inveniosoftware/invenio/issues/2291">issue #2291</a>. Let us
assume therefore that we need to remove this header from all
application responses, except for search pages.</p>
<p>(Actually, we can cache also the search pages, but let's assume that
we would like to amend cookies coming from the backend application
only for certain URLs. This "advanced" configuration may be needed in
later production.)</p>
<p>Let's start by cloning default configuration:</p>
<pre class="literal-block">
sudo cp /etc/varnish/default.vcl /etc/varnish/opendata.vcl
sudo vim /etc/varnish/opendata.vcl
</pre>
<p>Let's introduce the following differences, where we basically unset
<tt class="docutils literal">req.http.Cookie</tt> and <tt class="docutils literal"><span class="pre">beresp.http.set-cookie</span></tt> for all pages
except our wanted <tt class="docutils literal">/search</tt> URL:</p>
<pre class="literal-block">
sudo diff -u /etc/varnish/default.vcl /etc/varnish/opendata.vcl
--- /etc/varnish/default.vcl 2014-06-24 11:40:31.000000000 +0200
+++ /etc/varnish/opendata.vcl 2014-09-18 21:30:20.940161421 +0200
@@ -23,6 +23,10 @@
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
+
+ if (!(req.url ~ "^/search")) {
+ unset req.http.Cookie;
+ }
}
sub vcl_backend_response {
@@ -30,6 +34,11 @@
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
+
+ if (!(bereq.url ~ "^/search")) {
+ unset beresp.http.set-cookie;
+ set beresp.ttl = 1h;
+ }
}
sub vcl_deliver {
</pre>
<p>We can activate new configuration like this:</p>
<pre class="literal-block">
sudo perl -pi -e 's,VARNISH_VCL_CONF=/etc/varnish/default.vcl,VARNISH_VCL_CONF=/etc/varnish/opendata.vcl,g' /etc/sysconfig/varnish
</pre>
</div>
<div class="section" id="performance-measurements">
<h2>Performance measurements</h2>
<p>Let's measure the response time speed up via Apache <tt class="docutils literal">ab</tt> tool. The
old configuration gives:</p>
<pre class="literal-block">
laptop> ab -n 100 -c 5 http://opendata.cern.ch/
Requests per second: 32.38 [#/sec] (mean)
</pre>
<p>Restarting varnish with the new configuration gives:</p>
<pre class="literal-block">
$ sudo service varnish restart
$ ab -n 100 -c 5 http://opendata.cern.ch/
Requests per second: 52.36 [#/sec] (mean)
</pre>
<p>We can serve 52 reqs/sec vs 32 reqs/sec. This does not seem much in
terms of increase, but this measurement was done over a slow ADSL line
which limits the throughput somewhat.</p>
<p>Here is throughput comparison on the server itself:</p>
<pre class="literal-block">
$ ab -n 100 -c 5 http://127.0.0.1:80/
Total transferred: 946000 bytes
HTML transferred: 923100 bytes
Requests per second: 2156.52 [#/sec] (mean)
Time per request: 2.319 [ms] (mean)
Time per request: 0.464 [ms] (mean, across all concurrent requests)
Transfer rate: 19922.54 [Kbytes/sec] received
$ ab -n 100 -c 5 http://127.0.0.1:8080/
Total transferred: 931641 bytes
HTML transferred: 909141 bytes
Requests per second: 72.28 [#/sec] (mean)
Time per request: 69.175 [ms] (mean)
Time per request: 13.835 [ms] (mean, across all concurrent requests)
Transfer rate: 657.61 [Kbytes/sec] received
#+END_EXAMPLE
</pre>
<p>We are much, much faster; 21k reqs/sec vs 72 reqs/sec.</p>
</div>
<div class="section" id="slashdot-effect">
<h2>Slashdot effect</h2>
<p>Let's try to increase the number of client connections and observe
response times when simulating 5 and 100 concurrent users:</p>
<pre class="literal-block">
laptop> ab -n 100 -c 5 http://opendata.cern.ch/
Requests per second: 52.36 [#/sec] (mean)
laptop> ab -n 1000 -c 100 http://opendata.cern.ch/
Requests per second: 57.78 [#/sec] (mean)
</pre>
<p>The cache can easily serve such increased traffic, because the pages
are served from memory via efficient event-driver model.</p>
<p>Note that proper user scalability test would require distributed
testing with some backend heat processes, e.g. via <a class="reference external" href="https://packages.debian.org/sid/siege">siege</a>. However we are interested
here in a rule of thumb only.</p>
</div>
<div class="section" id="reboot-persistent-configuration">
<h2>Reboot-persistent configuration</h2>
<p>How to make Varnish run after reboot:</p>
<pre class="literal-block">
$ sudo chkconfig | grep http
httpd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ sudo chkconfig | grep varnish
varnish 0:off 1:off 2:off 3:off 4:off 5:off 6:off
varnishlog 0:off 1:off 2:off 3:off 4:off 5:off 6:off
varnishncsa 0:off 1:off 2:off 3:off 4:off 5:off 6:off
$ sudo chkconfig varnish on
$ sudo chkconfig | grep varnish
varnish 0:off 1:off 2:on 3:on 4:on 5:on 6:off
varnishlog 0:off 1:off 2:off 3:off 4:off 5:off 6:off
varnishncsa 0:off 1:off 2:off 3:off 4:off 5:off 6:off
</pre>
</div>
<div class="section" id="nicer-error-page">
<h2>Nicer error page</h2>
<p>By default the Varnish error page is not "user-friendly-nice".
E.g. stop Apache and observe "Error 503 Backend fetch failed".</p>
<p>To make the error page simpler and to hide Varnish server signature,
we can edit <tt class="docutils literal">vcl_backend_error</tt>:</p>
<pre class="literal-block">
$ sudo vim /etc/varnish/opendata.vcl
$ sudo diff -u /etc/varnish/default.vcl /etc/varnish/opendata.vcl
[...]
+
+sub vcl_backend_error {
+ set beresp.http.Content-Type = "text/html; charset=utf-8";
+ set beresp.http.Retry-After = "5";
+ synthetic( {"<!DOCTYPE html>
+<html>
+ <head>
+ <title>"} + beresp.status + " " + beresp.reason + {"</title>
+ </head>
+ <body>
+ <h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
+ </body>
+</html>
+"} );
+ return(deliver);
+}
</pre>
</div>
<div class="section" id="logging">
<h2>Logging</h2>
<p>To enable logging of incoming queries on the Varnish level, do:</p>
<pre class="literal-block">
$ sudo /etc/init.d/varnishncsa start
$ cat /var/log/varnish/varnishncsa.log
128.141.95.173 - - [05/Nov/2014:13:09:15 +0100] "GET http://opendata.cern.ch/ HTTP/1.1" 200 11612 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0"
128.141.95.173 - - [05/Nov/2014:13:09:15 +0100] "GET http://opendata.cern.ch/gen/opendata.css?eb2f0489 HTTP/1.1" 200 0 "http://opendata.cern.ch/" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0"
128.141.95.173 - - [05/Nov/2014:13:09:15 +0100] "GET http://opendata.cern.ch/gen/invenio.css?56a680c2 HTTP/1.1" 200 0 "http://opendata.cern.ch/" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Iceweasel/31.1.0"
128.141.164.203 - - [05/Nov/2014:13:09:39 +0100] "GET http://opendata.cern.ch/ HTTP/1.1" 200 11597 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0"
</pre>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p>Varnish is a widely used HTTP accelerator for web applications. The
use for the CERN Open Data portal seems perfectly plausible. One can
relatively easily configure it to amend <tt class="docutils literal"><span class="pre">Set-Cookie</span></tt> for certain
pages in case of (buggy) web application. The setup on the SLC6
platform seems stable under heavy load.</p>
</div>
Local Handling of GitHub Pull Requests2014-06-24T10:36:00+02:002014-06-24T10:36:00+02:00Tibor Šimkotag:tiborsimko.org,2014-06-24:/github-local-handling-of-pull-requests.html<p>If a project uses GitHub repository and receives many pull requests
coming from many developers, it is not necessary to add remote
repository for every developer to one's local checkout in order to
work with pull requests locally. By tweaking one's <tt class="docutils literal">.git/config</tt>
one can automatically receive all upstream pull …</p><p>If a project uses GitHub repository and receives many pull requests
coming from many developers, it is not necessary to add remote
repository for every developer to one's local checkout in order to
work with pull requests locally. By tweaking one's <tt class="docutils literal">.git/config</tt>
one can automatically receive all upstream pull requests as remote
<tt class="docutils literal">upstream/pr</tt> branches.</p>
<!-- PELICAN_END_SUMMARY --><p>As an example, let's take <a class="reference external" href="http://github.com/inveniosoftware/invenio">Invenio</a> project whose
<tt class="docutils literal">upstream</tt> usually points to
<tt class="docutils literal">git@github.com:inveniosoftware/invenio.git</tt>. If one edits
<tt class="docutils literal">.git/config</tt> of the local checkout to add a new "pull" line to the
fetch directive in the following way:</p>
<pre class="literal-block">
[remote "upstream"]
url = git@github.com:inveniosoftware/invenio.git
fetch = +refs/heads/*:refs/remotes/upstream/*
fetch = +refs/pull/*/head:refs/remotes/upstream/pr/*
</pre>
<p>then running the usual <tt class="docutils literal">git fetch upstream</tt> will now regularly bring
all pull request updates:</p>
<pre class="literal-block">
$ git fetch upstream
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 13 (delta 1), reused 6 (delta 1)
Unpacking objects: 100% (13/13), done.
From github.com:inveniosoftware/invenio
b0b9220..17054ee pu -> upstream/pu
+ 63c7c21...0ec9487 refs/pull/1744/head -> upstream/pr/1744 (forced update)
e3a98eb..cdf4e16 refs/pull/1765/head -> upstream/pr/1765
+ bd7e369...b030ec4 refs/pull/1807/head -> upstream/pr/1807 (forced update)
</pre>
<p>so that one can do the usual:</p>
<pre class="literal-block">
$ git log master..upstream/pr/1744 --stat
</pre>
<p>to inspect, test, review, cherry-pick, amend, merge and otherwise work
with the pull request #1744 locally, without having to add the
individual developer repository as remote.</p>
<p><em>(Thanks to Jiří Kunčar for the tip.)</em></p>
Quick Jumping Around Project Files in Emacs2014-05-27T18:00:00+02:002014-05-27T18:00:00+02:00Tibor Šimkotag:tiborsimko.org,2014-05-27:/emacs-find-file-in-repository.html<p>Imagine you are hacking on a project, working on some file, and you
would like to quickly open another file of the same project, located
in a completely different and possibly nested subdirectory. How to do
this efficiently in Emacs?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="usual-solutions-are-not-fast-enough">
<h2>Usual solutions are not fast enough</h2>
<p>There are several ways …</p></div><p>Imagine you are hacking on a project, working on some file, and you
would like to quickly open another file of the same project, located
in a completely different and possibly nested subdirectory. How to do
this efficiently in Emacs?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="usual-solutions-are-not-fast-enough">
<h2>Usual solutions are not fast enough</h2>
<p>There are several ways how to address the task of "jumping around
files": (a) use <strong>find-file</strong> or <strong>eshell</strong> and type the desired file
name by heart, using auto-complete of pathnames; (b) use <strong>dired</strong> and
jump between directories via caret etc; (c) use <strong>cedet</strong> or similar
project file browser to jump between files or functions via
sidebar; (d) use <strong>tags</strong> to jump to wanted function definition
location; etc. However, the listed techniques are either relatively
slow or work only for tagged source code, while we'd like to jump to a
random lambda file of the project.</p>
</div>
<div class="section" id="find-file-in-repository-meets-ido-ubiquitous">
<h2>find-file-in-repository meets ido-ubiquitous</h2>
<p>The way I do this more efficiently is via <a class="reference external" href="https://github.com/hoffstaetter/find-file-in-repository">find-file-in-repository</a> combined
with <a class="reference external" href="https://github.com/technomancy/ido-ubiquitous">ido-ubiquitous</a>
fuzzy completion. (Together with <a class="reference external" href="https://github.com/gempesaw/ido-vertical-mode.el">ido-vertical-mode</a> for extra eye
candy.)</p>
<p>(Note that find-file-in-repository works with projects using git,
mercurial, darcs, bazaar, monotone, or svn source code management
system.)</p>
<p>Here is one possible configuration of these packages:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">require</span> <span class="ss">'find-file-in-repository</span><span class="p">)</span>
<span class="p">(</span><span class="nb">require</span> <span class="ss">'ido</span><span class="p">)</span>
<span class="p">(</span><span class="nb">require</span> <span class="ss">'ido-ubiquitous</span><span class="p">)</span>
<span class="p">(</span><span class="nb">require</span> <span class="ss">'ido-vertical-mode</span><span class="p">)</span>
<span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">"C-x f"</span><span class="p">)</span> <span class="ss">'find-file-in-repository</span><span class="p">)</span>
<span class="p">(</span><span class="nv">ido-ubiquitous-mode</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nv">ido-vertical-mode</span><span class="p">)</span>
<span class="p">(</span><span class="k">setq</span> <span class="nv">ido-vertical-define-keys</span> <span class="ss">'C-n-C-p-up-down-left-right</span><span class="p">)</span>
</pre></div>
<p>The quick opening of project files is hooked under <tt class="docutils literal"><span class="pre">C-x</span> f</tt> keyboard
shortcut to create an analogy with the usual <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-f</span></tt> file opening
shortcut.</p>
</div>
<div class="section" id="example">
<h2>Example</h2>
<p>Consider we are working on, say, fuzzy tokeniser in <a class="reference external" href="http://invenio-sofware.org/">Invenio</a> and we would like to quickly open,
say, search engine API documentation. In other words, we are
editing:</p>
<pre class="literal-block">
modules/bibindex/lib/tokenizers/BibIndexAuthorTokenizer.py
</pre>
<p>and we would like to jump to:</p>
<pre class="literal-block">
modules/websearch/doc/hacking/search-engine-api.webdoc
</pre>
<p>Here is the key sequence to achieve this:</p>
<pre class="literal-block">
C-x f s e a RET
</pre>
<p>It takes only five keystrokes, thanks to fuzzy ido mapping of <tt class="docutils literal">s e
a</tt> to mean <tt class="docutils literal"><span class="pre">search-engine-api.webdoc</span></tt> based on starting letters of
the file name. Since there is no other file named like this, ido
offers search engine API guide as the first auto-complete hit, so it
is sufficient to press <tt class="docutils literal">RET</tt> without further ado.</p>
</div>
<div class="section" id="live-demo">
<h2>Live demo</h2>
<img alt="Emacs find-file-in-repository" src="http://tiborsimko.org/images/emacs-find-file-in-repository.gif" />
</div>
<div class="section" id="learning-from-the-past">
<h2>Learning from the past</h2>
<p>Morever, ido learns from user's past actions and behaviour, so if you
happen to edit more frequently <tt class="docutils literal">websearch_templates.py</tt> rather than
<tt class="docutils literal">websubmit_templates.py</tt>, because you are working more with the
Search component rather than the Submit component, then typing:</p>
<pre class="literal-block">
C-x f wstempl RET
</pre>
<p>will bring up the "correct" search file rather then the submit file.
In other words, <tt class="docutils literal">w s</tt> can stand either for <tt class="docutils literal">websearch</tt> or
<tt class="docutils literal">websubmit</tt>, depending on your past behaviour. Neat, isn't it?</p>
</div>
Sending Encrypted Messages with GnuPG2014-04-29T18:00:00+02:002014-04-29T18:00:00+02:00Tibor Šimkotag:tiborsimko.org,2014-04-29:/gpg-email.html<p>GnuPG is useful to encrypt sensitive information. It can be used in
many diverse scenarios, including (1) encrypting sensitive files
located on your computer, to be consumed by yourself; (2) encrypting
sensitive information to send to your friends via email, to be
consumed by others. Here is a detailed recipe …</p><p>GnuPG is useful to encrypt sensitive information. It can be used in
many diverse scenarios, including (1) encrypting sensitive files
located on your computer, to be consumed by yourself; (2) encrypting
sensitive information to send to your friends via email, to be
consumed by others. Here is a detailed recipe for the second use
case.</p>
<!-- PELICAN_END_SUMMARY --><p>Consider you have some sensitive information that you would like to
send me and that you would like to encrypt in a way that nobody else
can read.</p>
<p>Firstly, check whether you have GnuPG installed, and eventually
install it:</p>
<pre class="literal-block">
$ which gpg
$ sudo apt-get install gnupg
</pre>
<p>Secondly, you should look up and import my GnuPG public key:</p>
<pre class="literal-block">
$ gpg --keyserver pool.sks-keyservers.net --search tibor@simko.info
</pre>
<p>When my key is found, you will be offered an option to import it to
your keyring.</p>
<p>Alternatively, on the <a class="reference external" href="http://tiborsimko.org/pages/about.html">about</a> page of my
blog, you'll see that my GnuPG key is 0xBA5A2B67, so that you can
import it directly via the following one-liner:</p>
<pre class="literal-block">
$ gpg --keyserver pool.sks-keyservers.net --recv-key 0xBA5A2B67
</pre>
<p>Thirdly, if your favourite Mail User Agent supports PGP/GPG natively,
then you can follow your MUA's instructions on how to encrypt messages
for given recipients, and you are done.</p>
<p>Alternatively, you can also simply create a file containing given
sensitive information and encrypt it locally and then send it to me as
a regular email attachment. Just use your favourite editor to create
the file containing sensitive information:</p>
<pre class="literal-block">
$ vim secrets.txt
</pre>
<p>Now encrypt this file with me as the recipient:</p>
<pre class="literal-block">
$ gpg -a -r tibor@simko.info -e secrets.txt
</pre>
<p>This will create an encrypted file <tt class="docutils literal">secrets.txt.asc</tt> that will look like this:</p>
<pre class="literal-block">
$ head -10 secrets.txt.asc
-----BEGIN PGP MESSAGE-----
Version: GnuPG v1
hQIOA7kDrqeaBLJPEAf+N0brfbNomGt/53F+NpilHxE7be1kXUWSgau3QK4ME8ZN
2/eDheBy4iyVvTkBUHfjAdHtItROQU5/YdSZ/z8CAHOvShfqSJZB0J4cGCt+BmWw
TnYcmuWTphDgLqHLjzmmILRPfBOKffAx9hTFuVo5FgLyuYBwgJQ0QAVpkteJYbtA
kEsOy9H8FXlM+C7OuNGUwfyxsIWOXVxOkue1/Btu9s2Xi5s2qoBl1SbdF9aV603X
/XRhvYdkwQuGqGS2bl6QACSr/POM0gjL4Q5A6tZUCviG1jUqgGenel55flCIovwk
jYhEGaOkAyGYqsn9lZbBI92UwLnCZ75vAcwC4Q7wWgf6AxC8EqMS+wMXSbSM6YvZ
NYW+9sDb5TnrM7JOVvHkJfM/CbbdKxYPjBq7wf7skLATxLVxlWEf2hg7HP9y1ugO
</pre>
<p>You can send this file to me in an otherwise plain text email; even
from command-line, if you want:</p>
<pre class="literal-block">
$ echo "hi, here is that information you wanted" | \
mutt -a secrets.txt.asc -s greetings -- tibor@simko.info
</pre>
<p>Done. Only I shall be able to decode the sensitive part of the
message, even if others would get to see it.</p>
SLC Kernel OpenAFS Upgrade Enforcement2014-01-21T11:26:00+01:002014-01-21T11:26:00+01:00Tibor Šimkotag:tiborsimko.org,2014-01-21:/slc-kernel-openafs-upgrade-enforcement.html<p>When running vanilla Scientific Linux CERN distribution, i.e. without
Puppet or Quattor to manage the node, then the regular upgrading of
the operating system can lead to a kernel dependency problem due to
the openafs module. Here is a recipe on how to fix it.</p>
<!-- PELICAN_END_SUMMARY --><p>Consider doing a regular …</p><p>When running vanilla Scientific Linux CERN distribution, i.e. without
Puppet or Quattor to manage the node, then the regular upgrading of
the operating system can lead to a kernel dependency problem due to
the openafs module. Here is a recipe on how to fix it.</p>
<!-- PELICAN_END_SUMMARY --><p>Consider doing a regular OS update of a CERN VMM powered box:</p>
<pre class="literal-block">
yum update -y
</pre>
<p>If this gives a kernel dependency trouble of the following kind:</p>
<pre class="literal-block">
ERROR with rpm_check_debug vs depsolve:
package kernel-2.6.32-358.23.2.el6.x86_64 (which is newer than kernel-2.6.32-358.14.1.el6.x86_64) is already installed
package kernel-2.6.32-431.el6.x86_64 (which is newer than kernel-2.6.32-358.14.1.el6.x86_64) is already installed
package kernel-2.6.32-431.1.2.el6.x86_64 (which is newer than kernel-2.6.32-358.14.1.el6.x86_64) is already installed
package kernel-debug-2.6.32-431.1.2.el6.x86_64 (which is newer than kernel-debug-2.6.32-431.el6.x86_64) is already installed
</pre>
<p>then (and only then!) one should force openafs kernel module on and
off temporarily in order to solve the dependency problem:</p>
<pre class="literal-block">
rpm -e --nodeps $(rpm -qa kernel-module-openafs-\*)
yum erase -y openafs-client
yum update -y
yum install -y openafs-client kernel-module-openafs
chkconfig afs on
</pre>
<p>After which one can reboot into the new kernel:</p>
<pre class="literal-block">
shutdown -r now
</pre>
<p>P.S. Inspiration taken from <a class="reference external" href="https://wiki.physik.uni-bonn.de/atlas/public/index.php/CernComputerSetupForAtlas">CernComputerSetupForAtlas</a>.</p>
Building Vagrant Boxes with Veewee2013-12-15T10:00:00+01:002013-12-15T10:00:00+01:00Tibor Šimkotag:tiborsimko.org,2013-12-15:/vagrant-veewee.html<p><a class="reference external" href="https://github.com/jedi4ever/veewee">Veewee</a> is a useful tool
helping to build <a class="reference external" href="http://www.vagrantup.com/">Vagrant</a> boxes for
virtualised development. Here is how I build Debian, Scientific
Linux, CentOS, and FreeBSD boxes.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installing-prerequisites">
<h2>Installing prerequisites</h2>
<p>If you do not have Vagrant installed yet, you should start with that:</p>
<pre class="literal-block">
sudo apt-get install vagrant
</pre>
<p>You may also need to enable …</p></div><p><a class="reference external" href="https://github.com/jedi4ever/veewee">Veewee</a> is a useful tool
helping to build <a class="reference external" href="http://www.vagrantup.com/">Vagrant</a> boxes for
virtualised development. Here is how I build Debian, Scientific
Linux, CentOS, and FreeBSD boxes.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installing-prerequisites">
<h2>Installing prerequisites</h2>
<p>If you do not have Vagrant installed yet, you should start with that:</p>
<pre class="literal-block">
sudo apt-get install vagrant
</pre>
<p>You may also need to enable VT-x virtualisation in BIOS in order to be
able to run Vagrant with VirtualBox on your laptop.</p>
<p>Let's now install veewee:</p>
<pre class="literal-block">
sudo apt-get install ruby-dev
sudo gem install veewee
</pre>
</div>
<div class="section" id="building-debian-7-box">
<h2>Building Debian-7 box</h2>
<p>Debian box type is already available:</p>
<pre class="literal-block">
veewee vbox templates | grep Debian
</pre>
<p>so let's clone it:</p>
<pre class="literal-block">
veewee vbox define debian7-amd64 'Debian-7.2.0-amd64-netboot'
</pre>
<p>Now I've edited the template, because I had already downloaded Debian
ISO image file previously; so I opened definitions:</p>
<pre class="literal-block">
vim definitions/debian7-amd64/definition.rb
</pre>
<p>and replaced ISO image name <tt class="docutils literal"><span class="pre">debian-7.2.0-amd64-netinst.iso</span></tt> by the
one I downloaded <tt class="docutils literal"><span class="pre">debian-7.2.0-amd64-i386-netinst.iso</span></tt> and placed in
<tt class="docutils literal">iso</tt> subdirectory.</p>
<p>Optionally, one can also download VirtualBox guest additions:</p>
<pre class="literal-block">
sudo aptitude install virtualbox-guest-additions-iso
wget http://download.virtualbox.org/virtualbox/4.2.16/VBoxGuestAdditions_4.2.16.iso
</pre>
<p>and again place it in the <tt class="docutils literal">iso</tt> subdirectory.</p>
<p>Now let's build the base box:</p>
<pre class="literal-block">
veewee vbox build 'debian7-amd64'
</pre>
<p>Depending on the laptop specs and the network speed, the box will be
built in about twenty minutes.</p>
<p>Export it for vagrant via:</p>
<pre class="literal-block">
vagrant package --base debian7-amd64 --output debian7-amd64.box
</pre>
<p>Now it can be used with vagrant in the usual manner; we can add it
via:</p>
<pre class="literal-block">
vagrant box add debian7-amd64 ./debian7-amd64.box
</pre>
<p>and test it out by creating and connecting to a new Debian virtual
machine:</p>
<pre class="literal-block">
mkdir -p ~/private/vagrant/test-vm-debian7
cd ~/private/vagrant/test-vm-debian7
vagrant init
cat > Vagrantfile <<EOF1
Vagrant.configure("2") do |config|
config.vm.box = "precise64"
config.vm.hostname = 'localhost.localdomain'
config.vm.network :forwarded_port, host: 8080, guest: 8080
config.vm.network :forwarded_port, host: 8443, guest: 8443
end
EOF1
vagrant up
vagrant ssh
</pre>
<p>We are done.</p>
</div>
<div class="section" id="building-scientificlinux-6-4-box">
<h2>Building ScientificLinux-6.4 box</h2>
<p>Here is a complete compact recipe on how to build a Scientific Linux
box:</p>
<pre class="literal-block">
veewee vbox templates | grep scientificlinux
veewee vbox define sl6-x86_64 'scientificlinux-6.4-x86_64-netboot'
curl -C - -L 'http://ftp.heanet.ie/pub/rsync.scientificlinux.org/6.4/x86_64/iso/SL-64-x86_64-2013-03-18-boot.iso' -o 'iso/SL-64-x86_64-2013-03-18-boot.iso'
md5sum 'iso/SL-64-x86_64-2013-03-18-boot.iso'
veewee vbox build 'sl6-x86_64'
# fails at the very end due to chef needing ruby >= 1.9.2 so let's remove chef/puppet and restart...
vim definitions/sl6-x86_64/definition.rb # edit :postinstall_files to comment out chef and puppet loading
veewee vbox build 'sl6-x86_64' --force # restart vbox creation
# now it succeeded
vagrant package --base sl6-x86_64 --output sl6-x86_64.box
vagrant box add sl6-x86_64 ./sl6-x86_64.box
</pre>
</div>
<div class="section" id="building-centos-6-4-box">
<h2>Building CentOS-6.4 box</h2>
<p>The same recipe obviously works to build CentOS too, since they both
stem from RHEL:</p>
<pre class="literal-block">
veewee vbox define centos6-x86_64 'CentOS-6.4-x86_64-netboot'
vim definitions/centos6-x86_64/definition.rb # comment away chef and puppet
veewee vbox build 'centos6-x86_64' # press `No' so that we download manually
curl -C - -L 'http://www.mirrorservice.org/sites/mirror.centos.org/6.4/isos/x86_64/CentOS-6.4-x86_64-netinstall.iso' -o 'iso/CentOS-6.4-x86_64-netinstall.iso'
md5sum 'iso/CentOS-6.4-x86_64-netinstall.iso'
veewee vbox build 'centos6-x86_64' # now go on
vagrant package --base centos6-x86_64 --output centos6-x86_64.box
vagrant box add centos6-x86_64 ./centos6-x86_64.box
</pre>
</div>
<div class="section" id="building-freebsd-9-1-box">
<h2>Building FreeBSD-9.1 box</h2>
<p>I've also build FreeBSD box using the same approach. When doing so,
the installation was timing out; so I had to add more <tt class="docutils literal">Wait</tt>
statements in the box definition:</p>
<pre class="literal-block">
$ vim definitions/freebsd-9.1/definition.rb
$ less definitions/freebsd-9.1/definition.rb
[...]
'<Enter>',
'<Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait>',
'<Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait><Wait>',
'/bin/sh<Enter>',
'mdmfs -s 100m md1 /tmp<Enter>',
[...]
</pre>
<p>after which the build succeeded.</p>
</div>
JSON Select Speed Test with MongoDB and PostgreSQL2013-10-17T11:02:00+02:002013-10-17T11:02:00+02:00Tibor Šimkotag:tiborsimko.org,2013-10-17:/postgresql-mongodb-json-select-speed.html<p>MongoDB is a popular JSON database. PostgreSQL has added nice JSON
capabilities lately. How do the two databases compare in terms of
JSON select speed when using Python connectors? Let's find out.</p>
<!-- PELICAN_END_SUMMARY --><p>The test was performed on HP EliteBook 8440p using Python 2.7, MongoDB
2.4.8 with pymongo …</p><p>MongoDB is a popular JSON database. PostgreSQL has added nice JSON
capabilities lately. How do the two databases compare in terms of
JSON select speed when using Python connectors? Let's find out.</p>
<!-- PELICAN_END_SUMMARY --><p>The test was performed on HP EliteBook 8440p using Python 2.7, MongoDB
2.4.8 with pymongo 2.6.3, and PostgreSQL 9.3 with psycopg2 2.5.1. All
in their default Debian GNU/Linux configurations.</p>
<div class="section" id="test-data-set">
<h2>Test data set</h2>
<p>A test data set consisted of 50,000 JSON records representing book
metadata, looking like this:</p>
<pre class="literal-block">
[...]
{
"recid": 1494701,
"title": "The Feynman lectures on physics; New millennium ed.",
"author": "Feynman, Richard Phillips",
"isbn": "9780465024933",
"subject": ["53"],
"url": "http://www.feynmanlectures.caltech.edu/I_toc.html",
"publisher": "Basic Books",
"place": "New York, NY",
"year": "2010"
},
[...]
</pre>
<p>The test queries will be performed on the year field which is
represented as a string.</p>
</div>
<div class="section" id="database-definitions">
<h2>Database definitions</h2>
<p>The databases were created as follows:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">psycopg2</span>
<span class="kn">from</span> <span class="nn">pymongo</span> <span class="kn">import</span> <span class="n">MongoClient</span>
<span class="n">pg_con</span> <span class="o">=</span> <span class="n">psycopg2</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s2">"dbname=simko user=simko"</span><span class="p">)</span>
<span class="n">pg_cur</span> <span class="o">=</span> <span class="n">pg_con</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span>
<span class="n">mg_db</span> <span class="o">=</span> <span class="n">MongoClient</span><span class="p">()</span><span class="o">.</span><span class="n">test_database</span>
<span class="n">mg_col</span> <span class="o">=</span> <span class="n">mg_db</span><span class="o">.</span><span class="n">test_collection</span>
<span class="k">def</span> <span class="nf">mongodb_create_table</span><span class="p">():</span>
<span class="n">mg_db</span><span class="o">.</span><span class="n">create_collection</span><span class="p">(</span><span class="s1">'test_collection'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">mongodb_drop_table</span><span class="p">():</span>
<span class="n">mg_db</span><span class="o">.</span><span class="n">drop_collection</span><span class="p">(</span><span class="s1">'test_collection'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">postgresql_create_table</span><span class="p">():</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"CREATE TABLE test (recid INTEGER PRIMARY KEY, data json);"</span><span class="p">)</span>
<span class="n">pg_con</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">postgresql_drop_table</span><span class="p">():</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"DROP TABLE IF EXISTS test;"</span><span class="p">)</span>
<span class="n">pg_con</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
</pre></div>
<p>In some tests, dedicated year index will be used:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_create_index</span><span class="p">():</span>
<span class="n">mg_col</span><span class="o">.</span><span class="n">create_index</span><span class="p">(</span><span class="s1">'year'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">mongodb_drop_index</span><span class="p">():</span>
<span class="n">mg_col</span><span class="o">.</span><span class="n">drop_index</span><span class="p">(</span><span class="s1">'year_1'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">postgresql_create_index</span><span class="p">():</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"CREATE INDEX year ON test ((data->>'year'));"</span><span class="p">)</span>
<span class="n">pg_con</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">postgresql_drop_index</span><span class="p">():</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"DROP INDEX year;"</span><span class="p">)</span>
<span class="n">pg_con</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
</pre></div>
</div>
<div class="section" id="database-connections">
<h2>Database connections</h2>
<p>One possibly important factor to consider is that PyMongo returns
strings as Python Unicode strings while psycopg2 can return either
Python binary UTF-8 strings or Python Unicode strings, depending on
the wanted settings. Therefore let us perform two series of tests,
once letting psycopg2 to return Python binary strings and once forcing
psycopg2 to return Unicode strings. The latter can be achieved
globally via:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">psycopg2.extensions</span>
<span class="n">psycopg2</span><span class="o">.</span><span class="n">extensions</span><span class="o">.</span><span class="n">register_type</span><span class="p">(</span><span class="n">psycopg2</span><span class="o">.</span><span class="n">extensions</span><span class="o">.</span><span class="n">UNICODE</span><span class="p">)</span>
<span class="n">psycopg2</span><span class="o">.</span><span class="n">extensions</span><span class="o">.</span><span class="n">register_type</span><span class="p">(</span><span class="n">psycopg2</span><span class="o">.</span><span class="n">extensions</span><span class="o">.</span><span class="n">UNICODEARRAY</span><span class="p">)</span>
</pre></div>
</div>
<div class="section" id="database-statistics">
<h2>Database statistics</h2>
<p>Database statistics, such as table size, index size, and number of
rows, can be obtained for the two databases as follows:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_stats</span><span class="p">():</span>
<span class="k">print</span> <span class="s2">"** mongodb stats"</span>
<span class="k">print</span> <span class="s2">"count </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">database</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="s1">'collstats'</span><span class="p">,</span> <span class="s1">'test_collection'</span><span class="p">)[</span><span class="s1">'count'</span><span class="p">]</span>
<span class="k">print</span> <span class="s2">"size </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">database</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="s1">'collstats'</span><span class="p">,</span> <span class="s1">'test_collection'</span><span class="p">)[</span><span class="s1">'size'</span><span class="p">]</span>
<span class="k">print</span> <span class="s2">"storage size </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">database</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="s1">'collstats'</span><span class="p">,</span> <span class="s1">'test_collection'</span><span class="p">)[</span><span class="s1">'storageSize'</span><span class="p">]</span>
<span class="k">print</span> <span class="s2">"index size </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">database</span><span class="o">.</span><span class="n">command</span><span class="p">(</span><span class="s1">'collstats'</span><span class="p">,</span> <span class="s1">'test_collection'</span><span class="p">)[</span><span class="s1">'totalIndexSize'</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">postgresql_stats</span><span class="p">():</span>
<span class="k">print</span> <span class="s2">"** postgresql stats"</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"select pg_relation_size('test'),pg_total_relation_size('test')"</span><span class="p">)</span>
<span class="n">pg_relation_size</span><span class="p">,</span> <span class="n">pg_total_relation_size</span> <span class="o">=</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchone</span><span class="p">()</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"select count(*) from test"</span><span class="p">)</span>
<span class="n">pg_row_count</span> <span class="o">=</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchone</span><span class="p">()</span>
<span class="k">print</span> <span class="s2">"count </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">pg_row_count</span>
<span class="k">print</span> <span class="s2">"table storage size </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="n">pg_relation_size</span>
<span class="k">print</span> <span class="s2">"index size </span><span class="si">%d</span><span class="s2">"</span> <span class="o">%</span> <span class="p">(</span><span class="n">pg_total_relation_size</span> <span class="o">-</span> <span class="n">pg_relation_size</span><span class="p">)</span>
</pre></div>
</div>
<div class="section" id="timing-helpers">
<h2>Timing helpers</h2>
<p>To measure the speed of (slow) data loading step, I used IPython's
<tt class="docutils literal">%time</tt> facility.</p>
<p>To measure the speed of (fast) select queries, I used IPython's
<tt class="docutils literal">%timeit</tt> facility.</p>
</div>
<div class="section" id="test-1-json-loading">
<h2>Test 1: JSON loading</h2>
<p>The test data set of 50,000 JSON records was loaded as follows:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_load_books</span><span class="p">():</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s1">'books.json'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()):</span>
<span class="n">mg_col</span><span class="o">.</span><span class="n">insert</span><span class="p">(</span><span class="n">book</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">postgresql_load_books</span><span class="p">():</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s1">'books.json'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()):</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s1">'INSERT INTO test (recid,data) VALUES (</span><span class="si">%s</span><span class="s1">,</span><span class="si">%s</span><span class="s1">);'</span><span class="p">,</span> <span class="p">(</span><span class="n">book</span><span class="p">[</span><span class="s1">'recid'</span><span class="p">],</span> <span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">book</span><span class="p">)))</span>
<span class="n">pg_con</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
</pre></div>
<p>It may be interesting to note that internally <tt class="docutils literal">json.loads()</tt> parses
records already as Unicode strings:</p>
<pre class="literal-block">
{u'author': u'Rodi, Wolfgang',
u'isbn': u'0080445446',
u'place': u'San Diego, CA',
u'publisher': u'Elsevier',
u'recid': 6904,
u'title': u'Engineering Turbulence Modelling and Experiments 6: ERCOFTAC International Symposium on Engineering Turbulence and Measurements - ETMM6',
u'url': u'https://cds.cern.ch/auth.py?r=EBLIB_P_318118_0',
u'year': u'2005'}
</pre>
<p>These are stored as UTF-8 strings in MongoDB and PostgreSQL.</p>
<p>Here are resulting database sizes:</p>
<pre class="literal-block">
============== ========== ==========
database MongoDB PostgreSQL
-------------- ---------- ----------
item count [#] 50,000 50,000
table size [B] 14,219,776 15,269,888
index size [B] 1,627,024 1,171,456
w/ year [B] 2,747,136 2,310,144
============== ========== ==========
</pre>
<p>We can see that both MongoDB and PostgreSQL lead to databases of
relatively similar size. Note the extra column in PostgreSQL which
leads to slightly larger table.</p>
<p>Here are timing results of the data loading step:</p>
<pre class="literal-block">
========= ========== ==========
operation MongoDB PostgreSQL
--------- ---------- ----------
data load 9.470 sec 2.570 sec
========= ========== ==========
</pre>
<p>The data load was significantly faster with PostgreSQL; this is
probably due to the delayed commit statement at the end of the load
process, leading to batch-like insertion of all the books with
PostgreSQL, while the inserts were being done in a book-by-book manner
with MongoDB. Introducing an explicit PostgreSQL commit after each
insert would make PostgreSQL slower. However I'm not especially
interested in exploring data set loading timings here, as data set
select timings later; hence these were just tangential observations.</p>
</div>
<div class="section" id="test-2-searching-for-years-returning-record-ids-back">
<h2>Test 2: searching for years, returning record IDs back</h2>
<p>Let us now search inside the JSON structure for a certain year, and
let us return Python list of record IDs back.</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_test_select_speed</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="s1">'year'</span><span class="p">:</span><span class="n">year</span><span class="p">},{</span><span class="s1">'recid'</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'_id'</span><span class="p">:</span><span class="mi">0</span><span class="p">}):</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">book</span><span class="p">[</span><span class="s1">'recid'</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">def</span> <span class="nf">postgresql_test_select_speed</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SELECT recid FROM test WHERE data->>'year'=</span><span class="si">%s</span><span class="s2">;"</span><span class="p">,</span> <span class="p">(</span><span class="n">year</span><span class="p">,))</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchall</span><span class="p">():</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
</pre></div>
<p>Let us also test queries for year greater than a certain value:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_test_select_speed_greater</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="s1">'year'</span><span class="p">:</span> <span class="p">{</span><span class="s1">'$gt'</span><span class="p">:</span> <span class="n">year</span><span class="p">}},{</span><span class="s1">'recid'</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'_id'</span><span class="p">:</span><span class="mi">0</span><span class="p">}):</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">book</span><span class="p">[</span><span class="s1">'recid'</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">def</span> <span class="nf">postgresql_test_select_speed_greater</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SELECT recid FROM test WHERE data->>'year'></span><span class="si">%s</span><span class="s2">;"</span><span class="p">,</span> <span class="p">(</span><span class="n">year</span><span class="p">,))</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchall</span><span class="p">():</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
</pre></div>
<p>MongoDB connector internally returns a list of dictionaries where one
dictionary looks like:</p>
<pre class="literal-block">
{u'recid': 1618033}
</pre>
<p>PostgreSQL connector internally returns a list of tuples where one
tuple looks like:</p>
<pre class="literal-block">
(1618033,)
</pre>
<p>Let us do a test query once for rare years (1970) and once for
frequent years (2012). Also, let us perform the test once without
dedicated year index and once with dedicated year index.</p>
<p>(Also let us verify that MongoDB and PostgreSQL both return the same
thing.)</p>
<p>Here is example timing code:</p>
<div class="highlight"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">mongodb_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">);</span> <span class="n">x</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span>
<span class="k">print</span> <span class="s2">"*** mongodb year=1970 ...... </span><span class="si">%d</span><span class="s2"> [</span><span class="si">%d</span><span class="s2">,</span><span class="si">%d</span><span class="s2">,...,</span><span class="si">%d</span><span class="s2">,</span><span class="si">%d</span><span class="s2">]"</span> <span class="o">%</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="o">-</span><span class="mi">2</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="o">%</span><span class="n">timeit</span> <span class="n">x</span> <span class="o">=</span> <span class="n">mongodb_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">)</span>
<span class="o">%</span><span class="n">timeit</span> <span class="n">x</span> <span class="o">=</span> <span class="n">mongodb_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">)</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">postgresql_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">);</span> <span class="n">x</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span>
<span class="k">print</span> <span class="s2">"*** postgresql year=1970 ... </span><span class="si">%d</span><span class="s2"> [</span><span class="si">%d</span><span class="s2">,</span><span class="si">%d</span><span class="s2">,...,</span><span class="si">%d</span><span class="s2">,</span><span class="si">%d</span><span class="s2">]"</span> <span class="o">%</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="o">-</span><span class="mi">2</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="o">%</span><span class="n">timeit</span> <span class="n">x</span> <span class="o">=</span> <span class="n">postgresql_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">)</span>
<span class="o">%</span><span class="n">timeit</span> <span class="n">x</span> <span class="o">=</span> <span class="n">postgresql_test_select_speed</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">)</span>
</pre></div>
<p>Here are complete results:</p>
<pre class="literal-block">
========= ====== ========== ========= ========== ========= =========== =========
query hits MongoDB w/index PostgreSQL w/index uPostgreSQL w/index
--------- ------ ---------- --------- ---------- --------- ----------- ---------
year=1970 63 28 ms 0.56 ms 133 ms 0.19 ms 134 ms 0.19 ms
year=2012 5,110 43 ms 22.2 ms 137 ms 4.56 ms 137 ms 4.57 ms
year>2012 5,306 43 ms 22.7 ms 154 ms 5.78 ms 148 ms 5.76 ms
year>1970 49,563 184 ms 205. ms 181 ms 45.2 ms 174 ms 45.2 ms
========= ====== ========== ========= ========== ========= =========== =========
</pre>
<p>We can make several interesting observations already:</p>
<ul class="simple">
<li>MongoDB is faster than PostgreSQL for queries without index,
especially when small number of hits is being returned (speed up of
about 4x). When large number of hits is being returned, the
difference is negligible.</li>
<li>PostgreSQL is faster than MongoDB for queries with index, both when
small and large number of hits is being returned (speed up of about
4x).</li>
<li>Creating an index is much more helpful for PostgreSQL than for
MongoDB.</li>
<li>PostgreSQL binary vs Unicode connections do not play much difference
here, since we are returning integers; as expected. The observed
difference in timings may serve as an indication of measurement
errors, e.g. laptop operating system being more or less busy when
repeating the test in Unicode mode. (Naturally I've run the tests
in various order a few times to make sure the main result trends are
reproducible, even though the numbers may differ.)</li>
</ul>
<p>How much difference is being played by the database search speed
itself and how much by differences in connector object types? And how
will the timings look like if the query returned some other part of
the JSON structure?</p>
</div>
<div class="section" id="test-3-searching-for-years-returning-authors-back">
<h2>Test 3: searching for years, returning authors back</h2>
<p>In the previous test, record IDs were returned from JSON data in case
of MongoDB, and from another table column in case of PostgreSQL. This
simulates more realistically a scenario of mixed SQL/noSQL database
application when using PostgreSQL. However, would the results change
when we'd return data from JSON only?</p>
<p>To test this use case, let us return list of book authors instead of
list of record IDs:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_test_select_speed_author</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="s1">'year'</span><span class="p">:</span><span class="n">year</span><span class="p">},{</span><span class="s1">'author'</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'_id'</span><span class="p">:</span><span class="mi">0</span><span class="p">}):</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">book</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'author'</span><span class="p">,</span><span class="bp">None</span><span class="p">))</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">def</span> <span class="nf">postgresql_test_select_speed_author</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SELECT data->>'author' FROM test WHERE data->>'year'=</span><span class="si">%s</span><span class="s2">;"</span><span class="p">,</span> <span class="p">(</span><span class="n">year</span><span class="p">,))</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchall</span><span class="p">():</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">def</span> <span class="nf">mongodb_test_select_speed_greater_author</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">book</span> <span class="ow">in</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="s1">'year'</span><span class="p">:</span> <span class="p">{</span><span class="s1">'$gt'</span><span class="p">:</span> <span class="n">year</span><span class="p">}},{</span><span class="s1">'author'</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'_id'</span><span class="p">:</span><span class="mi">0</span><span class="p">}):</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">book</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'author'</span><span class="p">,</span><span class="bp">None</span><span class="p">))</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">def</span> <span class="nf">postgresql_test_select_speed_greater_author</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">res</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SELECT data->>'author' FROM test WHERE data->>'year'></span><span class="si">%s</span><span class="s2">;"</span><span class="p">,</span> <span class="p">(</span><span class="n">year</span><span class="p">,))</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchall</span><span class="p">():</span>
<span class="n">res</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">res</span>
</pre></div>
<p>MongoDB connector internally returns a list of dictionaries:</p>
<pre class="literal-block">
{u'author': u'Wilf, Herbert S'} ... MongoDB
</pre>
<p>PostgreSQL connector internally returns a list of tuples:</p>
<pre class="literal-block">
('Wilf, Herbert S',) .... PostreSQL, binary strings
(u'Wilf, Herbert S',) ... PostreSQL, Unicode extension
</pre>
<p>Here are the results:</p>
<pre class="literal-block">
========= ====== ========== ========= ========== ========= =========== =========
query hits MongoDB w/index PostgreSQL w/index uPostgreSQL w/index
--------- ------ ---------- --------- ---------- --------- ----------- ---------
year=1970 63 28.3 ms 0.59 ms 133 ms 0.35 ms 133 ms 0.43 ms
year=2012 5,110 45.8 ms 24.1 ms 153 ms 18.2 ms 155 ms 22.3 ms
year>2012 5,306 50.5 ms 24.8 ms 163 ms 19.7 ms 167 ms 24.0 ms
year>1970 49,563 199. ms 223. ms 308 ms 184. ms 352 ms 216. ms
========= ====== ========== ========= ========== ========= =========== =========
</pre>
<p>The results when returning authors confirm general trends found
before: MongoDB is faster for queries without indexes; however once
the indexes are switched on, PostgreSQL becomes faster. An
interesting phenomenon where index creation actually slows down
MongoDB (199 ms -> 223 ms) is still present.</p>
<p>As for the role of PostgreSQL connector using binary strings vs
Unicode strings, we can see that Unicode connections lead to slightly
slower performance, which is understandable due to necessary UTF-8
conversions. It is interesting to note that the speed of PostgreSQL
when using Unicode strings slows down to become virtually equal to
that of MongoDB.</p>
</div>
<div class="section" id="test-4-searching-for-years-returning-counts-back">
<h2>Test 4: searching for years, returning counts back</h2>
<p>What if we returned only the number of hits satisfying the given query
and measured that, instead of returning strings? Then the connectors
would not have to do various kinds of transformations and we could be
able to better measure the "raw" search speed.</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">mongodb_test_select_speed_count</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="k">return</span> <span class="n">mg_col</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="s1">'year'</span><span class="p">:</span><span class="n">year</span><span class="p">},{</span><span class="s1">'recid'</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'_id'</span><span class="p">:</span><span class="mi">0</span><span class="p">})</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">postgresql_test_select_speed_count</span><span class="p">(</span><span class="n">year</span><span class="p">):</span>
<span class="n">pg_cur</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SELECT COUNT(*) FROM test WHERE data->>'year'=</span><span class="si">%s</span><span class="s2">;"</span><span class="p">,</span> <span class="p">(</span><span class="n">year</span><span class="p">,))</span>
<span class="k">return</span> <span class="n">pg_cur</span><span class="o">.</span><span class="n">fetchone</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
</pre></div>
<p>MongoDB connector internally returns an integer:</p>
<pre class="literal-block">
63
</pre>
<p>PostgreSQL connector internally returns a list of tuples:</p>
<pre class="literal-block">
(63L,)
</pre>
<p>Here are the results:</p>
<pre class="literal-block">
================ ====== ========== ========= ========== ========= =========== =========
query result MongoDB w/index PostgreSQL w/index uPostgreSQL w/index
---------------- ------ ---------- --------- ---------- --------- ----------- ---------
COUNT(year=1970) 63 25.4 ms 0.293 ms 133 ms 0.161 ms 133 ms 0.163 ms
COUNT(year=2012) 5,110 25.5 ms 1.07 ms 134 ms 1.56 ms 134 ms 1.60 ms
COUNT(year>2012) 5,306 25.5 ms 1.09 ms 145 ms 2.68 ms 145 ms 2.75 ms
COUNT(year>1970) 49,563 26.2 ms 6.82 ms 143 ms 19.2 ms 144 ms 18.9 ms
================ ====== ========== ========= ========== ========= =========== =========
</pre>
<p>In this test, virtually no data is being returned and converted by the
database connector. What we are measuring is close to pure JSON
select speed, though obviously influenced by database cache, file
system cache, logging system, and related settings. The tests confirm
that MongoDB is faster for queries without indexes; but it is also
becoming faster for queries with indexes as well, for the first time,
when compared to PostgreSQL.</p>
</div>
<div class="section" id="test-5-mongodb-configuration">
<h2>Test 5: MongoDB configuration</h2>
<p>The above tests were performed with Debian GNU/Linux's default MongoDB
and PostgreSQL configurations. This was deliberate in order to have
<em>some</em> starting point of comparison. However, in real life, the
configurations would be tweaked to fit the application.</p>
<p>As an example, let us note that MongoDB logs every query taking longer
than 100ms into its log files. How big effect does this have in our
particular tests?</p>
<p>Looking at MongoDB log file, several slow queries can be seen logged:</p>
<pre class="literal-block">
$ tail /var/log/mongodb/mongodb.log
[...]
[conn1] getmore test_database.test_collection query: { year: { $gt: "1970" } } cursorid:1600782125584581 ntoreturn:0 keyUpdates:0 locks(micros) r:103319 nreturned:49462 reslen:791412 103ms
[conn1] getmore test_database.test_collection query: { year: { $gt: "1970" } } cursorid:1606323121267136 ntoreturn:0 keyUpdates:0 locks(micros) r:101010 nreturned:49462 reslen:791412 101ms
[conn1] getmore test_database.test_collection query: { year: { $gt: "1970" } } cursorid:1742429610105362 ntoreturn:0 keyUpdates:0 locks(micros) r:102457 nreturned:49462 reslen:1592469 102ms
[conn1] getmore test_database.test_collection query: { year: { $gt: "1970" } } cursorid:1746596008196226 ntoreturn:0 keyUpdates:0 locks(micros) r:101370 nreturned:49462 reslen:1592469 101ms
</pre>
<p>Let us reduce MongoDB slow query logging timeout via decreasing
profiling level, say to 2 seconds:</p>
<pre class="literal-block">
$ mongo
> db.setProfilingLevel(1,2000)
{ "was" : 1, "slowms" : 100, "ok" : 1 }
> db.setProfilingLevel(1,2000)
{ "was" : 1, "slowms" : 2000, "ok" : 1 }
</pre>
<p>Let us re-test the "worst case scenario" query:</p>
<div class="highlight"><pre><span></span><span class="o">%</span><span class="n">timeit</span> <span class="n">x</span> <span class="o">=</span> <span class="n">mongodb_test_select_speed_greater_author</span><span class="p">(</span><span class="s1">'1970'</span><span class="p">)</span>
</pre></div>
<p>and compare the results.</p>
<p>We see that there is no measurable effect on query timings with these
more permissive query profiling settings. For our application, the
timings stay the same. This is simply because the log file is not
being written too much; for fast queries (less than 100ms) there is no
log writing, and for slow queries IPython does not repeat them too
frequently when detecting their timings. Hence this particular
query-logging effect is not important for our particular set of tests.</p>
<p>We could go further in this direction, e.g. to study the effect of
query cache settings etc. However, let's stop here, because for our
relatively small data set (50k records of about 150MB total size) we
have gathered enough insight into our particular task at hand.</p>
</div>
<div class="section" id="test-6-postgresql-returning-sql-data-columns-vs-json-data-fields">
<h2>Test 6: PostgreSQL returning SQL data columns vs JSON data fields</h2>
<p>Let's perform one more test. Above, when querying for year and
returning record IDs that match given year, PostgreSQL queries were
returning record IDs from the SQL table column, not from the JSON data
itself. This was done in order to simulate mixed SQL/noSQL
application setup. Later, we tested returning JSON data from
PostgreSQL as well, but for author names, not for record IDs.</p>
<p>A question arises: how exactly would PostgreSQL performance differ
when returning record IDs from SQL column and from JSON data
themselves?</p>
<p>A small IPython experiment reveals:</p>
<pre class="literal-block">
In [30]: %timeit pg_cur.execute("SELECT recid FROM test WHERE data->>'year'=%s;", ('2012',))
10 loops, best of 3: 137 ms per loop
In [31]: %timeit pg_cur.execute("SELECT data->>'recid' FROM test WHERE data->>'year'=%s;", ('2012',))
10 loops, best of 3: 151 ms per loop
</pre>
<p>Returning record IDs from the SQL table column is faster than
returning record IDs from the JSON data by about 10 per cent.</p>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p>Let us represent the above tables in terms of relative
MongoDB-time-over-PostgreSQL-time slowdown factor, for each run
(without and with indexes, without and with Unicode PostgreSQL
connectors):</p>
<pre class="literal-block">
================ ======= ======= ======= =======
query M/P Mi/Pi M/uP Mi/uPi
---------------- ------- ------- ------- -------
RCIDS(year=1970) 0.211 2.947 0.209 2.947
RCIDS(year=2012) 0.314 4.868 0.314 4.858
RCIDS(year>2012) 0.279 3.927 0.291 3.941
RCIDS(year>1970) 1.017 4.535 1.057 4.535
---------------- ------- ------- ------- -------
AUTHS(year=1970) 0.213 1.686 0.213 1.372
AUTHS(year=2012) 0.299 1.324 0.295 1.081
AUTHS(year>2012) 0.310 1.259 0.302 1.033
AUTHS(year>1970) 0.646 1.212 0.565 1.032
---------------- ------- ------- ------- -------
COUNT(year=1970) 0.191 1.820 0.191 1.798
COUNT(year=2012) 0.190 0.686 0.190 0.669
COUNT(year>2012) 0.176 0.407 0.176 0.396
COUNT(year>1970) 0.183 0.355 0.182 0.361
================ ======= ======= ======= =======
</pre>
<p>We can conclude that:</p>
<ul class="simple">
<li>MongoDB is almost always faster when returning query counts.</li>
<li>MongoDB is almost always faster for queries not using indexes.</li>
<li>PostgreSQL is almost always faster for queries using indexes.</li>
<li>It pays off to create indexes for often used JSON fields. Important
gains may be expected, typically 1000 per cent.</li>
<li>It pays off to return UTF-8 binary strings rather than Unicode
strings if the application supports it. Moderate gains may be
expected, typically 30 per cent.</li>
<li>It pays off to use SQL columns for ID-like fields. Small gains may
be expected, typically 10 per cent.</li>
</ul>
<p>Functionality-wise, PostgreSQL offers a possibility to comfortably
combine relational and non-relational data in the same database. The
present tests showed that it achieves this in a very efficient manner,
too. Coupled with offering user-level transaction mechanism,
PostgreSQL seems to be an excellent all-in-one choice for mixed
RDBMS-JSON applications.</p>
<p>Further outlook: it would be interesting to compare timings for
"ElasticSearch-as-a-DB" JSON storage technique as well.</p>
</div>
SSH Tunnelling2013-05-27T10:32:00+02:002013-05-27T10:32:00+02:00Tibor Šimkotag:tiborsimko.org,2013-05-27:/ssh-tunnelling.html<p>Imagine being at home and wanting to connect to a web site sitting
behind the corporate firewall. The web site is not open to the
outside world; one can only use it from within the corporate IP domain
range. How to quickly set up an SSH tunnel to reach it …</p><p>Imagine being at home and wanting to connect to a web site sitting
behind the corporate firewall. The web site is not open to the
outside world; one can only use it from within the corporate IP domain
range. How to quickly set up an SSH tunnel to reach it?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="reaching-an-internal-web-site-via-ssh-tunnel">
<h2>Reaching an internal web site via SSH tunnel</h2>
<p>If the web site is called <tt class="docutils literal"><span class="pre">http://internal.example.org</span></tt>, and if the
publicly connectable machine is <tt class="docutils literal">public.example.org</tt>, then you can
establish an SSH tunnel to the internal web site by running the
following command on your laptop:</p>
<pre class="literal-block">
ssh -f johndoe@public.example.org -L 3000:internal.example.org:80 -N
</pre>
<p>After this, opening <tt class="docutils literal"><span class="pre">http://localhost:3000/</span></tt> in your browser will
actually show <tt class="docutils literal"><span class="pre">http://internal.example.org/</span></tt> as seen from the
<tt class="docutils literal">public.example.org</tt> machine. That is, from within the corporate IP
domain range.</p>
<p>This technique provides much faster browsing experience when compared
to logging via <tt class="docutils literal">ssh <span class="pre">-Y</span></tt> into <tt class="docutils literal">public.example.org</tt> and starting a
remote browser there.</p>
</div>
<div class="section" id="jenkins-invenio-software-org">
<h2>jenkins.invenio-software.org</h2>
<p>For example, to access Invenio Jenkins server from outside of CERN,
first open an SSH tunnel like this:</p>
<pre class="literal-block">
ssh -f lxplus.cern.ch -L 3001:jenkins.invenio-software.org:443 -N
</pre>
<p>then open <tt class="docutils literal"><span class="pre">https://localhost:3001/</span></tt> in the browser.</p>
</div>
Chromium and PDF2013-04-08T02:57:00+02:002013-04-08T02:57:00+02:00Tibor Šimkotag:tiborsimko.org,2013-04-08:/chromium-and-pdf.html<p>Chrome displays PDF files natively inside the browser via a
closed-source <tt class="docutils literal">libpdf</tt> library. This library is naturally not
included in Chromium, the open source brother of Chrome. Can one
achieve the same functionality with Chromium?</p>
<!-- PELICAN_END_SUMMARY --><p>If one wants to display PDFs inline with Chromium too, one technique
is to extract …</p><p>Chrome displays PDF files natively inside the browser via a
closed-source <tt class="docutils literal">libpdf</tt> library. This library is naturally not
included in Chromium, the open source brother of Chrome. Can one
achieve the same functionality with Chromium?</p>
<!-- PELICAN_END_SUMMARY --><p>If one wants to display PDFs inline with Chromium too, one technique
is to extract <tt class="docutils literal">libpdf</tt> library from Chrome:</p>
<pre class="literal-block">
mkdir /tmp/xyzzy && cd /tmp/xyzzy
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
ar vx google-chrome-stable_current_amd64.deb
tar --lzma -xvf data.tar.lzma
sudo cp ./opt/google/chrome/libpdf.so /usr/lib/chromium/
cd $HOME && rm -rf /tmp/xyzzy
</pre>
<p>After a restart of Chromium, one can go to <tt class="docutils literal">about:plugins</tt> to check
whether the libpdf plugin is well recognised and enabled.</p>
<p>Now PDF files should get open inside the browser.</p>
Git Fetch Trouble Workaround2013-01-07T17:20:00+01:002013-01-07T17:20:00+01:00Tibor Šimkotag:tiborsimko.org,2013-01-07:/git-fetch-trouble-workaround.html<p>When fetching from a git repository such as <a class="reference external" href="http://invenio-software.org/repo">Invenio</a> one, you may encounter a
situation where fetching a branch does not work due to missing
objects, all the while cloning the same repository anew works well.
Here is how to work around the fetching problem.</p>
<!-- PELICAN_END_SUMMARY --><p>If fetching replies:</p>
<pre class="literal-block">
$ cd ~/private …</pre><p>When fetching from a git repository such as <a class="reference external" href="http://invenio-software.org/repo">Invenio</a> one, you may encounter a
situation where fetching a branch does not work due to missing
objects, all the while cloning the same repository anew works well.
Here is how to work around the fetching problem.</p>
<!-- PELICAN_END_SUMMARY --><p>If fetching replies:</p>
<pre class="literal-block">
$ cd ~/private/src/invenio
$ git fetch sam -p
error: Unable to find 990baa93fc7bfd2beda20a70daf28bb5eda43976 under http://invenio-software.org/repo/personal/invenio-sam
Cannot obtain needed tree 990baa93fc7bfd2beda20a70daf28bb5eda43976
while processing commit 01a973828cef3385f23f4a58d9f630c946426d83.
error: Fetch failed.
</pre>
<p>Then the workaround solution is to temporarily make a full clone of
the repository and fetch the missing tree from there:</p>
<pre class="literal-block">
$ cd /tmp
$ git clone --bare http://invenio-software.org/repo/personal/invenio-sam
$ cd -
$ git remote add samtmp /tmp/invenio-sam
$ git fetch samtmp
$ git remote rm samtmp
</pre>
<p>Now regular fetching should work again:</p>
<pre class="literal-block">
$ git fetch sam # now works
</pre>
Emacs Bindings in GTK Applications2012-06-06T04:20:00+02:002012-06-06T04:20:00+02:00Tibor Šimkotag:tiborsimko.org,2012-06-06:/emacs-bindings-in-gtk-apps.html<p>Emacs offers many functionalities natively; I routinely use it for
writing, email, news, chat, programming, web browsing, image viewing,
and much more. Sharing the same environment and the same keyboard
shortcuts everywhere is good for productivity. However, what if one
uses a standalone GTK application such as Chromium or Pidgin …</p><p>Emacs offers many functionalities natively; I routinely use it for
writing, email, news, chat, programming, web browsing, image viewing,
and much more. Sharing the same environment and the same keyboard
shortcuts everywhere is good for productivity. However, what if one
uses a standalone GTK application such as Chromium or Pidgin? Is it
possible to share most of Emacs keyboard shortcuts there as well?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="setting-up-emacs-key-binding-theme">
<h2>Setting up Emacs key binding theme</h2>
<p>The first step is to configure the window desktop environment to use
Emacs-like key bindings generally. Depending on the window manager,
it can be done in several ways, for example:</p>
<ul>
<li><p class="first">Gnome2 and GTK-2 apps:</p>
<pre class="literal-block">
$ grep gtk-key-theme-name ~/.gtkrc-2.0
gtk-key-theme-name = "Emacs"
</pre>
</li>
<li><p class="first">Gnome3 and GTK-3 apps:</p>
<pre class="literal-block">
$ gnome-tweak-tool # menu "Theme -> Keybinding Theme -> Emacs"
</pre>
</li>
<li><p class="first">Xfce:</p>
<pre class="literal-block">
$ xfconf-query -c xsettings -p /Gtk/KeyThemeName -s Emacs
</pre>
</li>
</ul>
<p>This will set many useful keyboard shortcuts such as <tt class="docutils literal"><span class="pre">C-a</span></tt> to go to
the beginning of line, <tt class="docutils literal"><span class="pre">C-e</span></tt> to go to the end of line, <tt class="docutils literal"><span class="pre">M-f</span></tt> to
move forward a word, <tt class="docutils literal"><span class="pre">M-b</span></tt> to move backward a word, etc.</p>
<p>However, not all key bindings are usually covered. For example, I'm
used to hitting <tt class="docutils literal"><span class="pre">M-<backspace></span></tt> frequently, and this keyboard
shortcut is not supported out of the box. Can it be added?</p>
</div>
<div class="section" id="example-chromium-and-pidgin">
<h2>Example: Chromium and Pidgin</h2>
<p>En guise of example, let's discuss Chromium and Pidgin GTK
applications where we would like to make <tt class="docutils literal"><span class="pre">M-<backspace></span></tt> to delete
the word backwards. Let's start by checking which GTK library they
are linked to:</p>
<pre class="literal-block">
$ ldd /usr/bin/pidgin | grep gtk
libgtkspell.so.0 => /usr/lib/libgtkspell.so.0 (0x00007f054d4c9000)
libgtk-x11-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 (0x00007f054ce87000)
$ ldd /usr/lib/chromium/chromium | grep gtk
libgtk-x11-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 (0x00007ffa726bb000)
$ dpkg -S /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
libgtk2.0-0:amd64: /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
</pre>
<p>Now let's see which Emacs bindings are configured by default with GTK2
keyboard scheme:</p>
<pre class="literal-block">
$ cat /usr/share/themes/Emacs/gtk-2.0-key/gtkrc
binding "gtk-emacs-text-entry"
{
bind "<ctrl>b" { "move-cursor" (logical-positions, -1, 0) }
bind "<shift><ctrl>b" { "move-cursor" (logical-positions, -1, 1) }
bind "<ctrl>f" { "move-cursor" (logical-positions, 1, 0) }
bind "<shift><ctrl>f" { "move-cursor" (logical-positions, 1, 1) }
bind "<alt>b" { "move-cursor" (words, -1, 0) }
bind "<shift><alt>b" { "move-cursor" (words, -1, 1) }
bind "<alt>f" { "move-cursor" (words, 1, 0) }
bind "<shift><alt>f" { "move-cursor" (words, 1, 1) }
[...]
bind "<ctrl>w" { "delete-from-cursor" (word-ends, -1) }
</pre>
<p>However, there is no binding for backspace:</p>
<pre class="literal-block">
$ grep -c -i backspace /usr/share/themes/Emacs/gtk-2.0-key/gtkrc
0
</pre>
</div>
<div class="section" id="adding-a-binding-for-m-backspace">
<h2>Adding a binding for <tt class="docutils literal"><span class="pre">M-<backspace></span></tt></h2>
<p>Let's edit <tt class="docutils literal"><span class="pre">~/.gtkrc-2.0</span></tt> and enter the above <tt class="docutils literal"><ctrl>w</tt> definition
also for <tt class="docutils literal"><alt>BackSpace</tt>:</p>
<pre class="literal-block">
$ cat ~/.gtkrc-2.0
binding "gtk-emacs-text-entry"
{
bind "<alt>BackSpace" { "delete-from-cursor" (word-ends, -1) }
}
</pre>
<p>Now, after re-login, the keyboard shortcut <tt class="docutils literal"><span class="pre">M-<backspace></span></tt> will work
perfectly well in Chromium or Pidgin.</p>
<p>This key binding is very useful, as it is often faster to delete the
whole word and re-type it back rather than to correct wrong letters
one by one individually. If someone has <tt class="docutils literal"><span class="pre">M-<backspace></span></tt> hard-wired
in muscle memory, now the same shortcut can be used regardless of
whether one happens to be in Emacs or in Chromium.</p>
</div>
<div class="section" id="switching-off-some-native-application-bindings">
<h2>Switching off (some) native application bindings</h2>
<p>Going further, it is usually a good idea to alter native application
key bindings if they conflict with frequent Emacs ones. This is
usually customisable inside the given GTK application.</p>
<p>En guise of example, let's take Pidgin. When editing a message,
<tt class="docutils literal"><span class="pre">M-f</span></tt> invokes by default a font menu, which is not very useful
feature to me. Hence I can use "Ungroup items" to deactivate it; or I
can use "Options -> Show Formatting Toolbar" to disable it for good in
a persistent manner.</p>
</div>
<div class="section" id="conclusions">
<h2>Conclusions</h2>
<p>With a few lines of code, Emacs-like key bindings can be added to
well-behaved GTK applications such as Chromium or Pidgin, including
custom key bindings.</p>
<p>(P.S. When looking for better Emacs-like keyboard productivity
experience in GTK applications, it is usually interesting to go even
further. E.g. I've been using <em>Vimperator</em> extension for Firefox,
<em>Conkeror</em> XULRunner based browser, <em>Vimium</em> extension for Chromium,
<em>Edit with Emacs</em> extension for Chromium, etc. These are covered
elsewhere in this blog.)</p>
</div>
MySQL Replication Repairing2012-05-30T11:17:00+02:002012-05-30T11:17:00+02:00Tibor Šimkotag:tiborsimko.org,2012-05-30:/mysql-replication-repairing.html<p>MySQL replication stopped due to a disk full situation on MySQL
server. Here are notes on how I repaired it.</p>
<!-- PELICAN_END_SUMMARY --><p>Due to /opt being full on DB master, the replication crashed.
Symptoms, on DB slave:</p>
<pre class="literal-block">
PCUDSSW1513> sudo tail -100 /var/log/mysqld.log
120524 9:43:48 [ERROR] Got fatal …</pre><p>MySQL replication stopped due to a disk full situation on MySQL
server. Here are notes on how I repaired it.</p>
<!-- PELICAN_END_SUMMARY --><p>Due to /opt being full on DB master, the replication crashed.
Symptoms, on DB slave:</p>
<pre class="literal-block">
PCUDSSW1513> sudo tail -100 /var/log/mysqld.log
120524 9:43:48 [ERROR] Got fatal error 1236: 'binlog truncated in the middle of event' from master when reading data from binary log
120524 9:43:48 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000063', position 399760643
</pre>
<p>Checked the given binary log file on DB master:</p>
<pre class="literal-block">
PCUDSSX1501 2 /opt/mysql-data> sudo -u mysql mysqlbinlog --start-position 399760643 'mysql-bin.000063'
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 14597, event_type: 2
Could not read entry at offset 399760643:Error in log format or read error
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
</pre>
<p>Similar troubles for some positions before that. So looked at the
whole file:</p>
<pre class="literal-block">
PCUDSSX1501 2 /opt/mysql-data> sudo -u mysql mysqlbinlog 'mysql-bin.000063' > /opt/simko/zzz.sql
ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 14597, event_type: 2
Could not read entry at offset 399760643:Error in log format or read error
</pre>
<p>and:</p>
<pre class="literal-block">
PCUDSSX1501 2 /opt/mysql-data> tail -10 /opt/simko/zzz.sql
/*!*/;
# at 399760393
#120523 22:22:40 server id 201101501 end_log_pos 399760643 Query thread_id=77556 exec_time=0 error_code=0
SET TIMESTAMP=1337804560/*!*/;
INSERT INTO rnkDOWNLOADS (id_bibrec,id_bibdoc,file_version,file_format,id_user,client_host,download_time) VALUES (1157741,94200,'','GIF;ICON',0,INET_ATON('24.222.171.86'),NOW())
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
</pre>
<p>Checked whether this data exists on DB slave:</p>
<pre class="literal-block">
mysql> SELECT * FROM rnkDOWNLOADS WHERE id_bibrec=1157741 AND id_bibdoc=94200 AND client_host=INET_ATON('24.222.171.86');
+-----------+---------------------+-------------+---------+-----------+--------------+-------------+---------+------------------+
| id_bibrec | download_time | client_host | id_user | id_bibdoc | file_version | file_format | referer | display_position |
+-----------+---------------------+-------------+---------+-----------+--------------+-------------+---------+------------------+
| 1157741 | 2012-05-23 22:22:40 | 417246038 | 0 | 94200 | 0 | GIF;ICON | NULL | 0 |
+-----------+---------------------+-------------+---------+-----------+--------------+-------------+---------+------------------+
1 row in set (0.87 sec)
</pre>
<p>Seems it does.</p>
<p>Indeed, comparing <tt class="docutils literal">end_log_pos</tt> value of 399760643 in <tt class="docutils literal">zzz.sql</tt>
file above exactly corresponds to DB slave's position of 399760643:</p>
<pre class="literal-block">
mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 137.138.198.204
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000063
Read_Master_Log_Pos: 399760643
Relay_Log_File: mysqld-relay-bin.000191
Relay_Log_Pos: 235
Relay_Master_Log_File: mysql-bin.000063
Exec_Master_Log_Pos: 399760643
</pre>
<p>Hence it seems that everything that DB master had in the binary log
file before the crash was correctly replicated on the DB slave.</p>
<p>Consequently let's try to start replication anew by forcing the next
available position. Not the best, but may work. (See also
P.S. below.)</p>
<p>The start of the next DB master binary log file looks like:</p>
<pre class="literal-block">
(sudo -u mysql mysqlbinlog 'mysql-bin.000064' | head -15)
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#120523 23:54:51 server id 201101501 end_log_pos 98 Start: binlog v 4, server v 5.0.95-log created 120523 23:54:51 at startup
ROLLBACK/*!*/;
# at 98
#120523 23:54:53 server id 201101501 end_log_pos 345 Query thread_id=5 exec_time=0 error_code=0
use cdsweb/*!*/;
SET TIMESTAMP=1337810093/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1/*!*/;
SET @@session.sql_mode=0/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.time_zone='SYSTEM'/*!*/;
</pre>
<p>Let's try to restart DB replication from file 64 position 98 by
executing (on DB slave):</p>
<pre class="literal-block">
mysql> STOP SLAVE;
mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000064',MASTER_LOG_POS=98;
mysql> START SLAVE;
</pre>
<p>This restarted the replication.</p>
<p>Since almost a week elapsed since DB master crashed, the DB slave is
quite behind:</p>
<pre class="literal-block">
mysql> SHOW SLAVE STATUS\G;
Seconds_Behind_Master: 560616
</pre>
<p>But the replication is running again, catching up with the past.
Let's see how much time it will take for the slave to catch up.</p>
<p>P.S. For additional safety, one may not rely on such truncated binary
log file analysis, but rather restart DB replication from scratch
anew, when service time permits.</p>
MySQL Replication Purging2012-05-30T11:07:00+02:002012-05-30T11:07:00+02:00Tibor Šimkotag:tiborsimko.org,2012-05-30:/mysql-replication-purging.html<p>How to purge old unnecessary MySQL binary log files on DB master,
after checking replication status.</p>
<!-- PELICAN_END_SUMMARY --><p>On DB master (PCUDSSX1501), did:</p>
<pre class="literal-block">
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000075 | 277790 | | |
+------------------+----------+--------------+------------------+
</pre>
<p>On DB slave (PCUDSSW1513), did:</p>
<pre class="literal-block">
mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 137.138.198.204 …</pre><p>How to purge old unnecessary MySQL binary log files on DB master,
after checking replication status.</p>
<!-- PELICAN_END_SUMMARY --><p>On DB master (PCUDSSX1501), did:</p>
<pre class="literal-block">
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000075 | 277790 | | |
+------------------+----------+--------------+------------------+
</pre>
<p>On DB slave (PCUDSSW1513), did:</p>
<pre class="literal-block">
mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 137.138.198.204
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000063
Read_Master_Log_Pos: 399760643
Relay_Log_File: mysqld-relay-bin.000191
Relay_Log_Pos: 235
Relay_Master_Log_File: mysql-bin.000063
Exec_Master_Log_Pos: 399760643
</pre>
<p>As one can see, DB slave uses binary log file <tt class="docutils literal"><span class="pre">mysql-bin.000063</span></tt>.
Hence we can clean all previous DB master log files up to that one.</p>
<p>On DB master (PCUDSSX1501), did:</p>
<pre class="literal-block">
mysql> PURGE MASTER LOGS TO 'mysql-bin.000063';
Query OK, 0 rows affected, 13 warnings (1 min 1.80 sec)
</pre>
<p>This liberates quite a considerable disk space. (About 1 GB per
binary log file.)</p>
MySQL Replication Setup2012-04-18T23:18:00+02:002012-04-18T23:18:00+02:00Tibor Šimkotag:tiborsimko.org,2012-04-18:/mysql-replication-setup.html<p>Notes on how to set up MySQL replication.</p>
<!-- PELICAN_END_SUMMARY --><p>Firstly, stop web application nodes and redirect users to "please come
back later" web page.</p>
<p>Secondly, on DB node (PCUDSSX1501), switch on <tt class="docutils literal"><span class="pre">log-bin</span></tt> in
<tt class="docutils literal">/etc/my.cnf</tt> if not done yet, then perform in one DB client
connection:</p>
<pre class="literal-block">
mysql> TRUNCATE session;
mysql …</pre><p>Notes on how to set up MySQL replication.</p>
<!-- PELICAN_END_SUMMARY --><p>Firstly, stop web application nodes and redirect users to "please come
back later" web page.</p>
<p>Secondly, on DB node (PCUDSSX1501), switch on <tt class="docutils literal"><span class="pre">log-bin</span></tt> in
<tt class="docutils literal">/etc/my.cnf</tt> if not done yet, then perform in one DB client
connection:</p>
<pre class="literal-block">
mysql> TRUNCATE session;
mysql> FLUSH TABLES WITH READ LOCK;
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 | 1061 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
</pre>
<p>Note log file and position which will be used later down to start the
replication.</p>
<p>In another screen window on PCUDSSX1501, take DB backup and restart DB
server via:</p>
<pre class="literal-block">
mysqladmin -u root -p shutdown
sudo rm /opt/mysql-data/mysqld-slow.log
sudo cp -a /opt/mysql-data /opt/mysql-data-20120418
sudo /etc/init.d/mysqld start
</pre>
<p>Note that the copy may take about ~25 minutes, since DB size is close
to ~40 GB.</p>
<p>Now the web application can be restarted back to production.</p>
<p>Thirdly, set up new replication user client on DB master PCUDSSX1501:</p>
<pre class="literal-block">
mysql> CREATE USER repl@137.138.4.157 IDENTIFIED BY 'pass123';
mysql> GRANT REPLICATION SLAVE ON *.* TO repl@137.138.4.157;
</pre>
<p>Copy over DB master snapshot into DB slave node PCUDSSW1513:</p>
<pre class="literal-block">
PCUDSSX1501> sudo rsync -rlptDvz -e ssh /opt/mysql-data-20120418/ root@pcudssw1513:/opt/mysql-data-from-pcudssx1501-20120418
PCUDSSX1501> sudo rsync -rlptDvz -e ssh /opt/mysql-data-20120418/ root@pcudssw1513:/opt/mysql-data-from-pcudssx1501-20120418 # once more to test
PCUDSSX1501> sudo rm -rf /opt/mysql-data-20120418/ # free space
</pre>
<p>Install parts of DB master snapshot on DB slave node PCUDSSW1513:</p>
<pre class="literal-block">
sudo /etc/init.d/mysqld stop
sudo mv /opt/mysql-data-from-pcudssx1501-20120418/cdsweb /opt/mysql-data/
sudo cp -a /opt/mysql-data-from-pcudssx1501-20120418/ib* /opt/mysql-data/
sudo chown -R mysql.mysql /opt/mysql-data/ib*
sudo chown -R mysql.mysql /opt/mysql-data/cdsweb/
sudo /etc/init.d/mysqld start
</pre>
<p>Fourthly, set up and start replication on slave node PCUDSSW1513:</p>
<pre class="literal-block">
mysql> SHOW SLAVE STATUS\G;
mysql> CHANGE MASTER TO MASTER_HOST='137.138.198.204',MASTER_USER='repl',MASTER_PASSWORD='pass123',MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=1061;
mysql> START SLAVE;
mysql> SHOW SLAVE STATUS\G;
</pre>
<p>Monitor progress watching how <tt class="docutils literal">Seconds_Behind_Master</tt> decreases as
the replication catches up. We are done.</p>
MySQL In-Memory Temporary File System2012-04-11T21:45:00+02:002012-04-11T21:45:00+02:00Tibor Šimkotag:tiborsimko.org,2012-04-11:/mysql-tmpfs.html<p>If a MySQL application uses many tables containing TEXT columns, then
for various JOIN and GROUP BY operations on these tables, MySQL cannot
use in-memory temporary tables and need to resort to using on-disk
ones. The in-memory vs on-disk speed difference is considerable in
these cases. Introducing a special in-memory …</p><p>If a MySQL application uses many tables containing TEXT columns, then
for various JOIN and GROUP BY operations on these tables, MySQL cannot
use in-memory temporary tables and need to resort to using on-disk
ones. The in-memory vs on-disk speed difference is considerable in
these cases. Introducing a special in-memory temporary file system
can speed performance by a factor of 10x or more.</p>
<!-- PELICAN_END_SUMMARY --><p>Here is how I introduced in-memory temporary file system for the
INSPIRE application.</p>
<p>(1) Looking at slow query log (<tt class="docutils literal"><span class="pre">/opt/mysql-data/mysqld-slow.log</span></tt>), I
could see that there many slow queries like:</p>
<pre class="literal-block">
SELECT bx.value FROM bib70x AS bx, bibrec_bib70x AS bibx
WHERE bibx.id_bibrec IN (689533,865819,844129,778444,768538,)
AND bx.id=bibx.id_bibxxx AND bx.tag LIKE '700__a'
ORDER BY bibx.field_number, bx.tag ASC;
</pre>
<p>coming from <tt class="docutils literal">bibindex</tt> processes.</p>
<p>(2) <tt class="docutils literal">mysqltuner</tt> showed that our DB is running with too many temporary
tables created on the disk:</p>
<pre class="literal-block">
[!!] Temporary tables created on disk: 27% (107M on disk / 387M total)
</pre>
<p>It is highly probable that these two issues are related, because
<tt class="docutils literal">bibxxx</tt> tables use TEXT column definitions, and MySQL cannot use
in-memory temporary tables for various join and group operations in
these cases; it false back to using on-disk temporary tables.</p>
<p>(3) Observing I/O activity on the DB server shows some <tt class="docutils literal">/tmp</tt>
activity, albeit not very much:</p>
<pre class="literal-block">
PCUDSSW1502 ~> sudo iostat -n
Linux 2.6.18-194.17.1.el5 (pcudssw1502.cern.ch) 11/04/12
avg-cpu: %user %nice %system %iowait %steal %idle
2.47 0.00 1.39 4.53 0.00 91.61
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 98.64 671.79 531.22 31215469644 24683531996
sda1 0.01 0.04 0.03 1643438 1299532
sda2 0.00 0.00 0.00 7768 4272
sda3 0.49 1.48 7.39 68689098 343189936
sda4 0.00 0.00 0.00 62 0
sda5 0.07 0.01 0.74 454417 34359264
sda6 96.34 669.28 480.12 31098923522 22309295328
sda7 0.33 0.00 24.45 218871 1136130448
sda8 1.39 0.98 18.49 45515876 859253216
PCUDSSW1502 ~> mount | grep sda[678]
/dev/sda8 on /var type ext3 (rw)
/dev/sda7 on /tmp type ext3 (rw)
/dev/sda6 on /opt type ext3 (rw)
</pre>
<p>Mostly <tt class="docutils literal">/opt</tt> is busy, but <tt class="docutils literal">/tmp/</tt> may get busy as well at
times... especially at indexing times.</p>
<p>(4) The preceding items indicate that using dedicated in-memory
temporary file system to host temporary DB tables that MySQL would
otherwise have to create on disk (due to TEXT columns) should help
considerably in alleviating disk I/O in these indexing-heavy cases.</p>
<p>So I've edited <tt class="docutils literal">/etc/fstab</tt> to introduce a new dedicated <tt class="docutils literal">tmpfs</tt>
partition named <tt class="docutils literal">/mysqltmp</tt> having size 2 GB and then mounted it:</p>
<pre class="literal-block">
PCUDSSW1502 ~> id mysql
uid=27(mysql) gid=27(mysql) groups=27(mysql) context=user_u:system_r:unconfined_t
PCUDSSW1502 ~> sudo vim /etc/fstab # edit as follows
PCUDSSW1502 ~> tail -1 /etc/fstab
tmpfs /mysqltmp tmpfs size=2G,nr_inodes=10k,mode=700,uid=27,gid=27 0 0
PCUDSSW1502 ~> sudo mkdir /mysqltmp
PCUDSSW1502 ~> sudo chown 27.27 /mysqltmp
PCUDSSW1502 ~> sudo ls -ld /mysqltmp
drwxr-xr-x 2 mysql mysql 4096 Apr 11 21:48 /mysqltmp
PCUDSSW1502 ~> sudo mount /mysqltmp
PCUDSSW1502 ~> df -h /mysqltmp
Filesystem Size Used Avail Use% Mounted on
tmpfs 2.0G 0 2.0G 0% /mysqltmp
</pre>
<p>Then configured MySQL to use it:</p>
<pre class="literal-block">
PCUDSSW1502 ~> sudo vim /etc/my.cnf # edit as follows
PCUDSSW1502 ~> sudo grep tmpdir /etc/my.cnf
tmpdir = /mysqltmp/
</pre>
<p>Then restarted MySQL when task queue permitted it:</p>
<pre class="literal-block">
PCUDSSW1502 ~> sudo /sbin/service mysqld restart
</pre>
<p>(5) According to <tt class="docutils literal">iostat</tt>, the usage of <tt class="docutils literal">/tmp</tt> went to zero even
during indexing, and there are no slow indexing queries in
<tt class="docutils literal"><span class="pre">/opt/mysql-data/mysqld-slow.log</span></tt> anymore.</p>
<p>Now MySQL is much faster thanks to using in-memory temporary tables as
opposed to using on-disk temporary tables. Especially since doing
this will also liberate disk systems to serve other purposes rather
than creating many small temporary files.</p>
Paul Erdös and the Game Show Problem2012-04-06T01:06:00+02:002012-04-06T01:06:00+02:00Tibor Šimkotag:tiborsimko.org,2012-04-06:/erdos-game-show-mistake.html<p>Suppose you're on a game show, and you're given the choice of three
doors. Behind one door is a car, behind the others, goats. You pick
a door, say #1, and the host, who knows what's behind the doors, opens
another door, say #3, which has a goat. He says …</p><p>Suppose you're on a game show, and you're given the choice of three
doors. Behind one door is a car, behind the others, goats. You pick
a door, say #1, and the host, who knows what's behind the doors, opens
another door, say #3, which has a goat. He says to you, "Would you
rather like to pick door #2?" Is it to your advantage to switch your
choice of doors?</p>
<!-- PELICAN_END_SUMMARY --><p>Please ponder this problem for a few minutes. It is the classic "game
show" problem whose solution may seem intuitively surprising.</p>
<p>Marilyn vos Savant, "the highest recorded IQ" for many years, wrote
about this problem in her "Ask Marilyn" newspaper column, stirring
much debate across the US. Many readers went indignant. Marylin
received 10,000+ protesting letters, including 1,000+ from PhDs and
math professors! She had to repetitively "fight" for the correct
answer.</p>
<p><center>* * *</center><p><p>Even one famous mathematician is said to have been genuinely puzzled
by the problem. Paul Erdös -- yes, the one of the "Erdös number"
fame! -- reportedly got the problem wrong initially. Leonard
Mlodinow, in his book "The Drunkard's Walk: How Randomness Rules Our
Lives", writes:</p>
<blockquote>
"When told of this, Paul Erdös, one of the leading mathematicians of
the twentieth century, said: "That's impossible." Then, when
presented with a formal mathematical proof of the correct answer, he
still didn't believe it and grew angry. Only after a colleague
arranged for a computer simulation in which Erdös watched hundreds
of trials came out [...] in favour of [...] did Erdös concede he was
wrong."</blockquote>
<p>(Some words elided in order not to give away the correct answer too
easily.)</p>
<p>If you got the problem wrong, then do not worry; if Paul Erdös <em>also</em>
got the problem wrong initially, then the world still has a chance.</p>
<p><center>* * *</center><p><p>Leonard Mlodinow cites Bruce Schechter's book "My Brain Is Open: The
Mathematical Journeys of Paul Erdös" as his source. I did not read
that one, but I went directly to the original first-hand source of the
story.</p>
<p>It was Andrew Vazsonyi who tried to convince Paul Erdös about the game
show problem solution. His personal account of the events is
available in short format <a class="reference external" href="http://www.emis.de/classics/Erdos/textpdf/vazsonyi/bayes.pdf">in Zentralblatt</a> and
in longer format in a <a class="reference external" href="http://www.decisionsciences.org/decisionline/vol30/30_1/vazs30_1.pdf">Decision Line paper</a>
and in an even longer format in Vazsonyi's book "Which Door has the
Cadillac: Adventures of a Real-Life Mathematician". The book contains
more side details and other stylistic changes, but the heart of the
story and the quotes about Erdös's reactions are the same as in the
above Decision Line paper.</p>
<p>Vazsonyi's account seems to suggest that Erdös was genuinely puzzled
about the problem.</p>
<p>(P.S. You may also be interested in Andrew Vazsonyi's <a class="reference external" href="http://www.emis.de/classics/Erdos/textpdf/vazsonyi/genius.pdf">obituary for Paul Erdös</a>
containing more (unrelated) stories about their encounters. The
stories are also present in the above-cited Vazsonyi's book "Which
Door has the Cadillac: Adventures of a Real-Life Mathematician" which
is an otherwise interesting reading in itself.)</p>
Python Garbage Collection Issue2012-01-17T00:05:00+01:002012-01-17T00:05:00+01:00Tibor Šimkotag:tiborsimko.org,2012-01-17:/python-garbage-collection-issue.html<p>For performance reasons, it may be interesting to switch off Python
garbage collection in inner loops. Here is a story behind doing that
in Invenio software.</p>
<!-- PELICAN_END_SUMMARY --><p>For performance reasons, it may be interesting to switch off Python
garbage collection in inner loops. Here is story behind doing that in
Invenio …</p><p>For performance reasons, it may be interesting to switch off Python
garbage collection in inner loops. Here is a story behind doing that
in Invenio software.</p>
<!-- PELICAN_END_SUMMARY --><p>For performance reasons, it may be interesting to switch off Python
garbage collection in inner loops. Here is story behind doing that in
Invenio software.</p>
<p>On the ADS instance of Invenio that contains about 10M of records, it
was observed that <tt class="docutils literal">run_sql()</tt> sometimes run much longer than other
times, especially when memory is being consumed by other objects.
Here is initial illustrative example posted by Benoit:</p>
<pre class="literal-block">
In [1]: from invenio.dbquery import run_sql
In [2]: %time res = run_sql("SELECT id_bibrec FROM bibrec_bib03x LIMIT 1000000")
CPU times: user 1.44 s, sys: 0.08 s, total: 1.52 s
Wall time: 1.92 s
In [3]: i = range(50000000)
In [4]: %time res = run_sql("SELECT id_bibrec FROM bibrec_bib03x LIMIT 1000000")
CPU times: user 11.36 s, sys: 0.07 s, total: 11.43 s
Wall time: 11.67 s
In [5]: j = range(50000000)
In [6]: %time res = run_sql("SELECT id_bibrec FROM bibrec_bib03x LIMIT 1000000")
CPU times: user 21.21 s, sys: 0.06 s, total: 21.27 s
Wall time: 21.54 s
</pre>
<p>The issue was discussed on <em>project-invenio-devel</em> mailing list in
December 2011 in the thread entitled "Slow MySQL queries with large
data structures in memory". It was eventually tracked down to a
<a class="reference external" href="http://bugs.python.org/issue4074">Python GC issue</a>. One graphical
illustration of this issue can be found at
<a class="reference external" href="http://is.runcode.us/q/is-there-a-way-to-circumvent-python-list-append-becoming-progressively-slower-in-a-loop-as-the-list-grows">http://is.runcode.us/q/is-there-a-way-to-circumvent-python-list-append-becoming-progressively-slower-in-a-loop-as-the-list-grows</a>.</p>
<p>Here are my findings comparing behaviour of Python-2.6 and Python-2.7
in this respect.</p>
<p>The test code:</p>
<div class="highlight"><pre><span></span><span class="ch">#!python</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="kn">import</span> <span class="nn">gc</span>
<span class="k">class</span> <span class="nc">A</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">x</span> <span class="o">=</span> <span class="mi">1</span>
<span class="bp">self</span><span class="o">.</span><span class="n">y</span> <span class="o">=</span> <span class="mi">2</span>
<span class="bp">self</span><span class="o">.</span><span class="n">why</span> <span class="o">=</span> <span class="s1">'no reason'</span>
<span class="k">def</span> <span class="nf">time_to_append</span><span class="p">(</span><span class="n">size</span><span class="p">,</span> <span class="n">append_list</span><span class="p">,</span> <span class="n">item_gen</span><span class="p">):</span>
<span class="n">t0</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="n">append_list</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">item_gen</span><span class="p">())</span>
<span class="k">return</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">t0</span>
<span class="k">def</span> <span class="nf">test</span><span class="p">():</span>
<span class="n">x</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">count</span> <span class="o">=</span> <span class="mi">5000</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1000</span><span class="p">):</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">time_to_append</span><span class="p">(</span><span class="n">count</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">A</span><span class="p">())</span>
<span class="k">def</span> <span class="nf">test_nogc</span><span class="p">():</span>
<span class="n">x</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">count</span> <span class="o">=</span> <span class="mi">5000</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1000</span><span class="p">):</span>
<span class="n">gc</span><span class="o">.</span><span class="n">disable</span><span class="p">()</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">time_to_append</span><span class="p">(</span><span class="n">count</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">A</span><span class="p">())</span>
<span class="n">gc</span><span class="o">.</span><span class="n">enable</span><span class="p">()</span>
<span class="k">if</span> <span class="s1">'--nogc'</span> <span class="ow">in</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">:</span>
<span class="n">test_nogc</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">test</span><span class="p">()</span>
</pre></div>
<p>The test results measured on an HP EliteBook 8440p laptop, with
Python-2.6 and Python-2.7:</p>
<img alt="Python-2.6 GC issue" src="http://tiborsimko.org/images/python-gc-issue-2.6.png" />
<img alt="Python-2.7 GC issue" src="http://tiborsimko.org/images/python-gc-issue-2.7.png" />
<p>One can see that the cyclic garbage collector jumps in more reasonably
with Python-2.7 than with Python-2.6, and that when it jumps in, it
terminates more rapidly as well. However, one can also see that it is
still much more preferable to switch GC off in the inner loop, during
the construction of tuples-of-tuples or lists-of-stuff objects. Hence
the patch for <tt class="docutils literal">run_sql()</tt> committed in
<a class="reference external" href="https://github.com/inveniosoftware/invenio/commit/7a2b03d232a04c054751bb003f4facbc8e214992">https://github.com/inveniosoftware/invenio/commit/7a2b03d232a04c054751bb003f4facbc8e214992</a>
regardless of Python-2.x version.</p>
USB 3G Modem Tips2011-12-22T16:40:00+01:002011-12-22T16:40:00+01:00Tibor Šimkotag:tiborsimko.org,2011-12-22:/usb-3g-modem.html<p>Recently I got to use a T-Mobile SK USB 3G Mobile Broadband Modem
device (ZTE MF190). Here are notes on how to make it work in
GNU/Linux.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="prerequisites">
<h2>Prerequisites</h2>
<pre class="literal-block">
$ sudo aptitude install lsusb usb-modeswitch
</pre>
</div>
<div class="section" id="switching-the-device-from-usb-storage-mode-to-usb-modem-mode">
<h2>Switching the device from USB storage mode to USB modem mode</h2>
<p>When the given T-Mobile USB …</p></div><p>Recently I got to use a T-Mobile SK USB 3G Mobile Broadband Modem
device (ZTE MF190). Here are notes on how to make it work in
GNU/Linux.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="prerequisites">
<h2>Prerequisites</h2>
<pre class="literal-block">
$ sudo aptitude install lsusb usb-modeswitch
</pre>
</div>
<div class="section" id="switching-the-device-from-usb-storage-mode-to-usb-modem-mode">
<h2>Switching the device from USB storage mode to USB modem mode</h2>
<p>When the given T-Mobile USB 3G dongle (ZTE MF190) was plugged, it was
mounted as a USB storage device with Debian GNU/Linux "Wheezy"
defaults. It was not recognised as a USB modem by default. So we
need to switch it.</p>
<p>In <tt class="docutils literal">dmesg</tt> system logs, spotted <tt class="docutils literal">idVendor</tt> and <tt class="docutils literal">idProduct</tt> parameters
when mounted as a storage device:</p>
<pre class="literal-block">
[ 112.955608] usb 2-1.1: new high speed USB device number 3 using ehci_hcd
[ 113.049919] usb 2-1.1: New USB device found, idVendor=19d2, idProduct=2000
[ 113.049925] usb 2-1.1: New USB device strings: Mfr=3, Product=2, SerialNumber=4
[ 113.049930] usb 2-1.1: Product: ZTE WCDMA Technologies MSM
[ 113.049934] usb 2-1.1: Manufacturer: ZTE,Incorporated
</pre>
<p>Verified via <tt class="docutils literal">lsusb</tt>:</p>
<pre class="literal-block">
$ sudo lsusb -vvv
Bus 002 Device 003: ID 19d2:2000 ONDA Communication S.p.A. ZTE MF627/MF628/MF628+/MF636+ HSDPA/HSUPA
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x19d2 ONDA Communication S.p.A.
idProduct 0x2000 ZTE MF627/MF628/MF628+/MF636+ HSDPA/HSUPA
bcdDevice 0.00
iManufacturer 3 ZTE,Incorporated
iProduct 2 ZTE WCDMA Technologies MSM
iSerial 4 MF1900TMOD010000
bNumConfigurations 1
</pre>
<p>This differs from default <tt class="docutils literal"><span class="pre">usb-modeswitch</span></tt> configurations for ZTE
MF190 devices that have the following product IDs:</p>
<pre class="literal-block">
$ grep -C2 MF190 /lib/udev/rules.d/40-usb_modeswitch.rules
# ZTE MF190 (Variant)
ATTRS{idVendor}=="19d2", ATTRS{idProduct}=="0149", RUN+="usb_modeswitch '%b/%k'"
# ZTE MF190
ATTRS{idVendor}=="19d2", ATTRS{idProduct}=="1224", RUN+="usb_modeswitch '%b/%k'"
</pre>
<p>So I cloned one:</p>
<pre class="literal-block">
$ cd /etc/usb_modeswitch.d
$ tar xzf /usr/share/usb_modeswitch/configPack.tar.gz 19d2:0149
$ cp 19d2\:0149 19d2\:2000
$ vim 19d2\:2000
$ diff 19d2\:0149 19d2\:2000
2c2
< # ZTE MF190 (Variant)
---
> # ZTE MF190 (T-Mobile SK, Tibor/2011-12-22)
5c5
< DefaultProduct=0x0149
---
> DefaultProduct=0x2000
</pre>
<p>Now if the device is plugged and if it is mounted as a USB storage
device, after the mounted device is ejected e.g. in <tt class="docutils literal">nautilus</tt>, and
after an additional wait for about 30 seconds, the <tt class="docutils literal">usb_modeswitch</tt>
will kick in and will switch the device from the USB storage mode to
the USB modem mode, and we are good to go.</p>
</div>
<div class="section" id="connecting-to-3g-network">
<h2>Connecting to 3G network</h2>
<p>Now one can enable Mobile Broadband connection in <tt class="docutils literal"><span class="pre">nm-applet</span></tt> and
configure the connection there.</p>
</div>
Gnome3 First Look2011-11-24T00:00:00+01:002011-11-24T00:00:00+01:00Tibor Šimkotag:tiborsimko.org,2011-11-24:/gnome3-first-look.html<p>Now that Gnome 3.0.2 packages hit Debian Wheezy repository, I thought
of giving vanilla Gnome a try for a week or two. After many years of
using small, tiling, keyboard-friendly window managers
<a class="reference external" href="http://en.wikipedia.org/wiki/Ion_(window_manager)">ion</a>,
<a class="reference external" href="http://dwm.suckless.org/">dwm</a>,
<a class="reference external" href="http://awesome.naquadah.org/">awesome</a> and
<a class="reference external" href="http://xmonad.org/">xmonad</a>,
I was curious to see how a typical mainstream Desktop Environment …</p><p>Now that Gnome 3.0.2 packages hit Debian Wheezy repository, I thought
of giving vanilla Gnome a try for a week or two. After many years of
using small, tiling, keyboard-friendly window managers
<a class="reference external" href="http://en.wikipedia.org/wiki/Ion_(window_manager)">ion</a>,
<a class="reference external" href="http://dwm.suckless.org/">dwm</a>,
<a class="reference external" href="http://awesome.naquadah.org/">awesome</a> and
<a class="reference external" href="http://xmonad.org/">xmonad</a>,
I was curious to see how a typical mainstream Desktop Environment user
experience evolved over time from a keyboard-oriented user point of
view.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="install-gnome-shell-and-gnome-tweak-tool">
<h2>Install gnome-shell and gnome-tweak-tool</h2>
<p>Installed gnome-shell and gnome-tweak-tool:</p>
<pre class="literal-block">
$ sudo aptitude install gnome-shell gnome-tweak-tool
</pre>
<p>and configured keyboard shortcuts to my liking (e.g. <tt class="docutils literal"><span class="pre">s-2</span></tt> to switch
to desktop 2) and to be more Dvorak-friendly (e.g. <tt class="docutils literal"><span class="pre">s-c</span></tt> to switch
to the last focused window).</p>
<p>If running <tt class="docutils literal"><span class="pre">gnome-tweak-tool</span></tt> gives a message about
<tt class="docutils literal"><span class="pre">pk-gtk-module</span></tt> not being available, then install it:</p>
<pre class="literal-block">
$ sudo apt-file search pk-gtk-module
gnome-packagekit: /usr/lib/gnome-settings-daemon-3.0/gtk-modules/gpk-pk-gtk-module.desktop
packagekit-dbg: /usr/lib/debug/usr/lib/gtk-3.0/modules/libpk-gtk-module.so
packagekit-gtk3-module: /usr/lib/gtk-3.0/modules/libpk-gtk-module.so
$ sudo aptitude install gnome-packagekit packagekit-dbg packagekit-gtk3-module
</pre>
</div>
<div class="section" id="switch-off-blinking-cursor-in-gnome-terminal">
<h2>Switch off blinking cursor in gnome-terminal</h2>
<p>In gnome terminal, the default blinking cursor is annoying. There
does not seem to be an UI option to disable blinking cursor. One has
to use <tt class="docutils literal">gsettings</tt>:</p>
<pre class="literal-block">
$ gsettings set org.gnome.desktop.interface cursor-blink false
</pre>
<p>Otherwise the terminal scrolling speed seems to be on the par with
<tt class="docutils literal">urxvt</tt>, and looks are good with <em>Unifont Medium 12</em> and
green-on-black colour scheme.</p>
</div>
<div class="section" id="reduce-window-title-bar-height">
<h2>Reduce window title bar height</h2>
<p>Window title bars are too large and space consuming, especially since
I'm used to using window managers where they are typically absent.
So I did:</p>
<pre class="literal-block">
$ sudo sed -i '/title_vertical_pad/s|value="[0-9]\{1,2\}"|value="0"|g' \
/usr/share/themes/Adwaita/metacity-1/metacity-theme-3.xml
</pre>
<p>followed by restarting the gnome-shell via <tt class="docutils literal"><span class="pre">M-F2</span> r RET</tt>. This changes
vertical padding from 14 to 0, which gives windows a sleeker look. To
restore the original values, re-install package
<tt class="docutils literal"><span class="pre">gnome-themes-standard</span></tt>.</p>
<p>Tip seen at the fine <a class="reference external" href="https://wiki.archlinux.org/index.php/GNOME">ArchWiki Gnome</a> page.</p>
</div>
<div class="section" id="hide-window-title-bar-when-maximised">
<h2>Hide window title bar when maximised</h2>
<p>Window title bars are better removed completely, especially in
maximised mode. I did:</p>
<pre class="literal-block">
$ sudo sed -i -r 's|(<frame_geometry name="max")|\1 has_title="false"|' \
/usr/share/themes/Adwaita/metacity-1/metacity-theme-3.xml
</pre>
<p>followed by restarting the gnome-shell via <tt class="docutils literal"><span class="pre">M-F2</span> r RET</tt>. After this
tweak, you may find it difficult to un-maximize a window when there is
no title bar to grab. With suitable keybindings, you should be able
to use <tt class="docutils literal"><span class="pre">M-F5</span></tt>, <tt class="docutils literal"><span class="pre">M-F10</span></tt> or <tt class="docutils literal"><span class="pre">M-SPC</span></tt> to remedy the situation.</p>
<p>Tip seen at the fine <a class="reference external" href="https://wiki.archlinux.org/index.php/GNOME">ArchWiki Gnome</a> page.</p>
</div>
<div class="section" id="keyboard-friendly-window-navigation">
<h2>Keyboard-friendly window navigation</h2>
<p>It is rather essential to install <tt class="docutils literal"><span class="pre">gnome-shell-extensions</span></tt>.
Especially the <em>windows navigator</em> extension is essential for
keyboard-friendly navigation in the Gnome Overview "Exposé" mode.
However, <tt class="docutils literal"><span class="pre">gnome-shell-extensions</span></tt> did not seem to have hit the
Debian Wheezy software repository yet. So built Debian packages in
this way:</p>
<pre class="literal-block">
mkdir -p /home/simko-local/apps/gnome-shell-extensions
cd /home/simko-local/apps/gnome-shell-extensions
git clone https://github.com/bilalakhtar/gnome-shell-extensions-debian.git
cd gnome-shell-extensions-debian
dpkg-buildpackage -rfakeroot -uc -b
sudo dpkg -i ../gnome-shell-extensions-common_3.0.2-1_all.deb \
../gnome-shell-extensions-windows-navigator_3.0.2-1_all.deb
</pre>
<p>followed by restarting gnome-shell by <tt class="docutils literal"><span class="pre">M-F2</span> r RET</tt> and checking
extensions in <tt class="docutils literal"><span class="pre">gnome-tweak-tool</span></tt> and Gnome Looking Glass (<tt class="docutils literal"><span class="pre">M-F2</span> lg
RET</tt>).</p>
<p>After installing and enabling the <em>windows navigator</em> extension one
can switch windows via <tt class="docutils literal">s <span class="pre">M-3</span></tt> like keyboard combinations in the Gnome
Overview "Exposé" mode.</p>
</div>
<div class="section" id="start-some-applications-in-some-desktops">
<h2>Start some applications in some desktops</h2>
<p>Another useful productivity extension is <em>auto-move-windows</em> that
permits to start certain applications on certain desktops. It got
compiled in the previous step. Install via:</p>
<pre class="literal-block">
$ sudo dpkg -i ../gnome-shell-extensions-auto-move-windows_3.0.2-1_all.deb
</pre>
<p>Here is an example of a configuration that will start Chromium on
desktop three and Skype on desktop four:</p>
<pre class="literal-block">
$ gsettings set org.gnome.shell.extensions.auto-move-windows \
application-list "['chromium.desktop:3','skype.desktop:4']"
</pre>
<p>BTW, in order to start some applications automatically at the Gnome
boot up time, when not using <tt class="docutils literal"><span class="pre">~/.xinitrc</span></tt>, one can configure them by
running <tt class="docutils literal"><span class="pre">gnome-session-properties</span></tt>, or else via editing files under
<tt class="docutils literal"><span class="pre">~/.config/autostart/foo.desktop</span></tt>.</p>
</div>
<div class="section" id="tiling">
<h2>Tiling</h2>
<p>gnome-shell comes with basic tiling functionality, covering 2x1 setup.
This is suitable especially for laptop only use case, where I usually
either maximise windows, or else use them tiled precisely in a 2x1
manner. However, when using bigger screens, the 2x1 tiling mode is
not sufficient. I would have preferred to have at least one more
grid-like 2x2 tiling mode option.</p>
<p>The tiling is achieved by mouse-dragging windows to the left or right.
There does not seem to be a possibility to define keyboard shortcuts
to tile windows.</p>
<p>However, there is <tt class="docutils literal">shellshape</tt> gnome extension that provides some
tiling capabilities like xmonad. It is less smooth around the ages.</p>
</div>
<div class="section" id="drop-down-scratchpad-terminal">
<h2>Drop-down scratchpad terminal</h2>
<p>In xmonad I'm using "named scratchpads". One of the primary use case
of which has been to press a key and bring up a terminal with a
permanent <tt class="docutils literal">screen</tt> or <tt class="docutils literal">tmux</tt> session running in it. The same
functionality can be achieved in Gnome 3 by installing <tt class="docutils literal">guake</tt> which
offers a drop-down terminal after pressing <tt class="docutils literal">F12</tt>.</p>
</div>
<div class="section" id="summary">
<h2>Summary</h2>
<p>Impressions after using this setup for a week? Pleasant UI with good
looks, modern feel, nice overview exposé-like mode. With the
above-mentioned tweaks, also relatively space efficient and keyboard
friendly. I was pleased to see that basic tiling capabilities started
to hit the main desktop environments in a native manner. (And I hope
that native tiling gets extended much more in next gnome-shell
versions.) However, the productivity and ergonomics is still far from
using a native keyboard-friendly tiling window manager for me. This
is especially noticeable on set ups with bigger screens. After I went
back to <a class="reference external" href="http://xmonad.org/">xmonad</a>, I felt immediately much more
productive and efficient. Home, sweet home.</p>
</div>
Tee2011-06-06T01:30:00+02:002011-06-06T01:30:00+02:00Tibor Šimkotag:tiborsimko.org,2011-06-06:/tee.html<p>Have you ever tried to use <strong>echo</strong> with <strong>sudo</strong> to write some text
to files owned by another user? Did not work? <strong>Tee</strong>, a handy tool
that reads from standard input and writes to standard output and
files, is here help.</p>
<!-- PELICAN_END_SUMMARY --><p>Imagine the following directory permission situation:</p>
<pre class="literal-block">
$ ls -ld /var …</pre><p>Have you ever tried to use <strong>echo</strong> with <strong>sudo</strong> to write some text
to files owned by another user? Did not work? <strong>Tee</strong>, a handy tool
that reads from standard input and writes to standard output and
files, is here help.</p>
<!-- PELICAN_END_SUMMARY --><p>Imagine the following directory permission situation:</p>
<pre class="literal-block">
$ ls -ld /var/www
drwxr-xr-x 3 www-data www-data 4096 Dec 15 14:09 /var/www
</pre>
<p>This does not work:</p>
<pre class="literal-block">
$ echo Hello > /var/www/index.html
bash: /var/www/index.html: Permission denied
</pre>
<p>This does not either:</p>
<pre class="literal-block">
$ sudo -u www-data echo Hello > /var/www/index.html
bash: /var/www/index.html: Permission denied
</pre>
<p>The solution is to use <strong>tee</strong>:</p>
<pre class="literal-block">
$ echo Hello | sudo -u www-data tee -a /var/www/index.html
Hello
</pre>
<p>Another, possibly more common example, involving super user:</p>
<pre class="literal-block">
$ cat /sys/module/video/parameters/brightness_switch_enabled
Y
$ echo N | sudo tee /sys/module/video/parameters/brightness_switch_enabled
N
$ cat /sys/module/video/parameters/brightness_switch_enabled
N
</pre>
Git Subversion Mirroring2011-02-24T07:23:00+01:002011-02-24T07:23:00+01:00Tibor Šimkotag:tiborsimko.org,2011-02-24:/git-subversion-mirroring.html<p>Consider a project using git source code management system. Now
imagine a need arises to automatically mirror git commits to a
(read-only) subversion repository. How can one achieve such an
automated push?</p>
<!-- PELICAN_END_SUMMARY --><p>One technique is to use <strong>git-svn</strong> extension and prepare git
<strong>grafts</strong> to link the two repositories together. Here …</p><p>Consider a project using git source code management system. Now
imagine a need arises to automatically mirror git commits to a
(read-only) subversion repository. How can one achieve such an
automated push?</p>
<!-- PELICAN_END_SUMMARY --><p>One technique is to use <strong>git-svn</strong> extension and prepare git
<strong>grafts</strong> to link the two repositories together. Here is a
functional example demonstrating the concept:</p>
<div class="highlight"><pre><span></span><span class="c1">## prepare test space:</span>
rm -rf /tmp/test-git-to-svn
mkdir -p /tmp/test-git-to-svn
<span class="nb">cd</span> /tmp/test-git-to-svn
<span class="c1">## create test git project:</span>
mkdir test-project-git
<span class="nb">cd</span> test-project-git
git init
<span class="nb">echo</span> a > README
git add README
git commit -a -m a --author<span class="o">=</span><span class="s1">'Erika Mustermann <erika.mustermann@example.org>'</span>
<span class="nb">echo</span> b > README
git commit -a -m b --author<span class="o">=</span><span class="s1">'John Doe <john.doe@example.org>'</span>
git log
<span class="nb">cd</span> ..
<span class="c1">## create test SVN repo and its structure:</span>
svnadmin create test-svn-repo
svn co file:///tmp/test-git-to-svn/test-svn-repo work-svn-repo
<span class="nb">cd</span> work-svn-repo
svn mkdir trunk tags branches
svn commit -m <span class="s2">"initial repo structure"</span>
<span class="nb">cd</span> ..
<span class="c1">## make test SVN checkout:</span>
svn co file:///tmp/test-git-to-svn/test-svn-repo/trunk test-1
<span class="nb">cd</span> test-1
svn log
<span class="nb">cd</span> ..
<span class="c1">## link SVN repo to Git repo:</span>
<span class="nb">cd</span> test-project-git
git svn init -s file:///tmp/test-git-to-svn/test-svn-repo
git svn fetch
git branch -a
<span class="c1">## prepare git graft:</span>
<span class="nv">A</span><span class="o">=</span><span class="sb">`</span>git show-ref trunk <span class="p">|</span> awk <span class="s1">'{print $1;}'</span><span class="sb">`</span>
<span class="nv">B</span><span class="o">=</span><span class="sb">`</span>git log --pretty<span class="o">=</span>oneline master <span class="p">|</span> tail -n1 <span class="p">|</span> awk <span class="s1">'{print $1;}'</span><span class="sb">`</span>
<span class="nb">echo</span> <span class="nv">$A</span>
<span class="nb">echo</span> <span class="nv">$B</span>
<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$B</span><span class="s2"> </span><span class="nv">$A</span><span class="s2">"</span> >> .git/info/grafts
<span class="c1">## now try first commit:</span>
git svn dcommit
<span class="c1">## test new checkout from SVN:</span>
<span class="nb">cd</span> ..
svn co file:///tmp/test-git-to-svn/test-svn-repo/trunk test-2
<span class="nb">cd</span> test-2
svn log
<span class="nb">cd</span> ..
<span class="c1">## now emulate more work in git:</span>
<span class="nb">cd</span> test-project-git
<span class="nb">echo</span> c > README
git commit -a -m c
git svn dcommit
<span class="c1">## now try SVN update:</span>
<span class="nb">cd</span> ..
<span class="nb">cd</span> test-1
svn up
svn log
</pre></div>
Sed2010-07-11T10:00:00+02:002010-07-11T10:00:00+02:00Tibor Šimkotag:tiborsimko.org,2010-07-11:/sed.html<p><strong>Sed</strong>, the stream editor, is a useful tool in one's command-line-fu.
Here are some nifty one-liners.</p>
<!-- PELICAN_END_SUMMARY --><p>Replace stuff in a file:</p>
<pre class="literal-block">
$ sed -i 's/oldstring/newstring/g' filename
</pre>
<p>Replace stuff in many files:</p>
<pre class="literal-block">
$ find modules -name "*.py" -exec sed -i 's,CFG_SOME_VARIABLE,CFG_OTHER_VARIABLE,g' {} \;
</pre>
<p>Print line number 123 and quit …</p><p><strong>Sed</strong>, the stream editor, is a useful tool in one's command-line-fu.
Here are some nifty one-liners.</p>
<!-- PELICAN_END_SUMMARY --><p>Replace stuff in a file:</p>
<pre class="literal-block">
$ sed -i 's/oldstring/newstring/g' filename
</pre>
<p>Replace stuff in many files:</p>
<pre class="literal-block">
$ find modules -name "*.py" -exec sed -i 's,CFG_SOME_VARIABLE,CFG_OTHER_VARIABLE,g' {} \;
</pre>
<p>Print line number 123 and quit:</p>
<pre class="literal-block">
$ sed -n '123{p;q}' filename
</pre>
<p>Comment out line number 123:</p>
<pre class="literal-block">
$ sed -i '123s/\(.*\)/#\1/' filename
</pre>
<p>Delete line number 123:</p>
<pre class="literal-block">
$ sed -i 123d ~/.ssh/known_hosts
</pre>
<p>Print from line 1 until regexp:</p>
<pre class="literal-block">
$ sed -n '1,/regex/p' filename
</pre>
<p>Print from regexp until end of file:</p>
<pre class="literal-block">
$ sed -n '/regex/,$p' filename
</pre>
Rambam's Daily Schedule2010-04-07T18:00:00+02:002010-04-07T18:00:00+02:00Tibor Šimkotag:tiborsimko.org,2010-04-07:/rambam-daily-schedule.html<p>Life's busy. We may feel being overwhelmed by our personal schedules,
at times. We may feel being too tired to work it all out, at times.
When this happens, why don't we look outside and ponder how a really
busy person's schedule looks like? Shall we still feel too busy …</p><p>Life's busy. We may feel being overwhelmed by our personal schedules,
at times. We may feel being too tired to work it all out, at times.
When this happens, why don't we look outside and ponder how a really
busy person's schedule looks like? Shall we still feel too busy and
too tired to do our own part?</p>
<!-- PELICAN_END_SUMMARY --><p>Rabbi Mosheh ben Maimon, aka Rambam, was born in 1132 in Cordoba and
died in 1204 in Cairo. Rambam worked as the court physician for the
Sultan Saladin and the royal family in Cairo. Rambam authored many
works in medicine, philosophy, and the Jewish law. He is best known
for the monumental <em>Mishneh Torah</em>, the fourteen-volume codification
of Jewish law and ethics, and the philosophical treatise <em>Guide for
the Perplexed</em> on the Aristotelian philosophy from the Jewish
perspective. The greatness of Rambam's stature is apparent from the
epitaph "From Mosheh to Mosheh there has been none like Mosheh".</p>
<p>How was Rambam's day, then? Glimpses on Rambam's daily schedule were
revealed in his letter to Samuel ibn Tibbon, the translator of the
"Guide of the Perplexed". Rambam writes: (in 1199)</p>
<blockquote>
<p>Now G-d knows that in order to write this to you I have escaped to a
secluded spot, where people would not think to find me, sometimes
leaning for support against the wall, sometimes lying down on account
of my excessive weakness, for I have grown old and feeble.</p>
<p>With regard to your wish to come here to me, I cannot but say how
greatly your visit would delight me, for I truly long to commune with
you, and would anticipate our meeting with even greater joy than you.
Yet I must advise you not to expose yourself to the perils of the
voyage, for beyond seeing me, and my doing all I could to honor you,
you would not derive any advantage from your visit. Do not expect to
be able to confer with me on any scientific subject, for even one hour
either by day or by night, for the following is my daily occupation.
I dwell at Misr [Fostat] and the Sultan resides at Kahira [Cairo];
these two places are two Shabbath days' journey [about one mile and a
half] distant from each other. My duties to the Sultan are very
heavy. I am obliged to visit him every day, early in the morning; and
when he or any of his children, or any of the inmates of his harem,
are indisposed, I dare not quit Kahira, but must stay during the
greater part of the day in the palace. It also frequently happens
that one or two of the royal officers fall sick, and I must attend to
their healing. Hence, as a rule, I repair to Kahira very early in the
day, and even if nothing unusual happens I do not return to Misr until
the afternoon. Then I am almost dying with hunger. I find the
antechamber filled with people, both Jews and Gentiles, nobles and
common people, judges and bailiffs, friends and foes --- a mixed
multitude, who await the time of my return.</p>
<p>I dismount from my animal, wash my hands, go forth to my patients, and
entreat them to bear with me while I partake of some slight
refreshment, the only meal I take in the twenty-four hours. Then I
attend to my patients, write prescriptions for their various ailments.
Patients go in and out until nightfall, and sometimes even, I solemnly
assure you, until two hours and more in the night. I converse and
prescribe for them while lying down from sheer fatigue, and when night
falls, I am so exhausted that I can scarcely speak.</p>
<p>In consequence of this, no Israelite can have any private interview
with me, except on the Shabbath. On that day the whole congregation,
or at least the majority of the members, come unto me after the
morning service, when I instruct them as to their proceedings during
the whole week; we study together a little until noon, when they
depart. Some of them return, and read with me after the afternoon
service until evening prayers. In this manner I spend that day. I
have here related to you only a part of what you would see if you were
to visit me. Now, when you have completed for our brethren the
translation you have commenced, I beg that you will come to me but not
with the hope of deriving any advantage from your visit as regards
your studies; for my time is, as I have shown you, excessively
occupied.</p>
</blockquote>
<p>(Quoted from "A Maimonides Reader", by Isadore Twersky, p.6-8.)</p>
Emacs Jabber2009-09-23T12:00:00+02:002009-09-23T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2009-09-23:/emacs-jabber.html<p>The <a class="reference external" href="http://emacs-jabber.sourceforge.net/">jabber.el</a> package
provides an Emacs interface to the Jabber/XMPP chatting ecosystem. I
use it regularly with Google Talk and other Jabber services.</p>
<!-- PELICAN_END_SUMMARY --><p>You can use commands like <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-c</span></tt> to connect to a server or
two, then <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-r</span></tt> to visit your roster, then <tt class="docutils literal">RET …</tt></p><p>The <a class="reference external" href="http://emacs-jabber.sourceforge.net/">jabber.el</a> package
provides an Emacs interface to the Jabber/XMPP chatting ecosystem. I
use it regularly with Google Talk and other Jabber services.</p>
<!-- PELICAN_END_SUMMARY --><p>You can use commands like <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-c</span></tt> to connect to a server or
two, then <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-r</span></tt> to visit your roster, then <tt class="docutils literal">RET</tt> on a name
to launch a chat with someone, then <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-a</span></tt> to set presence
status to "away", etc.</p>
<div class="section" id="unobtrusive-im-workflow">
<h2>Unobtrusive IM workflow</h2>
<p>When you've had enough chatting, and you go back to writing some code
(or some email, or ...), your chat buffers become not visible anymore.
Then, if you have some new incoming IM activity, be it a private
message or a groupchat message or a subscription request, your Emacs
mode line will gently change and display the name of the incoming
contacts/groupchats that have seen new IM activity. You can then use
<tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-l</span></tt> to switch right to the correct jabber buffer to
continue the chat. After you are done, press <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-j</span> <span class="pre">C-l</span></tt> again to
get back to coding to the very same place you had been to before
answering the chat call.</p>
<p>This is a nicely unobtrusive way of IM operations. Work as usual,
answer incoming activity requests with a keypress, and use the same
keypress to get back to your work exactly where you left it.</p>
<p>Needless to say, one can also configure things to be as obtrusive as
desired, say pop up a notification window upon receiving incoming
message.</p>
</div>
<div class="section" id="activity-patch">
<h2>Activity patch</h2>
<p>A few days ago I made a small patch for <tt class="docutils literal">jabber.el</tt> related to
activity tracking.</p>
<p>Consider a situation where you hack on a project in a frame located on
Desktop 2 while your jabber buffers are open in another frame on
Desktop 8 that you rarely visit. While hacking in Desktop 2, the
jabber activity flags would not be raised in that frame's mode line,
because jabber buffers are considered "visible", even though in
reality they are visible on another desktop, and hence "invisible" in
this desktop in this user scenario.</p>
<p>I've created a new custom variable
<tt class="docutils literal"><span class="pre">jabber-activity-all-frames-visible</span></tt> that permits to configure the
activity alarm raising behaviour on all frames. You can get the patch
from <a class="reference external" href="http://article.gmane.org/gmane.emacs.jabber.general/913">http://article.gmane.org/gmane.emacs.jabber.general/913</a>.</p>
</div>
<div class="section" id="groupchat-history-logging">
<h2>Groupchat history logging</h2>
<p>Another patch I created a few days ago is related to logging chat
history.</p>
<p>The current behaviour of the message history logger seems to be not to
log groupchat messages. In my use case scenario, I would like to be
able to log messages of <em>some</em> particular groupchats, such as our
"invenio" developer chat room.</p>
<p>I've addressed this need by proposing a new custom variable
<tt class="docutils literal"><span class="pre">jabber-history-enabled-groupchats</span></tt> that is basically a regexp of
groupchats we want to enable logging for. You can get the patch from
<a class="reference external" href="http://article.gmane.org/gmane.emacs.jabber.general/915">http://article.gmane.org/gmane.emacs.jabber.general/915</a>.</p>
</div>
Emacs Multifile Operations2009-07-05T18:00:00+02:002009-07-05T18:00:00+02:00Tibor Šimkotag:tiborsimko.org,2009-07-05:/emacs-multifile-operations.html<p>The <strong>dired</strong> mode is an Emacs interface to the filesystem, enabling
one to perform certain operations on multiple files. The dired mode
can be combined with other Emacs goodies such as the search and
replace tool into a powerful multi-file editing tool.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="dired-example-one">
<h2>Dired Example One</h2>
<p>Say we would like to …</p></div><p>The <strong>dired</strong> mode is an Emacs interface to the filesystem, enabling
one to perform certain operations on multiple files. The dired mode
can be combined with other Emacs goodies such as the search and
replace tool into a powerful multi-file editing tool.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="dired-example-one">
<h2>Dired Example One</h2>
<p>Say we would like to rename all files named <tt class="docutils literal">*.shtml.wml</tt> into
<tt class="docutils literal">*.php.wml</tt> in a certain directory such as <tt class="docutils literal">/tmp</tt>. You may know
of the Linux CLI tool <tt class="docutils literal">mmv</tt> with which we could do:</p>
<pre class="literal-block">
$ mmv '*.shtml.wml' '#1.php.wml'
</pre>
<p>but what if we are on a system where mmv is not available? Shall we
try to install it or write a little script to achieve what we want?</p>
<p>What about using Emacs's dired and its <strong>dired-do-rename-regexp</strong>
command (bound on <tt class="docutils literal">% R</tt>) instead. Firstly, press <tt class="docutils literal"><span class="pre">C-x</span> d /tmp RET</tt> to
launch dired on our working directory (or <tt class="docutils literal">dired /tmp</tt> in the
<tt class="docutils literal">eshell</tt>). Then use this dired file regexp renaming command:</p>
<pre class="literal-block">
% R ^\(.*\)\.shtml\.wml$ RET \1.php.wml RET
</pre>
<p>and press <tt class="docutils literal">y</tt> or <tt class="docutils literal">n</tt> or <tt class="docutils literal">!</tt> to rename some or all of the matched
files, and we are done.</p>
</div>
<div class="section" id="dired-example-two">
<h2>Dired Example Two</h2>
<p>We are moving away from javadoc-style of docstrings
to epydoc-style of docstrings in our coding project, so we would like
to replace all occurrences of <tt class="docutils literal">@param foo bar</tt> by <tt class="docutils literal">@param foo: bar</tt>
recursively in all our sources.</p>
<p>Firstly, search for all the files containing <tt class="docutils literal">@param something space</tt>,
i.e. when the variable name not followed by a colon, and let us make a
dired buffer out of these files, by means of <strong>find-grep-dired</strong>:</p>
<pre class="literal-block">
M-x find-grep-dired RET ~/src/cds-invenio/modules RET @param \(\w*\) SPC RET
</pre>
<p>Secondly, in the dired buffer, mark only the Python files for further
processing:</p>
<pre class="literal-block">
% m \.py$ RET
</pre>
<p>Thirdly, replace wanted expressions in the preselected files:</p>
<pre class="literal-block">
Q @param \(\w+\) SPC RET @param \1: SPC RET
</pre>
<p>Now choose interactively <tt class="docutils literal">y</tt> or <tt class="docutils literal">n</tt> to replace or not the given
occurrence of the param regexp, or use <tt class="docutils literal">!</tt> to replace silently every
occurrence, etc.</p>
<p>Finally, save all buffers:</p>
<pre class="literal-block">
C-x s !
</pre>
<p>and we are done.</p>
<p>Another advantage of doing these replacements inside Emacs itself
rather than via CLI one-liners is a much better interactivity: we can
easily test our regexps, replace only some occurrences while not
touching others, revert some of the edits back, etc.</p>
</div>
Emacs Keyboard Macros2009-07-05T12:00:00+02:002009-07-05T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2009-07-05:/emacs-keyboard-macros.html<p>Need to run a certain operation on a bunch of lines? E.g. add some
text after a certain column or to the end of every line in a buffer?
With keyboard macros you can perform your line operations "live" for a
line, while recording them, and then replay them …</p><p>Need to run a certain operation on a bunch of lines? E.g. add some
text after a certain column or to the end of every line in a buffer?
With keyboard macros you can perform your line operations "live" for a
line, while recording them, and then replay them for the other lines.</p>
<!-- PELICAN_END_SUMMARY --><p>How to record a macro: position yourself to the beginning of line,
press <tt class="docutils literal"><span class="pre">C-x</span> (</tt> to start recording keyboard macro, then do some stuff
such as <tt class="docutils literal"><span class="pre">C-e</span></tt> to jump to end of line to write some text, then press
<tt class="docutils literal"><span class="pre">C-a</span> <span class="pre">C-n</span></tt> to jump to start of the next line, then <tt class="docutils literal"><span class="pre">C-x</span> )</tt> to end
macro recording.</p>
<p>How to replay a macro: <tt class="docutils literal"><span class="pre">C-x</span> e</tt> to replay it once, then keep pressing
e to replay it for other lines. Or, to replay it for the following
1234 lines, do <tt class="docutils literal"><span class="pre">C-u</span> 1234 <span class="pre">C-x</span> e</tt>. Or, to replay macro to all lines in
a region, do <tt class="docutils literal"><span class="pre">C-x</span> <span class="pre">C-k</span> r</tt>.</p>
<p>One can combine macros with register counters in order to rapidly help
Bart Simpson:</p>
<pre class="literal-block">
C-u 1 C-x r n a ;; store number 1 in register `a'
C-x ( ;; start recording macro
C-x r i a ;; insert contents of register `a'
C-f . I will not waste chalk. RET ;; enter our text
C-x r + a ;; increment register `a'
C-x ) ;; end recording macro
C-u 7 C-x e ;; apply macro 7 times
</pre>
<p>which will produce this output:</p>
<pre class="literal-block">
1. I will not waste chalk.
2. I will not waste chalk.
3. I will not waste chalk.
4. I will not waste chalk.
5. I will not waste chalk.
6. I will not waste chalk.
7. I will not waste chalk.
8. I will not waste chalk.
</pre>
Writing Python Docstrings with Emacs2009-06-08T12:00:00+02:002009-06-08T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2009-06-08:/emacs-epydoc-snippets.html<p>Writing documentation alongside coding is one of the best ways to
ensure that the documentation stays up to date with the code. In
Python, such a documentation can be achieved by writing rich
docstrings. How can Emacs help us with writing rich docstrings?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="code-skeleton-and-templates">
<h2>Code skeleton and templates</h2>
<p>Emacs offers several …</p></div><p>Writing documentation alongside coding is one of the best ways to
ensure that the documentation stays up to date with the code. In
Python, such a documentation can be achieved by writing rich
docstrings. How can Emacs help us with writing rich docstrings?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="code-skeleton-and-templates">
<h2>Code skeleton and templates</h2>
<p>Emacs offers several packages providing "code skeletons" or "code
templates" that help with writing repetitive patterns. For example,
if you type <tt class="docutils literal">def</tt> in a Python buffer and press <tt class="docutils literal">TAB</tt> afterwards,
the editor can auto-complete basic generic function skeleton for you;
including skeleton docstring.</p>
<p>There are several alternatives how to achieve skeletons or templating,
out of which I've settled for <a class="reference external" href="http://www.emacswiki.org/emacs/Yasniappet">yasnippets</a>. Yasnippets come with
predefined support for many languages. In our example at hand, we'd
like it to generate rich, appropriate docstrings when writing Python
functions or class methods.</p>
</div>
<div class="section" id="epytext">
<h2>Epytext</h2>
<p>One popular choice how to richly format Python docstrings is to use
the <a class="reference external" href="http://epydoc.sourceforge.net/epytext.html">epytext</a> markup,
after which <a class="reference external" href="http://epydoc.sourceforge.net/">epydoc</a> can generate
nicely formatted static documentation pages for the code.</p>
<p>A function documented in the epytext markup looks like this:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">area</span><span class="p">(</span><span class="n">length</span><span class="p">,</span> <span class="n">width</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Return rectangle area.</span>
<span class="sd"> @param length: rectangle length</span>
<span class="sd"> @type length: float</span>
<span class="sd"> @param width: rectangle width</span>
<span class="sd"> @type width: float</span>
<span class="sd"> @return: rectangle area</span>
<span class="sd"> @rtype: float</span>
<span class="sd"> """</span>
<span class="k">return</span> <span class="n">length</span><span class="o">*</span><span class="n">width</span>
</pre></div>
<p>Notice how <tt class="docutils literal">@param</tt> serves to document function parameters,
<tt class="docutils literal">@type</tt> their types, <tt class="docutils literal">@return</tt> what the function returns, and
<tt class="docutils literal">@rtype</tt> its type. Other useful markup is <tt class="docutils literal">@raise</tt> to document
which exceptions might be raised by the function, or <tt class="docutils literal">@note</tt> to add
extra notes.</p>
<p>Yasnippet's native <tt class="docutils literal">def</tt> snippet does not support epytext markup.
It is easy to define a new custom snippet (say called <tt class="docutils literal">de</tt>) that
<em>will</em> support it.</p>
</div>
<div class="section" id="defining-custom-snippet">
<h2>Defining custom snippet</h2>
<p>Here is how to load custom snippets from some directory:</p>
<pre class="literal-block">
(require 'yasnippet)
(yas/initialize)
(yas/load-directory "~/.emacs.d/snippets")
</pre>
<p>Here is how I defined custom <tt class="docutils literal">de</tt> snippet, located in
<tt class="docutils literal"><span class="pre">~/.emacs.d/snippets/text-mode/python-mode/de</span></tt>:</p>
<pre class="literal-block">
# -*- coding: utf-8 -*-
# name: de
# contributor: Orestis Markou
# contributor: Yasser González Fernández <yglez@uh.cu>
# contributor: Tibor Simko <tibor.simko@cern.ch>
# --
def ${1:name}($2):
"""
$3
${2:$
(let* ((indent
(concat "\n" (make-string (current-column) 32)))
(args
(mapconcat
'(lambda (x)
(if (not (string= (nth 0 x) ""))
(concat "@param " (nth 0 x) ": " indent
"@type " (nth 0 x) ": ")))
(mapcar
'(lambda (x)
(mapcar
'(lambda (x)
(replace-regexp-in-string "[[:blank:]]*$" ""
(replace-regexp-in-string "^[[:blank:]]*" "" x)))
x))
(mapcar '(lambda (x) (split-string x "="))
(split-string text ",")))
indent)))
(if (string= args "")
(concat indent "@return: " indent "@rtype: " indent (make-string 3 34))
(mapconcat
'identity
(list "" args "@return: " "@rtype: " (make-string 3 34))
indent)))
}
$0
</pre>
<p>Now, when typing <tt class="docutils literal">de RET</tt>, Emacs will substitute function skeleton
and will pre-complete docstring with epytext friendly formatting as we
shall be typing function parameters.</p>
</div>
<div class="section" id="live-demo">
<h2>Live demo!</h2>
<img alt="Emacs Epydoc Snippets" src="http://tiborsimko.org/images/emacs-epydoc-yasnippet-docstrings.gif" />
</div>
<div class="section" id="conclusions">
<h2>Conclusions?</h2>
<p>Writing up-to-date, feature-rich code documentation can be both easy
and fun provided one uses powerful extensible editors.</p>
</div>
Quotes on Programming2007-06-29T12:00:00+02:002007-06-29T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2007-06-29:/quotes-programming.html<p>Quotes on computer programming to meditate upon.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="performance">
<h2>Performance</h2>
<div class="line-block">
<div class="line">We should forget about small efficiencies, say about 97% of the time:</div>
<div class="line">premature optimization is the root of all evil.</div>
<div class="line">--Donald Knuth</div>
<div class="line"><br /></div>
<div class="line">You're bound to be unhappy if you optimize everything.</div>
<div class="line">--Donald Knuth</div>
<div class="line"><br /></div>
<div class="line">The best performance improvement is the transition from the …</div></div></div><p>Quotes on computer programming to meditate upon.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="performance">
<h2>Performance</h2>
<div class="line-block">
<div class="line">We should forget about small efficiencies, say about 97% of the time:</div>
<div class="line">premature optimization is the root of all evil.</div>
<div class="line">--Donald Knuth</div>
<div class="line"><br /></div>
<div class="line">You're bound to be unhappy if you optimize everything.</div>
<div class="line">--Donald Knuth</div>
<div class="line"><br /></div>
<div class="line">The best performance improvement is the transition from the nonworking</div>
<div class="line">state to the working state.</div>
<div class="line">--John Ousterhout</div>
<div class="line"><br /></div>
<div class="line">Rules of Optimization:</div>
<div class="line">Rule 1: Don't do it.</div>
<div class="line">Rule 2 (for experts only): Don't do it yet.</div>
<div class="line">--M.A. Jackson</div>
<div class="line"><br /></div>
<div class="line">More computing sins are committed in the name of efficiency (without</div>
<div class="line">necessarily achieving it) than for any other single reason - including</div>
<div class="line">blind stupidity.</div>
<div class="line">--W.A. Wulf</div>
</div>
</div>
<div class="section" id="quality">
<h2>Quality</h2>
<div class="line-block">
<div class="line">If builders built buildings the way programmers wrote programs, then</div>
<div class="line">the first woodpecker that came along would destroy civilisation.</div>
<div class="line">--Gerald Weinberg</div>
<div class="line"><br /></div>
<div class="line">Since human beings themselves are not fully debugged yet, there will</div>
<div class="line">be bugs in your code no matter what you do.</div>
<div class="line">--Chris Mason, Microsoft, "Zero Defects" memo</div>
<div class="line-block">
<div class="line"><a class="reference external" href="http://blogs.msdn.com/rick_schaut/archive/2004/05/19/135315.aspx">http://blogs.msdn.com/rick_schaut/archive/2004/05/19/135315.aspx</a></div>
<div class="line"><br /></div>
</div>
<div class="line">There is nothing that would make me happier than to fix every bug that</div>
<div class="line">is found during testing before the product officially ships.</div>
<div class="line">Unfortunately, the realities of product development always dash my</div>
<div class="line">happy dreams and I wind up taking a lot of aspirin. I don't care what</div>
<div class="line">model of software development you use -- waterfall, spiral, eXtreme,</div>
<div class="line">or otherwise -- the process of shipping a piece of software boils down</div>
<div class="line">to the same thing: trying to control chaos.</div>
<div class="line">--Joe Bork, Microsoft software tester, "The anatomy of a bug", Oct 2003</div>
<div class="line-block">
<div class="line"><a class="reference external" href="http://headblender.com/joe/blog/old/001280.html">http://headblender.com/joe/blog/old/001280.html</a></div>
<div class="line"><br /></div>
</div>
<div class="line">What is a good module? That's hard to say. What is good code? That's</div>
<div class="line">also hard to say. "Quality" is not a well-defined term in</div>
<div class="line">computing... and especially not Perl. One man's Thing of Beauty is</div>
<div class="line">another's man's Evil Hack. Since we can't define quality, how do we</div>
<div class="line">write a program to assure it?</div>
<div class="line"><br /></div>
<div class="line">Kwalitee: It looks like quality, it sounds like quality, but it's not</div>
<div class="line">quite quality.</div>
<div class="line">--CPAN Testing Service (quoting Schwern)</div>
</div>
</div>
<div class="section" id="metrics">
<h2>Metrics</h2>
<div class="line-block">
<div class="line">Measuring programming progress by lines of code is like measuring</div>
<div class="line">aircraft building progress by weight.</div>
<div class="line">--Bill Gates</div>
</div>
</div>
<div class="section" id="documentation">
<h2>Documentation</h2>
<div class="line-block">
<div class="line">It's not finished until it's documented.</div>
<div class="line">--This may originally have been said by Tom Limoncelli.</div>
<div class="line"><br /></div>
<div class="line">Documentation isn't done until someone else understands it.</div>
<div class="line">--Originally submitted by William S. Annis on 12jan2000.</div>
<div class="line"><br /></div>
<div class="line">Good code is its own best documentation. As you're about to add a</div>
<div class="line">comment, ask yourself, 'How can I improve the code so that this</div>
<div class="line">comment isn't needed?' Improve the code and then document it to make</div>
<div class="line">it even clearer.</div>
<div class="line">--Steve McConnell, "Code Complete"</div>
<div class="line"><br /></div>
<div class="line">If the code and the comments disagree, then both are probably wrong.</div>
<div class="line">--attributed to Norm Schryer</div>
<div class="line"><br /></div>
<div class="line">Incorrect documentation is often worse than no documentation.</div>
<div class="line">--Bertrand Meyer</div>
<div class="line"><br /></div>
<div class="line">Any code of your own that you haven't looked at for six or more months</div>
<div class="line">might as well have been written by someone else.</div>
<div class="line">--Eagleson's law</div>
</div>
</div>
<div class="section" id="coding-standards">
<h2>Coding standards</h2>
<div class="line-block">
<div class="line">Good programmers use their brains, but good guidelines save us having</div>
<div class="line">to think out every case.</div>
<div class="line">--Francis Glassborow</div>
<div class="line"><br /></div>
<div class="line">Just because the standard provides a cliff in front of you, you are</div>
<div class="line">not necessarily required to jump off it.</div>
<div class="line">--Norman Diamond</div>
</div>
</div>
<div class="section" id="debugging">
<h2>Debugging</h2>
<div class="line-block">
<div class="line">Debugging is twice as hard as writing the code in the first</div>
<div class="line">place. Therefore, if you write the code as cleverly as possible, you</div>
<div class="line">are, by definition, not smart enough to debug it.</div>
<div class="line">--Brian W. Kernighan</div>
<div class="line"><br /></div>
<div class="line">There are two ways to write error-free programs; only the third works.</div>
<div class="line">--Alan J. Perlis</div>
</div>
</div>
<div class="section" id="quick-fixing">
<h2>Quick fixing</h2>
<div class="line-block">
<div class="line">There's no such thing as a temporary fix.</div>
<div class="line">--Originally submitted by David Todd on 21dec99</div>
</div>
</div>
<div class="section" id="naming">
<h2>Naming</h2>
<div class="line-block">
<div class="line">There are only two hard problems in Computer Science: naming things</div>
<div class="line">and cache invalidation.</div>
<div class="line">--Phil Karlton</div>
</div>
</div>
<div class="section" id="testing">
<h2>Testing</h2>
<div class="line-block">
<div class="line">Testing by itself does not improve software quality. Test results are</div>
<div class="line">an indicator of quality, but in and of themselves, they don't improve</div>
<div class="line">it. Trying to improve software quality by increasing the amount of</div>
<div class="line">testing is like trying to lose weight by weighing yourself more</div>
<div class="line">often. What you eat before you step onto the scale determines how much</div>
<div class="line">you will weigh, and the software development techniques you use</div>
<div class="line">determine how many errors testing will find. If you want to lose</div>
<div class="line">weight, don't buy a new scale; change your diet. If you want to</div>
<div class="line">improve your software, don't test more; develop better.</div>
<div class="line">--Steve McConnell, "Code Complete"</div>
<div class="line"><br /></div>
<div class="line">If testing costs more than not testing, then don't test.</div>
<div class="line">--Kent Beck</div>
</div>
</div>
<div class="section" id="en-guise-of-conclusion">
<h2>En guise of conclusion</h2>
<div class="line-block">
<div class="line">...well over half of the time you spend working on a project (on the</div>
<div class="line">order of 70 percent) is spent thinking, and no tool, no matter how</div>
<div class="line">advanced, can think for you. Consequently, even if a tool did</div>
<div class="line">everything except the thinking for you -- if it wrote 100 percent of</div>
<div class="line">the code, wrote 100 percent of the documentation, did 100 percent of</div>
<div class="line">the testing, burned the CD-ROMs, put them in boxes, and mailed them to</div>
<div class="line">your customers -- the best you could hope for would be a 30 percent</div>
<div class="line">improvement in productivity. In order to do better than that, you have</div>
<div class="line">to change the way you think.</div>
<div class="line">--Fred Brooks, "No Silver Bullet", in "The Mythical Man Month",</div>
<div class="line-block">
<div class="line-block">
<div class="line">paraphrased by Allen Holub</div>
</div>
<div class="line"><a class="reference external" href="http://www.javaworld.com/javaworld/jw-07-1999/jw-07-toolbox.html">http://www.javaworld.com/javaworld/jw-07-1999/jw-07-toolbox.html</a></div>
</div>
</div>
</div>
Python Micro Benchmarking2007-02-22T14:10:00+01:002007-02-22T14:10:00+01:00Tibor Šimkotag:tiborsimko.org,2007-02-22:/python-micro-benchmarking.html<p><strong>timeit</strong> permits to micro-benchmark code snippets, e.g. the
efficiency of list comprehensions for various Python versions.</p>
<!-- PELICAN_END_SUMMARY --><p>Here are the results for Python 2.2, 2.3 and 2.4; note how the
performance improves:</p>
<pre class="literal-block">
$ python2.2 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "[i for i …</pre><p><strong>timeit</strong> permits to micro-benchmark code snippets, e.g. the
efficiency of list comprehensions for various Python versions.</p>
<!-- PELICAN_END_SUMMARY --><p>Here are the results for Python 2.2, 2.3 and 2.4; note how the
performance improves:</p>
<pre class="literal-block">
$ python2.2 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "[i for i in range(1000)]"
10000 loops, best of 5: 348 usec per loop
$ python2.3 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "[i for i in range(1000)]"
10000 loops, best of 5: 283 usec per loop
$ python2.4 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "[i for i in range(1000)]"
10000 loops, best of 5: 137 usec per loop
</pre>
<p>Or, using the same Python version, one can compare the speed of the
list comprehension approach against the classical functional
approach:</p>
<pre class="literal-block">
$ python2.4 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "[i for i in range(1000)]"
10000 loops, best of 5: 137 usec per loop
$ python2.4 /usr/lib/python2.4/timeit.py -n 10000 -r 5 "map(lambda x: x, range(1000))"
10000 loops, best of 5: 398 usec per loop
</pre>
Python Speed on AMD vs Intel2006-08-31T19:00:00+02:002006-08-31T19:00:00+02:00Tibor Šimkotag:tiborsimko.org,2006-08-31:/python-speed-amd-vs-intel.html<p>The speed of the Python interpreter on the Intel Core 2 Duo test
system seems to be better by about 20-25 percent when compared to our
hitherto-fastest AMD Opteron system, at an equivalent CPU speed.</p>
<!-- PELICAN_END_SUMMARY --><p>I did two Python benchmarks, pystones (simple) and pybench (preferred,
complete benchmark suite). The results …</p><p>The speed of the Python interpreter on the Intel Core 2 Duo test
system seems to be better by about 20-25 percent when compared to our
hitherto-fastest AMD Opteron system, at an equivalent CPU speed.</p>
<!-- PELICAN_END_SUMMARY --><p>I did two Python benchmarks, pystones (simple) and pybench (preferred,
complete benchmark suite). The results are:</p>
<table border="1" class="docutils">
<colgroup>
<col width="16%" />
<col width="21%" />
<col width="21%" />
<col width="21%" />
<col width="21%" />
</colgroup>
<thead valign="bottom">
<tr><th class="head">Machine</th>
<th class="head">Pystones/sec</th>
<th class="head">Pystns/sec/GHz</th>
<th class="head">Pybench [sec]</th>
<th class="head">Pybench*GHz</th>
</tr>
</thead>
<tbody valign="top">
<tr><td>PCUDS17</td>
<td>38759 (~ 99%)</td>
<td>22866 (~179%)</td>
<td>48.405 ( -0%)</td>
<td>82.046 (-46%)</td>
</tr>
<tr><td>SUNUDS99</td>
<td>39062 (=100%)</td>
<td>12777 (=100%)</td>
<td>49.808 ( =0%)</td>
<td>152.263 ( =0%)</td>
</tr>
<tr><td>SUNUDS93</td>
<td>+53763 (~138%)</td>
<td>+22469 (~176%)</td>
<td>-33.007 (-34%)</td>
<td>-78.985 (-48%)</td>
</tr>
<tr><td>SUNUDS94</td>
<td>58139 (~149%)</td>
<td>24325 (~190%)</td>
<td>31.730 (-36%)</td>
<td>75.834 (-50%)</td>
</tr>
<tr><td>PCITFIOT02</td>
<td>75757 (~194%)</td>
<td>25311 (~198%)</td>
<td>20.260 (-59%)</td>
<td>60.638 (-60%)</td>
</tr>
</tbody>
</table>
<p>The difference between PCITFIOT02 (Intel Core 2 Duo) and SUNUDS9{3,4}
(AMD Opteron) running at an equivalent CPU speed (Pybench*GHz) is
20-22 percent for Core 2 Duo in pybench, and 4-12 percent in pystones.</p>
<p>I have also run some other simple crunch tests, e.g. a stress test of
the garbage collection on binary trees, that gave ~25-30 percent
speedup for the Core 2 Duo system when compared to AMD Opteron running
at an equivalent CPU speed. Therefore I'd guess that the pybench
result of 20-22 percent is fairly trustworthy.</p>
<p>Machine details:</p>
<dl class="docutils">
<dt>PCUDS17: IBM ThinkPad T42</dt>
<dd>CPU = Intel(R) Pentium(R) M processor 1.695 GHz
OS = Debian GNU/Linux "Sarge" i386 32-bit
Python = Python 2.3.5 (#2, Sep 4 2005, 22:01:42) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2</dd>
<dt>SUNUDS99: Sun Fire V65x</dt>
<dd>CPU = Intel(R) Xeon(TM) CPU 3.057 GHz
OS = Debian GNU/Linux "Sarge" i386 32-bit
Python = Python 2.3.4 (#1, Dec 8 2004, 16:51:14) [GCC 2.95.4 20011002 (Debian prerelease)] on linux2</dd>
<dt>SUNUDS93: Sun Fire V20z</dt>
<dd>CPU = AMD Opteron(tm) Processor 250 2.393 GHz
OS = Debian GNU/Linux "Sarge" amd64 64-bit
Python = Python 2.3.4 (#2, Feb 21 2006, 17:44:05) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2</dd>
<dt>SUNUDS94: Sun Fire V20z</dt>
<dd>CPU = AMD Opteron(tm) Processor 250 2.390 GHz
OS = Debian GNU/Linux "Sarge" amd64 64-bit
Python = Python 2.3.5 (#2, Sep 9 2005, 21:37:55) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2</dd>
<dt>PCITFIOT02: ? Core 2 Duo</dt>
<dd>CPU = Genuine Intel(R) CPU [Woodcrests 5160?] 2.993 GHz
OS = Scientific Linux CERN 4 64-bit
Python = Python 2.3.4 (#1, Mar 12 2006, 16:28:27) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2</dd>
<dt>Machine notes:</dt>
<dd>The Python version numbers and the compiler options differed a bit,
so the comparison is not the most accurate there is, but the
difference due to this factor should be within ~5% or so, as shown
by the difference in SUNUDS9{3,4} results. Moreover, some of the
machines such as SUNUDS93 were tested during moderate load, which
may account for a couple of percents. Ideally one should install
the same OS on the machines and test them when idle; moreover, with
the CDS server software to have the precise performance numbers for
our concrete application. But the numbers cited above give already
quite a good speed estimate.</dd>
</dl>
Python Exception Handling Overhead2006-06-13T12:00:00+02:002006-06-13T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2006-06-13:/python-exception-handling-overhead.html<p>In a previous blog post, I've estimated Python OO method call overhead
to be about 10% over function calls. What about exception handling?</p>
<!-- PELICAN_END_SUMMARY --><p>Here the situation differs. Exceptions are so pervasive in Python
core language that handling exceptional cases of your code via adding
more exceptions do not add much …</p><p>In a previous blog post, I've estimated Python OO method call overhead
to be about 10% over function calls. What about exception handling?</p>
<!-- PELICAN_END_SUMMARY --><p>Here the situation differs. Exceptions are so pervasive in Python
core language that handling exceptional cases of your code via adding
more exceptions do not add much overhead at all. Here is a simple
brute-force comparison of returning C-style <tt class="docutils literal">(res, err, wrn)</tt> tuples
versus returning <tt class="docutils literal">res</tt> and exceptions:</p>
<div class="highlight"><pre><span></span><span class="sd">"""</span>
<span class="sd">Measure the speed of exception handling mechanism in Python, by</span>
<span class="sd">comparing returning of (res, err, wrn) tuples to returning res only</span>
<span class="sd">with raising exceptions.</span>
<span class="sd">Results run on PCUDS17 on 2006-06-13 are:</span>
<span class="sd">$ python2.3 exception_handling_speed.py</span>
<span class="sd">testing speed of exception handling versus returning (res, err, wrn) tuples:</span>
<span class="sd">f_tuples() returning never errors ........ 1.700 sec</span>
<span class="sd">f_exception() returning never errors ..... 1.000 sec</span>
<span class="sd">f_tuples() returning always errors ....... 1.580 sec</span>
<span class="sd">f_exception() returning always errors .... 3.940 sec</span>
<span class="sd">"""</span>
<span class="n">m</span> <span class="o">=</span> <span class="mi">10000</span>
<span class="k">class</span> <span class="nc">MyError</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">f_tuples_slave</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="n">okay</span><span class="o">=</span><span class="mi">1</span><span class="p">):</span>
<span class="sd">"""Do some calculations and return res, err, wrn tuple."""</span>
<span class="k">if</span> <span class="n">okay</span><span class="p">:</span>
<span class="k">return</span> <span class="n">n</span><span class="o">*</span><span class="n">n</span><span class="p">,</span> <span class="p">[],</span> <span class="p">[]</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">,</span> <span class="p">[</span><span class="s1">'foo'</span><span class="p">],</span> <span class="p">[</span><span class="s1">'bar'</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">f_exceptions_slave</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="n">okay</span><span class="o">=</span><span class="mi">1</span><span class="p">):</span>
<span class="sd">"""Do some calculations and return res plus raise exception."""</span>
<span class="k">if</span> <span class="n">okay</span><span class="p">:</span>
<span class="k">return</span> <span class="n">n</span><span class="o">*</span><span class="n">n</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">MyError</span>
<span class="k">def</span> <span class="nf">f_tuples</span><span class="p">(</span><span class="n">okay</span><span class="o">=</span><span class="mi">1</span><span class="p">):</span>
<span class="k">global</span> <span class="n">m</span>
<span class="n">x</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="n">m</span><span class="p">):</span>
<span class="n">res</span><span class="p">,</span> <span class="n">err</span><span class="p">,</span> <span class="n">wrn</span> <span class="o">=</span> <span class="n">f_tuples_slave</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">okay</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">==</span> <span class="p">[]:</span>
<span class="n">x</span> <span class="o">+=</span> <span class="n">res</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">def</span> <span class="nf">f_exceptions</span><span class="p">(</span><span class="n">okay</span><span class="o">=</span><span class="mi">1</span><span class="p">):</span>
<span class="k">global</span> <span class="n">m</span>
<span class="n">x</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="n">m</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">x</span> <span class="o">+=</span> <span class="n">f_exceptions_slave</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">okay</span><span class="p">)</span>
<span class="k">except</span> <span class="n">MyError</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">return</span> <span class="n">x</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="k">def</span> <span class="nf">timing</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span>
<span class="sd">"""Return timing of function F on argument A run N times.</span>
<span class="sd"> Taken from <http://www.python.org/doc/essays/list2str.html>.</span>
<span class="sd"> """</span>
<span class="n">r</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">t1</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">r</span><span class="p">:</span>
<span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="n">t2</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">round</span><span class="p">(</span><span class="n">t2</span><span class="o">-</span><span class="n">t1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="sd">"""Demonstrate the memoization technique for the Fibonacci calculator."""</span>
<span class="k">print</span> <span class="s2">"testing speed of exception handling versus returning (res, err, wrn) tuples:"</span>
<span class="c1"># test returning 0% exceptions case:</span>
<span class="n">time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">f_tuples</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">print</span> <span class="s2">"f_tuples() returning never errors ........ </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">time</span>
<span class="n">time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">f_exceptions</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">print</span> <span class="s2">"f_exception() returning never errors ..... </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">time</span>
<span class="c1"># test returning 100% exceptions case:</span>
<span class="n">time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">f_tuples</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="k">print</span> <span class="s2">"f_tuples() returning always errors ....... </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">time</span>
<span class="n">time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">f_exceptions</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="k">print</span> <span class="s2">"f_exception() returning always errors .... </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">time</span>
<span class="k">return</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
Python Memoisation2004-12-06T12:00:00+01:002004-12-06T12:00:00+01:00Tibor Šimkotag:tiborsimko.org,2004-12-06:/python-memoisation.html<p>If the profiling shows that you call some function a lot of times for
the same arguments, then memoise it. The canonical example is
memoising the intermediate results of the Fibonacci recursive
calculator, presented below. But beware: if you do memoise, better
make sure that you don't eat up the …</p><p>If the profiling shows that you call some function a lot of times for
the same arguments, then memoise it. The canonical example is
memoising the intermediate results of the Fibonacci recursive
calculator, presented below. But beware: if you do memoise, better
make sure that you don't eat up the full memory! Memoising is a
classic speed versus memory dilemma.</p>
<!-- PELICAN_END_SUMMARY --><p>Here is example code, showing how memoisation speeded up Fibonacci
calculator from 1.64 to 0.01 seconds:</p>
<div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/python</span>
<span class="sd">"""</span>
<span class="sd">Demonstrate the memoization technique for the Fibonacci calculator.</span>
<span class="sd">Results run on PCDH23 on 2004-12-06 are:</span>
<span class="sd"> $ python memoize-demo.py</span>
<span class="sd"> testing memoization for fib(20)...</span>
<span class="sd"> fib took 1.640 sec</span>
<span class="sd"> fib_memoized took 0.010 sec</span>
<span class="sd"> fib_memoized stats: calls in cache: 117 out of 138 (84.8%)</span>
<span class="sd">"""</span>
<span class="k">class</span> <span class="nc">Memoize</span><span class="p">:</span>
<span class="sd">"""Memoizator. Based on <http://www.norvig.com/python-iaq.html>."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">function</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">memo</span> <span class="o">=</span> <span class="p">{}</span>
<span class="bp">self</span><span class="o">.</span><span class="n">function</span> <span class="o">=</span> <span class="n">function</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_total</span> <span class="o">=</span> <span class="mi">0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_in_cache</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_total</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">memo</span><span class="o">.</span><span class="n">has_key</span><span class="p">(</span><span class="n">args</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_in_cache</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">memo</span><span class="p">[</span><span class="n">args</span><span class="p">]</span>
<span class="k">else</span><span class="p">:</span>
<span class="nb">object</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">memo</span><span class="p">[</span><span class="n">args</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">function</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="k">return</span> <span class="nb">object</span>
<span class="k">def</span> <span class="nf">get_stats</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"calls in cache: </span><span class="si">%d</span><span class="s2"> out of </span><span class="si">%d</span><span class="s2"> (</span><span class="si">%.1f%%</span><span class="s2">)"</span> <span class="o">%</span> \
<span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">count_calls_in_cache</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_total</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">count_calls_in_cache</span><span class="o">*</span><span class="mf">100.0</span><span class="o">/</span><span class="bp">self</span><span class="o">.</span><span class="n">count_calls_total</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">fib</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="sd">"""Calculate Fibonacci number for N recursively."""</span>
<span class="k">if</span> <span class="n">n</span> <span class="o"><</span> <span class="mi">2</span><span class="p">:</span>
<span class="k">return</span> <span class="mi">1</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">fib_memoized</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="sd">"""Calculate Fibonacci number for N recursively.</span>
<span class="sd"> Identical to fib() defined above. This one will be memoized later.</span>
<span class="sd"> """</span>
<span class="k">if</span> <span class="n">n</span> <span class="o"><</span> <span class="mi">2</span><span class="p">:</span>
<span class="k">return</span> <span class="mi">1</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">fib_memoized</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib_memoized</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">)</span>
<span class="c1"># memoize one of the two identical fib() functions</span>
<span class="n">fib_memoized</span> <span class="o">=</span> <span class="n">Memoize</span><span class="p">(</span><span class="n">fib_memoized</span><span class="p">)</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="k">def</span> <span class="nf">timing</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span>
<span class="sd">"""Return timing of function F on argument A run N times.</span>
<span class="sd"> Taken from <http://www.python.org/doc/essays/list2str.html>.</span>
<span class="sd"> """</span>
<span class="n">r</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">t1</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">r</span><span class="p">:</span>
<span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="n">t2</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">round</span><span class="p">(</span><span class="n">t2</span><span class="o">-</span><span class="n">t1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="sd">"""Demonstrate the memoization technique for the Fibonacci calculator."""</span>
<span class="n">n</span> <span class="o">=</span> <span class="mi">20</span>
<span class="k">print</span> <span class="s2">"testing memoization for fib(</span><span class="si">%d</span><span class="s2">)..."</span> <span class="o">%</span> <span class="n">n</span>
<span class="n">fib_time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="n">fib_memoized_time</span> <span class="o">=</span> <span class="n">timing</span><span class="p">(</span><span class="n">fib_memoized</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="k">print</span> <span class="s2">"fib took </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">fib_time</span>
<span class="k">print</span> <span class="s2">"fib_memoized took </span><span class="si">%.3f</span><span class="s2"> sec"</span> <span class="o">%</span> <span class="n">fib_memoized_time</span>
<span class="k">print</span> <span class="s2">"fib_memoized stats: </span><span class="si">%s</span><span class="s2">"</span> <span class="o">%</span> <span class="n">fib_memoized</span><span class="o">.</span><span class="n">get_stats</span><span class="p">()</span>
<span class="k">return</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
Common Lisp Interpreted vs Compiled2004-12-02T10:00:00+01:002004-12-02T10:00:00+01:00Tibor Šimkotag:tiborsimko.org,2004-12-02:/common-lisp-interpreted-compiled.html<p>Is there a difference between running interpreted and compiled Common
Lisp code?</p>
<!-- PELICAN_END_SUMMARY --><p>Your collaborators told you that:</p>
<blockquote>
I believe that the disappointment is largely related to the
misunderstanding regarding Common LISP compilation method: Common
LISP can NOT be executed interpreted - it is always compiled. [...]
Accordingly, by proclaiming compilation of a …</blockquote><p>Is there a difference between running interpreted and compiled Common
Lisp code?</p>
<!-- PELICAN_END_SUMMARY --><p>Your collaborators told you that:</p>
<blockquote>
I believe that the disappointment is largely related to the
misunderstanding regarding Common LISP compilation method: Common
LISP can NOT be executed interpreted - it is always compiled. [...]
Accordingly, by proclaiming compilation of a file nothing is in fact
accomplished, except that the compiled mirror is saved in a seperate
file. This certainly cannot improve the execution [...]</blockquote>
<p>This is wrong. Common Lisp offers both interpreted and compiled
functions. See the language standard or any other Lisp book for that
matter. (Some Common Lisp implementations may decide to compile
everything though.)</p>
<p>You are using Allegro Common Lisp, so here's a short example with
Allegro CL 6.2 on GNU/Linux, timing a recursive Fibonacci number
calculator for an illustration.</p>
<div class="section" id="fibonacci-interpreted-vs-compiled">
<h2>Fibonacci interpreted vs compiled</h2>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defun</span> <span class="nv">fib</span> <span class="p">(</span><span class="nv">n</span><span class="p">)</span>
<span class="s">"Calculate Fibonacci number for N recursively."</span>
<span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="nb">values</span> <span class="kt">fixnum</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">1</span>
<span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nv">fib</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">))</span> <span class="p">(</span><span class="nv">fib</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)))))</span>
</pre></div>
<p>Firstly, fib.lisp running interpreted:</p>
<pre class="literal-block">
$ acl
CL-USER(1): (load "fib.lisp")
; Loading /home/simko/private/work/lang/lisp-tests/fib-test/fib.lisp
T
CL-USER(2): (describe #'fib)
#<Interpreted Function FIB> is a FUNCTION.
The arguments are (N)
CL-USER(3): (time (fib 28))
; cpu time (non-gc) 14,810 msec user, 20 msec system
; cpu time (gc) 3,070 msec user, 0 msec system
; cpu time (total) 17,880 msec user, 20 msec system
; real time 18,046 msec
; space allocation:
; 21,597,594 cons cells, 1,143,644,184 other bytes, 5,224 static bytes
514229
</pre>
<p>Secondly, fib.lisp running compiled:</p>
<pre class="literal-block">
CL-USER(4): (load (compile-file "fib.lisp"))
;;; Compiling file fib.lisp
;;; Writing fasl file fib.fasla16
Warning: No IN-PACKAGE form seen in
/home/simko/private/work/lang/lisp-tests/fib-test/fib.lisp.
(Allegro Presto will be ineffective when loading a file having
no IN-PACKAGE form.)
;;; Fasl write complete
; Fast loading
; /home/simko/private/work/lang/lisp-tests/fib-test/fib.fasla16
T
CL-USER(5): (describe #'fib)
#<Function FIB> is a COMPILED-FUNCTION.
The arguments are (N)
CL-USER(6): (time (fib 28))
; cpu time (non-gc) 20 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 20 msec user, 0 msec system
; real time 17 msec
; space allocation:
; 1 cons cell, 0 other bytes, 0 static bytes
514229
</pre>
<p>In other words, <tt class="docutils literal">(load <span class="pre">...)</span></tt> makes the function <em>interpreted</em> and
leads to the runtime of 17880 ms, while <tt class="docutils literal">(load <span class="pre">(compile-file</span> <span class="pre">...))</span></tt>
makes the function <em>compiled</em> and leads to the runtime of 20 ms.</p>
</div>
<div class="section" id="general-speedup-tips">
<h2>General speedup tips</h2>
<p>So, if you want to speed up your CL application, then:</p>
<ul class="simple">
<li>make sure you're running compiled code, see <tt class="docutils literal">(describe <span class="pre">...)</span></tt> in the
example above, or better yet use <tt class="docutils literal">(disassemble <span class="pre">...)</span></tt> and check the
machine code (or bytecode) produced by the compiler</li>
<li>declare proper optimization settings, e.g.
<tt class="docutils literal">(declaim (optimize (speed 3) (safety 1) (debug 0) <span class="pre">(compilation-speed</span> 0) (space <span class="pre">0)))</span></tt></li>
<li>profile the code to find weak spots, for example with CMU CL do
<tt class="docutils literal"><span class="pre">(profile:profile-all)</span></tt> before calling your code, and
<tt class="docutils literal"><span class="pre">(profile:report-time)</span></tt> after calling it. This will give you the
list of weak spots, and for each weak spot do...<ul>
<li>... reduce consing as much as possible, think of better algorithms
and data structures</li>
<li>... declare variable types, e.g. a simple string:
<tt class="docutils literal">(declare (type <span class="pre">simple-base-string</span> x))</tt>
or a positive fixnum integer:
<tt class="docutils literal">(declare (type (integer 0 <span class="pre">#.(-</span> <span class="pre">most-positive-fixnum</span> 1)) n)</tt></li>
<li>... read compiler cost hints and make promises to the compiler,
e.g. <tt class="docutils literal">(the fixnum <span class="pre">...)</span></tt> so that it can stay with fixnum
arithmetics</li>
</ul>
</li>
</ul>
<p>By doing so the Common Lisp implementations usually produce machine
code that runs quite as fast as that produced by C++ compilers.</p>
</div>
<div class="section" id="optimisation-anecdote">
<h2>Optimisation anecdote</h2>
<p>BTW, some time ago somebody challenged comp.lang.lisp about Lisp's
presumed slowness with respect to a Coyote Gulch floating-point
benchmark. The claim was that Common Lisp implementations will yield
much slower code than C/C++. The result was that CMUCL and SBCL
produced code that was faster than the GNU C++ reference. The debate
can give you ideas on how to get things Coptimized in Lisp.</p>
<p>For the full thread, see:
<a class="reference external" href="http://groups.google.com/groups?q=g:thl1632776609d&dq=&hl=en&lr=&ie=UTF-8&selm=165b3efa.0403011540.6edea34c%40posting.google.com&rnum=1">http://groups.google.com/groups?q=g:thl1632776609d&dq=&hl=en&lr=&ie=UTF-8&selm=165b3efa.0403011540.6edea34c%40posting.google.com&rnum=1</a>.</p>
<p>For a concise summary and lessons to retain, read the <a class="reference external" href="http://home.comcast.net/~bc19191/blog/040308.html">story in Bill
Clementson's blog</a>.</p>
</div>
Python Method Call Overhead2004-11-22T12:00:00+01:002004-11-22T12:00:00+01:00Tibor Šimkotag:tiborsimko.org,2004-11-22:/python-method-call-overhead.html<p>Python function calls are expensive. Python object oriented method
calls are even more expensive. How can we estimate method call
overhead?</p>
<!-- PELICAN_END_SUMMARY --><p>By measuring performance of functional redefinition vs performance of
class method calls.</p>
<p>One can find out that the overhead of using the OO method calls over
functional calls seems …</p><p>Python function calls are expensive. Python object oriented method
calls are even more expensive. How can we estimate method call
overhead?</p>
<!-- PELICAN_END_SUMMARY --><p>By measuring performance of functional redefinition vs performance of
class method calls.</p>
<p>One can find out that the overhead of using the OO method calls over
functional calls seems to be about <strong>10%</strong>; see the test code below.
Also, while the cost of functional redefinition is practically zero,
one more OO subclass level adds further ~1%, so that the cost of OO
one-subclass inheritance redefinition over functional redefinition is
about 11%.</p>
<p>Hence, if one chooses OO, it is interesting to keep the class
hierarchy tree as simple as possible, and the cost of OO will be about
10%. In practice, this is often acceptable.</p>
<p>(Note that this does not say much about the slowness of Python OO
method call performance for simple class precedence trees, but rather
about the slowness of the classical Python function call.)</p>
<p>Here is <strong>redef_test.py</strong> testing code:</p>
<div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="sd">"""</span>
<span class="sd">Simple testing of performance cost of functional redefinition vs OO.</span>
<span class="sd">Results on PCDH23 run on 20041118 are:</span>
<span class="sd"> $ ./redef_test</span>
<span class="sd"> testing performance cost of functional redefinition vs OO ...</span>
<span class="sd"> fun 5.21</span>
<span class="sd"> fun_redef 5.1</span>
<span class="sd"> fun_oo 5.68</span>
<span class="sd"> $ ./redef_test</span>
<span class="sd"> testing performance cost of functional redefinition vs OO ...</span>
<span class="sd"> fun 5.21</span>
<span class="sd"> fun_redef 5.11</span>
<span class="sd"> fun_oo 5.66</span>
<span class="sd"> $ ./redef_test</span>
<span class="sd"> testing performance cost of functional redefinition vs OO ...</span>
<span class="sd"> fun 5.19</span>
<span class="sd"> fun_redef 5.1</span>
<span class="sd"> fun_oo 5.67</span>
<span class="sd">This means that the function call cost in case of functional</span>
<span class="sd">redefinition is zero, and in case of OO for a simple one-level class</span>
<span class="sd">hierarchy is 11%. Of course it will be more in case of deeper/complex</span>
<span class="sd">class precedence lists, but I haven't measured how much.</span>
<span class="sd">P.S. On 20041122 were added fun_oo_subclass to confirm that fun_oo and</span>
<span class="sd"> fun_oo_subclass give the same results, as it's mostly due to the</span>
<span class="sd"> OO overhead. One subclass does not add anything big to the</span>
<span class="sd"> overhead; I measured it to be ~1% difference.</span>
<span class="sd">"""</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="kn">from</span> <span class="nn">redef_test_slave</span> <span class="kn">import</span> <span class="n">fun</span><span class="p">,</span> <span class="n">fun_redef</span><span class="p">,</span> <span class="n">fun_oo</span><span class="p">,</span> <span class="n">fun_oo_subclass</span>
<span class="k">def</span> <span class="nf">timing</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="mi">1000000</span><span class="p">):</span>
<span class="sd">"""Return timing of function F on argument A run N times.</span>
<span class="sd"> Taken from <http://www.python.org/doc/essays/list2str.html>.</span>
<span class="sd"> """</span>
<span class="k">print</span> <span class="n">f</span><span class="o">.</span><span class="vm">__name__</span><span class="p">,</span>
<span class="n">r</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">t1</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">r</span><span class="p">:</span>
<span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">);</span> <span class="n">f</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="n">t2</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">clock</span><span class="p">()</span>
<span class="k">print</span> <span class="nb">round</span><span class="p">(</span><span class="n">t2</span><span class="o">-</span><span class="n">t1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="k">return</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="sd">"""Simple testing of performance cost of functional redefinition vs OO.</span>
<span class="sd"> """</span>
<span class="k">print</span> <span class="s2">"testing performance cost of functional redefinition vs OO ..."</span>
<span class="n">timing</span><span class="p">(</span><span class="n">fun</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">timing</span><span class="p">(</span><span class="n">fun_redef</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">timing</span><span class="p">(</span><span class="n">fun_oo</span><span class="p">()</span><span class="o">.</span><span class="n">fun_oo</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">timing</span><span class="p">(</span><span class="n">fun_oo_subclass</span><span class="p">()</span><span class="o">.</span><span class="n">fun_oo_subclass</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="k">return</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
<p>Here is <strong>redef_test_slave.py</strong>:</p>
<div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="k">def</span> <span class="nf">fun</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">return</span>
<span class="k">def</span> <span class="nf">fun_redef</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">fun_redef</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">return</span>
<span class="k">class</span> <span class="nc">fun_oo</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">fun_oo</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span>
<span class="k">def</span> <span class="nf">fun_oo_subclass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">fun_oo_subclass</span><span class="p">(</span><span class="n">fun_oo</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">fun_oo_subclass</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span>
</pre></div>
Arthur Bolstein or Science Reporting in Popular Newspapers2004-06-15T00:42:00+02:002004-06-15T00:42:00+02:00Tibor Šimkotag:tiborsimko.org,2004-06-15:/arthur-bolstein.html<p>When reading newspapers, are you choosing your sources well?</p>
<!-- PELICAN_END_SUMMARY --><p>Years ago I happened to read an interview in a popular Slovak weekly
magazine. The interview was with Arthur Bolčo aka Arthur Bolstein (an
apparent wordplay on Albert Einstein) who claimed to have had
mathematically proven the theory of relativity to …</p><p>When reading newspapers, are you choosing your sources well?</p>
<!-- PELICAN_END_SUMMARY --><p>Years ago I happened to read an interview in a popular Slovak weekly
magazine. The interview was with Arthur Bolčo aka Arthur Bolstein (an
apparent wordplay on Albert Einstein) who claimed to have had
mathematically proven the theory of relativity to be false. He wrote
a book called an "Ordinary Failure of an Extraordinary Theory" about
the proof. To get a feeling of how the newspaper reporting roughly
went, here is a similar <a class="reference external" href="http://spectator.sme.sk/articles/view/4813">The Slovak Spectator article</a> on Arthur Bolstein (in
English). A mix of serious and a bit of light tone, one might say.</p>
<p>Now this is not very surprising. There are people claiming to have
proven or disproven a famous theory every now and again. It is not
surprising either that people find willing publishers or that they
self-publish by themselves. And it is not surprising either that
selfmademan-genius-causing-scientific-revolution would be a topic
attractive to popular newspapers.</p>
<p>What may be more surprising, though, is that several years before the
newspaper interview, Mr. Bolstein received an 1999 anti-prize called
<a class="reference external" href="http://cs.wikipedia.org/wiki/Seznam_nositel%C5%AF_Bludn%C3%A9ho_balvanu">"Bludný balvan"</a>
("Erratic Boulder") awarded by the Czech Sceptics Club <a class="reference external" href="http://sysifos.cz/">"Sisyfos"</a>, a part of the world-wide <a class="reference external" href="http://www.skeptic.com/">The Skeptics Society</a> movement that aims at debunking pseudo
science. Apparently Mr. Bolstein's proof contains an error very early
on... If you read Czech, see <a class="reference external" href="http://www.vesmir.cz/clanek/arthur-bolstein-obycejne-selhani-jedne-neobycejne-teorie">a review</a>
by Professor Vopěnka in the popular science magazine Vesmír ("The
Universe").[1]</p>
<p>The journalists in The Slovak Spectator and elsewhere omitted to
mention this interesting news bit. Perhaps they did not run a web
search prior to publishing their articles; perhaps they did but did
not consider it worthy of mention; perhaps they did not have time to
ask the traditional "scientific establishment" for a counter opinion;
I don't know. What I do know are the feelings of bitterness or
sadness that remained.; I think the society deserves better science
coverage than that.</p>
<p>The morale of the story? How are you choosing the newspapers you
read?</p>
<p><center>* * *</center><p><p>[1] And, to close the loop, let me add that later, in 2006, the
popular science magazine Vesmír itself received an "Erratic Boulder"
prize! For a completely unrelated matter, related to magnetokinesis
humidity drying device story, see the <a class="reference external" href="http://www.sysifos.cz/index.php?id=vypis&sec=1173691336">award text</a> (in
Czech).</p>
Fun with Phonetics2004-06-14T18:04:00+02:002004-06-14T18:04:00+02:00Tibor Šimkotag:tiborsimko.org,2004-06-14:/fun-with-phonetics.html<p>Sound-play is another sort of word-play used in creating product
names.</p>
<!-- PELICAN_END_SUMMARY --><p>Coincidentally I was reading Steven Pinker's "The Language Instinct"
book today where he relates an amuzing anecdote:</p>
<blockquote>
George Bernard Shaw led a vigorous campaign to reform the English
alphabet, a system so illogical, he said, that it could spell …</blockquote><p>Sound-play is another sort of word-play used in creating product
names.</p>
<!-- PELICAN_END_SUMMARY --><p>Coincidentally I was reading Steven Pinker's "The Language Instinct"
book today where he relates an amuzing anecdote:</p>
<blockquote>
George Bernard Shaw led a vigorous campaign to reform the English
alphabet, a system so illogical, he said, that it could spell
"fish" as "ghoti" --- "gh" as in "tough", "o" as in "women", "ti"
as in "nation". ("Mnomnoupte" for "minute" and "mnopspteiche" for
"mistake" are other examples.) In his will Shaw bequathed a cash
prize to be awarded to the designer of a replacement alphabet for
English, in which each sound in the spoken language would be
recognizableby a single symbol. [...]</blockquote>
<p>(I would add that founders of the Slovak language codification process
in the 18th and 19th centuries, such as Anton Bernolák and Ľudovít
Štúr, used precisely the phonological rule "write-as-you-hear" as one
of the founding principles behind the codification of various Slovak
dialects into the common modern language.)</p>
<p><center>* * *</center><p><p>We don't have to go far for examples of phonetic fun within software
program names. The Free Software Foundation's <a class="reference external" href="http://www.gnu.org/">GNU project</a> sure bears the logo of the gnu animal, and is
an acronym of "GNU's Not Unix", and is pronounced "guh-new";
nonetheless the dictionaries say that the leading "g" in the word
"gnu" can be silent, in which case the pronounciation goes like "new",
refering to a new free way of software making-and-sharing. An idea
taken by my favourite mail/news reader, <a class="reference external" href="http://gnus.org/">Gnus</a>,
which -- with a silent leading "g" -- gets pronounced akin to "news",
a very appropriate name for a news reader.</p>
<p>Much fun is to be had with acronyms and/or phonetics.</p>
<p>P.S. See also <a class="reference external" href="http://www.spellingsociety.org/news/media/poems.php">Poems showing the absurdities of English spelling</a>.</p>
Acquis et inné2004-05-16T00:00:00+02:002004-05-16T00:00:00+02:00Tibor Šimkotag:tiborsimko.org,2004-05-16:/acquis-et-inne.html<p>A propos de notre débat récent sur l'acquis-versus-inné dans le
domaine d'intelligence, voici les information et les sources que
j'avais mentionnées.</p>
<!-- PELICAN_END_SUMMARY --><p>L'un des points forts dans ce domaine est l'étude des jumeaux
identiques monozygotes. Comme ils ont des gènes 100% identiques, et
comme il arrive qu'ils soient parfois séparés après …</p><p>A propos de notre débat récent sur l'acquis-versus-inné dans le
domaine d'intelligence, voici les information et les sources que
j'avais mentionnées.</p>
<!-- PELICAN_END_SUMMARY --><p>L'un des points forts dans ce domaine est l'étude des jumeaux
identiques monozygotes. Comme ils ont des gènes 100% identiques, et
comme il arrive qu'ils soient parfois séparés après la naissance, ils
représentent un cas modèle pour les recherches sur l'influence de
l'environnement par rapport à l'influence de l'hérédité.</p>
<p>Mais voyons d'abord ce que parle Encyclopaedia Britannica à propos de
l'hérédité de l'intelligence:</p>
<blockquote>
Intelligence is a very complex human trait, the genetics of which
has been a subject of controversy for some time. [...] Even roughly
measured as IQ, intelligence shows a strong contribution from the
environment. Fraternal twins, however, show relatively great
dissimilarity in IQ, suggesting an important contribution from
heredity as well. In fact, it has been estimated that on the
average between 60 and 80 percent of the variance in IQ test scores
could be genetic. It is important to note that intelligence is
polygenically inherited and that it has the highest degree of
assortative mating of any trait; in other words, people tend to mate
with people having similar IQ's.</blockquote>
<p>Donc, le consensus des recherches scientifiques semble être que 60-80
pourcent de QI est dû à la contribution génétique.</p>
<p>Voici un exemple d'une telle étude, menée par Thomas Bouchard de
l'Université de Minnesota dans les années 1979+, cité par Matt Ridley
dans son livre "Genome". (Je n'ai qu'une traduction tchèque dont je
traduirai hâtivement en français ici.) Ils ont étudié la
corrélation entre les tests de QI parmi les dizaines de milliers des
hommes/femmes majoritairement blanches et de la classe moyenne, et ont
trouvé ceci: [0% = pas de corrélation/pas de liaison, 100% =
corrélation parfaite/identité totale]</p>
<pre class="literal-block">
même personne testé deux fois ........... 87%
jumeaux monozygotes élevés ensemble ..... 86%
jumeaux monozygotes élevés séparément.... 76%
jumeaux hétérozygotes élevés ensemble ... 55%
frères/soeurs biologiques ............... 47%
parents et enfants vivant ensemble ...... 40%
parents et enfants vivant séparément..... 31%
enfants adoptés vivant ensemble ......... 0%
gens pas relatifs vivant séparément ..... 0%
</pre>
<p>On voit ici plusieurs choses intéressantes. (1) Tout d'abord, les
résultats des jumeaux monozygotes vivant au sein d'une même famille
sont pratiquement indissociables des résultats d'une même personne
répétant le test plusieurs fois. (2) Même si les jumeaux vivent
séparément, la corrélation est encore très grande, signalant que la
famille ne joue pas un facteur décisif ici. Ce résultat lui-même est
très dûr voire impossible à interpréter par les gens qui ne croient
qu'à la thèse de l'influence par environnement seul, sans évoquer
l'influence de la hérédité. Notons que les corrélations dans le
comportement des jumeaux monozygotes ne s'arrêtent pas au QI, mais
vont à des telles traits de personnalités comme les opinions sur
l'apartheid, sur les mères qui travaillent, le choix de carrière, les
loisirs, la façon d'aller se baigner, etc, comme les cite Steven
Pinker dans son "How the Mind Works". Mais je ne voudrais pas trop
m'arrêter sur ce sujet aujourd'hui, car les chiffres parlent
d'eux-mêmes. (3) Si on regarde les résultats d'enfants adoptés, on
peut même conclure que "la vie dans la même famille n'a aucune
influence sur le QI"! (4) Les résultats plus élevés pour jumeaux
hétérozygotes par rapport aux frères/soeurs biologique suggèrent aussi
une forte importance de la vie prénatale dans l'utérus.</p>
<p>En bref et en gros, les conclusions ont été les suivantes:</p>
<ul class="simple">
<li>~50% de QI est héréditaire;</li>
<li>moins de 20% de QI vient de la vie de famille;</li>
<li>le reste vient de la vie prénatale, de l'école, des copains, etc.</li>
</ul>
<p>(Note importante: il ne faut pas oublier que les personnes testées ici
ont été issues à peu près du même environnement: classe moyenne, race
blanche. "Quand une recherche sur les adoptions des enfants d'une
autre race a été faite, une corrélation faible a été détectée entre
le QI des parents et le QI des enfants adoptés, grande de 19%". Les
autres auteurs précisent encore davantage. Je reviendrai là-dessous
vers la fin.)</p>
<p>Mais il y a plus. La corrélation grandit avec l'âge des enfants, et
non pas diminue. Ceci va a l'encontre des idées populaires, car avec
le temps, malgré les études et l'expérience de la vie, la
prépondérance de l'hérédité <em>croît</em> et non pas décroît (enfants 45%,
jeunes hommes 75%). Ceci démontre que l'influence des gènes ne
"s'arrête pas" avec la naissance et que l'influence des autres
personnes sur nous-même diminue avec le temps. Plus on grandit, plus
on choisi notre environnement par adaptation avec nos facultés innées,
plutôt que d'adapter nos facultés innées à l'environnement où on est
mis. On peut considérer ceci comme une sorte de freinage génétique
d'urgence qui nous protège contre les influences éventuelles de
l'environnement hostile tels que les tuteurs trop possessifs ou l'état
totalitaire trop laveur-cerveau etc.</p>
<p>Mais il y a encore plus. Même si 50% de QI est déterminé par les
gènes, les chiffres ci-dessus démontrent l'importance capitale de la
vie prénatale, la phase de développement. Ridley même parle que "les
événements qui se sont passés avec nous dans l'utérus ont une
influence sur notre intelligence trois fois supérieure à tout ce que
les parents ont pu faire avec nous après la naissance". Où sont ces
gènes? Certainement il n'y a pas un seul "gène pour intelligence"
mais l'influence est dû aux beaucoup de gènes dans leur travail
collectif et compliqué. Les études ont montré une certaine influence
de "gènes directes" qui codent les protéines qui aident à brûler la
glucose dans le cerveau pendant l'apprentissage. Mais il y a aussi
l'influence des "gènes indirectes" qui aident à réagir contre le stress
et les toxines pendant notre vie prénatale. En fait, il a été montré
que la valeur de QI corrèle avec le développement plus ou moins
harmonique dans le ventre de la mère. Autrement dit, si le stress de
grandissement prénatal était moins élevé, le QI devenait plus élevé.
D'où l'importance des gènes indirectes qui "maîtrisent" le stress ou
les toxines pendant ce stade de développement. Mais ces gènes
indirectes ne peuvent fonctionnent que <em>avec</em> l'environnement
environnant, pas tous seuls de leur propre gré! Donc, en somme:</p>
<blockquote>
"On ne hérite pas de QI, mais seulement la capacité d'acquérir un
certain QI dans un certain environnement. Comment séparer donc la
biologie et l'environnement? Franchement, ceci n'est pas
possible."</blockquote>
<p>Ceci nous a mené dans le coeur de la dispute inné-versus-acquis: les
deux contributions sont indissociables. Steven Pinker dans son "How
the Mind Works" parle:</p>
<blockquote>
If the mind has a complex innate structure, that does <em>not</em> mean
that learning is unimportant. Framing the issue in such a way
that innate structure and learning are pitted against each other,
either as alternatives or, almost as bad, as complementary
ingredients or interacting forces, is a collosal mistake. [...]
Yes, every part of human intelligence involves culture and
learning. But learning is not a surrounding gas or force field,
and it does not happen by magic. It is made possible by innate
machinery designed to do the learning.</blockquote>
<p>Matt Ridley a conclu son chapitre sur l'intelligence comme ceci:</p>
<blockquote>
Après deux millions d'années de notre culture, quand nos ancêtres
transmettaient les traditions locales acquises, le cerveau humain
pouvait acquérir (par évolution) la capacité de chercher et de
faire travailler les facultés qui ont été enseignées par la culture
locale et dans lesquelles l'individu excellait. L'environnement,
dans lequel l'enfant grandi, est aussi bien le résultat de ses
gènes que des facteurs externes; l'enfant cherche et modèle son
propre environnement. S'il a des intérêts techniques, il
développera ses qualités mécaniques; s'il aime la lecture, il
cherchera les livres. Les gènes peuvent créer les goûts, non pas
les aptitudes. Après tout, la haute héréditabilité de myopie est
cause non seulement par la hérédité des formes de l'oeil, mais par
la hérédité des habitudes de la lecture. La hérédité de
l'intelligence peut être alors dans la même mesure la question de
l'environnement ainsi que la question de la biologie. Ceci est une
conclusion très satisfaisante pour les 100 années des disputes
démarrées par Galton.</blockquote>
<p>Le principe que l'enfant ou l'homme modèle son propre environnement
est connu depuis 1890s comme "l'effet de Baldwin" et dit a peu près
que "les hommes se spécialisent sur ce dans lequel ils sont bons, et
en faisant ceci ils créent les conditions qui prospèrent à leurs
gènes" comme l'a reformulé Matt Ridley dans son autre livre "The Red
Queen. Sex and the Evolution of Human Nature".</p>
<p>Comme conclusion, je voudrais répéter que:</p>
<blockquote>
On ne hérite pas de QI, mais seulement la capacité d'acquérir un
certain QI dans un certain environnement.</blockquote>
<p>Revenons maintenant brièvement aux interprétations philosophiques.
Les tests de corrélation mentionnés ci-dessus ont compté avec un
environnement grosso-modo similaire (race blanche, classe moyenne).
D'autres études avec les races différentes et les classes
pauvres/riches ont trouvées d'autres chiffres et les corrélations plus
élevées (davantage que les 19% cite plus haut). Dans l'environnement
propice on peut développer ces capacités; dans l'environnement moins
propice ou carrément hostile on ne peut pas. (C'est comme avec la
taille des personnes; ce facteur d'apparence est hautement héréditaire
(beaucoup plus que le poids, par exemple), mais dans le passé les
aliments étaient manquants ou pas assez nutritifs, donc la majorité
des personnes n'avaient pas atteint leur "taille génétique" ce qui ne
s'observe plus dans le monde actuel.) L'environnement plus ou moins
propice est le mot clé pour éviter la piège de la "fatalité
biologique". Comme l'a dit Stephen Jay Gould:</p>
<blockquote>
L'erreur de ceux qui croient à la hérédité ne consiste pas dans
le fait que le QI ne soit pas "héritable" jusqu'à un certain
niveau, mais dans le fait qu'ils mettent un signe d'égalité
entre "héritable" et "inevitable".</blockquote>
<p>Les gènes ne prédéterminent rien purement; ils n'expriment que les
possibilités et les capacités que nous pouvons ou non développer ou
atteindre dans la réalité de notre environnement. Un exemple facile
pourrait être le problème médicale de la sensibilité du métabolisme à
la lactose chez certaines personnes. Cette sensibilité à la lactose
est bien de l'ordre génétique, mais nous sommes bien libres de se
décider soit de manger et boire des quantités de lait et ainsi de
laisser ce gène s'exprimer par des problèmes médicales, soit de ne pas
en manger et laisser ce gène "dormir".</p>
<p>Comme il a été montré ci-haut, la recherche semble dire que le QI a
bien une forte influence génétique (~50%) et une forte influence
d'environnement (~50%), et que les deux ne sont pas dissociable et que
l'on ne peut tout simplement pas parler de l'une ou de l'autre en
soi-même. Ceci me semble être fort heureux du côté philosophique, car
on est guidé: (1) loin du mythe de la détermination génétique pure,
qui pourrait porter vers les eugénismes et autres racismes de ceux qui
ont par exemple oublié que les différences des QI intra-raciales sont
beaucoup plus importants que les différences éventuelles de QI moyenne
inter-racials; mais aussi (2) loin du mythe de la détermination par
environnement pure et de la "tabula rasa", qui pourrait porter soit
vers la fatalité soit vers le monde totalitaire de Big Brother avec
son éducation totalitaire pour former les bons citoyens de la patrie
par un dressage de cerveaux de petits enfants. Comme a dit si bien
Matt Ridley dans son "The Red Queen. Sex and the Evolution of Human
Nature":</p>
<blockquote>
Le déterminisme culturel ou environnementale de type au laquelle
adhèrent la plupart de sociologues serait aussi cruel et son crédo
aussi effrayant que le déterminisme biologique lequel il
attaque. La vérité est, D. merci, telle que nous sommes un
mélange indivisible et flexible des deux. Nous sommes le produit
de nos gènes seulement dans le sens que les influences génétiques
se développent et se forment sous l'influence des expériences,
exactement comme un oeil apprend à déterminer les formes ou
l'esprit apprend le vocabulaire. Et nous sommes le produit de
l'environnement culturel seulement dans la mesure ou nos cerveaux
apprennent quelque chose de cet environnement.</blockquote>
<p>Acquis avec inné, ensemble, travaillant dûr pour apprendre quelque
chose de notre environnement!</p>
<p>P.S. Les sources citées ci-dessus:</p>
<ul class="simple">
<li>Matt Ridley: "Genome"</li>
<li>Matt Ridley: "The Red Queen. Sex and the Evolution of Human Nature"</li>
<li>Steven Pinker: "How the Mind Works"</li>
</ul>
<p>(Ces auteurs écrivent de manière assez captivante, claire, et
amusante. Même si vous ne seriez pas d'accord avec eux, je ne peux
que les recommander. De la solide matière à réfléchir pour des
lectures critiques.)</p>
Software RAID2004-03-31T12:00:00+02:002004-03-31T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2004-03-31:/software-raid.html<p>How to install, monitor and repair Software RAID on Debian GNU/Linux.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installing-software-raid-on-debian-gnu-linux">
<h2>Installing Software RAID on Debian GNU/Linux</h2>
<p>A care is to be taken when installing Software RAID 1 on Debian/woody
onto the boot partition. One of the best recent guides is written by
Marcus Schoppen and is …</p></div><p>How to install, monitor and repair Software RAID on Debian GNU/Linux.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="installing-software-raid-on-debian-gnu-linux">
<h2>Installing Software RAID on Debian GNU/Linux</h2>
<p>A care is to be taken when installing Software RAID 1 on Debian/woody
onto the boot partition. One of the best recent guides is written by
Marcus Schoppen and is available at
<a class="reference external" href="http://wwwhomes.uni-bielefeld.de/schoppa/raid/woody-raid-howto.html">http://wwwhomes.uni-bielefeld.de/schoppa/raid/woody-raid-howto.html</a>.
With small changes, you can follow his procedure and install Software
RAID 1 onto non-RAID system remotely.</p>
<p>Some comments to his guide:</p>
<ul>
<li><p class="first">in step 1, you better do this:</p>
<pre class="literal-block">
$ sfdisk -d /dev/sda > partitions.sda
$ cp -a partitions.sda partitions.sdb
$ perl -pi -e 's,/sda,/sdb,g' partitions.sdb
$ sfdisk /dev/sdb < partitions.sda
</pre>
</li>
<li><p class="first">in step 7, you better do this:</p>
<pre class="literal-block">
$ mount -v /dev/md0 /mnt # let's start with md0
$ cd / # since md0=/, see note below
$ find . -xdev | cpio -pm /mnt
$ umount /mnt
</pre>
</li>
</ul>
<p>for each filesystem (md0=/, md1=/var, md2=/tmp, ...). But beware,
<tt class="docutils literal">cpio</tt> or <tt class="docutils literal">mirrordir</tt> do not work for files greater than 2GB! Have to
use <tt class="docutils literal">cp</tt> for those.</p>
<ul>
<li><p class="first">in step 9, it is not necessary to make boot floppy, if you do:</p>
<pre class="literal-block">
$ cp /etc/lilo.conf /tmp # to keep "good" lilo.conf handy
$ vi /tmp/lilo.conf # and put there root arg, like this:
# image=/boot/....
# label=Linux
# root=/dev/md0
# read-only
$ raidstop /dev/md0 # otherwise may have problems
$ raidstop /dev/md1 # stop raid for each mdX filesystem
$ lilo -C /tmp/lilo.conf
$ reboot # FIRST REBOOT if you started from RAID-capable kernel
</pre>
</li>
</ul>
<p>Then the installation can be fully remote. (tested)</p>
<ul>
<li><p class="first">in step 11, do not put <tt class="docutils literal">partition</tt> argument in <tt class="docutils literal">lilo.conf</tt>.
Alternative working configuration is as follows:</p>
<pre class="literal-block">
$ cat lilo.conf
lba32
restricted
boot=/dev/md0
root=/dev/md0
install=/boot/boot-menu.b
map=/boot/map
password=foobar
delay=20
vga=normal
raid-extra-boot="/dev/sda,/dev/sdb"
default=Linux
image=/vmlinuz
label=Linux
read-only
image=/vmlinuz.old
label=LinuxOLD
read-only
optional
</pre>
</li>
</ul>
<p>After step 11 is done, do SECOND (and final) REBOOT. You are done.</p>
</div>
<div class="section" id="software-raid-runtime-monitoring">
<h2>Software RAID Runtime Monitoring</h2>
<p>All our Linux servers run Software RAID-1 disks. So beware in case of
failure. A command to check the status of the RAID array is:</p>
<pre class="literal-block">
$ cat /proc/mdstat
</pre>
<p>and should show "[UU]" for each volume when everything is fine:</p>
<pre class="literal-block">
$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 sdb1[1] sda1[0]
96256 blocks [2/2] [UU]
md1 : active raid1 sdb2[1] sda2[0]
995904 blocks [2/2] [UU]
md2 : active raid1 sdb5[1] sda5[0]
586240 blocks [2/2] [UU]
md3 : active raid1 sdb6[1] sda6[0]
995904 blocks [2/2] [UU]
md4 : active raid1 sdb7[1] sda7[0]
2931712 blocks [2/2] [UU]
md5 : active raid1 sdb8[1] sda8[0]
5855552 blocks [2/2] [UU]
md6 : active raid1 sdb9[1] sda9[0]
6457984 blocks [2/2] [UU]
unused devices: <none>
</pre>
<p>If this is not the case, read section on repairing below.</p>
<p>Note that our machines usually run the <tt class="docutils literal">mdadm</tt> daemon that
periodically scans the health status of RAID devices and that alerts
<tt class="docutils literal">root</tt> by email in case it spots something wrong.</p>
</div>
<div class="section" id="software-raid-repairing">
<h2>Software RAID Repairing</h2>
<p>How to repair a degraded RAID device? If you do:</p>
<pre class="literal-block">
$ cat /proc/mdstat
</pre>
<p>and you see a line containing <tt class="docutils literal">U_</tt> such as:</p>
<pre class="literal-block">
md3 : active raid1 sdb6[1] sda6[0]
979840 blocks [2/2] [UU]
md4 : active raid1 sda7[0]
2931712 blocks [2/1] [U_]
</pre>
<p>then it means that <tt class="docutils literal">md4</tt> is running in a degraded mode and that <tt class="docutils literal">sdb7</tt>
has crashed.</p>
<p>Firstly you should check whether the disk is physically okay. Look
into <tt class="docutils literal">/var/log/messages</tt> and search for lines like:</p>
<pre class="literal-block">
$ sudo grep I/O /var/log/messages
Sep 15 02:32:06 pcwebc00 kernel: I/O error: dev 08:21, sector 139017744
Sep 15 02:32:32 pcwebc00 kernel: I/O error: dev 08:21, sector 139017752
</pre>
<p>If you see this, then the disk should be physically replaced before
continuing, and repartitioned exactly like the old one or the one it
is going to mirror.</p>
<p>(Sometimes the system can detect the disk as faulty and will mark it
as <tt class="docutils literal">(F)</tt> in <tt class="docutils literal">/proc/mdstat</tt> output, for example:</p>
<pre class="literal-block">
$ sudo cat /proc/mdstat
[...]
md6 : active raid1 sdb9[1](F) sda9[0]
12329792 blocks [2/1] [U_]
</pre>
<p>and you can double-check that <tt class="docutils literal">/var/log/messages</tt> indeed indicates an
I/O error:</p>
<pre class="literal-block">
Jan 4 00:16:26 pcdh90 kernel: scsi2: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 01 a8 d9 ef 00 00 50 00
Jan 4 00:16:28 pcdh90 kernel: Info fld=0x1a8da02, Current sd08:19: sense key Medium Error
Jan 4 00:16:28 pcdh90 kernel: Additional sense indicates Unrecovered read error
Jan 4 00:16:28 pcdh90 kernel: I/O error: dev 08:19, sector 16661768
</pre>
<p>asking for physical disk examination, as stated above.)</p>
<p>If you don't see any symptoms of a disk failure, then you may repair
the RAID device onto the same disk and onto the same partition.</p>
<p>To repair the RAID <tt class="docutils literal">/dev/md4</tt> of the example above, do:</p>
<pre class="literal-block">
$ sudo raidhotadd /dev/md4 /dev/sdb7
</pre>
<p>or, if you use <tt class="docutils literal">mdadm</tt> instead of <tt class="docutils literal">raidtools</tt>, like this:</p>
<pre class="literal-block">
$ sudo mdadm /dev/md4 -a /dev/sdb7
</pre>
<p>and watch the progress:</p>
<pre class="literal-block">
$ cat /proc/mdstat
md4 : active raid1 sdb7[2] sda7[0]
2931712 blocks [2/1] [U_]
[====>................] recovery = 21.4% (629440/2931712) finish=1.5min speed=25177K/sec
</pre>
<p>After a while, the RAID should be repaired:</p>
<pre class="literal-block">
$ cat /proc/mdstat
md4 : active raid1 sdb7[1] sda7[0]
2931712 blocks [2/2] [UU]
</pre>
<p>You are done.</p>
</div>
Python Psyco2003-09-23T12:00:00+02:002003-09-23T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,2003-09-23:/python-psyco.html<p>A simple (and sometimes <em>very</em> efficient) way to speed up your Python
programs is via the <a class="reference external" href="http://psyco.sourceforge.net/">Psyco</a> module.
But beware, Psyco only runs on 32-bit OSes.</p>
<!-- PELICAN_END_SUMMARY --><p>Basic Psyco usage is very simple: just do:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">psyco</span>
</pre></div>
<p>and later:</p>
<div class="highlight"><pre><span></span><span class="n">psyco</span><span class="o">.</span><span class="n">bind</span><span class="p">(</span><span class="n">my_slow_function</span><span class="p">)</span>
</pre></div>
<p>for each function/class you want to speed up …</p><p>A simple (and sometimes <em>very</em> efficient) way to speed up your Python
programs is via the <a class="reference external" href="http://psyco.sourceforge.net/">Psyco</a> module.
But beware, Psyco only runs on 32-bit OSes.</p>
<!-- PELICAN_END_SUMMARY --><p>Basic Psyco usage is very simple: just do:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">psyco</span>
</pre></div>
<p>and later:</p>
<div class="highlight"><pre><span></span><span class="n">psyco</span><span class="o">.</span><span class="n">bind</span><span class="p">(</span><span class="n">my_slow_function</span><span class="p">)</span>
</pre></div>
<p>for each function/class you want to speed up. Benefits depend a lot
on the nature of your problem: psyco helps mainly with simple code
logic and cycles like <tt class="docutils literal">for i in foo</tt>, and does not help much (or may
slow things down) with complex code logic.</p>
<div class="section" id="simple-code-logic-example">
<h2>Simple code logic example</h2>
<p>In my experience, naked Python is typically 5x to 50x slower than
Common Lisp (CMUCL). Similar numbers are cited in Peter Norvig's
<a class="reference external" href="http://www.norvig.com/python-lisp.html">Python for Lisp Programmers</a>. With simple code logic
like many for cycles, Psyco can provide very significant speed
improvement (2x-20x, in my experience), bringing the difference
between Python and Common Lisp down to the same order of magnitude,
say 2x-5x. For example, for the Fibonacci numbers program from the
<a class="reference external" href="http://www.bagley.org/~doug/shootout/bench/fibo/">Great Computer Language Shootout</a>, if I add to the
Python source:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">psyco</span>
<span class="n">psyco</span><span class="o">.</span><span class="n">bind</span><span class="p">(</span><span class="n">fib</span><span class="p">)</span>
</pre></div>
<p>I obtain, for fib(37)=39088169, practically the CMUCL time:</p>
<pre class="literal-block">
Python raw ....... 77.71 sec
Python + Psyco ... 1.63 sec
CMUCL ............ 1.42 sec
OCaml ............ 0.42 sec
</pre>
<p>Just to illustrate where one can get when the bottlenecks in the
Python source code are well "psycoable".</p>
</div>
<div class="section" id="complex-code-logic-example">
<h2>Complex code logic example</h2>
<p>For cases with complex code logic (not even speaking of including
database connections and stuff) I have observed little help from
Psyco. It could even slow things down.</p>
<p>Each code candidate for speed-up is to be tested on its
"psycoability".</p>
</div>
Fastest Ants2003-07-10T01:22:00+02:002003-07-10T01:22:00+02:00Tibor Šimkotag:tiborsimko.org,2003-07-10:/ants-fastest.html<p>Ants are fast. <em>Really</em> fast. Who is the ant champion of speed? And
what would it take for a human to move the same way?</p>
<!-- PELICAN_END_SUMMARY --><p><em>Odontomachus bauri</em>. The fastest movement ever measured in the
animal world. Mandible closure time from open wide to fully closed in
0.0003-0.001 sec …</p><p>Ants are fast. <em>Really</em> fast. Who is the ant champion of speed? And
what would it take for a human to move the same way?</p>
<!-- PELICAN_END_SUMMARY --><p><em>Odontomachus bauri</em>. The fastest movement ever measured in the
animal world. Mandible closure time from open wide to fully closed in
0.0003-0.001 sec. This is to capture small collembollas, or to
"catapult himself up" over a distance of 40 cm and more, by clapping
his mandibles against a stone rather than at a prey. Scaled to human
size, the mandible closure speed would correspond to the fist movement
of 3 km/s, i.e. a fist faster than a rifle bullet!</p>
<p>Pretty amazing, the ants. There is more than 9,000 ant species, with
very great diversity and way of life.</p>
<p><em>Gigantiops destructor</em>. Extremely good mosaic eyes with 4000 facets,
capable of memorising and recognising terrain perfectly. In the lab
they learn fast to memorise geometrical symbols like ellipse, circle,
triangle, square, etc.</p>
<p><em>Cataglyphis</em>. Very good at orienting, capable of detecting polarised
light, to calculate its position with respect to the sun and the nest.</p>
<p>Just a few examples of ant champions.</p>
Vampire Bats and the Food Sharing Dilemma2003-06-23T15:38:00+02:002003-06-23T15:38:00+02:00Tibor Šimkotag:tiborsimko.org,2003-06-23:/vampire-bats.html<p class="first last">Vampire bats are known for sharing their food with friends.
Is this an altogether altruistic process? Can a lazy bat
profit from the generosity of his friends all the while
never paying them back?</p>
<p>Previously I mentioned the ants and the process of sharing food known
as trophallaxis, a kind of regurgitation or vomiting happening per
request, to "altruistically" share the food between a hungry and a
satiated ant.</p>
<p>The vampire bats do that as well. When a bat gets some blood and has
had his full of it, upon returning home he can "altruistically" vomit
a bit and share it with his bat friends.</p>
<p>Now one may think that such a system would tend to profit a selfish
cheater kind of bat, since he could always try to get a share of the
common blood from his friends when he's hungry, and refuse to give
away any of his own victuals should he be the happy camper returning
home with the full stomach. Better to share, expecting the same
favour later on, or better to refuse, profiting from the present lucky
instant?</p>
<p>A prisoner's dilemma kind of game, in which bats have a complex
decision to make.</p>
<p>Luckily the nature provided them with complex brains to help in their
deciding business. The fact is, vampire bats have unusual brain size
and the biggest neocortex, "the thinking part" of the brain, from
among all the other bat species. This is to enable them to memorise
better which bat colleague wanted to share in the past and which was
selfish... and so the solitary selfish cheaters, that do not return a
favor, are learned about and avoided next time.</p>
<p>It's thanks to their bigger brain that such a sharing system can work
in reality.</p>
<p><em>After Matt Ridley's "On the origin of virtue".</em></p>
Shabby Ants2003-06-23T15:37:00+02:002003-06-23T15:37:00+02:00Tibor Šimkotag:tiborsimko.org,2003-06-23:/ants-shabby.html<p>In popular knowledge, ant societies are an amazing example of animal
collaboration. A working-together society full of brave actions.
Less known are examples of working-against society full of shabby
actions.</p>
<!-- PELICAN_END_SUMMARY --><p>Consider the spectacular <em>Polyergus</em>, the Amazon ant, the queen of
which penetrates into the nest of <em>Formica</em> species, kills the …</p><p>In popular knowledge, ant societies are an amazing example of animal
collaboration. A working-together society full of brave actions.
Less known are examples of working-against society full of shabby
actions.</p>
<!-- PELICAN_END_SUMMARY --><p>Consider the spectacular <em>Polyergus</em>, the Amazon ant, the queen of
which penetrates into the nest of <em>Formica</em> species, kills the
original queen, and subordinate existing workers to serve her and her
own species. The original <em>Formica</em> ants become literally their
slaves: the kids of <em>Polyergus</em> (warriors with huuge mandibules) are
so big that they are unable to feed themselves. They need their
<em>Formica</em> slaves to feed them. And, since the original queen is dead,
they have to find some fresh supplies of new <em>Formica</em> workers
somewhere every now and again, in order to survive. So, in the
evenings they go off out of the nest on a wild ride searching for
other <em>Formica</em> nests, fight them, take out their larvae, bring them
home to their <em>Formica</em> slaves who then take care of them and thus
raise more and more slaves for their "cruel" <em>Polyergus</em> masters.</p>
<p>This last part was a bit untought of, as it were, by <em>Anergates</em>, the
queen of which does not produce any workers at all. She gives birth
only to queens and males. So, when she penetrates into the nest of
the <em>Tetramorium</em> ant, and kills the host queen (some parasitic
species also try to make a deal with the host queen so that they can
live side by side with the host queen in the same nest), then when all
the host <em>Tetramorium</em> workers slowly die off, the whole <em>Anergates</em>
colony dies as well. There are simply no more workers left to get
food.</p>
<p>Or take tiny <em>Diplorhoptrum</em> ant, who lives inside the nest of a
bigger <em>Lasius</em> species, viciously stealing food from them from time
to time. And, when <em>Lasius</em> soldiers get finally angry and rush after
them to get them, <em>Diplorhoptrum</em> simply hides into their tiny narrow
corridors inside <em>Lasius</em> nest, where the bigger <em>Lasius</em> cannot
penetrate because of their size. Tiny but smart, <em>Diplorhoptrum</em>!</p>
Pompous Fools2003-06-20T16:40:00+02:002003-06-20T16:40:00+02:00Tibor Šimkotag:tiborsimko.org,2003-06-20:/pompous-fools.html<p>Ponder the following sentence. "The individual member of the social
community often receive his information via visual, symbolic
channels." Thumbs up or thumbs down?</p>
<!-- PELICAN_END_SUMMARY --><p>One of my favourite stories on "glorious ways of communicating" comes
from the pen of Richard Feynman. He once participated in a
cross-disciplinary kind of conference …</p><p>Ponder the following sentence. "The individual member of the social
community often receive his information via visual, symbolic
channels." Thumbs up or thumbs down?</p>
<!-- PELICAN_END_SUMMARY --><p>One of my favourite stories on "glorious ways of communicating" comes
from the pen of Richard Feynman. He once participated in a
cross-disciplinary kind of conference on ethics and wrote the
following on his experience.</p>
<p><center>* * *</center><p><p>There was a sociologist who had written a paper for us all to read
- something he had written ahead of time. I started to read the
damn thing, and my eyes were coming out: I couldn't make head nor
tail or it! I figured it was because I hadn't read any of the books
on that list. I had this uneasy feeling of "I'm not adequate,"
until I finally said to myself, "I'm gonna stop, and read one
sentence slowly, so I can figure out what the hell it means.</p>
<p>So I stopped - at random - and read the next sentence very
carefully. I can't remember it precisely, but it was very close to
this: "The individual member of the social community often received
his information via visual, symbolic channels." I went back and
forth over it, and translated. You know what it means? "People
Read."</p>
<p>Then I went over the next sentence, and I realized that I could
translate that one also. Then it became a kind of empty business:
sometimes people read; sometimes people listen to the radio," and
so on, but written in such a fancy way that I couldn't understand
it at first, and when I finally deciphered it, there was nothing to
it.</p>
<p>[...]</p><p>There were a lot of fools at that conference - pompus fools - and
pompous fools drive me up the wall. Ordinary foolks are all right;
you can talk to them, and try to help them out. But pompus fools -
guys who are fools and are covering it all over and impressing
people as to how wonderful they are with all this hocus pocus -
THAT, I CANNOT STAND! An ordinary fool isn't a faker; an honest
fool is all right. But a dishonest fool is terrible!</p>
<p>[...]</p><p>There was only one thing that happened at that meeting that was
pleasant or amusing. At this conference, every word that every guy
said at the plenary session was so important that they had a
stenotypist there, typing every damn thing. Somewhere on the second
day the stenotypist came up to me and said, "What profession are
you? Surely not a professor."</p>
<p>"I am a professor," I said.</p>
<p>"Of what?"</p>
<p>"Of physics - science."</p>
<p>"Oh! That must be the reason," he said.</p>
<p>"Reason for what?"</p>
<p>He said, "You see, I'm a stenotypist, and I type everything that is
said here. Now, when the other fellas talk, I type what they say,
but I don't understand what they're saying. But every time you get
up to ask a question or to say something, I understand exactly what
you mean - what the question is, and what you're saying - so I
thought you can't be a professor!"</p>
<p><em>Quoted from Richard Feynman's "Surely you're joking, Mr Feynman!"</em></p>
Common Lisp Extensibility2003-03-20T09:00:00+01:002003-03-20T09:00:00+01:00Tibor Šimkotag:tiborsimko.org,2003-03-20:/common-lisp-extensibility.html<p>A friend suggested that object orientation may have been added to Lisp
in an "usine à gaz" sort of way. This post is an extended answer to
that claim.</p>
<!-- PELICAN_END_SUMMARY --><p>Firstly I'll mention how the extensibility is at the very heart of
Lisp and how easy it is to add OO …</p><p>A friend suggested that object orientation may have been added to Lisp
in an "usine à gaz" sort of way. This post is an extended answer to
that claim.</p>
<!-- PELICAN_END_SUMMARY --><p>Firstly I'll mention how the extensibility is at the very heart of
Lisp and how easy it is to add OO abstractions to Common Lisp.
(Aiming to counter the proposition that including OO into Lisp were an
"usine à gaz" sort of thing.) Then I'll pause to express some
scepticism on whether OO is really an upgrade as far as the Lisp
abstraction possibilities are concerned. At the end I'll try to
introduce some of the ideas behind the Common Lisp Object System
(CLOS) to demonstrate why I think that this kind of object system is
superior and more natural to model the real world in than object
systems found in mainstream languages such as Java, or, err, Python.</p>
<div class="section" id="lisp-design-and-extensibility">
<h2>Lisp design and extensibility</h2>
<p>Lisp design principles make the language to be naturally extensible,
as well as leading to naturally extensible products: just look at
Emacs or Autocad. Why is this so?</p>
<blockquote>
<p>Lisp is designed to be extensible; it lets you define new operators
yourself. This is possible because the Lisp language is made out
of the same functions and macros as your own programs. So it's no
more difficult to extend Lisp than to write a program in it. In
fact, it's so easy (and so useful) that extending the language is
standard practice. As you're writing your program down toward the
language, you build the language up toward your program. You work
bottom-up, as well as top-down.</p>
<p>Almost any program can benefit from having the language tailored to
suit its needs, but the more complex the program, the more valuable
bottom-up programming becomes. A bottom-up program can be written
as a series of layers, each one acting as a sort of programming
language for the one above. TeX was one of the earliest programs
to be written this way. You can write programs bottom-up in any
language, but Lisp is by far the most natural vehicle for this
style.</p>
<p>Bottom-up programming leads to naturally extensible software. If
you take the principle of bottom-up programming all the way up to
the topmost layer of your program, then that layer becomes a
programming language for the user. Because the idea of
extensibility is so deeply rooted in Lisp, it makes the ideal
language for writing extensible software.</p>
<p class="attribution">—Paul Graham, ACL (1996)</p>
</blockquote>
<p>Lisp can lead to extensible software, to which Emacs is a real-life
proof. So far so good. But what about its extensibility and
adaptability as far as the programming paradigms are concerned?</p>
<blockquote>
<p>When Lisp was invented in 1958, nobody could have foreseen the
advances in programming theory and language design that have taken
place in the last thirty years. Other early languages have been
discarded, replaced by ones based on newer ideas. However, Lisp
has been able to survive, because it has been able to adapt.
Because Lisp is extensible, it has been changed to incorporate the
newest features as they become popular. [...] Flexibility of Lisp
goes beyond adding individual constructs. The brand new styles of
programming can easily be implemented. Many AI applications are
based on the idea of rule-based programming. Another new style is
object-oriented programming, which has been incorporated with the
Common Lisp Object System (CLOS), a set of macros, functions and
data types that have been integrated into ANSI Common Lisp.</p>
<p class="attribution">—Peter Norvig, PAIP (1992)</p>
</blockquote>
<p>The beginning of the above-mentioned quote from Paul Graham gives the
answer as to why is it so: because Lisp can be written in Lisp.</p>
<blockquote>
<p>The unusual thing about Lisp -- in fact, the defining quality of
Lisp -- is that it can be written in itself. [...] Using just
quote, atom, eq, car, cdr, cons, and cond, we can define a
function, eval, that actually impements our language, and then
using that we can define any additional function we want. [...]
Given a handful of simple operators and a notation for functions,
you can build a whole programming language [...] What we have here
is a remarkably elegant model of computation.</p>
<p class="attribution">—Paul Graham, "The Roots of Lisp" (2001)
<<a class="reference external" href="http://paulgraham.com/rootsoflisp.html">http://paulgraham.com/rootsoflisp.html</a>></p>
</blockquote>
<p>The key idea of Lisp, that makes it all possible, is the code-is-data
philosophy, the fact that Lisp programs use the same structure as Lisp
data. Using syntactically the same structure for code and data means
that in Lisp it is natural to write programs that manipulate code; to
write programs that write programs. I'm thinking of Lisp macros, of
course. Together with runtime typing, these features make the
language to be very flexible and adaptable ...unlike other languages.
It is important to note that these features have been present in Lisp
since 1960s but are still rare to find out elsewhere in 2003! Which
makes that, while extending Lisp is a natural way of working in the
Lisp world, extending other languages isn't that natural. For other
languages it is much harder to adapt to new trends, and trying to do
so often gives rise to an "usine à gaz" class of phenomena, i.e. more
or less unnatural stretching of the language beyond its original
domain.</p>
<p>Thanks to the code-is-data philosophy, extensibility is a core,
natural thing in Lisp <em>by its very design</em>, so that the "usine à gaz"
phenomenon is a beautiful stranger here. Let me demonstrate it on a
concrete example of the object-oriented paradigm.</p>
</div>
<div class="section" id="lisp-abstractions-and-oo-message-passing-model">
<h2>Lisp abstractions and OO message-passing model</h2>
<p>How hard would it be to extend Lisp to encompass the mainstream OO
message-passing model? That is, let's forget that Common Lisp has its
own OO system, and let's try to introduce a new mainstream OO
message-passing model into a non-OO Common Lisp. How many man-months
and lines of code do you think such a system would require?</p>
<p>This is exactly the kind of exercise Paul Graham took in the last
chapter of his "ANSI Common Lisp" book, in order to illustrate the
embedded language programming in Lisp. And the answer is: eight lines
of code to introduce a basic message-passing single-inheritance model:</p>
<blockquote>
<p>It's worth pausing to consider what we have done here. With eight
lines of code we have made plain old pre-CLOS Lisp into an
object-oriented language. How did we manage to achieve such a
feat? There must be some trick involved, to implement
object-oriented programming in eight lines of code.</p>
<p>There is a trick, but it is not a programming trick. The trick
is, Lisp already was an object-oriented language, or rather,
something more general. All we had to do was put a new facade on
the abstractions that were already there.</p>
</blockquote>
<p>Which abstractions did Paul Graham meant? Objects were modelled via
hash tables where the attributes were stored, as well as object
methods, via lambda functions. This is because functions are
first-class citizens in Lisp (meaning that they can be worked with as
data, passed into and returned from functions, etc). What was needed
to add "on top" of pre-CLOS Lisp was mainly a basic support for
inheritance, for which those eight lines of code were enough.</p>
<p>The chapter then goes on expanding and "beautifying" the model, adding
user-friendly language constructs for using objects etc, to end up in
a complete full-featured OO message-passing system, including multiple
inheritance(!), in something like seventy lines of code.</p>
<blockquote>
We now have an embedded language suitable for writing real
object-oriented programs. It is simple, but for its size quite
powerful. And in typical applications it will also be fast. [...]
We see what a wide latitude the term "object-oriented" has. Our
program is more powerful than a lot of things that have been called
object-oriented, and yet it has only a fraction of the power of
CLOS [...] One of the disadvantages of CLOS is that it is so large
and elaborate that it conceals the extent to which object-oriented
programming is a paraphrase of Lisp. The example in this chapter
does at least make that clear. If we were content to implement the
old message-passing model, we could do it in a little over page of
code. Object-oriented programming is one thing Lisp can do. A
more interesting question is, what else can it do?</blockquote>
<p>Seventy lines of code... this does not sound like an "usine à gaz"
sort of thing, does it?</p>
</div>
<div class="section" id="does-lisp-need-oo">
<h2>Does Lisp need OO?</h2>
<p>What the above exercise means? Not more and not less than the fact
that Lisp has its own powerful set of abstractions, and that the OO
point of view on the world doesn't provide any kind of "substantial
upgrade" into Lisp way of thinking. It suffices to only "map" new OO
constructs onto pre-existing Lisp abstractions, and the job was done.
What are these pre-existing key Lisp abstractions?</p>
<blockquote>
With macros, closures, and run-time typing, Lisp transcends
object-oriented programming.</blockquote>
<p>I've already spoken briefly about (i) code-is-data philosophy and (ii)
macros operate on code and produce code. It's worth to underline that
Lisp macros have full access to the language, unlike say C "macros"
and other "templating" systems that may operate simply textually. As
for (iii) runtime typing, it of course means that the types are
associated with values rather than with variables: it's a kind of
strong dynamic typing that is now fairly popular in many mainstream
languages. (iv) Lexical closures are present in Lisp since early
1970s, but are, similarly to macros, still quite rare outside of Lisp.
Substantially, lexical closures are functions that "close over"
lexically bound variables so that these variables are then
unaccessible to the outside code except via the closure function
calls. In other words, a closure is a function that can "capture and
encapsulate" variables in local state. Sounds like an OO concept,
hum? (Peter Norvig in the above-cited PAIP makes the same exercise
as Paul Graham in ACL -- how to introduce OO into non-OO Lisp -- for
which he uses expressly closures and lambda functions, with similar
easiness. And, in the full-featured CLOS, the methods are actually
closures, underneath the syntax.)</p>
<p>And I'm not going to elaborate anything about (v) Lisp symbols having
had their own "property lists" that could have been used to store
object attributes to; (vi) encapsulation and information-hiding that
can be provided not only via closures but via interning symbols; (vii)
etc, etc... but the list is already long enough and I hope to have
shown already why OO does not bring that much of "new" abstraction
concepts, which is why it was possible to introduce the
message-passing OO model, including multiple inheritance, in about
seventy lines of code.</p>
<p>Consequently, it's not a big surprise that many Lisp programmers do not
consider OO to be any kind of "ultimate" programming paradigm. Lisp
does not really "need" it. This is in sheer contrast to the non-Lisp
world where OO brought significant improvements:</p>
<blockquote>
<p>Object-oriented programming is exciting if you have a
statically-typed language without lexical closures or macros. To
some degree, it offers a way around these limitations.</p>
<p class="attribution">—Paul Graham, "Why Arc Isn't Especially Object-Oriented"
<<a class="reference external" href="http://paulgraham.com/noop.html">http://paulgraham.com/noop.html</a>></p>
</blockquote>
<p>while in the Lisp world itself:</p>
<blockquote>
With addition of CLOS, Common Lisp has become the most powerful
object-oriented language in widespread use. Ironically, it is also
the language in which object-oriented programming is least
necessary.</blockquote>
<p>That said, let's look at CLOS, the Common Lisp Object System, in a
later post.</p>
</div>
Common Lisp Object System2003-03-18T09:00:00+01:002003-03-18T09:00:00+01:00Tibor Šimkotag:tiborsimko.org,2003-03-18:/common-lisp-object-system.html<p>In a previous post, I looked at Common Lisp extensibility, how easy
and how natural it is to implement single-method dispatch in the
language; and how "unnecessary" it actually is, given the other
powerful constructs of the language. Let's now look at CLOS, the
Common Lisp Object System, to see …</p><p>In a previous post, I looked at Common Lisp extensibility, how easy
and how natural it is to implement single-method dispatch in the
language; and how "unnecessary" it actually is, given the other
powerful constructs of the language. Let's now look at CLOS, the
Common Lisp Object System, to see how the object orientation is
<em>really</em> done in the Lisp world.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="single-method-dispatch">
<h2>Single-method dispatch</h2>
<p>Single-method dispatch is based on a model where objects send each
other messages. We know it from Java, C++, Python, etc. However, is
it really the best way to model the real world?</p>
<p>Consider an object colliding with another object, such as two
strawberries launched against each other. Which one sends the message
to whom? Who's the sender and who's the receiver? Aren't they both
equal partners in the collision event? If so, isn't it better to
model this situation differently?</p>
<p>Enter generic functions and multi-method dispatch.</p>
</div>
<div class="section" id="multi-method-dispatch">
<h2>Multi-method dispatch</h2>
<p><em>Example added in August 2010</em></p>
<p>Q: "So, I don't know about CLisp's OO or multi-methods. Could you
give example pythonic syntax that makes clear how it would work?"</p>
<p>A: Pythonic pseudo code example defining the rock-paper-scissors game,
emulating Common Lisp way of doing things, would go like this:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Rock</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">class</span> <span class="nc">Paper</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">class</span> <span class="nc">Scissors</span><span class="p">:</span>
<span class="k">pass</span>
<span class="n">defgeneric</span> <span class="n">game</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">defmethod</span> <span class="n">game</span><span class="p">(</span><span class="n">Rock</span> <span class="n">r</span><span class="p">,</span> <span class="n">Paper</span> <span class="n">p</span><span class="p">):</span>
<span class="k">print</span> <span class="s2">"</span><span class="si">%s</span><span class="s2"> wins"</span> <span class="o">%</span> <span class="n">p</span>
<span class="n">defmethod</span> <span class="n">game</span><span class="p">(</span><span class="n">Rock</span> <span class="n">r</span><span class="p">,</span> <span class="n">Scissors</span> <span class="n">s</span><span class="p">):</span>
<span class="k">print</span> <span class="s2">"</span><span class="si">%s</span><span class="s2"> wins"</span> <span class="o">%</span> <span class="n">r</span>
<span class="n">defmethod</span> <span class="n">game</span><span class="p">(</span><span class="n">Paper</span> <span class="n">p</span><span class="p">,</span> <span class="n">Scissors</span> <span class="n">s</span><span class="p">):</span>
<span class="k">print</span> <span class="s2">"</span><span class="si">%s</span><span class="s2"> wins"</span> <span class="o">%</span> <span class="n">s</span>
</pre></div>
<p>A concrete rock-paper-scissors game round between John and Jane:</p>
<div class="highlight"><pre><span></span><span class="n">john_fist</span> <span class="o">=</span> <span class="n">Paper</span><span class="p">()</span>
<span class="n">jane_fist</span> <span class="o">=</span> <span class="n">Scissors</span><span class="p">()</span>
<span class="n">game</span><span class="p">(</span><span class="n">john_fist</span><span class="p">,</span> <span class="n">jane_fist</span><span class="p">)</span>
<span class="c1"># Jane wins</span>
</pre></div>
<p>For more, see e.g. <a class="reference external" href="http://www.c2.com/cgi/wiki?MultiMethods">http://www.c2.com/cgi/wiki?MultiMethods</a>.</p>
</div>
<div class="section" id="clos">
<h2>CLOS</h2>
<p><em>FIXME</em></p>
</div>
<div class="section" id="everything-should-be-made-simple">
<h2>Everything should be made... simple?</h2>
<p>Einstein once said that "Everything should be made as simple as
possible, but not simpler". Sometimes people forget this "but not
simpler" part. The simplifying point of view misses the fact that
life is inherently complex and you cannot model it well using
oversimplifications. The real world is full of multiple inheritance
and multi-method dispatch: it suffices to just look around. In
single-method dispatch language environments, people try to circumvent
this, inventing artificial workarounds to get the job done. I argue
that this is not the best approach. At some point in time one needs
to stop simplifying and accept the inherent complexity of the problem.
Then, of course, you need to tackle it with the right tools, the best
ones available.</p>
</div>
Common Lisp Runtime Redefinition2002-12-18T20:00:00+01:002002-12-18T20:00:00+01:00Tibor Šimkotag:tiborsimko.org,2002-12-18:/common-lisp-runtime-redefinition.html<p>I think Lisp is the ideal enviroment for rapid prototyping: even
better than Python. One example is that you can modify your code
on-the-fly while running it; there is no better debugging tool.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="dynamic-redefinition-of-class-methods">
<h2>Dynamic redefinition of class methods</h2>
<p>Consider the following rectangle class [you can type "lisp" on pcdh91
and …</p></div><p>I think Lisp is the ideal enviroment for rapid prototyping: even
better than Python. One example is that you can modify your code
on-the-fly while running it; there is no better debugging tool.</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="dynamic-redefinition-of-class-methods">
<h2>Dynamic redefinition of class methods</h2>
<p>Consider the following rectangle class [you can type "lisp" on pcdh91
and then enter my examples]:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defclass</span> <span class="nv">rectangle</span> <span class="p">()</span>
<span class="p">((</span><span class="nv">width</span> <span class="ss">:accessor</span> <span class="nv">rectangle-width</span> <span class="ss">:initarg</span> <span class="ss">:width</span><span class="p">)</span>
<span class="p">(</span><span class="nv">height</span> <span class="ss">:accessor</span> <span class="nv">rectangle-height</span> <span class="ss">:initarg</span> <span class="ss">:height</span><span class="p">)))</span>
</pre></div>
<p>and the following method to calculate rectangle's area:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defmethod</span> <span class="nv">rectangle-area</span> <span class="p">((</span><span class="nv">r</span> <span class="nv">rectangle</span><span class="p">))</span>
<span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nv">rectangle-height</span> <span class="nv">r</span><span class="p">)</span> <span class="c1">; typo intended!</span>
<span class="p">(</span><span class="nv">rectangle-height</span> <span class="nv">r</span><span class="p">)))</span>
</pre></div>
<p>now if you test it:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">setf</span> <span class="nv">my-test-rect</span> <span class="p">(</span><span class="nb">make-instance</span> <span class="ss">'rectangle</span> <span class="ss">:width</span> <span class="mi">80</span> <span class="ss">:height</span> <span class="mi">20</span><span class="p">))</span>
<span class="p">(</span><span class="nv">rectangle-area</span> <span class="nv">my-test-rect</span><span class="p">)</span>
</pre></div>
<p>Lisp prints "400" for you... wait, that's not good... oops and you
spot a typo in the rectangle-area function, so you rewrite it
<strong>without quitting</strong> Lisp runtime environment thusly:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defmethod</span> <span class="nv">rectangle-area</span> <span class="p">((</span><span class="nv">r</span> <span class="nv">rectangle</span><span class="p">))</span>
<span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nv">rectangle-width</span> <span class="nv">r</span><span class="p">)</span>
<span class="p">(</span><span class="nv">rectangle-height</span> <span class="nv">r</span><span class="p">)))</span>
</pre></div>
<p>and now you can call again the test function, without recreating new
instance:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nv">rectangle-area</span> <span class="nv">my-test-rect</span><span class="p">)</span>
</pre></div>
<p>you will receive the correct answer, i.e. 1600.</p>
<p>This small example demonstrates that you can modify a class <em>during
runtime</em> and all the existing instances of that class are corrected
accordingly! You do not need to quit, recompile, restart, recreate
instances and set the same program testing state. This saves a lot of
developer's time, and I think that all the other debugging models of
Python/OCaml/C/C++/Foo/Bar do not come even close.</p>
</div>
<div class="section" id="dynamic-redefinition-with-winding-unwinding-stack">
<h2>Dynamic redefinition with winding/unwinding stack</h2>
<p>It's even more impressive if we join dynamic redefinition with
winding/unwinding of stack. Unlike other languages that in case of
runtime error just stop and unwind the stack, CL offers you full
inspecting, patching and recompiling capabilities on the current
stack. IOW, you can debug and fix any problem you have and simply go
on running. I'll try a small example on this theme.</p>
<p>Consider the classical school drill:</p>
<pre class="literal-block">
I won't speak loudly during a lesson.
I won't speak loudly during a lesson.
I won't speak loudly during a lesson.
[...]
</pre>
<p>Let us write a little drill function that calls some phrase generator
function f n times and that prints out generated phrases with some
noops added for demo purposes:</p>
<pre class="literal-block">
simko@pcdh91:~$ clisp
[1]> (defun drill (n f)
"Repeat n times the drill phrase. Call f to obtain i-th drill
phrase."
(dotimes (i n)
(format t "~%Phrase ~d begins. " i)
(sleep 1)
(format t "~s" (funcall f i))
(sleep 1)
(format t " Phrase ~d ends." i)))
</pre>
<p>and let us write a phrase generator to praise my favourite language
and to calculate some simple arithmetics:</p>
<pre class="literal-block">
[2]> (defun drill-phrase-generator (i)
"Return i-th drill phrase."
(format nil "I love Java! BTW, ~d*~d=~d." i i (* i i)))
</pre>
<p>Now let us run it 10 times:</p>
<pre class="literal-block">
[3]> (drill 10 'drill-phrase-generator)
Phrase 0 begins. "I love Java! BTW, 0*0=0." Phrase 0 ends.
Phrase 1 begins. "I love Java! BTW, 1*1=1." Phrase 1 ends.
Phrase 2 begins. "I love Java! BTW, 2*2=4." Phrase 2 ends.
Phrase 3 begins.
</pre>
<p>but oops after several repetitions we notice that my favourite
programming language was misspelled, and the arithmetics "broken"
(say), so let's stop the app with <tt class="docutils literal">^C</tt> here:</p>
<pre class="literal-block">
** - Continuable Error
SYSTEM::%SLEEP: User break
If you continue (by typing 'continue'): Continue execution
1. Break [4]>
</pre>
<p>which brought us to the "application listener" where we can correct
ourselves:</p>
<pre class="literal-block">
1. Break [4]> (defun drill-phrase-generator (i)
(format nil "I love Common Lisp! BTW, ~d+~d=~d." i i (+ i i)))
DRILL-PHRASE-GENERATOR
</pre>
<p>and simply continue the execution:</p>
<pre class="literal-block">
1. Break [4]> continue
"I love Common Lisp! BTW, 3+3=6." Phrase 3 ends.
Phrase 4 begins. "I love Common Lisp! BTW, 4+4=8." Phrase 4 ends.
Phrase 5 begins. "I love Common Lisp! BTW, 5+5=10." Phrase 5 ends.
Phrase 6 begins. "I love Common Lisp! BTW, 6+6=12." Phrase 6 ends.
Phrase 7 begins. "I love Common Lisp! BTW, 7+7=14." Phrase 7 ends.
Phrase 8 begins. "I love Common Lisp! BTW, 8+8=16." Phrase 8 ends.
Phrase 9 begins. "I love Common Lisp! BTW, 9+9=18." Phrase 9 ends.
</pre>
<p>See the CL power? The application was patched during runtime and
continued its course as if nothing happened. Now imagine this example
in the middle of a big, slow, and large application, and I hope it's
clear how much it helps not to have to quit, recompile, relink, rerun,
restore state, and redebug.</p>
</div>
<div class="section" id="let-s-redefine-drill-as-well">
<h2>Let's redefine drill as well</h2>
<p>To see even more CL power, let us redefine the drill function too:</p>
<pre class="literal-block">
simko@pcdh91:~$ clisp
[1]> (defun drill (n f)
"Repeat n times the drill phrase. Call f to obtain i-th drill
phrase."
(dotimes (i n)
(format t "~%Phrase ~d begins. " i)
(sleep 1)
(format t "~s" (funcall f i))
(sleep 1)
(format t " Phrase ~d ends." i)))
[2]> (defun drill-phrase-generator (i)
"Return i-th drill phrase."
(format nil "I love Java! BTW, ~d*~d=~d." i i (* i i)))
DRILL-PHRASE-GENERATOR
[3]> (dotimes (i 10) (drill 3 'drill-phrase-generator))
Phrase 0 begins. "I love Java! BTW, 0*0=0." Phrase 0 ends.
Phrase 1 begins. "I love Java! BTW, 1*1=1." Phrase 1 ends.
Phrase 2 begins. "I love Java! BTW, 2*2=4." Phrase 2 ends.
Phrase 0 begins. "I love Java! BTW, 0*0=0." Phrase 0 ends.
Phrase 1 begins.
** - Continuable Error
SYSTEM::%SLEEP: User break
If you continue (by typing 'continue'): Continue execution
1. Break [4]> (defun drill-phrase-generator (i)
(format nil "I love Common Lisp! BTW, ~d+~d=~d." i i (+ i i)))
DRILL-PHRASE-GENERATOR
1. Break [4]> (defun drill (n f)
"Repeat n times the drill phrase. Call f to obtain i-th drill
phrase."
(dotimes (i n)
(format t "~%Drill ~d begins. " i)
(sleep 1)
(format t "~s" (funcall f i))
(sleep 1)
(format t " Drill ~d ends." i)))
DRILL
1. Break [4]> continue
"I love Common Lisp! BTW, 1+1=2." Phrase 1 ends.
Phrase 2 begins. "I love Common Lisp! BTW, 2+2=4." Phrase 2 ends.
Drill 0 begins. "I love Common Lisp! BTW, 0+0=0." Drill 0 ends.
Drill 1 begins. "I love Common Lisp! BTW, 1+1=2." Drill 1 ends.
Drill 2 begins. "I love Common Lisp! BTW, 2+2=4." Drill 2 ends.
</pre>
</div>
<div class="section" id="incremental-compilation">
<h2>Incremental compilation</h2>
<p>Now several languages (including Java) try to offer "incremental
compilation" IDEs that are directed precisely in this dynamic
redefinition area. Kent Pitman has a column on how programming terms
often get "redefined" for commercial purposes, and wrote in 1993:</p>
<pre class="literal-block">
The term ``incremental compilation'' is another one that I've been
sad to see recycled in the marketplace. Here again is another key
feature of Lisp not duplicated in most of its competitors: The
ability to compile and load new code without exiting your running
application. [...] To those who have only had full file
compilation before, even this stripped down kind of service may seem
like a real step up. But to me it's a step down from what I expect
from ``incremental compilation'' since it forces me to exit my
running application. So I think it's an abuse of long-standing
terminology, perhaps even in some cases with a deliberate intent to
mislead.
-- Kent Pitman, "What's in a Name? Uses and Abuses of Lispy Terminology"
<http://world.std.com/~pitman/PS/Name.html>
</pre>
<p>Now in 2002 for Java there is a good-looking candidate, judging by the
<a class="reference external" href="http://www-3.ibm.com/software/ad/vajava/about/v40/vaj40faq.html#22">FAQ of IBM VisualAge + WebSphere Test Environment</a>.</p>
<p>This reminds me again of the Greenspun's tenth rule of programming.
In CL the dynamic redefinition process is very natural since built-in
into the language by design. In popular languages, Java and Python
included, there is no such a powerful built-in capability, so they are
left to simulate it, via IDEs or "dirty" language tricks such as
Python's <tt class="docutils literal">new.instancemethod()</tt> and friends. Dunno how much
successful these simulations are; but why to wait for a feature X
provided by IDE Y in year Z, if everything is already built-in right
now in the Common Lisp? :-)</p>
</div>
Programming Language for Rapid Prototyping2002-12-18T09:00:00+01:002002-12-18T09:00:00+01:00Tibor Šimkotag:tiborsimko.org,2002-12-18:/programming-rapid-prototyping.html<p>Programming languages suitable for rapid prototyping or why I prefer
Lisp and Python to Java.</p>
<!-- PELICAN_END_SUMMARY --><p>I think one of the essential needs in programming is the ability to
create fast prototypes, to test ideas, to throw some of them, and
retain the good ones. Java is not good at this …</p><p>Programming languages suitable for rapid prototyping or why I prefer
Lisp and Python to Java.</p>
<!-- PELICAN_END_SUMMARY --><p>I think one of the essential needs in programming is the ability to
create fast prototypes, to test ideas, to throw some of them, and
retain the good ones. Java is not good at this. It's far behind
Common Lisp, Objective Caml or Python. I particularly like Common
Lisp as it enables you for example to dynamically change your code
<em>during runtime</em>. For example, you modify a class in a running Lisp
image, and all its instances are modified accordingly, in the runtime.
No need to remember the current debugging state, quit, edit,
recompile, restore state, and retest if it works this time!</p>
<p>A good article on this is "Accelerating Hindsight: Lisp as a Vehicle
for Rapid Prototyping" by Kent Pitman. The article is written in 1994
so it says "weird" things like "Hash Tables: widely accepted data
structures for fast access to large tables of data are not present in
most languages, but are standard in Common Lisp". Nowadays most
languages have "caught up" in this respect. But I think they did not
catch up in most of other points, far from that. His conclusions are
still valid. <a class="reference external" href="http://world.std.com/%7Epitman/PS/Hindsight.html">http://world.std.com/%7Epitman/PS/Hindsight.html</a></p>
<p>There is a nice real-world "proof" of this Lispy claim: see a somewhat
simplified measuring of programmer's efficiency in Java, C++ and Lisp
for a sample task, presented in the paper "Lisp as an Alternative to
Java" by Erann Gat
<a class="reference external" href="http://www.flownet.com/gat/papers/lisp-java.pdf">http://www.flownet.com/gat/papers/lisp-java.pdf</a>. Concerning this
study, Peter Norvig writes:</p>
<blockquote>
<p>I did not participate in the study, but after I saw it, I wrote my
version in Lisp. It took me about 2 hours (compared to a range of 2
to 8.5 hours for the other Lisp programmers in the study, 3 to 25
for C/C++ and 4 to 63 for Java) and I ended up with 45 non-comment
non-blank lines (compared with a range of 51 to 182 for Lisp, and
107 to 614 for the other languages). (That means that some Java
programmer was spending 13 lines and 84 minutes to provide the
functionality of each line of my Lisp program.)</p>
<p><a class="reference external" href="http://www.norvig.com/java-lisp.html">http://www.norvig.com/java-lisp.html</a></p>
</blockquote>
Average Style of Programming2002-10-23T09:00:00+02:002002-10-23T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-23:/programming-average-style.html<p>A danger of machinable software engineering factory: average style of
programming.</p>
<!-- PELICAN_END_SUMMARY --><p>The problem with software engineering methodologies is that they often
promote a kind of "average" style of programming, often leading to
mediocrity, or even worse, unnecessary overbloatedness. My favourite
quote illustrating this problem comes from an excellent book by …</p><p>A danger of machinable software engineering factory: average style of
programming.</p>
<!-- PELICAN_END_SUMMARY --><p>The problem with software engineering methodologies is that they often
promote a kind of "average" style of programming, often leading to
mediocrity, or even worse, unnecessary overbloatedness. My favourite
quote illustrating this problem comes from an excellent book by Jon
Bentley, "Programming Pearls":</p>
<blockquote>
<p>Problem 5 describes a class exercise that I graded on programming
style. Most students turned in one-page solutions and received
mediocre grades. Two students who had spent the previous summer on
a large software development project turned in beautifully
documented five-page programs, broken into a dozen procedures, each
with an elaborate heading. They received failing grades. My
program worked in five lines of code, and their inflation factor of
sixty was too much for a passing grade. When they complained that
they were employing standard software engineering tools, I should
have quoted Pamela Zave: "The purpose of software engineering is to
control complexity, not to create it."</p>
<p class="attribution">—Jon Bentley, "Programming Pearls", p.123</p>
</blockquote>
<p>I think that Java-like languages coupled with "enterprise"
engineering technologies <em>pretty often</em> lead to such an over-bloated
programming style. They are psychologically <em>promoting</em> such a style,
which is one of the major reasons I don't like Java, and, I think, one
of the major reasons one can often see poorly-performing Java programs
around.</p>
<p>Don't misunderstand me: these technologies have their place and are
probably a good thing in teams with 100+ members, but I think they are
disadvantageous in smaller teams such as ours. Here, the organic
growth way of software development, with frequent redesign and rapid
prototyping and trials and rewrite, is what I prefer. I think this
method leads to a better product. And while it's relatively expensive
to do such an organic development in languages like Java, it's very
natural in Common Lisp or Python.</p>
<p>Paul Graham puts it this way:</p>
<blockquote>
<ol class="arabic simple" start="2">
<li>Object-oriented programming is popular in big companies, because
it suits the way they write software. At big companies, software
tends to be written by large (and frequently changing) teams of
mediocre programmers. Object-oriented programming imposes a
discipline on these programmers that prevents any one of them
from doing too much damage. The price is that the resulting code
is bloated with protocols and full of duplication. This is not
too high a price for big companies, because their software is
probably going to be bloated and full of duplication anyway.</li>
<li>Object-oriented programming generates a lot of what looks like
work. Back in the days of fanfold, there was a type of
programmer who would only put five or ten lines of code on a
page, preceded by twenty lines of elaborately formatted
comments. Object-oriented programming is like crack for these
people: it lets you incorporate all this scaffolding right into
your source code. Something that a Lisp hacker might handle by
pushing a symbol onto a list becomes a whole file of classes and
methods. So it is a good tool if you want to convince yourself,
or someone else, that you are doing a lot of work.</li>
</ol>
<p class="attribution">—Paul Graham, "Why Arc Isn't Especially Object-Oriented"
<a class="reference external" href="http://www.paulgraham.com/noop.html">http://www.paulgraham.com/noop.html</a></p>
</blockquote>
<p>Let me finish with another favourite quote from Paul Graham, the
author of what has become the Yahoo! Store online system:</p>
<blockquote>
<p>"A new competitor seemed to emerge out of the woodwork every month
or so. The first thing I would do, after checking to see if they
had a live online demo, was look at their job listings. After a
couple years of this I could tell which companies to worry about
and which not to. The more of an IT flavor the job descriptions
had, the less dangerous the company was. The safest kind were the
ones that wanted Oracle experience. You never had to worry about
those. You were also safe if they said they wanted C++ or Java
developers. If they wanted Perl or Python programmers, that would
be a bit frightening-- that's starting to sound like a company
where the technical side, at least, is run by real hackers. If I
had ever seen a job posting looking for Lisp hackers, I would have
been really worried."</p>
<p class="attribution">—Paul Graham, "Beating the Averages"
<a class="reference external" href="http://www.paulgraham.com/avg.html">http://www.paulgraham.com/avg.html</a></p>
</blockquote>
Programming: Engineering or Poetry?2002-10-23T09:00:00+02:002002-10-23T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-23:/programming-engineering-or-poetry.html<p>Is programming more like an engineering or it is more like an art?</p>
<!-- PELICAN_END_SUMMARY --><p>I just came across a recent interview of Richard Gabriel for
java.sun.com where he precisely pushes against considering programming
as a <em>sole</em> engineering activity, underlines the role of creativeness
and diversity and makes parallels between …</p><p>Is programming more like an engineering or it is more like an art?</p>
<!-- PELICAN_END_SUMMARY --><p>I just came across a recent interview of Richard Gabriel for
java.sun.com where he precisely pushes against considering programming
as a <em>sole</em> engineering activity, underlines the role of creativeness
and diversity and makes parallels between writing programs and writing
poetry. An excerpt:</p>
<blockquote>
<p>Traditions of computer science and software engineering have tried
to turn all aspects of software creation into a pure engineering
discipline, when they clearly are not. [...] Writing software
should be treated as a creative activity. [...] My view is that we
should train developers the way we train creative people like poets
and artists.</p>
<p class="attribution">—Richard Gabriel, "The Poetry of Programming"
<a class="reference external" href="http://java.sun.com/features/2002/11/gabriel_qa.html">http://java.sun.com/features/2002/11/gabriel_qa.html</a></p>
</blockquote>
<p>Let me give you a chess analogy.</p>
<p>If you learn chess, there are a couple of theoretically known patterns
like the king opposition in K+P endings, good-vs-bad bishop, etc.
This corresponds to the good ol' "engineering knowledge". But you
cannot just go ahead and try to blindly apply these principles in the
real game. You need a lot of creativity to apply them, and more so
even to discover these patterns in the game. The games where players
like Botvinnik apply the "engineering principles" are not devoided of
creativity. They are very creative, fascinating and very pedagogical.
I like them since they make an impression that chess is actually very
simple. But the most spectacular chess games are not these ones. The
most spectacular games are those where players manage to get outside
of the thinking mood of usual "engineering patterns", get out of the
ordinary, and think different. This is the "artistic part" of chess
playing. It goes hand-in-hand with the "engineering part"; the chess
could not be diminished to either of these aspects.</p>
<p>To get back to Richard Gabriel's analogy, the usual advice to somebody
learning chess is to study a book or two on the theory and patterns,
but not to overdo it: one profits much more by playing a lot and by
studying the games of Grandmasters. Which is exactly what Richard
Gabriel says that people do when studying poetry. The art of chess
looks closer to the art of poetry than to the art of bridge building.
He argues that this is the case for the art of computer programming
too, where the creativity and the diversity factor is often overlooked
in profit of usual engineering mood. I cannot but subscribe to this
plea for diversity and creativity. That's why I like the freedom and
power of Lisp or Python and why I dislike the "narrow-mindedness" of
Java's point of view. Freedom to think different is the path to
really excellent chess games, excellent poems, and excellent programs!</p>
Programming Language Psychology2002-10-22T09:00:00+02:002002-10-22T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-22:/programming-language-psychology.html<p>What role does a programming language play in the mind of the
programmer?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="language-for-the-masses">
<h2>Language for the masses?</h2>
<p>Michael Vanier writes:</p>
<blockquote>
<p>Writing Java code, though not particulary painful in the sense that
C is painful (core dumps etc.), puts me to sleep. Writing Ocaml
(which is a "language designed for smart …</p></blockquote></div><p>What role does a programming language play in the mind of the
programmer?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="language-for-the-masses">
<h2>Language for the masses?</h2>
<p>Michael Vanier writes:</p>
<blockquote>
<p>Writing Java code, though not particulary painful in the sense that
C is painful (core dumps etc.), puts me to sleep. Writing Ocaml
(which is a "language designed for smart people" if there ever was
one) is exciting. My motivation to tackle the project has tripled
overnight. The interesting question is: why is Ocaml so much more
fun than Java? Why are "languages designed for smart people"
(LFSPs) so much more fun to program in than "languages designed for
the masses" (LFMs)?</p>
<p>One possibility is that LFSPs tend to be more unusual, and hence
are more novel. I'll admit that this is part of the answer, but it
misses the main point. Any new language is going to be novel, but
the novelty usually wears off quickly. The real point is that LFSPs
have a much greater support for abstraction, and in particular for
defining your own abstractions, than LFMs.</p>
<p><a class="reference external" href="http://www.paulgraham.com/vanlfsp.html">http://www.paulgraham.com/vanlfsp.html</a></p>
</blockquote>
</div>
<div class="section" id="hacker-languages-winning">
<h2>Hacker languages winning?</h2>
<p>Paul Graham has an interesting point about languages limiting one's
freedom and how they were proven "historically bad":</p>
<blockquote>
<ol class="arabic simple" start="9">
<li>[Java] is designed for large organizations. Large organizations
have different aims from hackers. They want languages that are
(believed to be) suitable for use by large teams of mediocre
programmers-- languages with features that, like the speed
limiters in U-Haul trucks, prevent fools from doing too much
damage. Hackers don't like a language that talks down to
them. Hackers just want power. Historically, languages designed
for large organizations (PL/I, Ada) have lost, while hacker
languages (C, Perl) have won. The reason: today's teenage hacker
is tomorrow's CTO.</li>
</ol>
<ol class="arabic simple" start="2">
<li>[...] Like the creators of sitcoms or junk food or package tours,
Java's designers were consciously designing a product for people
not as smart as them. Historically, languages designed for other
people to use have been bad: Cobol, PL/I, Pascal, Ada, C++. The
good languages have been those that were designed for their own
creators: C, Perl, Smalltalk, Lisp.</li>
</ol>
<p><a class="reference external" href="http://paulgraham.com/javacover.html">http://paulgraham.com/javacover.html</a></p>
</blockquote>
</div>
<div class="section" id="first-language-matters">
<h2>First language matters</h2>
<p>An oldie-goldie book on this topic is "The Psychology of Computer
Programming" by G.M.Weinberg. It is such a great reading.</p>
<blockquote>
<p>"Humanists often contend that machines tend to dehumanize people by
forcing them to have rigid personalities, but really, the contrary
is true. Because the machines are rigid, the people who use them
must -- if they are to be successful -- supply more than their
share of flexibility. Perhaps this is the effect that the
humanists are describing as "dehumanization", for in ordinary human
intercourse, each party gives and takes his share. Relationships
in which one party does all the giving or all the taking are not
fully human, and tend to produce personality distortions in one or
the other.</p>
<p>In making our adjustments to our particular programming language,
we can easily become attached to it simply because we now have so
much invested in it. We often listen to a man complaining about
his nagging, slovenly, and prodigal wife, only to find that when
asked why he doesn't leave her, he replies that he cannot live
without her. Most people would prefer almost any amount of pain to
giving up the familiarity of some constant companion for an unknown
quantity. We see this effect when we try to teach a programmer his
_second_ language. Teaching the first is no great problem, for he
has no investment in any other. By the time he has learned two or
more, he is aware that more things exist in this world than he has
dreamed of. But letting go of the first is, to him, just a promise
of pain with no promise of compensating pleasure.</p>
<p>Perhaps this situation could be improved if we could enunciate and
teach certain principles that are not tied to particular languages,
so that even the beginner would have some less relative measure to
hold up against the language he is learning. But teaching practice
today [i.e. in 1971] in our universities and programming schools
seems to be pointing in exactly the opposite direction. Instead of
trying to teach principles [...] the objective seems to be to get
the student writing some kind of program as soon as possible -- a
not unworthy aim -- but at the expense of limiting the future
growth of the programmer.</p>
<p>To be fair, we should recall our distinction between the
professional and amateur. The schools, it seems, are devoting
themselves primarily to turning out vast quantities of amateurs --
perhaps under the assumption that the professionals can and should
take care of themselves. But when the language designers begin to
believe that the principles underlying the design of an amateur's
language are the same as those upon which a professional's are
based, then we have a trouble." (PCP, pp.212)</p>
</blockquote>
<p>Did something change since 1971? Sure in the techno part, e.g. we now
have Java instead of COBOL (reminder: COBOL stands for "Common
Business Oriented Language"), but otherwise very little in the human
part.</p>
<p>People don't change much: that's why Homer is still relevant today,
and the PCP book got published in a silver anniversary edition
in 1998.</p>
<p>As for the computer languages, it's again all about flexibility and
freedom.</p>
</div>
Programming for Reusability: Cheap or Expensive?2002-10-22T09:00:00+02:002002-10-22T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-22:/programming-reusability-price.html<p>When programming, one can write off a task quickly, or one can make
components modular and reusable for the future. The latter obviously
takes more time. But how much? And when it is worth the effort?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="mythical-man-month">
<h2>Mythical Man-Month</h2>
<p>Fred Brooks in MMM summarises that reuse is a nice theory, but …</p></div><p>When programming, one can write off a task quickly, or one can make
components modular and reusable for the future. The latter obviously
takes more time. But how much? And when it is worth the effort?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="mythical-man-month">
<h2>Mythical Man-Month</h2>
<p>Fred Brooks in MMM summarises that reuse is a nice theory, but hard
practice. The price to pay for "writing for reusability" is about
threefold:</p>
<blockquote>
"Parna writes: "Reuse is something that is far easier to say than to
do. Doing it requires both good design and very good
documentation. Even when we see good design, which is still
infrequently, we won't see the components reused without good
documentation". Ken Brooks comments on the difficulty of
anticipating _which_ generalization will prove necessary: "I keep
having to bend things even on the fifth use of my own personal
user-interface library." [...] DeMarco says: "I am being very
discouraged about the whole reuse phenomenon. There is almost a
total absence of an existence theorem for reuse. Time has
confirmed that there is a big expense in making things reusable."
Yourdon estimates the big expense: "A good rule of thumb is that
such reusable components will take twice the effort of a 'one-shot'
component." I see that expense as exactly the effort of
productizing the component, discussed in Chapter 1. So my estimate
of the effort ratio would be threefold."
(MMM, p.224)</blockquote>
</div>
Programming: Technology or People?2002-10-22T09:00:00+02:002002-10-22T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-22:/programming-technology-people.html<p>What matters more in advancing a software project, a sound technology
or a sound team? The team, obviously. But how much?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="mythical-man-month">
<h2>Mythical Man-Month</h2>
<p>In his comments to the MMM edition after 20 years, F. Brooks writes:</p>
<blockquote>
Some readers have found it curious that MMM devotes most of the
essays to …</blockquote></div><p>What matters more in advancing a software project, a sound technology
or a sound team? The team, obviously. But how much?</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="mythical-man-month">
<h2>Mythical Man-Month</h2>
<p>In his comments to the MMM edition after 20 years, F. Brooks writes:</p>
<blockquote>
Some readers have found it curious that MMM devotes most of the
essays to the managerial aspects of software engineering, rather
than many technical issues. [...] it sprang from a conviction
that the quality of the people on a project, and their
organisation and management, are much more important factors in
success than are the tools they use or the technical approaches
they take. Subsequent researches have supported that conviction.
Boehm's COCOMO model finds that the quality of the team is by far
the largest factor in its success, indeed four times more potent
than are the tools they use or the technical approaches thay
make. [...] (MMM, p.276)</blockquote>
<p>What is the role of the language/technology, then?</p>
<blockquote>
The central question of how to improve the software art centers,
as it always has, on people. We can get good designs by following
good practices instead of poor ones. Good design practices can be
taught. Programmers are among the most intelligent part of the
population, so they can learn good practice. [...] Nevertheless I
do not believe we can make the next step upward in the same way.
Whereas the difference between poor conceptual designs and good
ones may lie in the soundness of design method, the difference
between good designs and great ones surely does not. Great
designs come from great designers. Software construction is a
_creative_ process. Sound methodology can empower and liberate
the creative mind; it cannot enflame or inspire the drudge. The
differences are not minor -- it is rather like Salieri and Mozart.
Study after study shows that the very best designers produce
structures that are faster, smaller, simpler, cleaner, and
produced with less effort. The differences between the great and
the average approach an order of magnitude. (MMM, p.202)</blockquote>
<p>Note the phrase on "sound methodology"; this is the soundness and
freedom that some languages tend to offer and some tend to forbid.</p>
</div>
<div class="section" id="peopleware">
<h2>Peopleware</h2>
<p>T. DeMarco and T. Lister in "Peopleware" look at the topic at length.
They conducted for several years a study of programmer productivity
via "coding war games" to find that there is little correlation with
language, years of experience, number of bugs people made, and salary.
The correlation was interestingly enough mainly with the organisation
the people worked at. They discovered a "clustering" effect: two
people from the same organization tended to perform alike. Best
organizations attracted best programmers, and vice versa. The
productivity ranged from 1 to 10 among different organizations, so the
effect was not negligible. When you are programming in the large,
it's all about people.</p>
</div>
Common Lisp and Python2002-10-08T09:00:00+02:002002-10-08T09:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-10-08:/common-lisp-python.html<p>I consider Python to be quite a Lisp-y language, clean and nice for
rapid development, with lots of libraries. One message on
comp.lang.python said "I never understood why LISP was a good idea
until I started playing with python".</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="peter-norvig">
<h2>Peter Norvig</h2>
<p>A nice comparison of Python and Lisp …</p></div><p>I consider Python to be quite a Lisp-y language, clean and nice for
rapid development, with lots of libraries. One message on
comp.lang.python said "I never understood why LISP was a good idea
until I started playing with python".</p>
<!-- PELICAN_END_SUMMARY --><div class="section" id="peter-norvig">
<h2>Peter Norvig</h2>
<p>A nice comparison of Python and Lisp by Peter Norvig: <a class="reference external" href="http://www.norvig.com/python-lisp.html">Python for Lisp
Programmers</a>.</p>
</div>
<div class="section" id="erann-gat">
<h2>Erann Gat</h2>
<p>Albeit Python runtime speed is much slower than Lisp or OCaml, one of
its advantages is that it's easier to find Python programmers than
Lisp or ML ones. See an interesting story of previously quoted Erann
Gat and "how he lost his faith in Lisp" ... in profit of precisely
Python, for some tasks: <a class="reference external" href="http://groups.google.com/groups?selm=gat-1902021257120001%40eglaptop.jpl.nasa.gov">How I lost my faith (very long)</a>.</p>
</div>
<div class="section" id="paul-graham">
<h2>Paul Graham</h2>
<div class="line-block">
<div class="line">Q: "I like Lisp but my company won't let me use it. What should I do?"</div>
<div class="line"><br /></div>
<div class="line">A: "Try to get them to let you use Python. Often when your employer</div>
<div class="line">won't let you use Lisp it's because (whatever the official reason)</div>
<div class="line">the guy in charge of your department is afraid of the way Lisp</div>
<div class="line">source code looks. Python looks like an ordinary dumb language, but</div>
<div class="line">semantically it has a lot in common with Lisp, and has been getting</div>
<div class="line">closer to Lisp over time."</div>
<div class="line"><br /></div>
<div class="line">--Paul Graham</div>
<div class="line"><<a class="reference external" href="http://www.paulgraham.com/faq.html">http://www.paulgraham.com/faq.html</a>></div>
</div>
</div>
<div class="section" id="eric-raymond">
<h2>Eric Raymond</h2>
<div class="line-block">
<div class="line">"LISP is worth learning for a different reason -- the profound</div>
<div class="line">enlightenment experience you will have when you finally get it.</div>
<div class="line">That experience will make you a better programmer for the rest of</div>
<div class="line">your days, even if you never actually use LISP itself a lot."</div>
<div class="line"><br /></div>
<div class="line">--Eric Raymond, "How to become a hacker"</div>
<div class="line"><<a class="reference external" href="http://www.tuxedo.org/~esr/faqs/hacker-howto.html">http://www.tuxedo.org/~esr/faqs/hacker-howto.html</a>></div>
</div>
</div>
<div class="section" id="alan-perlis">
<h2>Alan Perlis</h2>
<div class="line-block">
<div class="line">"A language that doesn't affect the way you think about</div>
<div class="line">programming, is not worth knowing".</div>
<div class="line"><br /></div>
<div class="line">-- Alan Perlis, "Epigrams in Programming"</div>
<div class="line"><<a class="reference external" href="http://www.cs.yale.edu/homes/perlis-alan/quotes.txt">http://www.cs.yale.edu/homes/perlis-alan/quotes.txt</a>></div>
</div>
</div>
Common Lisp Runtime Speed2002-09-17T20:00:00+02:002002-09-17T20:00:00+02:00Tibor Šimkotag:tiborsimko.org,2002-09-17:/common-lisp-runtime-speed.html<p>Common Lisp has got fancy compilers that compile to native code, so it
runs typically ~10 times faster than Python, see e.g. Peter Norvig's
<a class="reference external" href="http://www.norvig.com/python-lisp.html">Python for Lisp Programmers</a>. It's a dynamically typed
language though, so it runs somehow slower (say 50%-100%) than
statically typed OCaml/C++. Can one …</p><p>Common Lisp has got fancy compilers that compile to native code, so it
runs typically ~10 times faster than Python, see e.g. Peter Norvig's
<a class="reference external" href="http://www.norvig.com/python-lisp.html">Python for Lisp Programmers</a>. It's a dynamically typed
language though, so it runs somehow slower (say 50%-100%) than
statically typed OCaml/C++. Can one optimise things further?</p>
<!-- PELICAN_END_SUMMARY --><p>For example, consider a simple square function:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defun</span> <span class="nv">square</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">*</span> <span class="nv">x</span> <span class="nv">x</span><span class="p">))</span>
</pre></div>
<p>you can easily see the assembler code for it:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">compile</span> <span class="ss">'square</span><span class="p">)</span>
<span class="p">(</span><span class="nb">disassemble</span> <span class="ss">'square</span><span class="p">)</span>
</pre></div>
<p>to find out that relatively expensive GENERIC-* stuff is called:</p>
<pre class="literal-block">
480B22B8: .ENTRY SQUARE(x) ; (FUNCTION (T) NUMBER)
2D0: POP DWORD PTR [EBP-8]
2D3: LEA ESP, [EBP-32]
2D6: CMP ECX, 4
2D9: JNE L0
2DB: MOV ESI, EDX
2DD: MOV [EBP-12], ESI ; No-arg-parsing entry point
2E0: MOV EDX, ESI
2E2: MOV EDI, ESI
2E4: CALL #x100002C8 ; GENERIC-*
2E9: MOV ESP, EBX
2EB: MOV ESI, [EBP-12]
2EE: MOV ECX, [EBP-8]
2F1: MOV EAX, [EBP-4]
2F4: ADD ECX, 2
2F7: MOV ESP, EBP
2F9: MOV EBP, EAX
2FB: JMP ECX
2FD: NOP
2FE: NOP
2FF: NOP
300: L0: BREAK 10 ; Error trap
302: BYTE #x02
303: BYTE #x19 ; INVALID-ARGUMENT-COUNT-ERROR
304: BYTE #x4D ; ECX
</pre>
<p>If you measure its performance:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">time</span> <span class="p">(</span><span class="nb">dotimes</span> <span class="p">(</span><span class="nv">i</span> <span class="mi">100000000</span><span class="p">)</span> <span class="p">(</span><span class="nv">square</span> <span class="ss">'1234</span><span class="p">)))</span>
</pre></div>
<p>it takes 1.80 sec. Now if you set some optimization parameters and
declare argument type:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defun</span> <span class="nv">square2</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">optimize</span> <span class="p">(</span><span class="nv">speed</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="nv">safety</span> <span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="nv">debug</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="k">type</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">32</span><span class="p">)</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nb">*</span> <span class="nv">x</span> <span class="nv">x</span><span class="p">))</span>
</pre></div>
<p>you would get these hints from the compiler:</p>
<pre class="literal-block">
Note: Unable to recode as shift and add due to type uncertainty:
The result is a (MOD 18446744065119617026), not a (UNSIGNED-BYTE 32).
Note: Forced to do GENERIC-* (cost 30).
Unable to do inline fixnum arithmetic (cost 4) because:
The first argument is a (UNSIGNED-BYTE 32), not a FIXNUM.
The second argument is a (UNSIGNED-BYTE 32), not a FIXNUM.
The result is a (MOD 18446744065119617026), not a FIXNUM.
Unable to do inline (signed-byte 32) arithmetic (cost 5) because:
The first argument is a (UNSIGNED-BYTE 32), not a (SIGNED-BYTE 32).
The second argument is a (UNSIGNED-BYTE 32), not a (SIGNED-BYTE 32).
The result is a (MOD 18446744065119617026), not a (SIGNED-BYTE 32).
etc.
</pre>
<p>and if you take into account these hints, you could rewrite your
function into, for example:</p>
<div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">defun</span> <span class="nv">square3</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">optimize</span> <span class="p">(</span><span class="nv">speed</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="nv">safety</span> <span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="nv">debug</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">32</span><span class="p">))</span>
<span class="p">(</span><span class="k">type</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">32</span><span class="p">)</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nb">*</span> <span class="nv">x</span> <span class="nv">x</span><span class="p">))</span>
</pre></div>
<p>and you would get to 1.28 sec (as compared to 1.80 sec) and the
resulting assembler code would become:</p>
<pre class="literal-block">
(disassemble 'square3)
4811C888: .ENTRY SQUARE3() ; FUNCTION
8A0: POP DWORD PTR [EBP-8]
8A3: LEA ESP, [EBP-32]
8A6: MOV EAX, EDX
8A8: TEST AL, 3
8AA: JEQ L0
8AC: MOV EAX, [EAX-3]
8AF: JMP L1
8B1: L0: SAR EAX, 2
8B4: L1: MOV ECX, EAX
8B6: MOV EAX, ECX ; No-arg-parsing entry point
8B8: MUL EAX, ECX
8BA: MOV ECX, EAX
8BC: TEST ECX, 3758096384
8C2: JNE L3
8C4: LEA EDX, [ECX*4]
8CB: L2: MOV ECX, [EBP-8]
8CE: MOV EAX, [EBP-4]
8D1: ADD ECX, 2
8D4: MOV ESP, EBP
8D6: MOV EBP, EAX
8D8: JMP ECX
8DA: NOP
8DB: NOP
8DC: NOP
8DD: NOP
8DE: NOP
8DF: NOP
8E0: L3: JNS L4
8E2: MOV EDX, 522
8E7: JMP L5
8E9: L4: MOV EDX, 266
8EE: L5: MOV BYTE PTR [#x280001D4], 0 ; COMMON-LISP::*PSEUDO-ATOMIC-INTERRUPTED*
8F5: MOV BYTE PTR [#x280001BC], 4 ; COMMON-LISP::*PSEUDO-ATOMIC-ATOMIC*
8FC: MOV EAX, 16
901: ADD EAX, [#x806A404] ; current_region_free_pointer
907: CMP EAX, [#x806A3D8] ; current_region_end_addr
90D: JBE L6
90F: CALL #x8053358 ; alloc_overflow_eax
914: L6: XCHG EAX, [#x806A404] ; current_region_free_pointer
91A: MOV [EAX], EDX
91C: LEA EDX, [EAX+7]
91F: MOV [EDX-3], ECX
922: MOV BYTE PTR [#x280001BC], 0 ; COMMON-LISP::*PSEUDO-ATOMIC-ATOMIC*
929: CMP BYTE PTR [#x280001D4], 0 ; COMMON-LISP::*PSEUDO-ATOMIC-INTERRUPTED*
930: JEQ L7
932: BREAK 9 ; Pending interrupt trap
934: L7: JMP L2
</pre>
<p>that you could further optimize by assmebler inlining or whatever, if
you wish. (Of course, nothing like this is possible in Python, but
it's very well possible with OCaml/C/C++.)</p>
<p>If one prototypes in Lisp, then one does not have to worry about the
speed as one may worry with Python.</p>
Apache Watchdog1999-08-13T12:00:00+02:001999-08-13T12:00:00+02:00Tibor Šimkotag:tiborsimko.org,1999-08-13:/apache-watchdog.html<p>An example how to fork Unix processes and set process group in order
to write a simple Apache watchdog, connecting to a web page and
eventually restarting web server when the web page does not respond
within a given time limit.</p>
<!-- PELICAN_END_SUMMARY --><div class="highlight"><pre><span></span><span class="ch">#! /usr/local/bin/perl -w</span>
<span class="c1">## name: watchdog_httpd</span>
<span class="c1">## author: tibor …</span></pre></div><p>An example how to fork Unix processes and set process group in order
to write a simple Apache watchdog, connecting to a web page and
eventually restarting web server when the web page does not respond
within a given time limit.</p>
<!-- PELICAN_END_SUMMARY --><div class="highlight"><pre><span></span><span class="ch">#! /usr/local/bin/perl -w</span>
<span class="c1">## name: watchdog_httpd</span>
<span class="c1">## author: tibor.simko@cern.ch</span>
<span class="c1">## revision: 19990813</span>
<span class="c1">## description: checks if webserver is running okay by downloading a page</span>
<span class="c1">## (if no response from the server is received in X secs => restart it)</span>
<span class="c1">## note: to be called from root's crontab each let us say 30 minutes</span>
<span class="c1">## -- configuration section starts here</span>
<span class="nv">$httpd_test</span><span class="o">=</span><span class="s">"/soft/bin/wget -q --spider http://example.org/"</span><span class="p">;</span>
<span class="nv">$httpd_test_timeout</span><span class="o">=</span><span class="mi">6</span><span class="p">;</span> <span class="c1"># in seconds</span>
<span class="nv">$httpd_restart</span><span class="o">=</span><span class="s">"/soft/bin/apachectl restart"</span><span class="p">;</span>
<span class="c1">## -- configuration section ends here</span>
<span class="c1">###############################################</span>
<span class="c1">### do not change anything below this line ###</span>
<span class="c1">###############################################</span>
<span class="k">require</span> <span class="s">"ctime.pl"</span><span class="p">;</span>
<span class="nv">$now</span> <span class="o">=</span> <span class="o">&</span><span class="n">ctime</span><span class="p">(</span><span class="nb">time</span><span class="p">);</span>
<span class="nb">chop</span><span class="p">(</span><span class="nv">$now</span><span class="p">);</span>
<span class="nb">umask</span> <span class="mo">022</span><span class="p">;</span>
<span class="k">unless</span> <span class="p">(</span><span class="nv">$pid</span> <span class="o">=</span> <span class="nb">open</span> <span class="n">CMD</span><span class="p">,</span> <span class="s">"-|"</span><span class="p">)</span> <span class="p">{</span>
<span class="nb">defined</span> <span class="nv">$pid</span> <span class="o">||</span> <span class="nb">die</span> <span class="s">"E: fork failed $!"</span><span class="p">;</span>
<span class="nb">setpgrp</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="vg">$$</span><span class="p">)</span> <span class="o">||</span> <span class="nb">die</span> <span class="s">"E: set process group failed $!"</span><span class="p">;</span>
<span class="nb">exec</span> <span class="nv">$httpd_test</span> <span class="o">||</span> <span class="nb">die</span> <span class="s">"E: cannot execute httpd test query $!"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="o">&</span><span class="n">timed_out</span><span class="p">(</span><span class="nv">$httpd_test_timeout</span><span class="p">,</span> <span class="o">\*</span><span class="n">CMD</span><span class="p">))</span> <span class="p">{</span>
<span class="nb">kill</span> <span class="o">-</span><span class="mi">1</span> <span class="o">=></span> <span class="nv">$pid</span><span class="p">;</span>
<span class="nb">warn</span> <span class="s">"W: test query timed-out, restarting httpd ("</span><span class="p">,</span> <span class="nv">$now</span><span class="p">,</span> <span class="s">")\n"</span><span class="p">;</span>
<span class="nb">system</span> <span class="nv">$httpd_restart</span><span class="p">;</span>
<span class="p">}</span>
<span class="nb">close</span> <span class="n">CMD</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="vg">$?</span><span class="p">)</span> <span class="p">{</span>
<span class="nb">warn</span> <span class="s">"W: test query failed, restarting httpd ("</span><span class="p">,</span> <span class="nv">$now</span><span class="p">,</span> <span class="s">")\n"</span><span class="p">;</span>
<span class="nb">system</span> <span class="nv">$httpd_restart</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1"># some subroutines:</span>
<span class="k">sub</span> <span class="nf">timed_out</span> <span class="p">{</span>
<span class="k">my</span><span class="p">(</span><span class="nv">$timeout_secs</span><span class="p">,</span> <span class="nv">$handle</span><span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$rin</span><span class="p">;</span>
<span class="nb">vec</span><span class="p">(</span><span class="nv">$rin</span><span class="o">=</span><span class="s">''</span><span class="p">,</span> <span class="nb">fileno</span><span class="p">(</span><span class="nv">$handle</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="nb">select</span> <span class="nv">$rin</span><span class="p">,</span> <span class="nb">undef</span><span class="p">,</span> <span class="nb">undef</span><span class="p">,</span> <span class="nv">$timeout_secs</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="s">''</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
Vatikán a Slovenská Republika 1942-19441998-08-03T00:00:00+02:001998-08-03T00:00:00+02:00Tibor Šimkotag:tiborsimko.org,1998-08-03:/vatikan-a-slovenska-republika-1942-1944.html<p>The Slovak Jewish community was heavily decimated in the Shoah.
During 1942-1944, about 70,000 Jews were deported by Slovak and German
authorities; less than 5,000 returned. The following text was written
for <a class="reference external" href="http://www.angelfire.com/hi/xcampaign/">Poznanie-Knowledge Campaign</a> aiming to provide a
transcription of historical documents offering an outlook on the role …</p><p>The Slovak Jewish community was heavily decimated in the Shoah.
During 1942-1944, about 70,000 Jews were deported by Slovak and German
authorities; less than 5,000 returned. The following text was written
for <a class="reference external" href="http://www.angelfire.com/hi/xcampaign/">Poznanie-Knowledge Campaign</a> aiming to provide a
transcription of historical documents offering an outlook on the role
played by the highest Slovak state authorities.</p>
<!-- PELICAN_END_SUMMARY -->
<p><center>* * *</center><p>
<center>
<font size=+2>
<b> VATIKAN A SLOVENSKA REPUBLIKA (1942-1944) </b>
</font>
<br>
<font size=+1>
<b> <i> (Vyber z dokumentov) </i> </b>
</font>
</center>
<p><p><i> Nasledujuci ciastkovy vyber z dokumentov [<a
href="#vasr">VASR</a>] doklada korespondenciu vatikanskeho chargé
d'affaires na Slovensku, Giuseppe Burzia, so svojimi nadriadenymi
vo Vatikane, pocas rokov 1942-44. Nas vyber sa tyka specialne
postoja najvyssich predstavitelov Slovenskeho statu k uskutocnenym
i neuskutocnenym deportaciam, pricom ma za ciel dokumentovat
jednoznacne odmietavy postoj Slovenskeho statu k zmiernovacim
snaham Vatikanu. Verime, ze citatel v nasledujucich dobovych
materialoch najde odpovede na typicke argumenty zastancov Tisovho
rezimu, tvrdiace napriklad (i) ze deportacie boli "nevyhnutnym
vojnovym zlom" a ze Slovensky stat konal len a len na natlak
Nemecka a nie z vlastnej vole (ako protiargument vid o.i. <a
href="#answer1">spravu c.1558/43</a>); resp. (ii) ze prezident Tiso
vnutorne s deportaciami nesuhlasil a snazil sa vsemozne zmiernovat
ich dopad (ako protiargument vid o.i. <a href="#answer2">telegram c.103</a>).
Pripominame, ze dokumenty tu prezentovane su vylucne
dobovou korespondenciou katolickych diplomatov.</i>
<p>
<a name="obsah"><h3>Obsah</h3></a>
<a href="#1942">I. Obdobie prvej deportacnej vlny (1942)</a><br>
<a href="#1943">II. Obdobie pripravovanych, ale neuskutocnenych deportacii (1943)</a><br>
<a href="#1944">III. Obdobie druhej deportacnej vlny (1944)</a><br>
<a href="#pramene">IV. Pramene</a>
<p>
<a name="1942">
<h3>I. Obdobie prvej deportacnej vlny (1942)</h3>
</a>
Telegram Msgr Burzia (charge d'affaires Vatikanu v Bratislave)
kardinalovi Maglionemu (statny sekretar Vatikanu):
<p>Telegram c.19<br>
<div align=right>Bratislava, 9.marca 1942, 10h40<br>
doruceny 9.marca o 20h00</div>
<p> Prenikla sprava, ze bezprostredne hrozi hromadna deportacia
vsetkych slovenskych zidov do Halica a do lublinskeho regionu
bez rozdielu veku, pohlavia a nabozenstva a predpoklada sa
deportacia muzov, zien a deti oddelene. Prvy kontingent by
odisiel (?) uz v nasledujucom mesiaci.
<p> Ubezpecuju ma, ze tento kruty plan je dielom predsedu vlady
pana Tuku v dohode s ministrom vnutra a bez natlaku nemeckej
strany, ktora dokonca ziada od Slovenska 500 mariek a stravu na
dva tyzdne na kazdeho deportovaneho.
<p> V sobotu som bol u predsedu vlady, ktory mi potvrdil uvedenu
spravu, vehementne branil legitimnost opatrenia a opovazil sa
povedat (on, ktory sa tak vystatuje katolicizmom), ze v nom
nevidi nic nehumanneho a protidemokratickeho.
<p> Deportacia 80.000 osob do Polska napospas Nemcom sa rovna
odsudeniu velkej ich casti na istu smrt.
<p> (<a href="#actes">Actes</a>, zv.VIII, c.298, s.453 [<a href="#aes">AES</a> 2141/41]). <a href="#vasr">VASR</a>, s.79.
<p><center>* * *</center><p>
Telegram Msgr Burzia kardinalovi Maglionemu:
<p>Telegram c.21<br>
<div align=right> Bratislava, 24.marca 1942, 20h40<br>
prijaty 25.marca, 07h55</div>
<p> Siria sa chyry, ze tunajsia vlada zastavila planovanu deportaciu
zidov na zakrok Svatej Stolice. Minulej noci vsak vela zidovskych
zien vo veku od 16 do 25 rokov vytrhli z ich rodin a ma sa za to, ze
su urcene na prostituciu v nemeckom zazemi na ruskom fronte.
<p> (<a href="#actes">Actes</a>, zv.VIII, c.324, s.476 [<a href="#aes">AES</a> 2553/42]). <a href="#vasr">VASR</a>, s.88.
<p><center>* * *</center><p>
Telegram Msgr Burzia kardinalovi Maglionemu:
<p>Telegram c.22<br>
<div align=right> Bratislava, 25.marca 1942, 18h10<br>
prijaty 26.marca, 09h30</div>
<p> V rozpore s chyrmi, ktore sa vcera rozsirili, vlada sa nezriekla
svojho neludskeho umyslu a v sucasnosti sa uskutocnuje sustredovanie
desattisic muzov a zien ako prvy kontingent. Postupne sa pripravi
dalsi transport, az po totalnu deportaciu. Hore uvedene skutocnosti
mi povedali (?) na ministerstve.
<p> Poznamka Msgr Tardiniho (papezsky podsekretar) na pripojenom liste
papiera: 27.marec 1942. Eae. Telegrafovat Msgr Burziovi - odpovedou
na jeho telegramy - informovat o krokoch, ktore sa urobili tu a
poverit ho, aby osobne zakrocil u Tisa. (Neviem, ci zakroky dokazu
zastavit...blaznov! A blazni su dvaja: Tuka, ktory kona a Tiso, knaz,
ktory necha konat!)
<p> (<a href="#actes">Actes</a>, zv.VIII, c.326, s.478-9 [<a href="#aes">AES</a> 2388/42]). <a href="#vasr">VASR</a>, s.91.
<p><center>* * *</center><p>
Poznamky Msgr Tardiniho (statny sekretar Vatikanu):
<div align=right>Vatikan, 13.jula 1942</div>
<p>[...] Je to nestastie, ze prezidentom Slovenska je knaz. Kazdy
chape, ze Svata Stolica nemoze zastavit Hitlera. Ale kto pochopi, ze
nevie udrzat na uzde knaza?
<p> (<a href="#actes">Actes</a>, zv.VIII, c.426, s.597-8 [<a href="#aes">AES</a> 5085/42 rkp]). <a href="#vasr">VASR</a>, s.117.
<p>
<a name="1943">
<h3>II. Obdobie pripravovanych, ale neuskutocnenych deportacii (1943)</h3>
</a>
<i>Pripomenme si najskor pripravy Slovenskeho statu na novu vlnu
deportacii na jar 1943 kratkou citaciou z prejavu A. Macha na zjazde
Hlinkovej Gardy:
<blockquote> Jednou z prvych nasich povinnosti bude, ked sme
odstranili 80% zidov, vysporiadat sa aj s tymi ostatnymi... pride
marec, pride april a transporty pojdu. (Gardista, 9.februar 1943).
</blockquote>
Konkretne pripravy na deportacie sa zacali vo februari 1943 (SNA.MV,
166/1942, 14.odd.). <a href="#vasr">VASR</a>, s.120. </i>
<p><center>* * *</center><p>
Telegram Msgr Burzia kardinalovi Maglionemu:
<p>Telegram c.34 <br> <div align=right> Bratislava, 11.marca
1943, 11h10 <br> prijem 20h15 </div>
<p>
[...] Deportacia poslednych dvadsattisic zidov, ktori ostali
(na Slovensku?) je velmi pravdepodobna, nezda sa vsak, ze v
najblizsom case; nie je mozne ziskat presne spravy z vladnych
miest, ktore su velmi rezervovane a odpovedaju vyhybavo.
<p>
(<a href="#actes">Actes</a>, zv.IX, c.89, s.181 [<a href="#aes">AES</a> 1596/43]). <a href="#vasr">VASR</a>, s.128.
<p><center>* * *</center><p>
<a name="answer1">
Sprava Msgr Burzia kardinalovi Maglionemu:
</a>
<p>Sprava c.1558/43 <br> <div align=right> Bratislava,
10.aprila 1943</div>
<p>
[...] dovolujem si informovat Vasu najctihodnejsiu Eminenciu,
ze obavane nove deportacie zidov zo Slovenska sa este nezacali
uskutocnovat, ale nebezpecenstvo este nepominulo, ba naopak, zda
sa, ze je iba otazkou casu a prostriedkov. [...]
[o svojej intervencii u Tuku zo dna 7.aprila 1943 dalej pise:] Niet
nic protivnejsieho a ponizujucejsieho ako rozhovor s touto
osobou [...] Povedal [...] "Spominali ste sud dejin: ked raz
budu dejiny hovorit o dnesnom Slovensku, spomenu, ze na cele
vlady bol cestny a odvazny muz, ktory mal silu oslobodit svoju
vlast od najvacsej pliagy" [...] Dal som mu poslednu otazku:
"Mohol by som aspon oznamit Svatej Stolici, ze deportacie zidov
zo Slovenska sa nekonaju z iniciativy slovenskej vlady, ale pod
vonkajsim natlakom, ked toto je vseobecny nazor, na priam
presvedcenie?". "Ubezpecujem Vas na moju krestansku cest, ze je
to nasa vola a nasa iniciativa". [...] Dalej sa usiloval
zdoraznit svoje pevne presvedcenie, ze nasilna a hromadna
deportacia je jedinym prostriedkom na oslobodenie Slovenska od
"zidovskeho moru". "Vazenie nestaci, vazenie nikoho nenapravi,
verte mi to, mam v tom devatrocnu skusenost". Nechtiac povedal
pan Tuka najpresvedcivejsiu pravdu a jedinu uprimnu vec z celeho
rozhovoru. [...]
<p> Prezident republiky [...] si ma dal zavolat po tom, co ho
informovali o rozhovore, a vyslovil mi svoje polutovanie nad
chovanim a odpovedou ministra zahranicnych veci. Urobil aj
urcite doverne vyhlasenie, ale prosil ma naliehavo, aby som ho
neoznamil pisomne, ale pripadne iba ustne. [...]
<p>
(<a href="#actes">Actes</a>, zv.IX, c.147, s.245-51 [<a href="#aes">AES</a> 3084/43 orig]). <a href="#vasr">VASR</a>, s.137.
<p><center>* * *</center><p>
Telegram Msgr Burzia kardinalovi Maglionemu:
<p>Telegram c.38<br><div align=right> Bratislava, 4.juna
1943, 15h00 <br> prijaty 5.juna, 11h15 </div>
<p>
[...] Vystahovanie zidov este stale zastavene. Ministerstvo
vnutra ma informuje, ze zname presidlenie je iba v stadiu
priprav [...]
<p>
(<a href="#actes">Actes</a>, zv.IX, c.217, s.329 [<a href="#aes">AES</a> 3697/43]). <a href="#vasr">VASR</a>, s.151.
<p>
<a name="1944">
<h3>III. Obdobie druhej deportacnej vlny (1944)</h3>
</a>
Telegram Msgr Burzia statnemu sekretariatu:
<p>Telegram c.98<br> <div align=right> Bratislava,
15.septembra 1944, 09h00 <br> doruceny 21h30 </div>
<p>
Oznamujem nasledujuce: Po prichode okupacnych oddielov
Gestapo zacalo uskutocnovat masove zatykanie zidov menovite v
lokalitach, ktore odnali partizanom. V Bratislave sa este
nekonali razie, ale su obavy, ze sa tak moze stat v
nasledujucich dnoch. [...]
<p>
(<a href="#actes">Actes</a>, zv.X, c.324, s.418-9 [<a href="#aes">AES</a> 5881/44]). <a href="#vasr">VASR</a>, s.191.
<p><center>* * *</center><p>
<a name="answer2">
Telegram Msgr Burzia statnemu sekretariatu:
</a>
<p>Telegram c.103 <br> <div align=right> Bratislava,
6.oktobra 1944, 10h30 <br> doruceny 18h00 </div>
<p>
[...] Dvadsiateho druheho som intervenoval u tunajsej vlady a
24. septembra u prezidenta republiky [...] v Bratislave doslo k
obavanemu zatykaniu v noci na 29. septembra; styrali okolo 2000
zidov [...] Toho isteho dna som znova navstivil prezidenta
republiky a pokusal som sa dosiahnut, aby intervenoval aspon za
pokrstenych... Nenasiel som u neho nijake pochopenie a ani
jedine slovo sucitu s prenasledovanymi: v zidoch vidi pricinu
kazdeho zla a brani opatrenia Nemcov proti zidom ako diktovane
najvyssimi vojnovymi zaujmami [...]
<p>
(<a href="#actes">Actes</a>, zv.X, c.341, s.433 [<a href="#aes">AES</a> 6524/44]). <a href="#vasr">VASR</a>, s.196.
<p><center>* * *</center><p>
Telegram Msgr Burzia statnemu sekretariatu:
<p>Telegram c.106 <br> <div align=right> Bratislava,
26.oktobra 1944, 08h25 <br> doruceny 26.oktobra 1944, 20h00
</div>
<p>
[...] Kroky na zachranu zidov pred deportaciou ostali bez
ucinku; deportacia je v prude a hon na ukrytych zidov pokracuje.
Nasledkom okupacie zmizli aj zvysky slovenskej nezavislosti.
Vlada a prezident servilne vykonavaju prikazy okupacnych uradov.
Dobri katolici su znechuteni postojom prezidenta republiky a
kladu si otazku, na co este caka, preco uz nepoda demisiu [...]
<p>
(<a href="#actes">Actes</a>, zv.X, c.377, s.461 [<a href="#aes">AES</a> 6992/44]). <a href="#vasr">VASR</a>, s.202.
<p><center>* * *</center><p>
Telegram Msgr Tardiniho nunciovi v Berne Bernardinimu:
<p>Tel.c.717<br><div align=right> Vatikan, 28. oktobra 1944
</div>
<p>
Prosim oznamit Apostolskej nunciature na Slovensku nasledovne:
[...] Nech sa Vase Blahorodie ihned odoberie k prezidentovi
Tisovi, upovedomi ho o hlbokom ziali Svateho Otca nad utrpenim
tolkych osob, ktoremu su proti principom ludskosti a
spravodlivosti vystavene v tamojsom narode pre ich narodnost
alebo rasu, a v mene Svateho Otca ho vyzve, aby svoje city a
umysly zladil so svojou knazskou dostojnostou a svedomim.
Zaroven ho upozornite, ze krivdy spachane za jeho vlady skodia
prestizi jeho vlasti, a ze protivnici ich vyuzivaju na
diskreditovanie kleru a cirkvi na celom svete.
<p>
(<a href="#actes">Actes</a>, zv.X,c.378,s.461-2 [<a href="#aes">AES</a> 6992/44, autograf]). <a href="#vasr">VASR</a> s.203.
<p><center>* * *</center><p>
Tiso papezovi Piovi XII:
<div align=right> Bratislava, 8. novembra 1944 </div>
<p>
So synovskou oddanostou a uprimnou vdacnostou som prijal
posolstvo Svatej Stolice ako znak otcovskej starostlivosti
Svateho Otca o nas slovensky narod, ktoru sme uz tolko razy
pocitili.
<p> Vedeny tymto citom a tiez pamatlivy na svoj knazsky stav so
vsetkou pokorou oznamujem:
<p> Nam nepriatelsky naklonena propaganda zvelicuje chyry o
krutych opatreniach vlady Slovenskej republiky, protiviacich sa
principom ludskosti a spravodlivosti, voci osobam pre ich
narodnost a rasu.
<p> Naproti tomu je pravda, ze v priebehu piatich rokov trvania
nasej nezavislosti nebol vyneseny ani jeden rozsudok smrti.
Politicke zmeny tak zo 6.oktobra 1938, ako aj zo 14.marca 1939,
ale aj posledne inkriminovane opatrenia vlady, sa uskutocnili
bez preliatia jedinej kvapky krvi. To, ze vlada poslala domov
Cechov, ktori boli na Slovensku zbytocni, a ze zidov uvolnila na
prace do Nemecka, kam poslala na prace aj velky pocet Slovakov,
nemozno pripisat vlade na tarchu.
<p> Vlada neuskutocnila inkriminovane akcie proti Cechom a zidom
pre ich narodnu alebo rasovu prislusnost, ale z povinnosti
branit svoj narod proti nepriatelom, ktori po starocia zhubne
posobili v jej lone, - a to tak, ze aj nie v malom pocte a su
dobre, ba dokonca velmi dobre situovani. Tym vlada dokazala, ze
vo svojich opatreniach nevybocila z cesty ostraziteho riadenia
opravnenej starostlivosti o obranu a zabezpecenie narodnej,
socialnej a kulturnej existencie svojho naroda.
<p> Je potrebne este poznamenat, ze Cesi a zidia, ktori sa mali
pocas piatich rokov existencie Slovenskej republiky dobre,
koncom augusta tohto roku sa otvorene spojili s nepriatelskymi
parasutistmi roznych narodnosti, ktorych zhodili na Slovensko zo
vzduchu a zacali otvorenu vzburu proti Slovenskej republike.
Male, necakane a nespravodlivo napadnute Slovensko, neschopne
ubranit sa samo, poziadalo o pomoc svojho ochrancu, vladu
Nemeckej rise. Preto od tych cias maju akcie na Slovensku
vojensky, vojnovy charakter, odohravajuc sa mimo sfery moci
slovenskej vlady a takisto mimo jej zodpovednosti. Dokazuje to
znenie verbalnej noty slovenskej vlady zaslanej nemeckej vlade
vo veci akcii.
<p> Nasa vina tkvie v nasej vdacnosti a vernosti voci Nemcom,
ktori nielenze uznavali a schvalovali existenciu nasho naroda a
jeho prirodzene pravo na nezavislost a narodnu slobodu, ale
pomahali mu aj proti Cechom a zidom, nepriatelom nasho naroda.
Sme si vsak celkom isti, ze tato "vina" je v ociach katolikov
nasou najvacsou ctou.
<p> Vzdy sa usilujem o city a nazory zhodujuce sa s knazskou
dostojnostou a svedomim, pretoze nielen ja, ale aj ostatni
slovenski knazi v politickej sluzbe vidime osobitny sposob
pastoralnej starostlivosti, ktorej uzitocnost najlepsie dokazuje
stav cirkvi na Slovensku.
<p> Usilie nepriatelov z nasej cinnosti vykonstruovat pricinu
znevazovania cti kleru a cti cirkvi pred svetom, je naskrze
farizejske. Najvacsou ozdobou laskavej matky cirkvi je, ze ona
sama poveruje svojich knazov, aby sluzili malym narodom, teda
sama cirkev si vazi male narody a neponechava ich napospas
dravym vlkom. Knaz ochranca a robotnik svojho ludu istotne
prekaza tym, co by chceli male narody pohltit a vykoristovat.
<p> Svaty Otce! So synovskou uctou a s najhlbsou oddanostou
svatej cirkvi rimskej vyhlasujem: Cest a dobre meno kleru a
laskavej matky cirkvi je zaroven najvyssou ctou mojou a
slovenskeho naroda. Ostaneme verni nasmu heslu - Za Boha, za
narod - aby sme pred ocami Svateho Otca boli vzdy hodni jeho
dobrotivosti a otcovskej lasky.
<p> Ostavam s bozkom sv.Petra, najpokornejsi syn Dr.Jozef Tiso,
knaz.
<p>
(<a href="#actes">Actes</a>, zv.X, c.389, s.475-8 [<a href="#aes">AES</a> 8674/44, autograf, 7281/44]).
<a href="#vasr">VASR</a> s.207.
<p><center>* * *</center><p>
Telegram Msgr Burzia Msgr Tardinimu:
<p>Telegram c.124 <br> <div align=right> Bratislava,
11.decembra 1944, 09h30 <br> doruceny 12.decembra 1944, 12h45 </div>
<p>
Je takmer iste, ze v pripade, ze sa front priblizi k Bratislave,
prezident republiky a vlada sa uchylia do Nemecka [...]. Je
pravdepodobne, ze ma vyzvu a budu na mna naliehat, aby som
nasledoval vladu... Prosim Vasu najctihod.Excelenciu, aby mi
dala v tejto veci instrukcie a prosim o vyjadrenie, ci sa mozem
proti takymto vyzvam legitimne vzopriet [...]
<p>
(<a href="#actes">Actes</a>, zv.XI, c.462, s.642 [<a href="#aes">AES</a> 8971/44]). <a href="#vasr">VASR</a> s.217.
<p><center>* * *</center><p>
Telegram Msgr Tardiniho Msgr Burziovi:
<p>Telegram c.118 <br> <div align=right> Vatikan,
16. decembra 1944 </div>
<p>
[...] Po zvazeni vsetkeho nech Vase najctihodnejsie Blahorodie
podnikne vsetko, aby tam mohlo ostat. [...] Keby to bolo nutne,
Vase Blahorodie moze dat najavo - sposobom, aky bude
pokladat za najvhodnejsi - ze nevidi dovod, preco by malo
nasledovat vladu, ktora nechcela uznat papezskeho predstavitela
a vo viacerych pripadoch nepocuvala rady Svatej Stolice. [...]
<p>
(<a href="#actes">Actes</a>, zv.XI, c.470, s.649-50 [<a href="#aes">AES</a> 8971/44]). <a href="#vasr">VASR</a> s.219.
<p>
<a name="pramene">
<h3>IV. Pramene</h3>
</a>
<dl>
<dt>[<a name="actes">Actes</a>]
<dd>"Actes et documents du Saint Siege relatifs a la Seconde
guerre mondiale 1939-1945".
Ed: P.Blet, R.A.Graham, A.Martini, B.Schneider.
Libreria Editrice Vaticana, Vatikan, 1965-81. (11 zvazkov).
<p>
<dt>[<a name="aes">AES</a>]
<dd>"Archivo della Congregazione degli Affari Ecclesiasticci
Straordinari".
<p>
<dt>[<a name="vasr">VASR</a>]
<dd>"Vatikan a Slovenska Republika (1939-1945). Dokumenty."
Ed: I.Kamenec, V.Precan, S.Skovranek.
Slovak Academic Press, Bratislava, 1992.
</dl>