John Banana Qwerty

Monday, August 30, 2010

A simple distributed lock with memcached... in Python

Following Sylvain's advice I just wrote a memcache lock in Python:

class Lock():
   def __init__(self, key):
       self.key = "lock.%s" % key

   def __enter__(self):
       while True:
           value = getClient().add(self.key, "1", 60000)
           if value == True:
               return
           time.sleep(.1)

   def __exit__(self, exc_type, exc_val, exc_tb):
       getClient().delete(self.key)
       return False

Using the lock is as simple as doing:

with Lock("critical1"):
   doSomeExpensiveStuff()

The advantage over Sylvain's Java version is that the locking code is reusable, thanks to the use of a Python context manager.

Monday, September 15, 2008

Rencontre insolite

Sur le chemin des ruines de Wilcahuaín à Huaraz, la capitale de l'andinisme, j'ai fait une rencontre insolite. Ce qui pour certains privilégiés est un anachronisme agricole, est pour une majorité un moyen de subsistance. La vidéo parle d'elle-même, regardez plutôt.

Thursday, June 26, 2008

Service Public 2.0

J'ai eu la bonne surprise de recevoir un SMS de la Mairie de Toulouse pour me prévenir que mon passeport était disponible. Ça c'est du service public 2.0! Incroyable!

Thursday, May 15, 2008

Tempête de grêle sur Toulouse

Une tempête de grêle sur Toulouse , on voit ça très rarement. Dommage pour les plantes, va falloir replanter des nouveaux pieds de tomates!

Thursday, April 10, 2008

Beware, Puppet power users!

Puppet is a tool to automate system administration and to save you from boring and repetitive tasks. The various services, packages and configuration files of the various hosts are organized into nodes. It's pretty interesting because nodes can be arranged in a tree, so that the configuration that is common between several machines is written only once, and inherited by the child nodes. Also, definitions let you write modular pieces of system configuration, with parameters, and default values.

Puppet is written in Ruby, so it may sound fantastic news for power users that wish to extend Puppet capabilities. But there are some limitations that even a novice Puppet user can already face:

Configuration is not written in Ruby, so you have to stick with the plain Puppet vocabulary. That's actually good to have a well-defined format that models the basic concepts like package, service, file, etc, and that's a guarantee to survive a Ruby upgrade or even a complete Puppet rewrite. But Puppet has reinvented variables for example, and without any object-oriented design. So you're stuck with some kind of artificial global variables following poorly-documented rules. So I would actually warn against using Puppet variables.

Classes, as it's called in the Puppet semantic, are not classes like you would expect in the object-oriented sense. Classes in Puppet are just encapsulating other configuration bits. Using those classes simply consists in including their definition. Do not expect any sophisticated inheritance or any other OOP feature here. I would recommend to only use definitions, that are reusable wrappers with support for parameters and default values.

Puppet templates (actually ERB templates) can reference Puppet variables defined in the configuration, but this is a hack, so don't expect sophisticated constructs here. Template writers don't have access to any high-level Puppet-oriented API. You can use plain Ruby constructs however, like string manipulation, math functions, etc. And the error reporting is so weak, see for yourself:
```
err: Could not retrieve catalog: Uncaught exception compile error
(erb):73: syntax error, unexpected kEND, expecting $end
 end ; _erbout.concat "\n"
    ^ in method puppetmaster.getconfig
```
You have to restart puppetmasterd in debug mode and use erb -x -T '-' mytemplate.erb to find out what goes wrong!

Documentation is written in a wiki. It's based on user contribution, so don't expect to receive more than you can give yourself. I suggest reading Pulling Strings with Puppet, although I haven't read it myself, but I hope it can offer some more in-depth understanding of Puppet. (On a side note, the Wiki is using a Trac plugin providing reStructuredText syntax, so make sure to preview your changes, it's not the Trac syntax).

Now let's look at the promise of extensibility.

Writing a custom function

That looks pretty seducing at a first glance, but it just doesn't work:

All so-called functions are run on the server. It's a bit misleading, and I think it is a wrong design. Not being able to write a custom function that will be run on the client side does not make sense. When I'm writing bits of configuration, I'm communicating to Puppet the details that make sense on the local machine, I'm never referencing files that are local to the server! At least the Puppet syntax for a function call should reflect the fact that functions are actually parser functions, not general-purpose functions.

Again, there is no OO design. For example, creating a function is achieved through calling Puppet::Parser::Functions::newfunction(:myfunction)

As you can see in newfunction example above, Puppet internals make extensive use of Ruby symbols, which in my opinion reflects the absence of a real API, and the lack of a real internal design.

Also, note that the Ruby code has to be deployed both on the server and on the client. The fileserver along with the pluginsync mechanism allows to distribute the code automatically on all clients, although I haven't tested this myself.

Writing a custom type

So, you want to write a custom thing that will be executed on the client? There is no choice, you need to write a custom type. But the problem here is that I won't be able to talk a lot about it, because after spending 6 hours trying to hack a custom type with a lot of copy paste, I'm not satisfied with the result, and although I'm both new to Ruby and Puppet-internals, I don't think anyone can actually write any useful custom thing with Puppet. Puppet really needs a true API, be it in Ruby or not. The fact that the manifests are written in an independent syntax offers the possibility to remain compatible in the eventuality of a complete rewrite of the Puppet API.

The word of the end

If you need to extend the core capabilities of Puppet, be warned! There is no reliable and documented way to achieve it. Puppet is a great tool to automate system deployment and maintenance, but still lacks a true API with clearly documented extension points. Stick with simple documented Puppet constructs, use the exec type with hardcoded paths if there is no other way, or better: prepare your data on the server-side beforehand outside the control of Puppet if you need something more elaborate. For example, to concatenate Apache htpasswd files, I ended up writing a two-line shell script in the post-update hook of my Git repository. Much simpler than trying to concatenate files with Puppet! Extending Puppet is far from being a trivial task, even for an expert programmer!

Wednesday, January 23, 2008

Refactoring without branching

Developers using the conventional Subversion source control management system (SCM) often come to a point where they cannot continue committing their work on trunk because their changes are impacting the work of others. When you're knee-deep in the code, you don't necessarily plan to create a branch beforehand. A refactoring is sometimes unforeseen, and once you realize it's going to be the big blast of the month, you refrain from committing your work to trunk and realize you'll have to decide: either to keep a huge set of modified files for several weeks at your own risk, or to create a branch and redo your changes there, with all the hassle of managing branches with Subversion (did you ever find yourself entering into endless discussions about branching?). Either way, you feel bad.

Fortunately, there's a new player in the world of source control: GIT.

Git is a popular version control system designed to handle very large projects with speed and efficiency. Git falls in the category of distributed source code management tools. Every Git working directory is a full-fledged repository with full revision tracking capabilities, not dependent on network access or a central server.

Sounds great, heh? But there's no chance your CTO will stash SVN and adopt Git. This is where git-svn comes to the rescue! In a word, git-svn is a bridge between Subversion and Git:

You create a Git workspace by importing the SVN repository once

Make changes and commit them locally in your Git workspace

Repeat step 2. ad lib.

Push back your changes to SVN when you're done

I'm using this approach since a few weeks at work with success. When I push back my changes once a week to the SVN repository, a bunch of commits are made within in a few seconds, that's impressive! My colleagues know when I'll be pushing the next changes, so they can work safely until the next iteration. No more branches, no more hassle, only happy refactoring!

If you want to know more, I suggest to look at:

Introduction to Git for Subversion users

An introduction to git-svn for Subversion users and deserters

Sunday, October 28, 2007

Java est-il vraiment adapté au développement d'applications Web?

Drôle de question de la part d'un committer Apache Cocoon et Apache Wicket: tous deux sont écrits en Java et offrent une API Java pour le développement d'applications web. Mais prêtons-nous au jeu après plus de six ans de J2EE, quels-sont vraiment les points forts de Java pour le développement d'applications Web?

Les Threads et autre TimerTask sont bien utiles (si on compare avec PHP)

L'outillage (Eclipse JDT, JUnit) est très "pro", on oublie vite l'éditeur de texte et la ligne de commande

Certains très bons programmes n'existent qu'en Java: Lucene, FOP

sûrement d'autres avantages que j'ai oubliés?

En revanche l'approche Java a aussi ses inconvénients:

l'approche tout-objet multiplie le nombre de classes et augmente le volume global et la complexité structurelle de l'application: l'outil de build (Ant ou Maven) est devenu quasiment obligatoire, même si les développements réalisés ne justifient pas cette complexité.

Java n'est pas Open-Source, cela a deux principaux inconvénients: pendant longtemps il a été difficile d'intégrer Java avec les différents systèmes d'exploitation. D'autre part, la mise en place d'un réseau de distribution de composants a beaucoup tardé, et n'a pas été initiée par les concepteurs de Java. Face à la «complexité obscure» de Maven, on ne peut qu'envier à Perl la simplicité de son CPAN, à PHP son PEAR et à Ruby ses Gems.

Java ne s'intègre pas avec le serveur web Apache directement, on doit maintenir un serveur séparé qui tourne sur une JVM (ceci dit Ruby on Rails semble vouloir faire de même)

cycle de développement plus lourd, le bon vieux save+refresh ne fonctionne pas avec Java et ne peut pas être implémenté simplement et de manière fiable (JavaRebel semble cependant gagner en popularité en ce moment). Un certain nombre de frameworks écrivent des classloaders pour compiler ou recharger les classes à la volée, mais c'est loin d'être une science exacte.

D'autre part les frameworks MVC comme Apache Wicket sont très sophistiqués et très séduisants pour le développeur (l'API est proche de Swing) mais posent problème pour les applications Web 2.0:

l'approche tout-composant éloigne de plus en plus le développeur de la problématique initiale: le serveur doit fournir au client le morceau de HTML ou de Javascript nécessaire au bon moment. Le framework devient alors un obstacle à contourner, ce qui peut se révéler difficile et fragile.

Les fonctionnalités Ajax sont implémentées de manière très abstraite, et la réponse est enveloppée dans du XML. Pourquoi ne pas plutôt renvoyer uniquement du code JavaScript directement interprété (JSON)?

Le modèle MVC ne se marie pas toujours très bien avec les contraintes actuelles du développement Web: JavaScript sur les navigateurs n'est pas assez standardisé et formalisé pour pouvoir bâtir des applications web par composants. Les étapes de chargements des éléments de la page par exemple sont difficilement modélisables avec une API.

S'il se prête moins à la réalisation d'applications Web, surtout les
applications Web appellées «2.0» où l'utilisateur interagit avec le
serveur de manière beaucoup plus fine que sur des applications web
classiques qui se contentent de servir des pages, l'environnement Java reste particulièrement intéressant pour une équipe de développement qui construit des bibliothèques de composants comportant des traitements métiers complexes, accessibles par exemple à travers une API Rest.

Cependant, même si on le cantonne au backend, la JVM prend du retard sur ses concurrents, et du fait de son aspect «fermé», Java n'obtient pas nécessairement les contributions qui permettraient de moderniser lae language. Ses concurrents Stackless Python, Erlang et Scala suivent la tendance actuelle de distribuer les applications sur plusieurs serveurs et de profiter du nombre croissant de processeurs. Tirer intelligemment profit de la puissance de traitement des clusters de machine, c'est le challenge à relever pour les applications Web d'aujourd'hui.

Pourquoi Java est-il donc si populaire dans le monde des services informatiques? Il répond à une demande forte de la part des professionnels, et l'ensemble de la filière s'adapte à cette demande. Java est devenue la référence commune admise dans le monde des grandes entreprises, la technologie «politiquement correcte» dans laquelle on peut s'engager les yeux fermés. Bien souvent la montée en charge n'est pas une problématique exprimée initialement. L'exercice d'optimisation a posteriori est donc bien souvent fastidieux. Avec Java il est tellement facile de construire des cathédrales, en réutilisant des librairies par ci par là, que le développeur (ou même l'architecte) oublie bien souvent l'aspect efficacité au profit des fonctionnalités.

Par contre quand on a la liberté de créer une application web en dehors de ce monde corporate, on aura intérêt à utiliser des technologies plus «abordables» pour rester dans la course: abordables en termes de développement, qui n'a pas eu régulièrement des problèmes de classpath, mais aussi de déploiement: redémarrage en cas de crash, monitoring, exploitation des fichiers de logs, etc. Les technologies «abordables» donc, avec des cycles de développement
courts, permettent de garder un avantage compétitif non négligeable sur ce web en perpétuelle évolution.

Quelques articles intéressants (en anglais):

http://www.process-one.net/en/blogs/article/web_20_shifting_from_get_fast_to_get_massive/

http://public.yahoo.com/bfrance/radwin/talks/yahoo-phpcon2002.htm

http://raibledesigns.com/rd/entry/php_vs_java_which_is

http://www.epicserve.com/blog/69/python-and-django-ruby-on-rails-and-php

Et une spéciale dédicace à Sylvain Wallez, qui m'a aiguillé sur toutes ces pistes de réflexion intéressantes depuis 2004.