18 March 2009

Jaiku.com używa Django

Kilka dni temu serwisy branżowe obiegła informacja, że Google udostępniło kod Jaiku.com i zaprzestało jego rozwoju jako własnej aplikacji, oferując całość jako JaikuEngine na zasadach open source. Wspomniano w artykułach, że nowa wersja działa na Google App Engine, ale zapomniano wspomnieć, że to aplikacja napisana w Django, stosująca helper dla GAE. Dla mnie Jaiku.com jako kod Twittera nie jest szczególnie interesujące, ale analiza kodu łączącego GAE+Django pisana przez pracowników Google może być ciekawą lekturą (mam nadzieję zobaczyć tam wydajne wykorzystanie GAE). Kod aplikacji znajduje się pod adresem http://code.google.com/p/jaikuengine.

15 March 2009

Additional Scrum materials on You Tube

I think that almost everyone interested in Scrum saw the two videos on You Tube about Scrum: one by Ken Schwaber - Scrum et al. and one by Jeff Sutherland - Scrum Tuning.

Several days ago I have found two others by Jeff Sutherland, each taking 1,5h with half hour reserved for questions. The links: Hyperproductive Distributed Scrum Teams and Self-Organization: The Secret Sauce for Improving your Scrum team. Both have some similar elements but are not the same. I also think they are not for beginners in Agile and Scrum.

By the way, the Scrum Coach at MySpace (mentioned in the two videos) have a very interesting blog entry explaining that Scrum is not for everyone. From my experience I can only say it is only true.

8 March 2009

The third style of Django multilingual data handling

In the last month I saw new versions of code for multilingual/translation handling for Django. Current solutions can be grouped into two categories:

  1. The use of two tables -- general nontranslatable data + translatable data as 1:N relation.
  2. Use of additional per language columns for translatable data.

Both solutions are good in many situations and have their own pluses and minuses, but there is another way of doing it that seems less common even when exists in Django from the beginning... use sites M:N relation as main system for multilingual handling.

When the sites solution may be good?

The sites solution is a better option than other two when you want to address some real world requirements that sometimes emerge:

  • Have the choice to hide some data from site version in other language (not to show them in default language when there is no current language). Suppose we have en and de versions of the site, but on de site the entry should not be visible till it will be translated (en should not take its place). This is completely optional -- you can still link the en version on de site if required!

  • Have a possibility to choose what to show on each localized portal: en version may have additional events/promotions/news that will not important for de version.

  • You want a full control about fields that are: translatable, should always be the same (global) and nontranslatable because must be different for reasons that translator may not be aware of (domain specific).

  • Do not have one primary language from where other translations are created. Even when most of the time you translate from en to de, there may be times when you need to do the opposite and in the future update the en when de changes.

  • Handling of territory languages structure, eg. en -> de -> de-at (Austrian German), and remember the structure when changes are needed.

  • Easy addition/removal of new languages (no table alters as in 2.)

  • If interested in read speed improvements give the possibility to remove translation joins (bad thing 1.) and read only needed data (bad thing in 2.)

Are there any side effects?

Yes, there are several tradeoffs, especially:

  • More work on writes (row synchronization).

  • More space needed (global fields are duplicated).

  • More work and logic is needed in CMS/admin part to show the structure in a clear way.

I'm not saying the the sites based solution is good for everything and every case, but I have worked in two multilingual systems (several languages, several sites) and in both it was the only way to avoid the 2xJOINs everywhere mess and have the elasticity that management wanted without sacrificing the general structure.

The reference implementation

Below I'm showing the one of possible implementations of the sites based translations. It is an abstract class that you can use to create translatable models. The system is read effective, writes are costly. In addition current solution do not copy M:N global fields. Maybe in Django 1.1 there will be room for better handling the copy procedure using update() and F() objects.


from django.db import models
from django.db.models.base import ModelBase
from django.contrib.sites.models import Site
from django.utils.translation import ugettext_lazy as _
from django.contrib.sites.models import Site
from django.conf import settings

class TranslationModelManager(models.Manager):
def get_query_set(self):
return super(TranslationModelManager, self).get_query_set().filter(sites__id__exact=settings.SITE_ID)

class TranslationModelBase(ModelBase):
"""
Metaclass for all translation models.
Used to properly set TranslationModelManager and to decode Translation class.
"""
def __new__(cls, name, bases, attrs):
trans_class = super(TranslationModelBase, cls).__new__(cls, name, bases, attrs)

# Decode and copy Translation class.
attr_translation = attrs.pop('Translation', None)
if not attr_translation:
translation = getattr(trans_class, 'Translation', None)
else:
translation = attr_translation
trans_class.add_to_class('_translation', { 'fields': [], 'excludes': [] })

if getattr(translation, 'fields', None) is not None:
trans_class._translation['fields'] = translation.fields
if getattr(translation, 'excludes', None) is not None:
trans_class._translation['excludes'] = translation.excludes


models.Manager().contribute_to_class(trans_class, 'objects')
TranslationModelManager().contribute_to_class(trans_class, 'site_objects')
return trans_class

class TranslationModel(models.Model):
"""
Model for providing per site translations.

Usage:
Subclass it and use class Translation to enumarate translatable fields.
By default all nontranslatable fields are copied from parent to children
and vice versa, but you can use excludes feature to disable this behaviour
for some fields. Below is an example of Translation inner class::

class Translation:
fields = ['title', 'content']
excludes = ['edited_by']
"""
__metaclass__ = TranslationModelBase

sites = models.ManyToManyField(Site)
lang = models.CharField(_('Language'), max_length=5, default=u'en')
source = models.ForeignKey('self', related_name='translations',
null=True, blank=True,
verbose_name=_("Source of translation"),
help_text=_("If not set, this is an independent entry.")
)

class Meta:
abstract = True

def __init__(self, *args, **kwargs):
super(TranslationModel, self).__init__(*args, **kwargs)
# If source is set, copy nontranslatable fields from it also when initializing.
if self.pk is None:
self.copy_nontranslatable_from_source()

def save(self, force_insert=False, force_update=False, pass_going_up=False):
super(TranslationModel, self).save(force_insert=force_insert, force_update=force_update)

# Go up if it wasn't done.
if not pass_going_up:
tip = self
while tip.source is not None:
tip = tip.source
# Cheat source to do the copying.
if tip != self:
tip.source = self
tip.copy_nontranslatable_from_source()
tip.source = None
tip.save(pass_going_up=True)
return

# Propagate to children.
slaves = self._default_manager.filter(source=self.pk)
for slave in slaves:
slave.copy_nontranslatable_from_source()
slave.save(pass_going_up=True)

def copy_nontranslatable_from_source(self, exclude=[]):
"""Copy all nontranslatable fields from source to child entry, overriding old data."""
if self.source is not None:
original = self.source
all_excludes = ['sites', 'source', 'lang'] + \
self._translation['fields'] + self._translation['excludes'] + exclude

for field in self._meta.fields:
field_name = field.name
if field_name in all_excludes or field.primary_key:
continue
setattr(self, field_name, getattr(original, field_name))

Below is an example static pages model created using translational model:


class TranslatableStaticPage(TranslationModel):
url = models.CharField(max_length=100, db_index=True)
title = models.CharField(max_length=200)
content = models.TextField(blank=True)
template_name = models.CharField(max_length=70, blank=True)

class Meta:
ordering = ('url',)

class Translation:
fields = ['title', 'content']
excludes = ['template_name']

Conclusions

The first two mentioned models have its own use cases, pros and cons. Here, I just wanted to show one other way to do the job that in some situations are more elastic than other solutions, but also have its own costs (write and space especially).