fíam

(rhymes with liam)

  • Postback: Horror story with GeoDjango

    Aug. 15, 2008 at 20:19:15 CEST

    Instead of replying to the comments, I decided to write a postback, since there are a few points I want to address.

    As John De Rosa guessed, I expecteded to receive some flamage because I've been in both sides situations like this before and I know it works. I've been the flamer and the flamed, but everytime the discussed issue ended up solved much anticipately that expected. We (people involved in FOSS) don't like flames, so when people start flaming, we look at the issue much quicker. That's how it works. Remember the Linus vs. Gnome flame?

    As D (BTW, it would be nice if people posted with their real names) pointed, I'm possibly using the wrong approach for the distance queries. But the truth is I didn't come with that ORM call myself. I got it from the GeoDjango documentation. So we have another problem here: the documentation has wrong examples. If you put something in the examples, don't yell when people tell you it doesn't work,

    And finally, as M pointed, I haven't got in touch with the GeoDjango developers. But that's because I have nothing useful to report. Shoud I report a bug telling: "Hey, I loaded GeoDjango with FastCGI and 50% of the requests crashed deeply in flup while accesing the DB"? Someone would come over and either close the bug as WORKSFORME or request more information. Before I can report this bug I need to investigate the issue, find some pointers and guess where the error is coming from. It may not be a bug the GeoDjango itself, but in the libraries GeoDjango loads, so I'll possibly have to rebuild the libraries with debug symbols, recreate my production environment and spend some hours doing tests. That takes time and I'm currently very busy (I run an hotel and August in Spain is the favourite month for holydays, so we are at 100% every day). Instead of doing a crappy bugreport which only adds random noise to the bugtracker, I prefer waiting until I have something which really helps the developers and does not waste their time.

  • Horror story with GeoDjango

    Aug. 15, 2008 at 01:59:47 CEST

    Last week I ported my geonames application to GeoDjango and I've ended up reverting all the changes and turning back to custom SQL.

    First, let me tell you what the geonames database is about. It contains almost 7 million geographical places and you can download the full dump under a Creative Commons BY-SA license.

    The first problem I encountered was importing the dataset. Since I need to do additional processing in their database dumps, I can't simply COPY the files to the database. When GeoDjango didn't exist, I decided to write a python script using the DB API and insert all the records that way. It worked fine, but it wasn't portable between different databases. So, when porting django-geonames to GeoDjango, I started by modifying the import script to use the Django ORM. First fail!. The script started running (with settings.DEBUG = False) and two hours later (importing the data with the Python script took around 1 hour) the OOM started killing processes. Well, I decided to left the script as it was before, but port the models and the custom SQL calls to the GeoDjango ORM.

    And I modified the models, adding some spatial fields and then removed all the custom SQL in favor of calls to the ORM. When I finished I started the browser, opened the main page in the test server and waited. Second fail! The query for finding the nearest geoname to a given latitude,longitude pair run for 1 minute before I hit control+C. I checked the PostgreSQL logs and I found it was using Distance_sphere while my custom SQL was using ST_DWithin. I didn't remember why exactly I have choosen ST_DWithin, so I looked it up in the PostGIS docs and I remembered the reason: ST_DWithin uses the index, while Distance_sphere() needs to go trough all the table calculating the distance to the given point. That's not very fast when you have 7M rows. So I ditched all the code using the ORM and went back to custom SQL, but decided to keep the PointFields in the Geoname model thinking they could be handy at some point in the future.

    So I continued working on other features and three days later, after testing the new features, I moved the code into production. Third fail! 50% of the requests ended with "An unhandled exception was thrown by the application." which is really bad, since you don't even get an email notification. I activated debugging in the fastcgi server and the exception was ": no results to fetch" with the traceback starting while fetching the user session from the database.

    With help from git-bisect I tracked to problem to a single commit in my code: "Port django-geonames to GeoDjango". I was shocked, since my code was only importing django.contrib.gis.db.models and declaring a PointField(). The rest of the code was the same as before, but that was enough to crash all my application in a really difficult to debug way.

    Don't take me wrong. I think GeoDjango will end up as a nice geographical framework, but it's not ready. My point is I wasted four hours writing code, testing it, debugging and I ended with exactly the same code because GeoDjango still has some problems. Yes, I know it's still in beta, but Django developers have been telling us how much conservative they are with trunk that you, as I did, may felt tempted to try it into production. The only thing I want to accomplish with this post is making sure you don't waste another four hours as I did.