sixtyPercent: Cochlear Implants, Aviation, Technlology, and Philosophy 2005/12/09
Building Scalable Web Applications With Python and PostgreSQL
During the workday, I work as a part of a tiny team building the MerchantCircle web application. Our application is built entirely in Python, XHTML, Javascript, and PostgreSQL. Someday we hope our site will be home to millions of users, but right now we're just getting going.
We started designing and building this application before the current crop of "hot" Python web application frameworks (TurboGears, Django, and Subway among others) hit the streets. It's very cool to see such rapid work and great thinking out loud about the various approaches to building these frameworks and the applications they support. Since we started before these tools were public, we built our own system. Since we're not crazy, we built it out of many of the same parts used by other frameworks: CherryPy, SQLObject, FormEncode, Cheetah Templates, memcached, Lighttpd and of course Linux.
But I'm in more of a "heads down, get it done" sort of role, and while I greatly enjoy learning about these systems and perhaps contributing a tiny bit to them, I mostly need to focus on our application. Fortunately, the Python language and the philosophy of the library and tools has made integrating all of these tools relatively easy. This "philosophy" isn't new I suppose -- the Python library tends to be composed of lots of little pieces that generally do one or two things very well, and that reminds me of the Unix way. When we needed wiki formatters or RSS parsers or soundex libraries, high quality implementations were almost always available.
Since nice frameworks like TurboGears and Django weren't yet available, we had to work through some issues of integration and scaling when combining all of these pieces. If I were starting now, this wouldn't be the case. In any event, as I said in my first sentence -- I spend my days building the application. I don't spend my time figuring out hot to validate form input or read data from the database or set cache control headers.
So how do you build scalable, Python based, Postgres-backed web applications? Come back in a year and I'm sure I'll know by then :-) In the mean time, I'll post more in-depth articles over the next days and weeks as to what we're doing, and here's a few things that our team has learned so far;
- Start with a solid base, such as from Django, my current favorite. As fun as it may be (for some) to write the next great web application framework, that's usually not the main point of the exercise.
- Think about database schema and queries all of the time. Spend a lot of time reducing database hits. We learned how to log queries taking longer then 500ms, and we study the logs. But don't worry to much about exotic, clustered databases, etc. It's amazing how many users can be handled by one (nice) box. Don't be afraid of stored procedures, pl-pgsql, triggers, and so on -- in fact embrace them and push as much as is reasonable to the DB layer.
- Don't be afraid of schema evolution -- worse is better. Develop good practices for migrating schema and data and then keep developing.
- Use lighttpd and squid, and aggressively cache static content and dynamic pages (as possible). Lighttpd is awesome.
- Design and build the middle layer (the application server) to be federated. SQLObject in particular can get in the way here if you're not careful -- watch out for cache inconsistencies.
- Log everything and watch the logs all the time.
by David Creemer : 2005/12/09 : Categories technology python : 0 trackbacks : 0 comments (permalink)