Archive

Archive for the ‘hacking’ Category

What I’ve been up to: Zenexity, Play

November 30th, 2009 2 comments

Busy as I was, I realized I didn’t blog about my recent employment change. I left Yoono 2 months ago to join a company called Zenexity (site in French). It’s really cool because after Flock and Yoono that were very similar (consumer oriented/social mashup/Mozilla technologies), I get to work on really different stuff: more server-side, and more business oriented. But still with a strong R&D component, and it’s something that really motivated me to get on board with Zenexity: they’re independent because they earn their own money (e.g. don’t live on VC money) but still spend a lot of effort in R&D projects. Projects for customers also are really state-of-the-art of the web.

Specifically, they (I mean “we”) have an Open Source project called the Play! Framework. It’s an MVC framework similar to Django or Ruby on Rails, in Java. Within the Java world, I think it’s pretty disruptive. It contrasts from bloated stacks, and manages to provide simplicity and productivity to Java web development. Also, it speaks the language of the web by making it easy to create RESTful web apps, pretty URLs and web services.

Here is a screencast I did last month for the 1.0 release.

A web app in 10 minutes using Play! from zenexity on Vimeo.

Categories: hacking, life, tech Tags: ,

Simple Tips to Build Scalable Websites

July 1st, 2009 3 comments

A few days ago I’ve been invited to a launch party for a web product in Paris. While the product was nice and polished, it seems like the developers didn’t understand anything about scalability. They didn’t even understand my question when I asked them if the product could scale.

It’s probably not a big deal for them: they were presenting a CMS, so most of the time it will be installed for a limited user base. I guess most people will be happy to use it on a single server, so it’s probably OK for them not to be able to scale. However I noticed that while scalability is now a fairly solved problem, there are not that many articles explaining how to prepare to scalability on the web. So here I go. I will not try to replace a good book, but just to give the very basics.

What is scalability?

It’s important to get that out of the way. Scalability is not performance: it’s not about making good use of CPU and bandwidth, and it’s not about having the page being loaded quickly in the user’s browser. It’s about being able to balance the load between several servers. So when the load increases (more users creating accounts, more visitors, more page views) you can add additional servers to balance the load. You don’t just throw in a server, you need to design your software to work on a cluster of servers.

An other point is that you will rarely create a cluster of machines from scratch: when you launch a new website you will have few users so few machines (one or two), and as your load increase you will increase the number of servers. You will have to scale different parts of your system one after the other.

#1: the web front-end

Most of the time you start with a front-end (PHP, Python, Ruby, Java…) and a data layer (MySQL, PostgreSQL, CouchDB…). As your load increase, the front-end will be the first to break. Of course server-side caching will help, but at some point you will need several front-end servers.

The key for that is to ensure you don’t store any data on the front-end. The problem sometimes arise with sessions: a lot of PHP libraries store session information locally on the server, and that prevents from balancing the load. The idea is that in a session a user may hit a server for a given page, then an other for the next page. If the session is only accessible to the first server, you’re screwed. You want it to be somewhere else. That can be in the data layer or in a special sessions server. If you write a Facebook app you don’t need to care, because Facebook takes care of the session.

Now can have as many front-ends as we want, but we have a unique database server.

#2: the read operations on the database

Most applications will have many more reads than writes. For example in a blogging software, each visitor will trigger a read on the database (OK, not each visitor if there is a good cache), but writes only occur when the author writes a new post or someone leave a comment.

That’s good, because it’s much easier to scale reads than writes. Just make sure that in your code you have different settings for reads and writes. They can point to the same database at launch time, but when the time comes you can separate those. Writes will go to your “main” database, and reads will go to a copy. There are other approaches, but for example MySQL offers replications features. Once set up, the slaves will stay in sync with the master. You can have as many slaves as you need.

OK – several front-ends, several read-only databases, but still one master database for writes. If your applications has few reads it may be fine with a beefy database server, (and some major websites just have one master database), but if you have a lot of writes (highly social applications like Facebook or Twitter) you may want to continue the scaling process.

#3: the write database

Now we want to have several databases where we can write to. Obviously, we have to be careful not to introduce inconsistencies in the process. So having an old version of a blog post on a server and the new version on an other one is not great; what if some users see an old version of your post and others see the most recent one ?

There are various strategies to divide data in a safe and consistent way, including:

  • Depending on the userid (or blogid, or whatever makes sense in your application), put the data on one server on an other. For example, all users with an even id go to server1 and all users with an odd id go to server2. Hint: make sure your algorithm lets you add more servers later, which is not the case with my example where you will be stuck at 2 servers :)
  • Put some tables on a server, some others on an other. It doesn’t help you when a table is growing too much, but it can be combined with the previous point.

Conclusion

Here you go, the basics for building a scalable website. That’s not all you have to do, if your website continues growing you will face more problems such as having to scale your network. I’m not talking about outgoing bandwidth but communication between your servers (front-end and data layers). But if your code is efficient, those simple recommendation will get you to a server that can handle a fairly big load. I really recommend Building Scalable Websites, from O’Reilly if you want to know more.

FAQ

Q: Language X doesn’t scale, but language Y does!

A: Bullshit. It’s not the language that scales, it’s your code. Some languages may not perform as good as others, so you will have to add boxes more often but the way you scale is still the same.

Q: What about cloud computing? Virtualization? All these fancy buzzwords?

Virtualization means you run on virtual machines rather than on physical ones. The benefit is that you can easily add or remove machines. For example, using Amazon EC2 you can add as many machines as you want in a few minutes, and then remove them in no more time. With a classical hosting company, you need to make a phone call, ask for the machines and you get them in maybe one week. They’ll charge you for the set-up too, and if you no longer want it you still have to pay for a full term. So cloud computing offers are generally more flexible.

Q: Does Google App Engine make it easier to scale?

In short, yes. By not letting you access the machines, Google App Engine constrain you into writing scalable code. You also don’t have to request new machines when you need them or release when you no longer need them; you just pay what you use depending on the load of your application.

I am a big fan on Google App Engine but be careful, since it’s programmed in a particular way it’s not easy to move your project out of it. You may feel locked in after you project started.

Thoughts on Google App Engine

May 26th, 2009 No comments

I’ve been playing with Google App Engine recently. It’s actually pretty cool, to the point that I’m almost ashamed to have ignored it when it was released. I kind of felt like it would be too restrictive with just Python, just their own database and so on.

But so far, I like it:

  • It’s only Python (or Java), but you can do pretty much anything you would do in a non-App Engine Python project. You can load pure pythonic third party libraries by just including them in your project.
  • The free quotas are really big. It’s enough for a hobby project, and if it becomes successful enough to hit the ceiling you should be able to figure out a way to monetize it to pay your Google bill.
  • You can use your own domain name even with a free account
  • There is no SQL, but Google’s BigTable seems to be good enough. Heck, that’s what they use for most of their products!

And you get all the App Engine specific goodness: easy authentication with Google Accounts, free hosting with huge quotas, and most importantly easy scalability on Google’s infrastructure… Having to call your hosting company to add new servers is a pain in the ass (and in the wallet), having to create and delete instances on Amazon S3 is a much better, but not having to think about it at all is just pure joy.

Categories: hacking, tech Tags: , , ,

Thank you, AMO!

April 15th, 2009 No comments

The website addons.mozilla.org recently changed the rule for sandboxed addons: you no longer need to register and login to install one. That was a big issue because the review process is a bit heavy, and a lot of add-ons were stuck in that Limbo of Firefox.

The difference for my recent sandboxed Video Games Spy (a sidebar to get aggregated info about games) is huge. It went from 0-ish to more than 50 downloads a day! It’s still not a lot, especially compared the thousand a day Moji is still getting, but it shows there is some interest for the add-on.

Video Games Spy

Super Mario Galaxy!

Should Firefox toolbars get “Text besides icons”?

December 1st, 2008 No comments

This is a feature that I really love in Gnome (Linux), and that I wish Firefox had it too. That would make it even more integrated into the Gnome desktop.

It is about the options for text and icons on toolbars. Currently, Firefox proposes three options:

  • Icons only (the default)
  • Icons and text
  • Text only

Gnome and Toolbars

Gnome, on the other hand, has one more option: text besides icons, while the icons and text option of Firefox is called text below icons. My preference goes to text besides icons. Let’s see how each option looks on Epiphany, Gnome’s own web browser:

Gnome's different toolbar options on Epiphany

From top to bottom: text below icons, text besides icons, icons only and text only. As you can see, in the “text besides icons” option, not all icons have a label: only the most important ones. It’s not unlike IE6, so IE6 was not completely garbage. Yes it has a shitty rendering engine, no tabs and no popup blocker, but it has text besides icons :) .

The advantages of “text besides icons” are multiple, not limited to teaching the meaning of the buttons but also:

  • Create a hierarchy between important buttons and secondary buttons (for example, “back” is more important than “forward”
  • Give more real estate to important buttons, making them easier to click

Those two goals have been solved for Mac and Windows for the back and forward buttons only; however on Linux back and forward are still given the same importance.

I believe the patch would be simple enough, it is just one entry to add to the options list and and few CSS rules to apply in this case (see below).

The Cherry on the Cake

If Firefox get that feature, the cherry on the cake would be to have an additional option for Linux users: “System Default”. Using this option, Firefox will just use whatever the user (or the distribution) set from the preferences. It would be tempting to get rid of the option altogether and set everyone to the system default, but I guess that wouldn’t please KDE users who don’t have access to this preference.

That requires (1) to read the gconf option for toolbars and (2) to listen for the change in order to refresh the UI as soon as the user changes the system preference.

It seems like the Mozilla codebase already have some code related to gconf, in nsGNOMEShellService.cpp to set Firefox as the default browser.

Get it Today

You can easily get the text besides icons on your Firefox, by adding the following lines to your userChrome.css:

/* Text besides icons */
toolbar:not([mode=full]) #back-button,
toolbar:not([mode=full]) #home-button {
   -moz-box-orient: horizontal !important;
}

#back-button .toolbarbutton-text,
#home-button .toolbarbutton-text {
   display: block !important;
}

You will have to set your toolbars to “icons only”.

Text Besides Icons in Firefox

Text Besides Icons in Firefox

Categories: browsers, hacking, tech Tags:

France2 News for XBMC and Plex

October 26th, 2008 3 comments

I just released a plugin for the media center softwares XBMC and Plex to watch the France2 news from your couch. Details in French here.

Je viens de publier un plugin pour les media center XBMC et Plex permettant de regarder le journal de France2 depuis le canapé. Détails ici.

Categories: hacking Tags:

Getting more media sites in Flock’s mediabar, with Media RSS

July 17th, 2008 1 comment

Flock 2.0 is on its way to the final release, and many of you have noticed that besides all the Firefox 3 goodness, the experience is pretty much the same as in Flock 1.2. Well, it’s pretty much the same, not exactly the same. One discreet feature is the recognition of Media RSS feeds for the mediabar.

Media RSS on the French website lemonde.fr

While Flock has been doing a lot of service-specific integration, it has never been the intent for the long term. We do service-specific because we have no choice, but we are eager to support open standards (and promote them) as they get available. Our blog editor had support for MetaWeblog and ATOM Publishing Protocol from the beginning, and now it’s the turn of the mediabar to get some open standard love.

Media Discovery

What does it mean for you, the user? Well, it means that besides the 7 supported services, you can consume content from any website that advertise an RSS feed. You can try it, in Flock 2.0beta2. Here is a selection of websites:

Custom Search

So when you visit a page with a media rss feed, you can see it in the mediabar and subscribe to it. It gives a experience similar to Flock’s news reader, but with an experience more tailored to media content (images and videos). But there is more. It’s really an advanced feature, but if a website provides a media rss feed for a given search result, you can use that to add search in the mediabar.

Example: Hulu
Hulu is a website with TV content from the major networks (FOX, NBS, PBS…) with limited advertisement. You can get search for it in Flock’s mediabar, again that’s for Flock 2.0beta2:

  1. Open the URL “about:config”
  2. Search for “rssSearch”
  3. Change the value of flock.photo.rssSearch to:
    [{"hulu":{"id":"hulu","title":"Hulu Videos","url":"http://www.hulu.com/feed/search/%s","icon":"http://www.hulu.com/images/hulu.ico"}}]
  4. Open the mediabar

Voilà! You can now search for your favorite TV shows in the mediabar.

Search for "Homer" on Hulu.com

Get support for your site

If you have a feed with images or videos on your website/blog, you can get it in Flock’s mediabar pretty easily.

The easiest way is to pipe your feed through Feedburner, making sure you enable their SmartCast feature. Feedburner will nicely add the required markup to your feed (and they have a lot of other features too).

If you’re tech savvy and you’d rather do it your way, there is some documentation that I wrote for that.

Categories: flock, hacking Tags: , , , ,

Week-End Hacking

July 6th, 2008 No comments

I’ve spent a day hacking on a new extension. It’s for gamers (like me!) who like to check reviews about new games before they buy. I’ve put Amazon customer reviews score, the mandatory Metacritic score. More to come – let me know what you think should be there.

Also, when you visit Metacritic, IGN or Gamespot, games get detected, so all you need when you’re viewing a page about a game is to click on the famicom icon, and the info will open on the left. Pretty cool, eh?

It’s in AMO sandbox now, so if you don’t have an AMO account with Sandbox access you can also download it here. And if you like it, don’t forget to write a review on AMO, so it can get out of the sandbox!

Video Games Spy

"Boom Blox" in VGSpy

Categories: flock, hacking Tags: