Archive for the ‘memcache’ Category

Stop Disabling SELinux!

Декабрь 22nd, 2011

I see a lot of people coming by #centos and similar channels asking for help when they’re experiencing a problem with their Linux system. It amazes me how many people describe their problem, and then say something along the lines of, “and I disabled SELinux...”. Most of the time SELinux has nothing to do with the problem, and if SELinux is the cause of the problem, why would you throw out the extra security by disabling it completely rather than configuring it to work with your application?

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

High Performance PHP Session Storage on Scale

Ноябрь 17th, 2011

One of the great things about the HTTP protocol, besides status code 418, is that it's stateless. A web server therefore is not required to store any information on the user or allocate resources for a user after the individual request is done. By that a single web server can handle many many many different users easily, and well if it can't anymore one can add a new server, put a simple load balancer in front and scale out. Each of those web servers then handles its requests without the need for communication which leads to linear scaling (assuming network provides enough bandwidth etc.).

Now the Web isn't used for serving static documents only anymore but we have all these fancy web apps. And those applications often have the need for a state. The most trivial information they need is the current user. HTTP is a great protocol and provides a way to do authentication which works well with its stateless nature - unfortunately this authentication is implemented badly in current clients. Ugly popups, no logout button, ... I don't have to tell more I think. For having nicer login systems people want web forms. Now the stateless nature of HTTP is a problem: The user may login and then browse around. On later requests it should still be known who that user is - with a custom HTML form based login alone this is not possible. A solution might be cookies. At least one might think so for a second. But setting a cookie "this is an authorized user" alone doesn't make sense as it could easily be faked. Better is to simply store a random identifier in a cookie and then keep a state information on the server. Then all session data is protected and only the user who knows this random identifier is authenticated. If this identifier is wisely chosen and hard to guess this works quite well. Luckily this is a mostly PHP- and MySQL-focused blog and as PHP is a system for building web applications this functionality is part of the core language: The PHP session module.

The session module, which was introduced in PHP 4, partly based on work on the famous phplib library, is quite a fascinating piece of code. It is open and extendable in so many directions but still so simple to use that everybody uses it, often newcomers learn about it on their first day in PHP land. Of course you can not only store the information whether the user is logged in but cache some user-specific data or keep the state on some transactions by the user, like multi-page forms or such.

In its default configuration session state will be stored on the web server's file system. Each session's data in its own file in serialized form. If the filesystem does some caching or one uses a ramdisk or something this can be quite efficient. But as we suddenly have a state on the web server we can't scale as easily as before anymore: If we add a new server and then route a user with an existing session to the new server all the session data won't be there. That is bad. This is often solved by a configuration of the load balancer to route all requests from the same user to the same web server. In some cases this works quite ok, but it is often seen that this might cause problems. Let's assume you want to take a machine down for maintenance. All sessions there will die. Or imagine there's a bunch of users who do complex and expensive tasks - then one of your servers will have a hard time, giving these users bad response times which feels like bad service, even though your other systems are mostly idle.

A nice solution for this would be to store the sessions in a central repository which can be accessed from all web servers.


Continue reading "High Performance PHP Session Storage on Scale"
PlanetMySQL Voting: Vote UP / Vote DOWN

Caching could be the last thing you want to do

Июль 24th, 2010

I recently had a run-in with very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected.

What is that mistake?

The ecommerce package I was working with depended on caching.  Out of the box it couldn’t serve 10 pages/second unless I enabled some features which were designed to be “optional” (but clearly they weren’t).

I think with great tools like memcached it is easy to get carried away and use it as the mallet for every performance problem, but in many cases it should not be your first choice.  Here is why:

  • Caching might not work for all visitors - You look at a page, it loads fast.  But is this the same for every user?  Caching can sometimes be an optimization that makes the average user have a faster experience, but in reality you should be caring more that all users get a good experience (Peter explains why here, taking about six sigma).  In practice it can often be the same user that has all the cache misses, which can make this problem even worse.
  • Caching can reduce visibility – You look at the performance profile of what takes the most time for a page to load and start trying to apply optimization.  The problem is that the profile you are looking at may skew what you should really be optimizing.  The real need (thinking six sigma again) is to know what the miss path costs, but it is somewhat hidden.
  • Cache management is really hard – have you planned for cache stampeding, or many cache items being invalidated at the same time?

What alternative approach should be taken?

Caching should be seen more as a burden that many applications just can’t live without.  You don’t want that burden until you have exhausted all other easily reachable optimizations.

What other optimizations are possible?

Before implementing caching, here is a non-exhaustive checklist to run through:

  • Do you understand every execution plan of every query? If you don’t, set long_query_time=0 and use mk-query-digest to capture queries.  Run them through MySQL’s EXPLAIN command.
  • Do your queries SELECT *, only to use subset of columns?  Or do you extract many rows, only to use a subset? If so, you are extracting too much data, and (potentially) limiting further optimizations like covering indexes.
  • Do you have information about how many queries were required to generate each page? Or more specifically do you know that each one of those queries is required, and that none of those queries could potentially be eliminated or merged?

I believe this post can be summed up as “Optimization rarely decreases complexity. Avoid adding complexity by only optimizing what is necessary to meet your goals.”  – a quote from Justin’s slides on instrumentation-for-php.  In terms of future-proofing design, many applications are better off keeping it simple and refusing the temptation to try and solve some problems “like the big guys do”.


Entry posted by Morgan Tocker | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Storage Miniconf Deadline Extended!

Сентябрь 30th, 2009

The linux.conf.au organisers have given all miniconfs an additional few weeks to spruik for more proposal submissions, huzzah!

So if you didn’t submit a proposal because you weren’t sure whether you’d be able to attend LCA2010, you now have until October 23 to convince your boss to send you and get your proposal in.


PlanetMySQL Voting: Vote UP / Vote DOWN

Storage Miniconf Deadline Extended!

Сентябрь 30th, 2009

The linux.conf.au organisers have given all miniconfs an additional few weeks to spruik for more proposal submissions, huzzah!

So if you didn’t submit a proposal because you weren’t sure whether you’d be able to attend LCA2010, you now have until October 23 to convince your boss to send you and get your proposal in.


PlanetMySQL Voting: Vote UP / Vote DOWN