PlanetMySQL Voting: Vote UP / Vote DOWN
Archive for the ‘memcached’ Category
MySQL and memcached Guide
Ноябрь 2nd, 2010PlanetMySQL Voting: Vote UP / Vote DOWN
Introduction to memcached
Ноябрь 2nd, 2010PlanetMySQL Voting: Vote UP / Vote DOWN
CB1 Ubuntu 10.10 Linux Development Setup
Октябрь 17th, 2010I use a MacBook Pro for my day-to-day operations here at CB1, INC. I’m a huge believer that a development environment should mimic the production environment, so I find myself running a couple virtual machines in VMware Fusion.
The following guide is a reference for myself as well as possibly a helpful resource for setting up your own Linux development environment. Here’s an checklist of the tasks to perform and software to install:
- Operating System
- Ubuntu 10.10 64-bit: I use Ubuntu Desktop in dev and Ubuntu Server in production
- Package updates and upgrades
- Network configuration (at least 2 static IP addresses)
- Development Tools
- C/C++ development environment
- Autotools
- Sun Java JDK
- Valgrind
- Version control: Subversion, Bazaar, git
- Android SDK
- Servers
- Samba (file sharing)
- SSH (remote shell access)
- Apache 2.2 (web server)
- nginx 0.8 (web server)
- PHP 5.3.3 (application server)
- PHP-FPM (PHP’s FastCGI process manager)
- MySQL 5.1 (database server)
- PostgreSQL (database server)
- memcached 1.4.5 (caching layer)
- Gearman (job queue manager)
- PHP Extensions
- Desktop Applications
- Google Chrome
- KCachegrind
- Appcelerator Titanium
Operating System

Start by installing Ubuntu 10.10 Desktop (or server). I’m not going to cover installing Ubuntu since there are already several other resources out there. Once Ubuntu is installed, open a Terminal:
user@ubuntu:~# sudo passwd root [sudo] password for user: <type your password> Enter new UNIX password: <type new root password> Retype new UNIX password: <type new root password again> passwd: password updated successfully user@ubuntu:~# sudo apt-get update user@ubuntu:~# sudo apt-get upgrade user@ubuntu:~# mkdir ~/src
New File Permissions
user@ubuntu:~# sudo pico /etc/profile
Change 022 to 002. This setting controls the default permissions when a new file or directory is created. This is mostly useful when managing files over Samba.
Network IP Addresses
Optionally, you may want to assign a static IP address. I set up one IP address for Apache and another for nginx.
user@ubuntu:~# sudo pico /etc/network/interfaces
The following is a reference for adding two static IPs. Change the IPs to meet your needs.
auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.1.200 netmask 255.255.255.0 gateway 192.168.1.1 auto eth0:1 iface eth0:1 inet static address 192.168.1.201 netmask 255.255.255.0
user@ubuntu:~# sudo /etc/init.d/networking restart
Packages
Here’s a bunch of packages that will set up compilers, version control, Java, MySQL, Apache, PHP, Memcache, Gearman, Samba, and more.
user@ubuntu:~# sudo apt-get install build-essential autotools-dev autoconf \
autoconf2.13 openssh-server ethtool traceroute openjdk-6-jdk \
mysql-server-5.1 bzr subversion subversion-tools ntp ntpdate \
libpcre3-dev libevent-dev automake bison libtool scons g++ \
ncurses-dev libreadline-dev libz-dev libssl-dev libcurl4-openssl-dev \
ruby rubygems libzip-ruby1.8 libzip-ruby1.9.1 python-dev ruby-dev \
libdbus-glib-1-dev uuid-dev libpam0g libpam0g-dev gperf samba valgrind \
libxml2-dev libfreetype6-dev curl libcurl4-openssl-dev \
libjpeg62-dev libpng12-dev sqlite3 libsqlite3-dev git-core \
postgresql postgis gearman libgearman-dev php5 \
libapache2-mod-php5 php5-dev memcached php5-memcached \
php5-curl php5-gd php5-mysql php5-pgsql php-apc \
php5-xdebug php5-fpm libapache2-mod-fastcgi
MySQL
During the package install above, MySQL will prompt you for the root password.
After the packages are installed, we need to allow remote MySQL connections.
user@ubuntu:~# sudo pico /etc/mysql/my.cnf
Comment out the bind-address line.
# bind-address = 127.0.0.1
SSH
Next, you may optionally increase the connection keep alive interval for remote ssh connections. Timeouts aren’t really an issue for SSH’ing into a local VM, but really helps for remote installs.
user@ubuntu:~# sudo echo "ClientAliveInterval 60" >> /etc/ssh/sshd_config
Samba
Samba allows me to drag and drop files between my Mac and Linux VM. I personally do not enable/install Samba on production servers.
user@ubuntu:~# sudo cp /etc/samba/smb.conf /etc/samba/smb.conf.orig user@ubuntu:~# sudo pico /etc/samba/smb.conf
You can add a share such as the following:
[ubuntu]
force user = <your username>
writeable = yes
create mode = 644
path = /home/<your username>
directory mode = 755
force group = <your username>
Then create yourself a Samba user:
user@ubuntu:~# sudo smbpasswd -a <your username>
Apache 2
Apache is mostly configured out of the box, but I like to enable rewrite and SSL so I can test production features.
user@ubuntu:~# sudo a2enmod rewrite user@ubuntu:~# sudo a2enmod ssl
Since I’m going to run Apache and nginx, I’m going bind Apache to eth0.
user@ubuntu:~# sudo pico /etc/apache2/ports.conf
NameVirtualHost 192.168.1.200:80
Listen 192.168.1.200:80
<IfModule mod_ssl.c>
Listen 192.168.1.200:443
</IfModule>
Now we need to add eth0‘s IP to the default host:
user@ubuntu:~# sudo pico /etc/apache2/sites-enabled/000-default
<VirtualHost 192.168.1.200:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
<Directory /var/www/>
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
LogLevel warn
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>
Restart Apache for the changes to take effect.
user@ubuntu:~# sudo apache2ctl restart
Gearman
By default, Gearman uses memory to store pending jobs in the queue, but I prefer to use MySQL for persistent storage. To do this, first create the queue database and table:
user@ubuntu:~# mysqladmin -uroot -p123123 create gearman user@ubuntu:~# mysql -uroot -p123123 -e "CREATE TABLE gearman.gearman_queue ( unique_key VARCHAR(64) NOT NULL, function_name VARCHAR(255) NULL, priority INT NULL, data LONGBLOB NULL, PRIMARY KEY (unique_key) ) ENGINE = InnoDB;"
Next update the init script to tell Gearman to use the database:
user@ubuntu:~# sudo mv /etc/default/gearman-job-server /etc/default/gearman-job-server.bak user@ubuntu:~# sudo echo "PARAMS=\"-q libdrizzle --libdrizzle-host=127.0.0.1" \ "--libdrizzle-user=root --libdrizzle-password=123123 --libdrizzle-db=gearman" \ "--libdrizzle-table=gearman_queue --libdrizzle-mysql\"" > /etc/default/gearman-job-server user@ubuntu:~# sudo /etc/init.d/gearman-job-server restart
Gearman PHP Extension
We need to download and install the Gearman PHP extension if we want to write PHP workers or post jobs to the queue.
user@ubuntu:~# cd ~/src user@ubuntu:~/src# wget http://pecl.php.net/get/gearman-0.7.0.tgz user@ubuntu:~/src# tar xzf gearman-0.7.0.tgz user@ubuntu:~/src# rm gearman-0.7.0.tgz package.xml user@ubuntu:~/src# cd gearman-0.7.0 user@ubuntu:~/src# phpize user@ubuntu:~/src# ./configure user@ubuntu:~/src# make user@ubuntu:~/src# sudo make install
Next, add the config file to load the Gearman PHP extension:
user@ubuntu:~# sudo echo "extension=gearman.so" >> /etc/php5/conf.d/gearman.ini
memcached PHP Extension
Since we have memcached and the memcached PHP extension install, let’s use it for storing session data:
user@ubuntu:~/src# sudo echo "session.save_handler = memcached
session.save_path = \"127.0.0.1:11211\"" >> /etc/php5/conf.d/memcached.ini
nginx
nginx is web server that is really fast. I use nginx as my primary development web server unless I’m running a web app that only works with Apache. You can choose to install nginx from package, but I like to live life on the bleeding edge, so I’ll be building nginx from source. To install nginx, we need to download the source, compile it, install it, and configure it.
user@ubuntu:~# cd ~/src user@ubuntu:~/src# wget http://nginx.org/download/nginx-0.8.52.tar.gz user@ubuntu:~/src# tar xzf nginx-0.8.52.tar.gz user@ubuntu:~/src# rm nginx-0.8.52.tar.gz user@ubuntu:~/src# cd nginx-0.8.52 user@ubuntu:~/src# mkdir /var/lib/nginx user@ubuntu:~/src# ./configure \ --sbin-path=/usr/sbin \ --conf-path=/etc/nginx/nginx.conf \ --error-log-path=/var/log/nginx/error.log \ --pid-path=/var/run/nginx.pid \ --lock-path=/var/lock/nginx.lock \ --http-log-path=/var/log/nginx/access.log \ --http-client-body-temp-path=/var/lib/nginx/body \ --http-proxy-temp-path=/var/lib/nginx/proxy \ --http-fastcgi-temp-path=/var/lib/nginx/fastcgi \ --http-uwsgi-temp-path=/var/lib/nginx/uwsgi \ --http-scgi-temp-path=/var/lib/nginx/scgi \ --with-http_stub_status_module user@ubuntu:~/src# make user@ubuntu:~/src# sudo make install user@ubuntu:~# sudo pico /etc/init.d/nginx
Here’s the init script that will start nginx for us:
#! /bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/sbin/nginx
NAME=nginx
DESC=nginx
test -x $DAEMON || exit 0
case "$1" in
start)
echo -n "Starting $DESC: "
start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS
echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
start-stop-daemon --stop --quiet --pidfile /var/run/$NAME.pid --exec $DAEMON
echo "$NAME."
;;
restart|force-reload)
echo -n "Restarting $DESC: "
start-stop-daemon --stop --quiet --pidfile /var/run/$NAME.pid --exec $DAEMON
sleep 1
start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS
echo "$NAME."
;;
reload)
echo -n "Reloading $DESC configuration: "
start-stop-daemon --stop --signal HUP --quiet --pidfile /var/run/$NAME.pid --exec $DAEMON
echo "$NAME."
;;
*)
echo "Usage: /etc/init.d/$NAME {start|stop|restart|reload|force-reload}" >&2
exit 1
;;
esac
exit 0
Now we need to make the init script executable and enable it:
user@ubuntu:~# sudo chmod +x /etc/init.d/nginx user@ubuntu:~# sudo update-rc.d nginx defaults user@ubuntu:~# sudo pico /etc/nginx/nginx.conf
Here’s a starter nginx.conf with some basic settings:
user www-data www-data;
worker_processes 2;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
tcp_nodelay on;
tcp_nopush on;
keepalive_timeout 65;
server_name_in_redirect off;
server_tokens off;
add_header Strict-Transport-Security max-age=1800;
add_header X-Frame-Options deny;
gzip on;
gzip_buffers 16 8k;
gzip_comp_level 9;
gzip_types text/plain text/xml application/x-javascript text/css;
include /etc/nginx/sites/*;
}
user@ubuntu:~# sudo mkdir /etc/nginx/sites user@ubuntu:~# sudo pico /etc/nginx/sites/default
Now we need to set up a default host that supports PHP (via PHP-FPM, PHP’s FastCGI Process Manager) and we want the default host to use the eth0:1 IP address:
server {
listen 192.168.1.201:80 default;
server_name _;
root /var/www;
index index.php;
location / {
if (!-e $request_filename) {
rewrite ^/(.*)$ /index.php?q=$1 last;
break;
}
}
location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name;
include fastcgi_params;
}
location ~* (\.(htaccess|engine|inc|info|install|module|profile|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)|code-style\.pl|Entries.*|Repository|Root|Tag|Template)$ {
deny all;
}
}
After the config files are good to go, start nginx:
user@ubuntu:~# sudo /etc/init.d/nginx start
Service Names
I also like to add service names so I can see what ports are in use when I run netstat. I added drizzle and Cassandra for fun despite this post not including them.
user@ubuntu:~# sudo cp /etc/services /etc/services.bak user@ubuntu:~# su root@ubuntu:~# echo "drizzle 4427/tcp drizzle 4427/udp memcached 11211/tcp memcached 11211/udp gearmand 4730/tcp gearmand 4730/udp fastcgi 9000/tcp cassandra 9160/tcp" >> /etc/services root@ubuntu:~# exit
Android SDK
The Android SDK is unfortunately not in package, so you’ll need to download it from the Android Developer site: http://developer.android.com/sdk/index.html.
user@ubuntu:~# wget http://dl.google.com/android/android-sdk_r07-linux_x86.tgz user@ubuntu:~# tar xzf android-sdk_r07-linux_x86.tgz user@ubuntu:~# rm android-sdk_r07-linux_x86.tgz user@ubuntu:~# sudo mv android-sdk-linux_x86 /usr/local user@ubuntu:~# sudo find /usr/local/android-sdk-linux_x86 -type d -exec chmod 777 {} \;
You’ll need to add the Android SDK path near the top of your ~/.bash_profile or ~/.bashrc:
export PATH=${PATH}:/usr/local/android-sdk-linux_x86/tools
To manage your Android SDK packages and virtual devices, you’ll need to run the android app:
user@ubuntu:~# android
First go to Available Packages and download version 1.6 and 2.2 Android SDK packages. You can also choose to download the documentation, samples, and Google APIs.


Downloading the package may take several minutes. You don’t have to create a virtual device right now if you are planning on installing Appcelerator’s Titanium platform. You can exit the Android app when you’re done.
Desktop Apps
If you’re running Ubuntu Desktop, there are a couple handy apps I install. The first is Google Chrome and can be directly downloaded from the Google Chrome download page.
I find KCachegrind and GHex to be useful:
user@ubuntu:~# sudo apt-get install kcachegrind ghex
Appcelerator Titanium
Titanium is an awesome platform for developing desktop applications for Linux, Mac OS X, and Windows as well as mobile apps for iPhone and Android. We use Titanium Developer to create Titanium projects. Begin by downloading the 64-bit version of Titanium:
user@ubuntu:~# wget -O titanium.tgz http://www.appcelerator.com/download-linux64
There’s also a 32-bit version available at http://www.appcelerator.com/download-linux32.
Next we unpack Titanium Developer and move it to a safe place:
user@ubuntu:~# tar xzf titanium.tgz user@ubuntu:~# rm titanium.tgz
Next you need to run the installer by double-clicking the Titanium Developer executable. Run the executable and then click the Install button. You can try installing to /opt/titanium, but you might need root privileges.


Next, there are a few issues with outdated libraries, so we simply delete them:
user@ubuntu:~# rm ~/.titanium/runtime/linux/1.0.0/libgobject-2.0.* user@ubuntu:~# rm ~/.titanium/runtime/linux/1.0.0/libglib-2.0.* user@ubuntu:~# rm ~/.titanium/runtime/linux/1.0.0/libgio-2.0.* user@ubuntu:~# rm ~/.titanium/runtime/linux/1.0.0/libgthread-2.0.*
Titanium Developer also complains if /bin/java doesn’t exist, so create a quick link:
user@ubuntu:~# sudo ln -s /usr/bin/java /bin/java
Relaunch Titanium Developer and enter your login credentials. If you don’t have a login, you can get a free account.

After signing in, you may notice there are some updates available in the upper right corner of the window. Click in the box and the updates will be downloaded and installed.

Optionally you can create a launcher icon for your GNOME panel. Don’t forget to escape spaces in the command with a backslash!

Finishing Touches
Lastly, I like to re-arrange my desktop to maximize my coding real estate.

Conclusion
That should get you up and running with a neato dev environment. If you need to run SSL, I wrote a post on Creating Self-Signed Certs on Apache 2.2 and Virtual Hosts and Wildcard SSL Certificates with Apache 2.2.
If you find any typos or additions, please feel free to sound off in the comments!
PlanetMySQL Voting: Vote UP / Vote DOWN
LCA Miniconf Call for Papers: Data Storage: Databases, Filesystems, Cloud Storage, SQL and NoSQL
Сентябрь 29th, 2010This miniconf aims to cover many of the current methods of data storage and retrieval and attempt to bring order to the universe. We’re aiming to cover what various systems do, what the latest developments are and what you should use for various applications.
We aim for talks from developers of and developers using the software in question.
Aiming for some combination of: PostgreSQL, Drizzle, MySQL, XFS, ext[34], Swift (open source cloud storage, part of OpenStack), memcached, TokyoCabinet, TDB/CTDB, CouchDB, MongoDB, Cassandra, HBase….. and more!
Call for Papers open NOW (Until 22nd October).
PlanetMySQL Voting: Vote UP / Vote DOWN
More on dangers of the caches
Сентябрь 24th, 2010I wrote couple of weeks ago on dangers of bad cache design. Today I’ve been troubleshooting the production down case which had fair amount of issues related to how cache was used.
The deal was as following. The update to the codebase was performed and it caused performance issues, so it was rolled back but yet the problem remained. This is a very common case when you would see customer telling you everything is the same as it was yesterday… but it does not work today.
When I hear these words I like to tell people computers are state machines and they work in predictable way. If it does not work same today as it worked yesterday something was changed… it is just you may not recognize WHAT was changed. It may be something subtle as change in query plan or increase in search engine bot activity. It may be RAID writeback cache disabled due to battery learning but there must be something. This is actually where Trending often comes handy – graphs would often expose which metrics became different, they just need to be detailed enough.
So back to this case… MySQL was getting overloaded with thousands of same queries… which corresponded to cache miss storm but why it was not problem before ? The answer lies in caching as well. When software is deployed the memcache is cleared to avoid potential issues with different cache content, so system have to start with cold cache which overloads the system and it never recovers. When you have expiration based cache you increase the chance of conditions when system will not gradually recover by populating cache – if because of cache misses performance is so bad the speed of populating cache with new items is lower than speed with which items expire due to timeout you may never get a system warmed up.
But wait again… was this the first change ? Was not the code ever updated before ? Of course it was. As often with serious failures there is more than one reason which pushes system over top. During normal deployment the code change is done at night when when the traffic is low, so even if system has higher load and worse response time for several minutes after code is updated, the traffic is not high enough to push it to conditions it is unable to recover. This time code update was not successful and by the time rollback was completed the traffic was already high enough to cause the problems.
So the immediate solution to bring the system up was surprisingly simple. We just had to get traffic on the system in stages allowing Memcache to be warmed up. There were no code which would allow to do it on application side so we did it on MySQL side instead. “SET GLOBAL max_connections=20” to limit number of connections to MySQL and so let application to err when it tries to put too much load on MySQL as MySQL load stabilizes increasing number of connections higher until you finally can serve all traffic without problems.
So what we can learn from this, besides cache design related issues I mentioned in the previous post.
Include Rollback in Maintainance Window Ensure you plan the maintainance window long enough so you can do rollback inside this window and do not hesitate to do this rollback
if you’re running out of time. Know how long rollback takes and have it well prepared. Way to often I see people trying to make things work until time allocated for the operation is up and when
rollback have to be done outside of the time window allowed.
Know your Cold Cache Performance and Behavior Know how your application behaves with cold cache. Does it recovers or does it just dies with the high traffic ? How high is the response time penalty and how long it takes to reach normal performance ?
Have a way to increase traffic gradually There are many reasons beyond caching when you may want to slowly ramp up the traffic on the system. Make sure you have some means to do that. I’d recommend doing it on user session so some users are in and can use the system completely while others have to wait for their turn to get in. It is a lot better than having it done on page basics when you randomly have some pages giving error messages. In some cases you can also do ramp up feature by feature.
Consider Pre-Priming Caches In some cases when cold performance gives too bad response time you may want to prime the caches by running/replaying some production workload on the system before it is put online. In this case all ramp up and suffering from bad response time can be done by script… which does not care.
Entry posted by peter | No comment
PlanetMySQL Voting: Vote UP / Vote DOWN
Caching could be the last thing you want to do
Июль 24th, 2010I recently had a run-in with very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected.
What is that mistake?
The ecommerce package I was working with depended on caching. Out of the box it couldn’t serve 10 pages/second unless I enabled some features which were designed to be “optional” (but clearly they weren’t).
I think with great tools like memcached it is easy to get carried away and use it as the mallet for every performance problem, but in many cases it should not be your first choice. Here is why:
- Caching might not work for all visitors - You look at a page, it loads fast. But is this the same for every user? Caching can sometimes be an optimization that makes the average user have a faster experience, but in reality you should be caring more that all users get a good experience (Peter explains why here, taking about six sigma). In practice it can often be the same user that has all the cache misses, which can make this problem even worse.
- Caching can reduce visibility – You look at the performance profile of what takes the most time for a page to load and start trying to apply optimization. The problem is that the profile you are looking at may skew what you should really be optimizing. The real need (thinking six sigma again) is to know what the miss path costs, but it is somewhat hidden.
- Cache management is really hard – have you planned for cache stampeding, or many cache items being invalidated at the same time?
What alternative approach should be taken?
Caching should be seen more as a burden that many applications just can’t live without. You don’t want that burden until you have exhausted all other easily reachable optimizations.
What other optimizations are possible?
Before implementing caching, here is a non-exhaustive checklist to run through:
- Do you understand every execution plan of every query? If you don’t, set long_query_time=0 and use mk-query-digest to capture queries. Run them through MySQL’s EXPLAIN command.
- Do your queries SELECT *, only to use subset of columns? Or do you extract many rows, only to use a subset? If so, you are extracting too much data, and (potentially) limiting further optimizations like covering indexes.
- Do you have information about how many queries were required to generate each page? Or more specifically do you know that each one of those queries is required, and that none of those queries could potentially be eliminated or merged?
I believe this post can be summed up as “Optimization rarely decreases complexity. Avoid adding complexity by only optimizing what is necessary to meet your goals.” – a quote from Justin’s slides on instrumentation-for-php. In terms of future-proofing design, many applications are better off keeping it simple and refusing the temptation to try and solve some problems “like the big guys do”.
Entry posted by Morgan Tocker | No comment
PlanetMySQL Voting: Vote UP / Vote DOWN
As of late…
Июль 15th, 2010* Fixing bugs in DBD::mysql, just released 4.015, 4.016, and next 4.017. I had a patch sent yesterday from a user/developer that I want to get out there
* Memcached::libmemcached - 0.4201 version - now using latest libmemcached 0.42. This is the only Perl client that supports binary protocol!
patg@patg-desktop:~/code_dev/perl-libmemcached$ PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/12-set-get-binary.t
t/12-set-get-binary....ok
All tests successful.
Files=1, Tests=5, 0 wallclock secs ( 0.04 cusr + 0.01 csys = 0.05 CPU)
Whoot!
* FederatedX (in Maria) - fixing MySQL bug 32426, https://bugs.launchpad.net/maria/+bug/571200 . This involves a little work as it is fixed in Federated (not FederatedX) and FederatedX has a whole new design using an IO class to abstract database driver details as well as numerous other changes. But it will happen.
* Delving into C++ Boost libraries. These look quite useful!
PlanetMySQL Voting: Vote UP / Vote DOWN
As of late…
Июль 15th, 2010* Fixing bugs in DBD::mysql, just released 4.015, 4.016, and next 4.017. I had a patch sent yesterday from a user/developer that I want to get out there
* Memcached::libmemcached - 0.4201 version - now using latest libmemcached 0.42. This is the only Perl client that supports binary protocol!
patg@patg-desktop:~/code_dev/perl-libmemcached$ PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/12-set-get-binary.t
t/12-set-get-binary....ok
All tests successful.
Files=1, Tests=5, 0 wallclock secs ( 0.04 cusr + 0.01 csys = 0.05 CPU)
Whoot!
* FederatedX (in Maria) - fixing MySQL bug 32426, https://bugs.launchpad.net/maria/+bug/571200 . This involves a little work as it is fixed in Federated (not FederatedX) and FederatedX has a whole new design using an IO class to abstract database driver details as well as numerous other changes. But it will happen.
* Delving into C++ Boost libraries. These look quite useful!
PlanetMySQL Voting: Vote UP / Vote DOWN
Webinar today – Scaling Web Services with MySQL Cluster, Part 1: An Alternative to MySQL Server & memcached
Июнь 9th, 2010MySQL and memcached has become, and will remain, the foundation for many dynamic web services with proven deployments in some of the largest and most prolific names on the web. There are classes of web services however that are update-intensive, demanding real-time responsiveness and continuous availability. In these cases, MySQL Cluster provides the familiarity and ease-of-use of the regular MySQL Server, while delivering significantly higher levels of write performance with less complexity, lower latency and 99.999% availability. This webinar will discuss the use-cases for both approaches, and provide an insight into how MySQL Cluster is enabling users to scale their update-intensive web services.
The webinar starts at 09:00 Pacific/17:00 UK/18:00 CET today (June 9th 2010).
Still time to register (for free) at http://www.mysql.com/news-and-events/web-seminars/display-545.html – even if you can’t attend, this way you’ll get sent a link to the charts and replay.
PlanetMySQL Voting: Vote UP / Vote DOWN
On Good Instrumentation
Июнь 1st, 2010In so many cases troubleshooting applications I keep thinking how much more efficient things could be going if only there would be a good instrumentation available. Most of applications out there have very little code to help understand what is going on and if it is there it is frequently looking at some metrics which are not very helpful.
If you look at the system from bird eye view – system needs to process transactions and you want it to successfully complete large number of transactions it gets (this is what called availability) and we want it to serve them with certain response time, which is what is called performance. There could be many variables in environment which change – load, number of concurrent users, database, the way users use the system but in the nutshell all what you really care is having predictable response time within certain range. So if we care about response time – this is exactly what our instrumentation should measure
Response Time Summary We want to understand where exactly response time comes from. For example if we define transaction as the time it took to generate HTML page we want to understand how much time was spent waiting on the database, memcache, other external services, as well as how much CPU time it consumed.
Now what is important we need this information for individual transactions. It may be every transaction which is best and easily achievable for small-medium systems or at least for large enough sample. It is very important this information is available for individual transactions not the average. Average is useless because 100 transactions taking 1 sec and 99 transactions having 1ms and 1 taking 99.1 sec will have the same average while for sake of performance analyzes these are completely different. When you have transaction sample make sure it contains fair population of transaction – getting only transactions which are slow is not helpful as we might want to compare them to the fast transactions to understand why they are slow.
What kind of components do you need to have in response time summary – all components which are significant enough. If your instrumentation has 95% of response time unaccounted for it is useless. You also want blocks not to mix apples with oranges. For example “mysql and memcache” block would not be helpful. Even further I would prefer to split “mysql time” in the “connect time” and “query time” as there are situations when one but not other would be affected.
In is important for response time summary stored in the logs which are easy to query so you can analyze data in a lot of various ways. Sometimes you may find the response time is impacted by queries from certain user, in others it may be attributed to different application/web server.
The goal for Response Time Summary is to quickly point direction where problem happens. Whenever you have spike in response time or it is bad response time for certain kind of request you can quickly understand where does it come from ? Is it wasted CPU time Slow response from MySQL or Memcache.
I also like to see numbers of calls stored together with attributed response time. For example I’d like to see number of mysql calls in addition to MySQL response time. This helps to understand if it is the issue with number of queries or their performance. If I see 2 queries taking 30 seconds it is clearly slow queries. If it is 10.000 queries executed and total response time is 4 sec I know it is pretty much as good as it gets with standard Ethernet network and finding a ways to reduce number of queries is going to be more helpful.
The Glue Our applications involve multiple layers and typically higher layer can only report response time it took to call lower layer, but not the reason for that response time. For example we can report time it took to execute SQL query from PHP application side, but we can’t say why it has taken so long. Was it row level lock ? waiting on disk IO or was it simply question of burning a lot of CPU. On the other hand this information may be available in the instrumentation stats from that lower layer – for example in MySQL Query Log. What is important is however to be able to connect the data from these logs – glue them together. The easy way to do it is to provide an unique identifier to all requests and put it in the logs with request of the lower levels. With mySQL the simple way to do it is to put it in the comments for queries you execute.
Optional Tracing The information in lower layers logs is very helpful however it typically have two problems. First not every layer has good logs. For example if you’re running memcached you probably do not have the logs detailing all requests and their response time. Second – the lower layer may only know response time from its vantage point, which in many cases does not include network communication time which can be very important.
Tracing should be optional and normally applied to the small sample of requests, though it needs to be detailed. Typically you would include the calls to the lower level services together with timestamp, the response summary with timestamp again. The information about request has to be complete enough to identify target action and response completely. For example if I’m speaking about memcache I’d like to know which server:port request was issued to which key was requested, and on response I’d like to know if it was hit, miss or error.
The way I use it may be as follows. I see the increased response time for given kind of request. I see response time is coming from MySQL. I check the number of MySQL Queries and it is 5x when it usually is for this kind of request. Looking at memcache stats I can see high number of misses. Looking at some available traces shows the server memcache01 has very high miss ratio. Checking what is going on with memcache01 shows it just was restarted (and hence has almost empty cache). This is important example as it shows your increased response time from MySQL may not have anything to do with MySQL itself but you would not know unless you’re capturing the right data.
If you’re looking for nice example framework for instrumentation, check out instrumentation for PHP – It has everything mentioned by tracing which is trivial to add.
Entry posted by peter | No comment
PlanetMySQL Voting: Vote UP / Vote DOWN




