Archive for the ‘Linux’ Category

Mixed signals in IT’s great war over IP

Май 10th, 2012

Recent news that Microsoft and Barnes & Noble agreed to partner on the Nook e-reader line rather than keep fighting over intellectual property suggests the prospect of more settlement and fewer IP suits in the industry. However, the deal further obscures the blurry IP and patent landscape currently impacting both enterprise IT and consumer technology.

It is good to see settlement — something I’ve been calling for, while also warning against patent and IP aggression. However, this settlment comes from the one conflict in this ongoing war that was actually shedding some light on the matter, rather than further complicating it.

See the full article at TechNewsWorld.


PlanetMySQL Voting: Vote UP / Vote DOWN

Mixed signals in IT’s great war over IP

Май 10th, 2012

Recent news that Microsoft and Barnes & Noble agreed to partner on the Nook e-reader line rather than keep fighting over intellectual property suggests the prospect of more settlement and fewer IP suits in the industry. However, the deal further obscures the blurry IP and patent landscape currently impacting both enterprise IT and consumer technology.

It is good to see settlement — something I’ve been calling for, while also warning against patent and IP aggression. However, this settlment comes from the one conflict in this ongoing war that was actually shedding some light on the matter, rather than further complicating it.

See the full article at TechNewsWorld.


PlanetMySQL Voting: Vote UP / Vote DOWN

Installing MySQL from source/binary tarball as a Linux service

Май 1st, 2012

I've written before I prefer to do a manual install of MySQL over a repository one. I still do: I typically install from binary tarball or by compiling from source.

I'd like to share my setup procedure for Linux installation and service setup. I've done this dozens of times, on different Linux flavors, and it works well for me.

Installing from source

To get this straight: you sometimes have to compile the source files. I, for example, happen to use the Sphinx MySQLSE extension. You can only use it if compiled with MySQL. You had to compile a "vanilla" 5.1 version without query cache in order to completely remove the cache's mutex contention.

Anyway, I find the easiest way is to install onto a path associated with the server version. For example, I would install a 5.5 server onto /usr/local/mysql55

This way, a new version gets its own path, and no ambiguity.

To do that, use the prefix option on configuration step:

cd /path/to/extracted/source/tarball
sh BUILD/autorun.sh
./configure --prefix=/usr/local/mysql55
make
sudo make install

Once this is complete, you have everything under /usr/local/mysql55. This means binaries, libraries, scripts, etc.

To install the MySQL server as a service, copy the mysql.server script to /etc/init.d:

sudo cp /usr/local/mysql55/support-files/mysql.server /etc/init.d/mysql55

Again, I'm naming the script after the MySQL version. This avoids conflict with possible past or future installations of the MySQL server, which typically create a service named mysql or mysqld.

A thing to note about the mysql.server script is that it allows you (at around line #45) to set two variables:

  • basedir: path to your installation directory. When compiling from source this is already setup with the path provided to the configure script. Thus, in our example, you can expect this variable to read /usr/local/mysql55. So basically nothing to do here.
  • datadir: path to your data directory. If you're putting your my.cnf file in /etc or /etc/mysql, then setting datadir in my.cnf suffices. However, if you're going to put my.cnf itself on the data directory (e.g. so as to avoid collisions) then make sure to set the variable in the mysql.server init script.

Depending on your $PATH configuration, it is also a good idea to specify basedir variable on your my.cnf's [mysqld] section.

Which leads us to $PATH: your linux system is still unaware of the many binaries you've got in there. I typically add the following line at the end of /etc/bash.bashrc:

export PATH=/usr/local/mysql55:${PATH}

This is the most global PATH settings one can do. Alternatively, use /etc/profile, ~/.bashrc etc. (you may have noticed by now I'm working with bash).

Finally, need to setup the init script to run at startup and stop at shutdown.

  • On Debian/Ubuntu/related I use rcconf (I'm too lazy to remember the command line setup).
  • On RedHat/CentOS/related I use chkconfig --add mysql55, or  linuxconf (since I'm lazy).

Installing from binary tarball

The only difference is that the mysql.server script is unaware of our deployment path. So the basedir variable must be set in that file. Other than that, follow same steps as for source installation (oh, of course no need to configure & make...).


PlanetMySQL Voting: Vote UP / Vote DOWN

CAOS Theory Podcast 2012.04.20

Апрель 20th, 2012

Topics for this podcast:

*OpenStack, Amazon, Eucalyptus and Citrix engage in open cloud warfare
*Microsoft spins off new company for openness
*Updates on automation players Puppet Labs and Opscode with Chef
*Percona turns attention to MySQL high availability
*Open APIs as the fifth pillar of modern IT openness

iTunes or direct download (28:42, 4.9MB)


PlanetMySQL Voting: Vote UP / Vote DOWN

Installing MySQL-Frontend Chive (A phpMyAdmin Alternative)

Апрель 19th, 2012

Installing MySQL-Frontend Chive (A phpMyAdmin Alternative)

This guide explains how to install the phpMyAdmin alternative Chive. Chive is a free, open source, web-based database management tool with easy administration, super fast UI and state of the art web technologies. It takes advantage of the capabilities of modern browsers. Features include an SQL editor with syntax highlighting and built-in profiling of SQL queries.


PlanetMySQL Voting: Vote UP / Vote DOWN

Installing MySQL-Frontend Chive (A phpMyAdmin Alternative)

Апрель 19th, 2012

Installing MySQL-Frontend Chive (A phpMyAdmin Alternative)

This guide explains how to install the phpMyAdmin alternative Chive. Chive is a free, open source, web-based database management tool with easy administration, super fast UI and state of the art web technologies. It takes advantage of the capabilities of modern browsers. Features include an SQL editor with syntax highlighting and built-in profiling of SQL queries.


PlanetMySQL Voting: Vote UP / Vote DOWN

Analyzing I/O performance

Апрель 19th, 2012

There are probably thousands of articles on the Internet about disk statistics in Linux, what various columns mean, how accurate the information is, and so on. I decided to attack the problem from a little bit more practical side. Hopefully this will be just the first of many future posts on identifying various I/O related performance problems on a MySQL server.

Linux exposes disk statistics through /proc/diskstats. However the contents of this file isn’t something anyone can understand quickly. It needs a tool to transform the information into something human readable. A tool that is available for any Linux distribution is called iostat and comes with sysstat package.

How to access and read I/O statistics

Usually you want to call iostat one way:

iostat -xkd <interval> <block device>

The interval should typically be one second as it is the base unit for many things, but also because it gives the best level of detail. With longer intervals some of it could be averaged out. This is a typical example of where any analysis begins:

bash-4.2 # iostat -xkd 1 /dev/drbd0

[..]

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          22.35    0.12    9.61    3.12    0.00   64.79

Device: rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0     0.00     0.00   16.00   79.00   192.00  1408.00    16.84     2.66   27.96   3.75  35.60

Note: Make sure to ignore the first round of output iostat produces as it shows the statistics concerning the time since the system was booted rather than only a current period.

What does it say? In interval of the specified length, /dev/drbd0 device was doing 16 reads per second (r/s) and 79 writes per second (w/s). The average I/O request size was roughly 8.5 kilobytes (avgrq-sz * 512 bytes), while the average wait time 27.96 milliseconds (await).

How to use the information?

Out of all these metrics, one is definitely more important than others. The average wait time says how much time an average I/O request spent both waiting in queue and executing, which you should read as how long my database may had to wait on a disk access. It’s the response time. This is only an average, so by definition not a very accurate information, however in most cases it tells enough to recognize a potential problem. For example, nearly thirty milliseconds could be a lot if one expects queries to return within only a few or less. The queries that find information buffered in memory may still meet the expectations, however the disk-bound ones could very well become many times slower.

It’s important to relate the response time to the capabilities of the storage on which it is measured. You should not expect the same range of values from an EBS volume as from a locally attached RAID array.

Newer versions of iostat provide even more useful information by splitting await into two separate columns: one for reads (r_await) and one for writes (w_await). This way you can see how each type of activity contributes to the global average, but more importantly, it helps to understand more precisely where the problems may be. Unfortunately this version is not available on many Linux distributions, at least not by default.

But there is another way to get the extra insight. A script called pt-diskstats from Percona Toolkit reads information from /proc/diskstats directly and generates an output that includes these more detailed statistics.

bash-4.2 # pt-diskstats -d 'drbd0$' -c 'rd_s|rd_rt|rd_cnc|wr_s|wr_rt|wr_cnc'

#ts device   rd_s rd_cnc   rd_rt    wr_s wr_cnc   wr_rt
{1} drbd0     7.9    0.0     1.0    62.2    0.9    15.1
{1} drbd0     4.6    0.0     1.1    57.9    1.4    24.2
[..]

Note: The example uses pt-diskstats 1.0. The current version is 2.1 and implements a different naming scheme for the command line parameters, however I discovered the newer versions sometimes used to produce incorrect results and therefore I postponed my upgrade until later time.

The output is a bit different from iostat, but it shows the same information. rd_rt and wr_rt are response times of the respective I/O request types and they carry the same information as await.

The next step would be looking at how many I/O operations happen every second. If the sum of r/s and w/s is close to the storage theoretical capacity, it can very well explain the elevated await times. In order words, this could mean the system is running out of the I/O capacity. Being able to identify such situation is important as it can save you from investigating a performance problem when its direct cause is just in front of your eyes. This is why you should always benchmark a storage before it is put to use, although please remember that you need to benchmark for IOPS rather than for throughput.

Afterwards it may also be worth checking at how much data is being pushed in or out. A very long sequential read or write may need significantly more time compared to a much shorter one, so a larger avgrq-sz could explain the extended waits. Until you learn what values make sense and which look odd, this is where a good graphing system comes in handy as it would enable you to compare the historical sizes to current ones. In practice, however, this metric is mostly useful in discovering problems that are not coming from database itself as the only sequential I/O it can be doing are read-aheads and they aren’t very common. A running backup, some runaway rsync – these are more typical examples of things that could noticeably affect it. Related values are shown in rkB/s (or rsec/s) and wkB/s (or wsec/s), although they show the accumulated size of reads or writes respectively rather than just per-request average.

One last item is %util. It shows what percentage of the interval the block device was busy serving requests. People often tend jump to conclusions about a storage performance only by looking at this number, because it is easy. I almost never look at it. The value here can sometimes be very misleading as Linux does not recognize that a block device it sees, may be a logical RAID volume that down below consists of multiple physical disks. Therefore any parallel I/O execution will never be reflected properly in this metric, so while it can be used as a hint in certain cases (i.e. when it is close to 100%), you should always verify the actual utilization through other metrics. The simplest way is calculating the real value through the following formula:

concurrency = (r/s + w/s) * (svctm / 1000)
%util = concurrency * 100%

There are two important pieces of information in here. One is that request concurrency can be calculated. Knowing the value may become important when figuring out problems with I/O requests serialization which occurs in applications (e.g. MySQL working as replication slave where queries execute in a single thread). Or it may show that the concurrency is higher than the number of disks in an array, which would imply the storage cannot keep servicing incoming requests. And the second is that utilization can in fact be higher than 100% with a storage built from multiple disks, which is what iostat would never show.

Working with the information

In the final section, let’s go through an example.

Good graphs can be useful. I like to graph the iostat or pt-diskstats output over longer periods, a few minutes at least, because it allows seeing patterns and spotting problems without even bothering yourself with looking at numbers until there is actually a reason.

What we can see is that write response times are quite high and periodically spike to 200 milliseconds or even higher, while at the same time reads seem to be doing well. There appears to be no unusual activity around the spikes, though. The I/O utilization remains in a reasonable range of 100 – 150 IOPS, which is something even a single disk should handle without delays.

With locally attached, ordinary spinning disks the response times should hardly ever cross the ten millisecond threshold. That limit is sufficient for most disks that are commonly installed in database servers. Assuming a RAID array with some write-back caching, write response time shouldn’t even be anywhere near that, because everything should be going straight to a much faster storage – the cache memory. But for some reason we see 15ms or more.

The Throughput and Requests Size graph doesn’t add anything interesting. Nothing there can be correlated with the increased response times.

However when we zoom in on one of the spikes in the I/O Utilization and Performance graph, we can make one interesting observation. Reads not only work without any issues, but seem completely unaffected by the problem (blue and red lines on the graph). Why?

We should focus on the block device for a moment. The statistics were pulled from a DRBD volume, which holds the MySQL data files. DRBD is a block device driver working on top of an another block device, which intercepts writes and synchronously sends modified blocks to another system. Synchronously means that it blocks an I/O request until the write completes in both locations. From there we can quickly conclude that reads are fine, because they always come from a locally attached disk. Writes are slower, because they need to be synchronized over the network to the secondary storage. So perhaps it is the other server that isn’t performing well, or maybe there are problems on the network layer, or maybe DRBD has been poorly configured… Of course, at this point all we have are guesses, but together with the data collected, these guesses set a pretty good direction for further investigation.


PlanetMySQL Voting: Vote UP / Vote DOWN

Creating a local repository for Yum to work

Апрель 17th, 2012


You cannot use Yum to an unregistered Linux by default after installation. As a workaround, you will have to create a repo from your installation CD or ISO file.

1. Mount your DVD/CDROM. Run this command from shell.

mount /dev/cdrom /mnt


2. Or if you have no DVD/CDROM, you can copy your ISO file to the server and mount like this.

mount -o loop -t iso9660 yourisofile.iso /mnt


3. Change directory to /mnt and run this command

yum clean all


5. Edit /etc/yum.repos.d/iso.repo. Use nano or vi.

nano /etc/yum.repos.d/iso.repo


6. Paste below and save.

[local]
name=Local CD Repo
baseurl=file:///mnt
gpgcheck=1
gpgkey=file:///mnt/RPM-GPG-KEY


Now try installing using yum.

PlanetMySQL Voting: Vote UP / Vote DOWN

How to install and configure a Linux server for MySQL?

Апрель 12th, 2012

Have you ever spent a lot of time thinking about how to install and configure a Linux server for MySQL database? I will try to highlight all the critical steps and some of the decisions you may need to make.

Linux distribution.

Unless you have a really good experience in systems administration, choose a widely supported Linux distribution. The best choices usually are RedHat or its free cousin called CentOS. Compatible alternatives you could also consider are Scientific Linux and Oracle Linux. Make sure you will be installing a 64-bit version, unless you have a very good reason not to.

Storage.

If you have multiple disks available in the server, create a single array from all disks. Choose RAID level that offers better performance rather than more disk space, so either RAID 1 or RAID 10 depending on the number of disks (“Should RAID 5 be used in a MySQL server?”). Use stripe (chunk) size of at least 128KB.

LVM.

You may skip it if database carries InnoDB tables only (other than system tables). Otherwise, or if unsure, LVM should be configured as it may help in creating consistent and lock-free MySQL backups. Make sure to leave some free space in the volume group where MySQL logical volume will be. 10% of the logical volume size could be a good default. An example of what you should see after installation:

garfield ~ # vgdisplay <your volume group name>
  --- Volume group ---
  VG Name               <your volume group name>
  System ID
  Format                lvm2
[..]
  VG Size               1,58 TiB
  PE Size               4,00 MiB
  Total PE              413626
  Alloc PE / Size       394426 / 1,51 TiB
  Free  PE / Size       19200 / 75,00 GiB
  VG UUID               aZcyWM-B1mc-5PEc-VeNf-W0fw-hLvV-12w29x

Partitioning.

For a dedicated database server, it is commonly enough to create four partitions:

  • /boot – tiny, to hold boot files – please refer to the system documentation for details
  • / – 20GB or 30GB for the system files and logs
  • swap – nothing larger than a few gigabytes makes sense
  • /mnt/db – all remaining space for MySQL data and binary logs

Filesystem.

Use whatever you want for system partitions. For MySQL partition, ext3 or ext4 are fine, but if you need to squeeze that last ounce of performance out of the system, go with xfs. It can deal much better with concurrent I/O.

Time synchronization.

All your servers should have their clocks updated periodically. Install NTP client and make sure it updates time daily. How to check whether clock is out of sync or not? Once you install ntpdate package, try the following:

garfield ~ # ntpdate -q pool.ntp.org
server 77.65.7.58, stratum 2, offset -29.763656, delay 0.04489
server 91.217.142.1, stratum 2, offset -29.776960, delay 0.08609
server 195.8.52.8, stratum 2, offset -29.761839, delay 0.04979
12 Apr 12:21:25 ntpdate[8676]: step time server 77.65.7.58 offset -29.763656 sec

The clock in my computer was clearly not synchronized :-)

Swap.

Prevent needless swap activity. Add vm.swappiness = 0 into /etc/sysctl.conf. Please refer to “How to prevent swapping on a MySQL server?” for details.

I/O scheduler.

By default any Linux uses CFQ algorithm for I/O scheduling. CFQ does not really care about I/O latency and may serialize requests, which means it can work against database performance. This is why it should be replaced with deadline, or even noop when server has a good hardware RAID controller, on physical volumes where MySQL data files will be stored.

In order to change the I/O scheduler at run time, simply write into a special pseudo-file /sys/block/<block device>/queue/scheduler, for example:

echo "noop" > /sys/block/sda/queue/scheduler

The change can be made permanent by adding such line into an init script such as /etc/rc.local or other that is appropriate.

MySQL configuration.

  1. Create MySQL data directory inside the volume mount point (e.g. /mnt/db/mysql). Do not use the mount point directly or you may be seeing the annoying lost+found database inside MySQL.
  2. Create a directory for binary logs (e.g. /mnt/db/binlog).

Alternatively, instead of making new /mnt/db/mysql, you may simply move the data directory created by the install script from a MySQL package (e.g. mv /var/lib/mysql /mnt/db/).

And do not forget about changing these directories ownership to mysql user!

In my.cnf, set the following:

  1. datadir to /mnt/db/mysql
  2. log-bin to /mnt/db/binlog/mysql-bin
  3. relay-log to /mnt/db/binlog/mysql-relay

Done.

At this point the server is ready for use and you can begin taking care of setting up privileges and tuning the MySQL configuration.


PlanetMySQL Voting: Vote UP / Vote DOWN

Fun with Bash :: one liners

Апрель 10th, 2012

Here are some quick and easy bash commands to solve every day problems I run into. Comment and leave some of your own if you like. I might update this post with new ones over time. These are just some common ones.

Iterate through directory listing and remove the file extension from each file
ls -1 | while read each; do new=`echo $each |sed 's/\(.*\)\..*/\1/'` && echo $new && mv "$each" "$new"; done

Output relevant process info, and nothing else
ps axo "user,pid,ppid,%cpu,%mem,tty,stime,state,command"| grep -v "grep" | grep $your-string-here

Setup a SOCKS5 proxy on localhost port 5050, to tunnel all traffic through a destination server
ssh -N -D 5050 username@destination_server'

Setup a SOCKS5 proxy via a remote TOR connection, using local port 5050 and remote TOR port 9050
ssh -L 5050:127.0.0.1:9050 username@destination_server'

Display text or code file contents to screen but don't display any # comment lines
sed -e '/^#/d' $1 < $file_name_here

Same as above but replacing # lines with blank lines
sed -e '/^#/g' $1 < $file_name_here

Find all symlinks in the current directory and subdirs
find ./ -type l -exec ls -l {} \;

Find all executable files in current directory and subdirs
find ./ -type f -perm -o+rx -exec ls -ld '{}' \;

Remove all files matching the input string
echo -n "filename match to remove [rm -i]: " && read f; find ./ -name ${f} -exec rm -i {} \;

Display largest ten files in current dir and subdirs
du -a ./ | sort -n -r | head -n 10

Display all files in current dir and subdirs in order of filesize
du -a ./ | sort -n -r

Generate a MD5 hash for the input string (not file)
Linux: echo -n "str: " && read x && echo -n "$x" | md5sum
OSX: echo -n "str: " && read x && echo -n "$x" | md5

Display a summary of all files in current and subdirs
for t in files links directories; do echo `find . -type ${t:0:1} | wc -l` $t; done 2> /dev/null

PlanetMySQL Voting: Vote UP / Vote DOWN