Archive for the ‘admin’ Category

How to quickly identify queries with pt-query-digest and pt-query-advisor from rules ?

Январь 16th, 2012

Today I’m working on integrate the Percona toolkit (instead of maatkit) in my own tools and I’m playing with pt-query-digest and pt-query-advisor.
These tools can be very interesting to identify some queries from established rules.
The  –review option is available for two of them and helps me to store a sample of each class of query and match them with an advice.

The rules (or advices) are available in the pt-query-advisor documentation and let you identify various problems such as queries with an argument with leading wildcard or with a table joined twice, for example.
There are 3 types of rules : note, warning and  critical.

The goal here is to quickly identify the queries corresponding to a rule, from the slow query log.
So, let me explain what was my method to do that, and, please, give me yours.

First, I need to create a table to store the queries (you can also use –create-review-table option) :

CREATE TABLE query_review (
   checksum     BIGINT UNSIGNED NOT NULL PRIMARY KEY,
   fingerprint  TEXT NOT NULL,
   sample       TEXT NOT NULL,
   first_seen   DATETIME,
   last_seen    DATETIME,
   reviewed_by  VARCHAR(20),
   reviewed_on  DATETIME,
   comments     TEXT
)

It’s interesting to note that the 3 last columns are used to annotate the queries. A simple PHP interface or script can be used to fill these fields (sorry, not enough time to write this script now).

Second easy step, put the queries from the slow query log to the review table with pt-query-digest :
(Note that my review table is in the mysql database and my slow query log is /var/log/slowq.log)

pt-query-digest /var/log/slowq.log --review <DSN OPTIONS>
DSN OPTIONS : u=<user>,p=<pass>,D=mysql,t=query_review,S=/tmp/mysql.sock

And obtain this result in database :
(fingerprint, sample and number of queries truncated for readability)

 
Now, generate an analyze file with pt-query-advisor from the review table :

pt-query-advisor --review <DSN OPTIONS> >/tmp/analyze_queries.log
DSN OPTIONS : u=<user>,p=<pass>,D=mysql,t=query_review,S=/tmp/mysql.sock

In this file, you can find a list of rules with associated queries (but you don’t need to edit the file, go to next step) :

CLA.007 0x4D97479E1774609A 0x6040F6055F330176 0x6BF38A80507513C8 0x7A4ACE71E642E539 0xCE21735A6006AB3D 0xEE13BCBC394FCD83

Finaly, to retrieve all the queries for a particular rule, use the command below :

cat /tmp/analyze_queries.log | grep <RULE ID>
| awk '{for ( i = 2 ; i <= NF ; i++ ) print substr($i,3)}'
| xargs -i mysql -u<user> -p<password> -E
-e"SELECT * FROM mysql.query_review WHERE CHECKSUM=CONV('{}',16,10)"

Replace the RULE ID, user and password with your informations.
The rules ID can be find in the pt-query-advisor documentation.

This is where the real work begins !

Hope that can help.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster training Jan 24 — 26 in DC

Январь 13th, 2012

We still have a few seats left for our MySQL Cluster training in Washington DC January 24 - 26. If you're interested in learning about MySQL Cluster, the architecture, how to install it, administer it and troubleshoot it this is the course for you. The course will also cover replication and optimization and we will also discuss the exciting new features coming in version 7.2 of MySQL Cluster. For more information about the course contents visit http://www.skysql.com/services/training/courses/administering-mysql-cluster and to sign up to the course go to http://www.skysql.com/services/training/schedule/administering-mysql-cluster-1.

See you there!

 


PlanetMySQL Voting: Vote UP / Vote DOWN

[RELOADED] Vote for MySQL+ community awards 2011 !

Январь 5th, 2012

[UPDATE 2011/01/11] : New poll added, vote for the best GUI client tool ! (And continue to vote for other polls)
And thanks again for your involvement. It’s time to vote again… 

First of all, I wish you a happy new year.
Many things happened last year, it was really exciting to be involved in the MySQL ecosystem.
I hope this enthusiasm will be increased this year, up to you !

To start the year, I propose the MySQL+ Community Awards 2011
It will only take 5 minutes to fill out these polls.
Answer with your heart first and then with your experience with some of these tools or services.

Polls will be closed January 31, so, vote now !
For “other” answers, please,  let me a comment with details.

Don’t hesitate to submit proposal for tools or services in the comments.
And, please, share these polls !

 

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Happy 2012 !
Cédric

This article is obviously not sponsored !
(MySQL is a trademark of Oracle Corporation and/or its affiliates)

Sources :


PlanetMySQL Voting: Vote UP / Vote DOWN

Monitoring your monitoring tools (MONyog inside) !

Декабрь 2nd, 2011

Regardless of the monitoring tool you use to monitor your databases, it can be better to monitor this tool.
No, it’s not a joke ! Do you think you can have a benefit with a monitoring tool not connected to your servers ? ( without being alerted )

I choose to talk about MONyog here but this can apply to all existing monitoring tools.
I just want to share the message, the tool does not matter, so, do it !

So, let me explain how to control if you have fresh data with MONyog.
With MONyog it’s easy because it’s an agentless monitoring tool.

There are two ways to check that :

Per server general info :


 

 

 

 

 

 

For each server, you can find the last data updated date in the Monitors/Advisors tab.
The MySQL Availability in the dashboard is also usefull.

But it’s a per server information, so, that can be boring if you have a lot of databases servers.

General log file :

 

The default location for this file is MONyog_PATH/MONyog.log on your MONyog server.
You can find usefull informations there about MySQL, SFTP or SSH connexion failures for all your databases servers, in one place.

This log can also be used to detect client side errors if you don’t have access to the application servers.

Hope that can help, tell us what is the best method to do that with others monitoring tools like MySQL Enterprise Monitor.

Have a nice week-end.


PlanetMySQL Voting: Vote UP / Vote DOWN

My slides of MySQL Meetup Viadeo / LeMUG Paris

Ноябрь 22nd, 2011

I was glad to present how to schedule and monitor mysqldump with ZRM community last week in Paris as part of the MySQL Meetup Viadeo / LeMUG

You can find my slides below, enjoy ! :

Thanks to Olivier and all the viadeo team in Paris for this event.
And, of course, thanks to all attendees.
We need more events like that in France !

PlanetMySQL Voting: Vote UP / Vote DOWN

Why a new memory engine may change everything ?

Сентябрь 26th, 2011

I’m sure you are aware that the last Percona server release includes a new improved MEMORY storage engine for MySQL.
This new engine is based on Dynamic Row Format and offers some of great features, specialy for VARCHAR, VARBINARY, TEXT and BLOB fields in MEMORY tables.

But because this new MEMORY engine by Percona has some limitations and because Percona server hasn’t used it for its internal temporary tables yet, I would like to talk about what can be the real benefits to have a brand new MEMORY engine based on Dynamic row format specialy for internal memory tables.

Just remember or discover how MySQL uses internal memory tables

And the MEMORY storage engine characteristics and its limitations

So, the memory storage engine transforms all varchar fields in char fields for internal temporary tables or for user created memory tables. 

1. Let me explain what is the problem with a simple exemple :

I’ve created an InnoDB table (without index) with two varchar fields (50 & 100) :

mysql> show table status like 'test_memory5'\G
*************************** 1. row ***************************
           Name: test_memory5
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 621089
 Avg_row_length: 66
    Data_length: 41484288
Max_data_length: 0
   Index_length: 0
      Data_free: 5242880
 Auto_increment: NULL
    Create_time: 2011-09-21 13:10:17
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:

The size of this table is about 48Mb (and more than 600.000 rows) :

-rw-rw---- 1 mysql mysql  48M 2011-09-21 13:11 test_memory5.ibd

Now, I’m creating a new memory table with exactly the same structure and I’m setting paramters for memory tables like this :

  • set tmp_table_size=50*1024*1024;
  • set max_heap_table_size=50*1024*1024;

That’s mean I can create a 50Mb max memory table.
Thus, let me insert my 600.000 rows in this table :

mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

My 50Mb memory table can’t  contain the 48Mb of the InnoDB table !
Let’s try with 80Mb :

mysql> set tmp_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> set max_heap_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

And this error occurs until the memory tables can have a 110Mb maximum size !

Why ? Because the two varchar fields of the InnoDB table are converted in char fields with the memory storage engine.
Let’s see this example from MySQL documentation :

Value CHAR(4) Storage Required VARCHAR(4) Storage Required
'' '    ' 4 bytes '' 1 byte
'ab' 'ab  ' 4 bytes 'ab' 3 bytes
'abcd' 'abcd' 4 bytes 'abcd' 5 bytes
'abcdefgh' 'abcd' 4 bytes 'abcd' 5 bytes

By the way, this is a very good reason to take care of your varchar fields.

Conclusion : A memory table can be really bigger than an InnoDB table

2. Let  me explain why a new memory engine may change everything :

Changing the rules for memory tables may change everything for, at least, two reasons :

  • VARCHARVARBINARYTEXT and BLOB fields will be supported by this new engine for user created memory tables (Percona server can do it with restrictions)
  • Internal memory tables could be more efficient with Dynamic Row Format
The real benefits will come with internal memory tables, how often do you see that when you “explain” your queries :
Extra: Using where; Using temporary; Using filesort

Thus, for each query using a temporary table, MySQL could use less of memory (RAM) !
I don’t know what would be the real benefit but I am convinced that it can be really significant.

I look forward to see more benchmark about that with last percona server release and I hope that Percona server or MariaDB will support dynamic row format for internal memory tables soon.

Please, let us know if you have already tested this new Percona Memory engine.


PlanetMySQL Voting: Vote UP / Vote DOWN

Finding tables without primary keys

Сентябрь 5th, 2011
I was checking a third party server, and I needed to find if there were tables without primary keys. This is important to know, not only because the lack of primary keys affects performance and data accuracy in general, but also because in row-based replication performance can degrade beyond belief when updating tables without primary keys.Anyway, I did not remember off the bat any method to get this information from a server with thousands of tables, and thus I went to find a solution on my own.My first instinct called for using the COLUMNS table from the INFORMATIOn_SCHEMA, and so I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):

select
table_schema,table_name
from
information_schema.columns
group by
table_schema,table_name
having
sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;
This query got the job done, and it was quite quick as well.Then, since I was chatting with Sheeri Cabral about other matters, I asked her if she could come up with an alternative solution. She suggested a LEFT JOIN between the information_schema.tables and information_schema.statistics, which I translated into this query:

select
t.table_schema, t.table_name
from
information_schema.tables t
left join information_schema. statistics s
on t.table_schema=s.table_schema and t.table_name=s.table_name
and s.non_unique=0
where
s.table_name is null;
This query works on the principle that it removes from the tables list all the ones for which there is no corresponding table in the statistics table.This query also works. Using both queries in a relatively empty server did not show any significant difference. But since I knew that I had to use this method on a very busy server, with a lot of tables, I quickly created 1,000 databases, each containing 5 tables, two of which did not have any primary or unique key.Now came the first surprise.The query with GROUP BY took about 0.5 seconds, while the one using LEFT JOIN used 11 seconds.I was about to congratulate myself for my acumen, when I realized that, in addition to schema and table names, I also needed the table engine.For the second query, that is not a problem. Adding the engine to the columns list works OK, and the query runs in 11 seconds like before.The first query, though, can't list the engine. There is no 'engine' in the COLUMNS table. So I needed a JOIN. Thus my query became

select
t.table_schema,t.table_name,engine
from
information_schema.tables t
inner join information_schema .columns c
on t.table_schema=c.table_schema and t.table_name=c.table_name
group by
t.table_schema,t.table_name
having
sum(if(column_key in ('PRI','UNI'), 1,0)) =0;
Guess what? This query ran in 17 seconds (!). So much for my instinct!Joins without keys are not efficient in MySQL, and tables in the information schema are no exception.If anyone has a more efficient method of getting a list of tables without primary key (the list including schema name, table name, and engine), I am curious to know.

PlanetMySQL Voting: Vote UP / Vote DOWN

Scribd is Hiring (I’m Looking for an Operations Engineer to Join My Team)

Август 17th, 2010

Scribd is a top 100 site on the web and one of the largest sites built using Ruby on Rails. As one of the first rails sites to reach scale, we’ve built a lot of infrastructure and solved a lot of challenges to get Scribd to where it is today. We actively try to push the envelope and have contributed substantial work back to the open source community.

Scribd has an agile, startup culture and an unusually close working relationship between engineering and ops. You’ll regularly find cross-over work at Scribd, with ops people writing application-layer code and engineers figuring out operations-level problems. We think we’re able to make that work because of the uniquely talented people we have on the team.

To allow us to keep scaling, we’re now looking to add a strong, experienced operations guru to the team. As a member of Scribd operations, you’ll have tremendous ownership and responsibility for one of the web’s most popular applications. Because Scribd is a startup, you will wear many hats and have broader responsibility than you would at a larger company.

If you read this blog, you should already have a good sense of the kind of work you’ll be doing on this position.

The Ideal Profile

You are an experienced operations professional and have run ops at at least one large-scale website. You have comprehensive knowledge of a broad variety of system tools, from MySQL and Nginx to Squid and Memcached. You should also have strong software development skills and be well-versed in major programming languages. You should be strongly motivated, a creative solution finder, and ready to jump into the thorniest technical problems whenever necessary.

Responsibilities

  • Develop and maintain all aspects of Scribd’s operations infrastructure, including system monitoring, backups, server configuration, databases, and caching systems
  • Collaborate with engineering to create next generation infrastructure to support changing requirements
  • Predict scaling problems before they occur and work with engineering to prevent them
  • Write and debug application level ruby code
  • Participate in an on-call rotation
  • Quickly diagnose server problems and employ preventive measures to maintain high availability servers

Qualifications

  • Bachelors degree in CS or equivalent experience
  • 3-5 years of professional experience in site operations
  • Strong software engineering skills, including knowledge of major programming languages
  • Strong database skills, preferably with MySQL, and overall linux knowledge
  • Experience with most of the following technologies: MySQL, Nginx, Ruby, Memcached, Squid, git, Solr, HBase, Postfix
  • Proven ability to quickly learn and implement unfamiliar technologies
  • Strong desire to work hard at a rapidly growing company

Location: You are preferably located near San Francisco, CA. Relocation assistance is designed on a per-case basis. In short, we’ll be creative to get you here.

Contact: Please send your email cover letter and resume with the subject “Your name – Senior Site Operations Engineer – via Kovyrin.net” to jobs@scribd.com or contact me directly using any of my contacts. All communication and correspondence is held in the strictest confidence to ensure that you can connect and learn more without exposure.



PlanetMySQL Voting: Vote UP / Vote DOWN

Enabling IPv6 Support in nginx

Январь 16th, 2010

This is going to be a really short post, but for someone it could save an hour of life.

So, you’ve nothing to do and you’ve decided to play around with IPv6 or maybe you’re happened to be an administrator of a web service that needs to support IPv6 connectivity and you need to make your nginx server work nicely with this protocol.

First thing you need to do is to enable IPv6 in nginx by recompiling it with --with-ipv6 configure option and reinstalling it. If you use some pre-built package, check if your nginx already has this key enabled by running nginx -V.

The results should have --with-ipv6 option in configure arguments:

1
2
3
4
5
[root@node ~]# nginx -V
nginx version: nginx/0.7.64
built by gcc 4.1.2 20080704 (Red Hat 4.1.2-46)
TLS SNI support disabled
configure arguments: --with-ipv6 ... --prefix=/opt/nginx

After you’ve got your nginx binary with IPv6 support, you need to enable it by changing listen directives in your configuration file.

If your server binds to all interfaces/IPs, you already have listen 80 or something like that in your file. Those lines should be changed to make sure you tell your nginx to bind on both IPv4 and IPv6 addresses:

1
listen [::]:80;

For situations when you do not want to listen on IPv4 interfaces, there is ipv6only=on parameter:

1
listen [::]:443 default ipv6only=on;

For configurations that need to bind to specific ip addresses you could use similar notation:

1
listen [2607:f0d0:1004:2::2]:80;

After changing your configs and testing them you need to restart (not reload) your nginx process and then check your system port bindings to make sure it works as expected:

1
2
3
[root@node ~]# netstat -nlp | grep nginx
tcp   0    0 :::80        :::*         LISTEN    23817/nginx
tcp   0    0 :::443       :::*         LISTEN    23817/nginx

This is it, now you can add AAAA records to your main domain name or just create a dedicated ipv6.yourcompany.com sub-domain and show it to your friends :-)



PlanetMySQL Voting: Vote UP / Vote DOWN