Archive for the ‘monitoring’ Category

CRIB – CentRal Information Base for MySQL

Декабрь 10th, 2010

CRIB is a CentRal Information Base for MySQL, a long time coming project since I worked on it on and off for a few weeks and now I decided it’s time to hand it over to the global MySQL community.

So, what is CRIB?

CRIB is a central database which collects information about all your MySQL instances which you set as clients. It is monitoring in a way, but not the typical number of connections, memory, index usage, table scans, cpu usage and such, but rather consists of a repository where, if you have tens, hundreds or even thousands of clients, you can see where a particular user was created, where a certain database name features or which tables does a database consists of. It also features a script which logs table sizes periodically (customizable) so you can graph disk usage over time and be able to forecast future disk space requirements easily.

Download the latest code with: bzr branch lp:crib

First of all lets create the user used to write to the server (mother instance)


[dcassar@ubt]$ use5151
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 56
Server version: 5.1.51 MySQL Community Server (GPL)


Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql > grant create, insert, alter, update, delete, select, drop, execute, create temporary tables, create routine, alter routine, lock tables on crib.* to "mother"@"%" identified by "teacher";
Query OK, 0 rows affected (0.02 sec)


mysql > exit
Bye
[dcassar@ubt]$

Lets go ahead and install CRIB on both server and client


[dcassar@ubt]$ cd INSTALL
[dcassar@ubt]$ ./crib_install.sh

brought to you by Darren Cassar

Anytime you need to cancel installation just press ( Ctrl + C )

Would you like to setup source and destination or just source?
1. Client and Server
2. New Client
Enter choice (default 2): 1
Installation starting

This section you'll supply the destination (CentRal Information Base) details

Enter CRIB user name (default root): mother
Enter CRIB password (default 'toor'):
Enter CRIB hostname (default localhost): 127.0.0.1
Enter CRIB port (default 3306): 5151

This section you'll supply the client details

Enter client user name (default root): dcassar
Enter client password (default 'toor'):
Enter client hostname (default localhost): dolphin
Enter client port (default 3306): 3331
Installation complete
[dcassar@ubt]$

Now is the time to gather the data from the clients and log it onto the server.


[dcassar@ubt]$ ./populate.sh CONFIG/crib_3331.cnf
[dcassar@ubt]$ ./tablesize.sh CONFIG/crib_3331.cnf

Lets log into the CRIB database on the mother server and see what details we have at our disposal.


[dcassar@ubt]$ use5151
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 56
Server version: 5.1.51 MySQL Community Server (GPL)


Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql > use crib
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql > show tables;
+------------------+
| Tables_in_crib   |
+------------------+
| info_instances   |
| info_object_size |
| info_objects     |
| info_privileges  |
| info_users       |
| version          |
+------------------+
6 rows in set (0.00 sec)

mysql > select * from version;
+----+---------+---------------------+
| ID | VERSION | UPDATED_TIMESTAMP   |
+----+---------+---------------------+
|  1 | 0.1     | 2010-12-09 16:24:05 |
+----+---------+---------------------+
1 row in set (0.00 sec)

mysql > select * from info_instances;
+----+-------------------------------------+------+-----------+---------------+-----------------+------------+--------------+------+-------+
| id | hostname                            | port | server_id | mysql_version | os              | os_version | architecture | bits | notes |
+----+-------------------------------------+------+-----------+---------------+-----------------+------------+--------------+------+-------+
|  1 | dolphin                             | 3331 |  14123561 | 5.1.50        | sun-solaris2.10 | 10         | sparc        |   64 | NULL  |
+----+-------------------------------------+------+-----------+---------------+-----------------+------------+--------------+------+-------+
1 row in set (0.00 sec)

From the above you realize you can identify all versions used in your setup, what hardware you are using, port numbers used and much more!

Lets see what users we have so far.

mysql > select * from info_users;
+----+-------------+------------+---------------------------+-------------------------------------------+
| id | instance_id | username   | hostname                  | password                                  |
+----+-------------+------------+---------------------------+-------------------------------------------+
|  1 |           1 | root       | localhost                 | *BA44AFCA02B64C198FAEB3043F47CA3797638975 |
|  2 |           1 | root       | %.dev.domain.com          | *BA44AFCA02B64C198FAEB3043F47CA3797638975 |
|  3 |           1 | root       | 127.0.0.1                 | *BA44AFCA02B64C198FAEB3043F47CA3797638975 |
|  4 |           1 | abc        | %                         | *A4456E2A03EC0EE4E164BFA533443E82691DA163 |
|  5 |           1 | def        | localhost                 | *5162BA4456B42A01FD8DDBB4CDBE46AFAD06C5AB |
|  6 |           1 | ghi        | %                         | *95E6C48AFC85167C37A24130DD4F5FE0F48AB658 |
|  7 |           1 | jkl        | %                         | *7A2D095E620E354F216EE00635E163406AD47392 |
|  8 |           1 | mno        | %                         |                                           |
|  9 |           1 | qpr        | localhost                 | *5162BA4456B42A01FD8DDBB4CDBE46AFAD06C5AB |
+----+-------------+------------+---------------------------+-------------------------------------------+
9 rows in set (0.00 sec)

If we had multiple machines set up as clients then we would have another group with instance_id 2, so on and so forth.

Remember this is an alpha tool and whatever you do with it is your responsibility. You are advised NOT TO set this up on production for now.


PlanetMySQL Voting: Vote UP / Vote DOWN

Monitoring MySQL SQL statements the way it SHOULD be done!

Ноябрь 23rd, 2010
You may have read a previous post of mine, back in April this year, where I wrote about using the MySQL 5.5 Audit interface to SQL Statement monitoring. There was a bunch of comments and some ideas, but not much happened. Until now that is.

Hereby I release the first version of SQLStats, which is a much enhanced version of what I described in the previous post. This is a MySQL Plugin for MySQL 5.5 that allows you to monitor the most recently executed, and the most frequently executed statements using 2 INFORMATION_SCHEMA tables. The thing is not very complex, to be honest, but it does do the job. So what was the job then? Well, looking at what this plugin does, it goes something like this:
  • Allows you to monitor ALL SQL statements executed by the server.
  • The SQL statements are "normalized", meaning that literals / constants are removed before comparison.
  • Data is saved in memory. No disk access and very little overhead.
  • Data is retrieved from INFORMATION_SCHEMA tables, just a simple SELECT and you know what is going on in the server.
I have done some basic testing of the overhead of this plugin, and it wasn't even noticable. If you have used the general query log for this stuff before, know what overhead I'm talking about. With SQLStats, the overhead is close to 0.

There is no need for MySQL Proxy or anything like that. There is no need to change something in the Client or in the Connector. To be honest, there are a couple of things I want to add to the plugin eventually, but this is a starting point at least. To use it: download it, install the tomcat / mysql monitoring server, install the monitoring agent and ... No wait, that was MySQL Enterprise Monitor, this is how you do it: download, build, install and use it. That's it.

So where can I get it, you ask (or maybe you don't, but I'm gonna tell you anyway). It's on sourceforge, and you can download it from here:
http://sourceforge.net/projects/sqlstats/

There are two files to download: A simple PDF documents with some basic usage and configuration tips, and a source-code package (which also includes the same PDF).

What would I want from you? Ideas for future development, bug reports and a few beers, that's it, not too much to ask, eh?

/Karlsson
Also, did I mention that the overhead is VERY low...

PlanetMySQL Voting: Vote UP / Vote DOWN

mycheckpoint (rev 208): aggregation tables, enhanced charting, RPM distribution

Ноябрь 8th, 2010

Revision 208 of mycheckpoint, a MySQL monitoring solution, has been released. New and updated in this revision:

  • Aggregation tables: aggregated data makes for fast reports on previously slow queries.
  • Enhanced charting: interactive charts now present time stamps dynamically (see demo); “Zoom in” charts are available (see demo) on mycheckpoint‘s HTTP server.
  • RPM distribution: a “noarch” RPM mycheckpoint build is now available.
  • Initial work on formalizing test environment

mycheckpoint celebrates one year of existence!

Aggregation tables

I really wanted to avoid using these: everything was so more beautiful with one single dataset and dozens of supporting views (OK, the views themselves are hardly “beautiful”).

However it was impossible (for my level of expertise) to optimize query performance what with all those views on per-hour and per-day aggregation. The GROUP BYs and the JOINs did not make it possible for condition pushdown (i.e. using MERGE algorithm) where desired.

As result, mycheckpoint now manages aggregation tables: per-hour and per-day. The impact on sample taking is neglect able (making for two additional fast queries), but the impact on reading aggregated data is overwhelming. Generating a HTML full report could take a few minutes to complete. It now returns in no time. This makes charting more attractive, and allows for enhanced charting, such as zooming in on charts, as described following.

Aggregation tables will automatically be created and retroactively populated upon using revision 208. There’s nothing special to do; be advised that for one single execution of mycheckpoint, many INSERT queries are going to be executed. Shouldn’t take more than a couple minutes on commodity hardware and a few months of history.

It is possible to disable aggregation tables, or make for a complete rebuild of tables; by default, though, aggregation is ON.

Enhanced charting

Two enhancements here:

  1. The interactive line charts already know how to update legend data as mouse hovers over them. Now they also present accurate date & time. This provides with fully informative charts.
  2. As with other monitoring tools, it is possible to “zoom in” on a chart: zooming in will present any chart in “last 24 hours”, “last 10 days” and “complete history” views, magnified on screen. See demo here.

RPM distribution

No excuse for this being so late, I know. But RPM distribution is now available. Yeepee!

This is a noarch distribution, courtesy of Python’s distutils; you should be able to install the package on any RPM supporting platform. I have only tested in on CentOS; feedback is welcome.

Future plans

Work is going on. These are the non-scheduled future tasks I see:

  • Monitoring InnoDB Plugin & XtraDB status.
  • A proper man page.
  • Anything else that interests me & the users.

Try it out

Try out mycheckpoint. It’s a different kind of monitoring solution. Simple monitoring (charting) is immediate. For more interesting results you will need basic SQL skills, and in return you’ll get a lot of power under your hands.

mycheckpoint is released under the New BSD License.

Umm, I’ll repeat this last one: mycheckpoint is released under the New BSD License. Still, and will continue to be. Thanks for the good advice by Lenz, Domas and others.


PlanetMySQL Voting: Vote UP / Vote DOWN

Not excited about paying for MySQL monitoring for your enterprise?

Ноябрь 8th, 2010
I think most people will agree that one of the biggest advantages of MySQL Community Server is that it’s free. Being free doesn’t get you a multi-million user community though; MySQL offers a great array of transactional engines, advanced high-availability features, robust I/O performance, and it powers many of the top-500 internet sites. When it [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

How To Send One Billion Email Marketing Messages Per Month

Октябрь 7th, 2010
One... *billion* emails!

One... *billion* emails!

One *Billion* Emails

In email marketing there are senders of all shapes and sizes, from small businesses using self-serve ESPs to the largest web properties self-sending to massive user bases. While only a few senders will reach or exceed volumes of one billion messages per month, the tools and practices needed to achieve such a volume level are applicable to all senders who want to succeed in email marketing.

Who Am I?

My name is Mike Hillyer (click here for bio and social links). I manage a team of Sales Engineers for Message Systems, a leading provider of digital messaging solutions for both senders and receivers. In my work over the last several years I have helped a number of clients reach the billion messages per month level and even more clients successfully deploy email marketing solutions ranging in scale from hundreds of thousands to millions of messages per month.

Who Needs To Send This Much Mail?

Contrary to what the image to the image above implies, there's nothing inherently evil about sending a billion messages a month. Some of the businesses that move a billion messages a month include ESPs, social networks (some move more than a billion a day for that matter), social gaming sites and large online retailers.

Any time you have a fairly large number of users (5-20 million) who receive multiple messages per day, or a really large number of users (40-50 million) receiving one message per day you are heading into the billion messages per month territory.

What Are The Numbers?

So exactly how much mail are we talking about here? That will depend on sending patterns:

In a lot of high-volume environments the sending will be to a world-wide audience, resulting in round-the-clock sending with no significant bursts of traffic. In such an environment the hourly volume will be 1,000,000,000 messages divided by 30 days divided by 24 hours equaling 1,388,889 messages per hour (386 messages per second), assuming 30 days in a month.

In an environment with inconsistent hourly volumes, we have to allow for both an average hourly volume and a maximum hourly volume and then design our solution to address the maximum hourly volume.

We need to look at seasonal factors: Does your social network move a lot of extra messages around Mother's Day? Does your dating site move a lot of extra messages around Valentine's day? Does your web shopping portal do a lot of extra business around Christmas?

We need to look at growth: If you are sending a billion messages a month it is very likely due to successful growth of your user base, something which you certainly have no intention of slowing. Look at how you have grown your email volume so far and extrapolate it out for the next year or two (especially if you only get budget for your infrastructure every two years).

Let's assume for the sake of this article that you have an average volume of one million messages per hour with a peak volume of two and a half million messages per hour during your busiest season. You have expect to double your user base each year for the next two years. At the end of two years you expect to be sending ten million messages per hour, or 7.2 billion messages per month (I've seen just this kind of growth several times with customers and prospects).

You Will Need to Send In-House

If you are aiming for a billion email marketing messages a month and are using an ESP it's time to plan your move to in-house sending. Assuming a $1.00 CPM (Cost Per Mille with Mille being Latin for thousand, so cost per thousand) you are looking at paying an ESP a million dollars a month to handle this kind of volume. Naturally you can probably secure a better rate than $1.00 CPM at these volume levels but regardless of the discount at this volume level you will pay less to buy the infrastructure and hire the people needed to do this yourself, gaining the control you need when sending at these volume levels.

Start With a Good Reputation

In order to hit the volume levels we're talking about it is going to be vital that you have a solid sending reputation. This means you need to follow best practices for list acquisition, list hygiene, segmentation and relevancy. There's a wealth of information online and an excellent catalog of it at Email Marketing Reports. This article will focus primarily on the technical aspects of sending one billion email messages per month but keep in mind that reaching one billion messages a month without a solid reputation on your domain and sending IPs is very difficult. A number of tools for checking the reputation of your IP addresses can be found at Word to the Wise. At this volume level email is key to your business and a solid reputation is going to be essential.

From a technical perspective there's a number of bases we need to cover regarding authentication, whitelisting, bounce processing and complaint handling.

Authentication

As a reputable sender you will want to associate your IP addresses with your domain using the authentication standards available to you. These include SPF, SenderID, DomainKeys (DK) and DomainKeys Identified Mail (DKIM). There are indications that SPF (and SenderID by association) is ineffective but given the low effort required to implement it I would recommend doing so anyway. While SPF and SenderID are purely DNS-based, DK and DKIM require an implementation either during message creation or during relay by the MTA and as a result will impact the maximum throughput of your infrastructure (more about this later).

DomainKeys is quickly being superseded by DomainKeys Identified Mail but with most solutions supporting both DK and DKIM it is simple enough to use both when sending to an ISP that supports one standard or the other. Implementation details will vary based on your sending solution. While some recommend selectively signing DK and DKIM for only messages sent to ISPs that are known to check authentication (in order to lower the impact signing has on throughput on a solution that takes a significant performance hit from signing), I recommend signing all messages; you never know who is checking for authentication without announcing it.

Whitelisting

One benefit of getting on the various whitelists provided by ISPs and reputation providers is that in some cases you can send higher volumes on whitelisted IP addresses than would otherwise be possible. Keep in mind that in most situations whitelisting is something that comes after sending has already begun in order to allow the provider of the whitelist to examine your sending patterns as part of the whitelisting process, so put your best foot forward (and follow it up with consistent behavior).

Bounce Processing

One quick way to lose reputation is to repeatedly send mail to recipients that do not exist. The ISPs will track how many non-existing addresses you send to and throttle you accordingly. Even more seriously, ISPs will occasionally take inactive email addresses and re-activate them as spam traps; any mail sent to the address will immediately get the source classified as a spam source and filtered accordingly.

To prevent this it is necessary to capture and act on the responses sent by the ISPs and unsubscribe those addresses identified as non-existent or inactive, while retaining those with responses that identify users on vacation and other not-fatal errors. Commercial sending solutions will perform this automatically with varying levels of effectiveness while other platforms will require a third-party solution such as Boogie Tools. Keep in mind that the more you send, the more you receive back in the form of automated responses and bounce notifications. As your reply addresses reach more and more users you the flow of notifications will become contaminated with spam and virus-carrying messages, requiring the implementation of Anti-Virus/Anti-Spam solutions for your incoming mail stream.

Complaint Handling

In an effort to help senders improve their practices, a number of ISPs have implemented ARF formatted Feedback Loop programs. When a user on a supported ISP clicks the "This is Spam" button, an automated message is sent to an address you define in advance (when signing up with the ISP for the Feedback Loop program). By processing these messages and un-subscribing the relevant users, you prevent further reputation damage that may result when sending them future messages.

The ARF format used by the ISPs makes it relatively straightforward to process Feedback Loop messages and use them to unsubscribe the users who have complained about your messages. There are tools available to process ARF formatted messages and some sending solutions will handle FBL messages natively.

Infrastructure Considerations

There are a number of architectural components that come into play to make it possible to send email marketing messages at volume levels of one billion email messages per month (or more) including network connectivity, server hardware and software.

Connectivity

Most professional sending operations are based in rented datacenters, simplifying the provisioning of network connectivity. In our initial example of a maximum throughput of 2.5 million messages per hour we'll use a sample message size of 50 kilobytes (51,200 bytes), meaning that we need to send at a rate of  2,500,000 * 51,200 = 128,000,000,000 bytes per hour or 271.2 megabits per second.

With the throughput we're talking about we certainly need to use gigabit speed networking within the datacenter and, more importantly, need backbone connectivity that can support not only a sustained throughput of 271 megabits per second but than can handle our future needs of 7.2 billion messages per month. You need to look at a datacenter that will be able to provide sustained gigabit speeds to the backbone.

Keep in mind that when you are sending a billion messages per month it means that email has significant impact on your bottom line and you won't be able to tolerate extended outages. You need to not only make sure that the datacenter you choose has redundant power and backbone connections, you also need to consider using redundant datacenters.

Server Hardware

Moving over a million messages per hour does not require the purchase of custom server hardware but it does require making a proper investment in hardwar. Generally speaking you will be using an infrastructure similar to the following:

An example of a basic sending infrastructure

The Message Injector queries the database and uses the results to assemble one or more messages which it relays to the Outbound Mail Server. The Outbound Mail Server queues the message, performs any necessary manipulations on the message and then sends it via the Internet to the destination server. In the event of a delivery failure message or a feedback loop message, the incoming message arrives via the Internet to the Inbound Mail Server. The Inbound Mail Server performs anti-virus/anti-spam scanning and then, in the case of a legitimate message, processes the message and updates the subscriber information in the database (not all server solutions can perform this processing in-stream, when using such solutions an intermediate server will be needed to accept the clean message from the Inbound Mail Server and process it using custom code).

In a production deployment there can be several variations on this example, typically with multiple servers used on the outbound and inbound roles, with multiple message injectors pushing to the outbound machines and often specialty servers on the inbound side dedicated to processing incoming feedback loop and bounce messages.

I generally recommend mail server similar to the following:

  • 2x multi-core, 64-bit processors
  • 16-32 GB of RAM
  • 8x 15K RPM hard-disks
  • Battery-backed RAID-10 controller

The specific details of your hardware selection will depend on the ability of your specific software to leverage the resources provided. A large number of fast disks in a RAID-10 array is recommended for the message spool as standards-compliant mail servers must write messages to disk before accepting them for delivery, placing significant demands on storage resources.

Software

As an employee of a leading software provider for high-volume senders you would rightly expect me to recommend a commercial solution, and specifically my company's solution. I'd like to take a moment to point out why:

Performance

We need to send at a rate in excess of one million messages per hour. I've dealt with a number of solutions and my experience has shown that most Open Source MTAs such as Postfix and Sendmail are limited to around 100,000 messages per hour. Commercial sending solutions typically show real-world performance ranging from 500,000 messages per hour to over two million messages per hour.

I have helped several companies that were operating dozens of Open Source servers to consolidate down to one-tenth as many servers running a commercial solution.

Segregation

In addition to limited throughput, Open Source MTAs are usually limited to sending through a single IP address, meaning that to send through ten IP addresses you need ten separate server instances. Commercial solutions support sending through multiple IP addresses simultaneously.

Advanced Functionality

Commercial solutions go beyond basic message queuing and sending, providing the additional functionality required for a high-volume sender. This includes features such as APIs, bounce classification, feedback loop processing, internal scripting, automated throttling, and database integration.

Availability

If you're sending a billion emails a month, you absolutely need a solution that provides high availability out of the box. If a server goes down you can't afford to be frantically activating a warm spare, just to find out that it too has some issue. You need an active-active solution that reacts automatically to server failures and keeps the mail flowing.

Manageability

You need a solution that can be easily managed on your terms, whether you prefer editing configuration files or using a web interface. In addition, you need something that grows with you, providing centralized management of an entire cluster of servers. Commercial solutions will provide easier, centralized management.

Reporting

One key to successful sending at high volumes is keeping tabs on how your server is performing and how your mailings are doing. You need to know what is passing through your server, how quickly messages are moving, whether queues are backed up, how the various ISPs are treating your traffic, all with the ability to drill down on specific source IPs and specific destination ISPs. You need to be able to see all of this in real-time and across your entire infrastructure. A good commercial solution provides all of this out of the box.

Time is Money

On multiple occasions I've seen organizations choose a free or low-cost solution and then spend countless hours building workarounds to the weaknesses of their chosen platform, writing scripts to automate administration, reporting tools to fill their needs, failover scripts to provide redundancy, etc.

While a lot of this work was impressive, it required time to implement and time to maintain. Time spent creating tools that are already provided in an alternative solution is time (and money) wasted. You are always better off using your time to create your "special sauce"; that which makes your business unique and gives you a competitive advantage.

The Price of Success is Continued Vigilance

When sending at a rate of one billion messages a month (or more), you can't just use a 'fire and forget' mentality. You are going to have to have people around to keep a constant eye on what is happening in your environment, monitoring multiple key factors to ensure you can continue to successfully send.

Reputation Monitoring

Remember earlier when I said you need to start with a good reputation? You also need to keep a good reputation, and the only way to do that is to know what your reputation is. You will need to take advantage of reputation monitoring tools provided by companies like Return Path and Pivotal Veracity as well as keep a close eye on the reporting produced by your sending solution (remember when I said you need good reporting?)

You need to watch things such as bounce rates, FBL hit rates, blacklist hits, transient failures and response rates.

Infrastructure Monitoring

If email is your company's lifeblood (and if you're sending a billion messages a month it certainly is) then you need to make sure to keep it flowing no matter what happens, and that means making sure your email infrastructure stays online. I spoke earlier of  the need for high-availability, active monitoring goes hand-in-hand with this need. You will need to monitor the health of the servers that support your infrastructure, the network components that carry your messages and the software that creates and relays your messages.

There are a number of monitoring solutions available to accomodate any platform and budget, be sure to implement one that meets your needs and get monitoring. Make sure to test simulated failures to confirm that monitoring is working successfully. Consider setting up a simulated mailing that runs on a regular basis using your full infrastructure stack: a monitoring script can check an inbox and if the test message fails to appear, something is potentially wrong in your sending infrastructure. This approach can help identify issues that may pass by other monitoring systems unnoticed (and can be integrated into some monitoring solutions directly).

Response Monitoring

Keep in mind that you get what you monitor for; if you focus too much on one metric it may improve without helping the big picture. In addition to making sure all the underlying pieces are in place, don't forget to keep an eye on things where the rubber meets the road. You may be sending at phenomenal rates with great metrics but failing to generate customer actions that lead to revenue.

Conclusion

While by no means an exhaustive list, I hope this gives you some idea as to the scope of sending in a high-volume environment of one billion email messages per month. Watch this space over the coming weeks for deeper dives into some of the subjects covered here.

Questions? Did I miss something? Let me know in the comments!

Disclaimer

The opinions and information in this post are my own and do not necessarily reflect those of my employer.


PlanetMySQL Voting: Vote UP / Vote DOWN

mycheckpoint (rev. 190): HTTP server; interactive charts

Сентябрь 7th, 2010

Revision 190 of mycheckpoint, a MySQL monitoring solution, has been released. New and updated in this revision:

  • HTTP server: mycheckpoint can now act as a web server. Point your browser and start browsing through HTML reports. See mock up demo.
  • Interactive charts: HTML line charts are now interactive, presenting with accurate data as you move over them. See sample.
  • Enhanced auto-deploy: now auto-recognizing failed upgrades.
  • Reduced footprint: much code taken out of the views, leading to faster loading times.
  • Better configuration file use: now supporting all command line options in config file.
  • Remote host monitoring accessibility: now supporting complete configurable accessibility details.
  • Bug fixes: thanks to the bug reporters!

mycheckpoint is free, simple, easy to use (now easier with HTTP server) and useful. I encourage you to try it out: even compared with other existing and emerging monitoring tools, I believe you will find it a breeze; it’s low impact and lightness appealing; it’s alerts mechanism assuring; its geeky SQL-based nature with ability to drill down to fine details — geeky-kind-of-attractive.

</encouragement>

HTTP server

You can now run mycheckpoint in http mode:

bash$ mycheckpoint http

mycheckpoint will listen on port 12306, and will present you with easy browsing through the reports of your mycheckpoint databases.

The http server automatically detects those schemata used by mycheckpoint, and utilizes the existing HTML views, integrating them into the greater web framework.

While in http mode, mycheckpoint does nothing besides serving web pages. It does not actively exercise monitoring: you must still use the usual cron jobs or other scheduled tasks by which you invoke mycheckpoint for monitoring.

The http server is directed at a single MySQL server, as with the following example:

bash$ mycheckpoint --host=slave1.localdomain --port=3306 --http-port=12306 http

It is assumed that this server has the monitoring schemata.

See mock up demo. The demo uses presents with real output from a mycheckpoint HTTP server; I haven’t got the means to put up a live demo.

Interactive charts

The openark line charts, used in the HTML reports, are now interactive. As you scroll over, the legend presents you with series values.

No more “I have this huge spike once every 4 hours, which reduces all other values to something that looks like zero but is actually NOT”. Hover, and see the real values.

See sample.

Enhanced auto-deploy

The idea with mycheckpoint is that it should know how to self upgrade the schema on version upgrade (much like automatic WordPress upgrades). mycheckpoint does bookkeeping of installed versions within the database, and upgrades by simple comparison.

It now, following a couple of reported bugs, also recognizes failure of partial, failed upgrades. This adds to the automation of mycheckpoint‘s installation.

Reduced footprint

Some of mycheckpoint‘s views are complicated, and lead to a large amount of code in view declaration. This leads to increased table definition size (large .frm files). There has been some work to reduce this size where possible. Work is still ongoing, but some 30% has been taken off already. This leads to faster table (view) load time.

Better configuration file use

Any argument supported on the command line is now also supported in the config style. Much like is handled with MySQL. For example, one can issue:

mycheckpoint --monitored-host=sql02.mydb.com  --monitored-user=monitor --monitored-password=123456

But now also:

mycheckpoint

With the following in /etc/mycheckpoint.cnf:

[mycheckpoint]
monitored_host     = sql02.mydb.com
monitored_user     = monitor
monitored_password = 123456

Rules are:

  • If an option is specified on command line, it takes precedence over anything else.
  • Otherwise, if it’s specified in the configuration file, value is read from file.
  • Otherwise use default value is used.
  • On command line, option format is xxx-yyy-zzz: words split with dash/minus character.
  • On configuration file, option format is xxx_yyy_zzz: words split with underscore. Unlike MySQL configuration format, dashes cannot be used.
  • If an option is specified multiple times on configuration file — well — I have the answer, but I won’t tell. Just don’t do it. It’s bad for your health.

Future plans

Work is going on. These are the non-scheduled future tasks I see:

  • Monitoring InnoDB Plugin & XtraDB status.
  • A proper man page.
  • Anything else that interests me & the users.

Try it out

Try out mycheckpoint. It’s a different kind of monitoring solution. Simple monitoring (charting) is immediate. For more interesting results you will need basic SQL skills, and in return you’ll get a lot of power under your hands.

mycheckpoint is released under the New BSD License.


PlanetMySQL Voting: Vote UP / Vote DOWN

Kontrollbase – new version available with improved analytics

Август 23rd, 2010
A new version of Kontrollbase – the enterprise monitoring, analytics, reporting, and historical analysis webapp for MySQL database administrators and advanced users of MySQL databases – is available for download. There are several upgrades to the reporting code with improved alert algorithms as well as a new script for auto-archiving of the statistics table based [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

mycheckpoint (rev. 170): improved custom queries; local charting; page/swap I/O monitoring; improved HTML reports

Июль 16th, 2010

Revision 170 of mycheckpoint, a MySQL monitoring solution, has been released. New and updated in this revision:

  • Improved custom queries: lifting of limitations from previous, introductory revision; better HTML presentation
  • Local, inline charting: no rendering of Google Charts, unless explicitly requested. All charts are now rendered locally using JavaScript. No data is now sent over the network.
  • Page/Swap I/O monitoring: now monitoring for page ins and outs, swap ins and outs (Linux only).
  • Improved HTML reports: several improvements on presentation (see sample, more follow).

Improved custom queries

Some limitations, introduced in revision 132, are now lifted. New features are introduced.

  • There is now no limit to the number of custom queries (well, an INT limit).
  • In fact, the data tables adjust themselves to the existing custom queries in the form of auto-deploy: once a new custom query is added or an old one removed, mycheckpoint will add or remove the relevant columns from the data tables.
  • The chart_order column is now utilized: HTML reports which include custom query charts now order those charts according to chart_order values. This makes for nicer reports.
  • The standard HTML brief report (SELECT html FROM sv_report_html_brief) now automatically includes all custom charts. The HTML brief report is the report one usually wants to look at: it provides with the latest 24 hours metrics for selected values. It now becomes a centralized place for all that is interesting in the past 24 hours.
  • Custom queries are now allowed to return NULL, treated as a missing value. This is a bugfix from previous revisions.

Local charting

Motivation for local charting is clear: no one likes having their data being sent over the network. And no one likes Google to know about their DML values.

I’ve been playing around with quite a few charting solutions, and have gone into depths with two of them, adding and rewriting quite a lot of code. Eventually, I settled on my very own rendering. Here’s what I’ve seen & tested:

  • dygraphs: a very nice time series charting library. I’ve presented a use case on a previous post.
    • Pros: slick, easy to work with.
    • Cons: uses HTML Canvas for rendering. This is fine on Firefox, Chrome, Safari, you name it. This isn’t fine on IE, which does not support Canvas. There’s ExplorerCanvas, a hack tool which converts canvas to IE’s VML, but it is far from being satisfactory: it is sloooow. Very, very slow. It is slow with one chart; but loading of 21 charts, as I do in some of mycheckpoint‘s reports can take long minutes on Internet explorer.
    • Cons: Only provides with a time series chart. No scatter plots.
  • Because they’re using ExplorerCanvas for IE, flot, jqPlot etc., are all unacceptable.
  • gRaphael: very slick charts based on Raphael. The original line charts are very basic, and I have invested a lot of time rewriting a great deal (you can find it all here). Raphael uses VML on IE, and SVG for all other browsers.
    • Pros: very slick. Supports various chart types, including line (though not time-series) and scatter.
    • Cons: slooooooooow when instantiating multiple charts. Unbearably slow, both on Firefox and IE. Slow as in minutes of waiting.

In addition, all of the above solutions were quite heavyweight: at about 45KB to start with, then add ExplorerCanvas or jQuery, or Raphael as supporting libraries, these became a real burden.

So, I had some time to spare (business is fine, thank you. I was a bit Ill. I’m feeling well now, thank you), and was upset what with all the time I invested in the above coding. And I decided to invest even more time, and build my own charts.

Enter openark-charts.



Currently, these line charts and scatter charts know how to parse a Google Image chart URL (only some features supported — only those I’m actually using with mycheckpoint). These are not full blown solutions: they come to serve mycheckpoint. And they do so nicely, if I may say so. Using Canvas for most browsers, or VML for IE, these very small pieces of code (10K for line chart, 6K for scatter chart, minified) load fast, use very little memory, and do their work well.

Granted, neither provides with interactive features: this is planned for the future.

Page/swap I/O monitoring

(Linux only) mycheckpoint now reads /proc/vmstat to get the pageins, pageouts, swapins and swapouts (since last reboot). I was actually looking at completely different places on the /proc file system to get swap info, and was frustrated with the complexity involved, till I bumped on /proc/vmstat… New tricks every day!

Improved HTML reports

This is mostly HTML make-up. Some minimal design, some more details thrown into the HTML pages (name of DB, MySQL version, mycheckpoint version). A little more verbosity; all sorts of stuff which was neglected so far.

Here are some show off examples of the new HTML views: [full report], [brief report], [24/7 report], [custom full report], [custom brief report], [alert pending report].

All HTML views now utilize the new openark-charts, and none renders charts with Google charts. This means when you use your HTML view, your data is safe. No data is sent over the net. All charts are rendered using Javascript, which is loaded and executed locally.

But if you like, there’s a [url] link next to each chart, which leads to a (online) Google chart image. Why? Because neither HTML Canvas nor VML allow for a complete rendering of the charts to an image. So this is a way for one to retrieve & store a chart’s image. Don’t use it if you see no reason for it; it’s just there.

And I even threw in rounded corners (IE users: only as of Windows 7).

Future plans

Work is going on. These are the non-scheduled future tasks I see:

  • Monitoring InnoDB Plugin & XtraDB status.
  • Interactive charts. See my earlier post.
  • A proper man page.
  • Anything else that interests me.

Try it out

Try out mycheckpoint. It’s a different kind of monitoring solution. You will need basic SQL skills, and in return you’ll get a lot of power under your hands.

mycheckpoint is released under the New BSD License.


PlanetMySQL Voting: Vote UP / Vote DOWN

What do MySQL Consultants do?

Июль 8th, 2010

One role of a MySQL consultant is to review an existing production system. Sometimes you have sufficient time and access, and other times you don’t. If I am given a limited time here is a general list of things I look at.

  1. Review Server architecture, OS, Memory, Disks (including raid and partition type), Network etc
  2. Review server load and identify physical bottleneck
  3. Look at all running processes
  4. Look specifically at MySQL processes
  5. Review MySQL Error Log
  6. Determine MySQL version
  7. Look at MySQL configuration (e.g. /etc/my.cnf)
  8. Look at running MySQL Variables
  9. Look at running MySQL status (x n times)
  10. Look at running MySQL INNODB status (x n times) if used
  11. Get Database and Schema Sizes
  12. Get Database Schema
  13. Review Slow Query Log
  14. Capture query sample via SHOW FULL PROCESSLIST (locked and long running)
  15. Analyze Binary Log file
  16. Capture all running SQL

Here are some of the commands I would run.

2. Review server load and identify physical bottleneck

$ vmstat 5 720 > vmstat.`date +%y%m%d.%H%M%S`.txt

4. Look at MySQL processes

$ ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld

$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208

5. Review MySQL Error Log

The error log can be found in various different places based on the operating system and configuration. It is important to find the right log, the SHOW GLOBAL VARIABLES LIKE ‘log_error’ will determine the location.

This is generally overlooked, however this can quickly identify some underlying problems with a MySQL environment.

7. Look at MySQL configuration

$ [ -f /etc/my.cnf ] &&  cat /etc/my.cnf
$ [ -f /etc/mysql/my.cnf ] &&  cat /etc/mysql/my.cnf
$ find / -name  "*my*cnf" 2>/dev/null

8. Look at running MySQL Variables

$ mysqladmin -uroot -p variables

9. Look at running MySQL status (x n times)

$ mysqladmin -uroot -p extended-status

It is important to run this several times at regular intervals, say 60 seconds, 60 minutes, or 24 hours.

I also have dedicated scripts that can perform this. Check out Log MySQL Stats.

11. Get Database and Schema Sizes

Check out my scripts on my MySQL DBA page

14. Capture Locked statements

Check out my script for Capturing MySQL sessions.

15. Analyze Binary Log file

Check out my post on using mk-query-digest.

16. Capture all SQL

Check out my post on DML Stats per table

Moving forward

Of course the commands I run exceeds this initial list, and gathering this information is only


PlanetMySQL Voting: Vote UP / Vote DOWN

Apache And MySQL Monitoring With Bijk On Debian Lenny

Июнь 25th, 2010

Apache And MySQL Monitoring With Bijk On Debian Lenny

This tutorial describes how you can monitor your server with the tool Bijk. Bijk creates online 30 graphs about load, CPU, memory, traffic, Apache, NginX, PostreSQL and others with alerts. Bijk can be used on Debian, Ubuntu, CentOS, RedHat and Gentoo. In this article I will explain how to install Bijk on Debian.


PlanetMySQL Voting: Vote UP / Vote DOWN