Archive for the ‘load’ Category

Improving MySQL Insert thoughput

Ноябрь 5th, 2010

There are three simple practices that can improve general INSERT throughput. Each requires consideration on how the data is collected and what is acceptable data loss in a disaster.

General inserting of rows can be performed as single INSERT’s for example.

INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);

While this works, there are two scalability limitations. First is the network overhead of the back and forth of each SQL statement, the second is the synchronous nature, that is your code can not continue until your INSERT is successfully completed.

The first improvement is to use MySQL’s multi values capability with INSERT. That is you can insert multiple rows with a single INSERT statement. For example:

INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?),  (?, ?, ?), (?, ?, ?);

Depending on how you collect the information to be inserted, you may be able to easily implement this. The benefit, as previously mentioned in The value of multi insert values shows an 88% improvement in performance.

One system variable to be aware of is max_allowed_packet. You may have to adjust this for larger INSERT statements.

Next is the ability to INSERT data based on information already in another table. You can also leverage for example another storage engine like MEMORY to batch up data to be inserted via this approach.

INSERT INTO table (col1, col2, col3) SELECT col1, col2, col3 FROM anothertable

The third option is to batch load your data from a flat file. Depending on how you source the information you are inserting, that may also be a significant improvement in throughput in bulk inserting data.

LOAD DATA [CONCURRENT] INFILE 'file'
INTO TABLE (col1, col2, col3)

On a closing note, the choice of storage engine can also have a significant effect on INSERT throughput. MySQL also offers other non ANSI options including DELAYED, IGNORE and LOW_PRIORITY. These should definitely be avoided.


PlanetMySQL Voting: Vote UP / Vote DOWN

Linux Open Administration Days 2010

Апрель 20th, 2010

So about 4 monts ago there was the crazy idea to start a new FOSS event in Belgium targeted at sysadmins.

What started out as an event for local people to meet local people with some local speakers actually ended up being a small local event with some top international speakers on onfiguration mananagement and system administration mixed with a bunch of good local ones !

I had the honour to open the conference with an extremely short version of the Devops talk I gave earlier last year.. extremely short as I knew that over the course of the weekend the topic would reoccur a lot.

We had the first european talk on Chef, by Joshua Timberman, and we had Puppet talks amongst by Dan Bode from Puppetlabs and CFengine talks , devops was a frequently dropped word,

We had a book raffle where we handed out O'Reilly's .. we had a great free pizza party (got the idea from the saturday pizza event at LCA 2005) , and we had some free beer. Sounds like a good combination for a geeky weekend.

Apart from the regular talks there were plenty of Open Spaces where interesting topics were discussed ... we had spaces on Open Source vs Open Core , strong voices were heard when we discussed what we should do with the Open Core companies that claim to value Open Source , some people think we should actually list the fauxpensource ones somewhere and make sure the world knows about them

We had an awesome configuration management discussion session discussing Chef vs Puppet vs CFengine . And much much more ...

Some people owe me plenty of Sushi as I had to do my MySQL HA talk before their Managing MySQL talk , but other than that .. things just went fine..

Trackback URL for this post:

http://www.krisbuytaert.be/blog/trackback/998

PlanetMySQL Voting: Vote UP / Vote DOWN