Archive for the ‘postgres’ Category

New England Database Summit

Ноябрь 15th, 2011

The New England Database Summit is an all day conference-style event where participants from the research community and industry in the New England area can come together to present ideas and discuss their research and experiences working with on data-related problems.  It is an academic conference with applications to real life, and includes any type of database.

The 5th annual NEDB will be held in Cambridge, MA MIT (in 32-123) on Friday, February 3, 2012.  Anyone who would like is welcome to present a poster (registration required), or submit a short paper for review.  We plan to accept 8--10 papers for presentation (15 minutes) at the meeting.   All posters will be accepted.

For more details, and to register and / or upload a paper, see:

http://db.csail.mit.edu/nedbday12/


PlanetMySQL Voting: Vote UP / Vote DOWN

On Clouds And Data

Август 26th, 2011
I'm sitting in SFO tonight, awaiting my return trip back to Hurricane Pending Maryland. (As a former Floridian, I must of course scoff at any notions that this hurricane is significant). Walking through the airport I noticed a large billboard about "Big Data and the Cloud". This is the kind of billboard you only see in Silicon Valley; I don't see signs like that in Portland or Ottawa, and certainly not when I had to change flights in Detroit this year.

Anyway, these two buzz words aren't a local phenomenon, and are actually taking the tech world by storm. Big Data has become serious enough that there are multiple conferences now for folks interested in the topic. And cloud, well, perhaps harder to define, but more and more businesses are moving to the cloud every day. The problem here is that, most of the traditional ideas on big data run entirely counter to the ideas that work well in the cloud.

Last spring I moderated a panel PGEast in New York that focused on Postgres in the cloud. As someone who works on multi-terabyte systems, and someone who deals with cloud servers on at least a semi-regular basis, I tried to prod and poke my panelists into sharing their take on how they see Postgres's role in the cloud. Not too surprisingly, the idea behind "Big Data" on Postgres in the cloud was not a particularly popular one. The tools you need to do the job effectively with Postgres just aren't there. Not to say you can't try, but so far I haven't seen many wild successes.

Next month at Surge though, I'm going to be involved in another panel focusing on "Pushing Big Data To The Cloud". This time though I'm turning over moderating duties to long-time thought leader in the MySQL community Baron Schwartz. Joining me on the panel are several folks who all have a stake in the idea of Big Data in the cloud; John Hugg and Philip Wickline from VoltDB and Hadapt, respectivly, two new database vendors built with scale-out in mind; Bryan Cantrill, VP of Engineering at Joyant, a cloud provider with thier own strong opinions on dealing with data in the clouds, and Kate Matsudaira, someone who is currently managing those multi-TB databases, all in the cloud, over at SEOMoz. This should be a really good mix of people using different technology, with different biases against the problems involved. If you're looking to work on Big Data in The Cloud, I hope you'll join us, it should be a lot of fun.
PlanetMySQL Voting: Vote UP / Vote DOWN

Paying Attention Pays Off

Август 15th, 2011
I often run my ops like I take care of data; a bit overzealously. Case in point, when setting up a new database, I like to throw on a metric for database size, which gets turned into both a graph for trending, but also an alert on database size. Everyone is always on board with trending database size in a graph, but the alert is one people tend to question. This is not entirely without justification.

On a new database, with no data or activity, deciding when to alert is pretty fuzzy. When we set up a new client within our managed hosting service, I usually just toss up an arbitrary number, like 2GB or something. The idea isn't that a 2GB database is a problem, it's that when we cross 2GB, we should probably take a look at the trending graph and do a projection. Depending on how things look, we'll bump up the threshold on the alert to a new level, based on when we think we might want to look at things again. For example, in this graph we take a month long sample, and then project it out for three months. We can then set a new threshold somewhere along that line.

projected db size

While this is good for capacity planning, there's more that can be gained from this process. The act of alerting forces us to pay attention. And if we get notices before our expectations, we go back in and re-evaluate the data patterns. Of course, some times people will question this. Getting a notice that your database has passed 4GB can seem pointless when you have 100+ GB of free space on your disks. And besides, isn't that what free space monitors are for?

Here is a graph of another of our clients database growth. Their data size is not particularly large (don't confuse scalability with size; it doesn't take a large database to have scalability issues), but what's important is that we kept getting notices that the size was growing, and when talking with the developers, no one thought it should be growing at nearly this rate. Eventually we were able to track down the problem to purging job that had gone awry. Once that was fixed, the growth pattern leveled off completely (and the database size returned to the tiny amount that was expected!)

Fix DB Size



PlanetMySQL Voting: Vote UP / Vote DOWN

Maybe they just like it better?

Август 8th, 2011
There has been a lot of chatter the past week about Apple replacing MySQL with Postgres in the new OSX Lion Server [U.S. | England | New Zealand ]. Most of it seems to tie things back to Oracle's new stewardship over the MySQL project, a lot of that stemming from what I would say is FUD from the EnterpriseDB folks, regarding doom and gloom about the way Oracle might handle the project in the future. Not that the FUD is entirely unwarrented; While Oracle has done a pretty decent job with MySQL so far, looking at what Oracle has done to projects like Open Solaris certainly would make one queasy. And yes, we've seen an uptick in people asking for help with Oracle/MySQL to Postgres migrations since the acquisition of Sun. That said, I have an alternative theory. Maybe they just like it better?


Continue reading "Maybe they just like it better?"
PlanetMySQL Voting: Vote UP / Vote DOWN

Developer Week in Review: Lion drops pre-installed MySQL

Август 3rd, 2011


A busy week at Casa Turner, as the infamous Home Renovations of Doom wrap up, I finish the final chapters of "Developing Enterprise iOS Applications" (buy a copy for all your friends, it's a real page turner!), pack for two weeks of vacation with the family in California (Palm Springs in August, 120 degrees, woohoo!), and celebrate both a birthday and an anniversary.



But never fear, WIR fans, I'll continue to supply the news, even as my MacBook melts in the sun and the buzzards start to circle overhead.

The law of unintended consequences

Lion ServerIf you decide to install Lion Server, you may notice something missing from the included software: MySQL. Previous releases of OS X server offered pre-installed MySQL command line and GUI tools, but they are AWOL from Lion. Instead, the geek-loved but less widely used Postgres database is installed.

It seems pretty obvious to the casual observer why Apple would make this move. With Oracle suing Google over Java, and Oracle's open source philosophy in doubt, I know I wouldn't want to stake my bottom line on an Oracle package bundled with my premiere operating system. Apple could have used one of the non-Oracle forks of MySQL, but it appears they decided to skirt the issue entirely by going with Postgres, which has a clear history of non-litigiousness.

Meanwhile, Oracle had better be asking themselves if they can afford to play the games they've been playing without alienating their market base.

South Korea fines Apple 3 million won, which works out to ...

Apple has bee been hit with a penalty from the South Korean government that's a result of the iPhone location-tracking story that broke earlier this year. Now, Apple may have more money than the U.S. Treasury sitting in petty cash right now, but it will be difficult for them to recover from such a significant hit to their bottom line: a whopping 3 million won, which works out to a staggering ... um ... $2,830. Never mind.

Strata Conference New York 2011, being held Sept. 22-23, covers the latest and best tools and technologies for data science -- from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.

Save 20% on registration with the code STN11RAD

Java 7 and the risks of X.0 software

Java 7 was recently released to the world with great fanfare and todo. This week, we got a reminder why using an X.0 version of software is a risky endeavor. It turns out that the optimized compiler is really a pessimized compiler, and that programs compiled with it stand a chance of crashing. Even better, there's a chance they'll just go off and do the wrong thing.

Java 7 seems to be breaking new ground in non-deterministic programming, which will be very helpful for physics researchers working with the Heisenberg uncertainty principle. What could be more appropriate for simulating the random behavior of particles than a randomly behaving compiler?

Got news?

Please send tips and leads here.

Related:


PlanetMySQL Voting: Vote UP / Vote DOWN

How are index-only scans implemented in InnoDB?

Ноябрь 23rd, 2010
There have been interesting discussions in the PostgreSQL community about adding support for index only scans. On several occasions people were curious about how InnoDB supports this. A recent post by the InnoDB team is an excellent overview. A brief summary of that post and other material is:

  • records in the clustered (primary) index store hidden columns (DB_TRX_ID, DB_ROLL_PTR)
  • records in the non-clustered (secondary) index do not store hidden columns
  • records in clustered and non-clustered indexes have a delete-mark flag
  • records are not updated in the secondary index, they are delete-marked on delete, inserted on insert and delete-marked/inserted on update
  • delete-marked records are removed from indexes by the purge thread when it is safe to do so
When a secondary index page is read, if the max transaction ID on the page is less than the max transaction ID visible to the reading transaction, then the page can be used as is and the page read is index-only. If this condition is not true, then for any entry read from this page the record is read from the clustered index page to determine whether the index entry is visibile. In that case the secondary index read is not index only. Index only matters because when things are not index only there can be an additional random disk read to the clustered index for each entry read from the secondary index.

I began to read the code for this today as I want to add a counter for the number of secondary index page reads that are and are not index only. If you want to read the code too the function lock_sec_rec_cons_read_sees determines whether all entries on a secondary index page are definitely visible to a transaction (read view).

If you are interested in this topic, I recommend these books:


PlanetMySQL Voting: Vote UP / Vote DOWN

random poking

Ноябрь 8th, 2010

These are some of my notes from some sysbench in-memory r/o testing in past day or so:

  • At ‘fetch data by primary key’ benchmark with separate read snapshots at each statement, MySQL shines until ~200 concurrent threads, then performance starts dropping slightly faster than one would want, I think mostly from table cache LOCK_open contention
  • auto-commit cost (establishing read snapshot per statement) for SELECTs is ~10% for MySQL, but for PG it can be +50% in plain SQL mode and +130% (!!!!!!!) when using prepared statements (this can be seen in a graph – obviously the global lock PG has during this operation is held for too long and maybe is too costly to acquire.)
  • Some benchmarks went up by 10% when using jemalloc
  • MySQL could accept 10x more connections per second than PG (15000 vs 1500)
  • Most confusing behavior MySQL exhibited was at 100-record range scans in PK order:
    • At innodb_thread_concurrency=0 it did around 70k range reads, both fetching data and aggregation (SUM())
    • At innodb_thread_concurrency>0 it did only 10k range reads returning data but still was able to do 70k aggregations/s
    • PG was doing ~35k ops/s at that test

    It seems that at least for systems that do lots of range scans (or joins) I guess, managed concurrency kills performance entirely due to giving up tickets too often, need to review it more.


PlanetMySQL Voting: Vote UP / Vote DOWN

Postgres kernel regressions

Сентябрь 13th, 2010

While Mark was looking at MongoDB, I was playing comparing various aspects of MySQL and Postgres performance. Certain PG performance numbers I saw (40kqps vs 110kqps from MySQL) made me really upset, so I ended up discussing with people at #postgresql – and started comparing various versions/configurations/machines/etc.

Apparently 2.6.32 kernel, which is in Ubuntu 10.04 LTS (Lucid Lynx) and is also basis for future RHEL6 kernel has nearly 20% performance degradation for PG (though not for MySQL, phew) – and apparently it was news to their community (I have started an email thread, will see where it goes).

While I was doing my research, of course I could observe plenty of gems of wisdom:

<davidfetter> domas, well, as tom lane once said, there’s no limit to how quickly you can get an answer if it doesn’t have to be correct

And an hour later:

<rawtaz> are you one of those mysql lovers?

<dim> nobody wants performance at the price of correctness, right?
<dim> as Tom said, I can be as quick as you want as providing an answer if you don’t want it to be correct (hint: 42.)

Apparently chasing these performance variations is…

<Snow-Man> tbh, I’m of the opinion that we’re losing the forest for the trees.

Can’t say that all of them were encouraging:

<sterncera> If you can’t be bothered to subscribe to -hackers, maybe you shouldn’t be mailing it

Special thanks goes to mastermind, who not only didn’t lose his temper, but stayed focused on the topic and resisted my trolling :-) I really want PG to be greatly performing database (and I’ve seen some great numbers from it), but the amazing amount of ignorance and animosity they have against MySQL makes it somewhat complicated there though :(

P.S. Now all benchmarks I did are tainted and will need full rerun…


PlanetMySQL Voting: Vote UP / Vote DOWN

Now What? (wrt OpenSolaris and your database)

Август 16th, 2010
Last week's "announcement" of the death of OpenSolaris has steered a lot of questions my way about where people should go, and/or where OmniTI will go, now that OpenSolaris future looks non-existent. As one of the more open users of Solaris related technology, and running some beefy loads on top of it, it makes sense that people would be curious as to what we might be doing next. I would start with saying that as a company, we don't have an official policy on this yet, and probably won't. We evaluate each situation on a customer by customer basis, so what follows here is more my personal feelings on what people should do at this current point in time.

The one thing I have noticed from the people I have already spoken with is that there seem to be two major camps, an over simplification to be sure, but I break this down into the free software camp (those motivated by a desire to remain on open source, and/or support, free software as a primary driver of technology decisions), and those more interested in the technology than the ideals behind it. Depending on where you fall into that spectrum, you have different options available to you, and will likely reach very different conclusions.

Too Soon?


The first thing I have said to everyone is that it is honestly too soon to make any moves. Oracle is notorious for being poor communicators, and at this point I don't think we've seen enough official communication to really know what's going to happen. This doesn't mean you can't start planning though! We've been looking at some of the available options since before the Oracle/Sun merger was closed, so it doesn't hurt to start evaluating the options out there. However there's no need to rush in to things; it is possible that the announcement of OpenSolaris's death might be premature. I personally don't believe Solaris can't survive based on the model we've just seen laid out; there are too many people learning the gnu tool chain who won't be willing to invest big money into a tool that is hard for them to use. They need a low cost / free option for people to familiarize themselves on (and all the better if it installs gnu tools by default). There's an outside chance Oracle might come to this conclusion, which would give new life to OpenSolaris.

A more likely alternative to that theory is that some other group might pick up OpenSolaris maintenance and start pushing it forward. Certainly not an easy task, but there are already several different distribution of OpenSolaris available, so the userland level management has the resources, we mostly would need to figure out how to handle the more core technologies that have been maintained by Sun. I think this might also be possible, as there are numerous companies already heavily invested in OpenSolaris technology, and there are Solaris internals hackers looking to move out of Oracle, it's not an impossible leap to think we might see something worked out. And if Oracle continuous to make technology available via the CDDL (which most of the current signs seem to indicate), this could work out. I would say that this might not resemble the OpenSolaris as it is now, but could definitely be an option for current users who'd like to remain on the OpenSolaris platform.

Other Options?



Of course, you might not want to put all your eggs in that basket. So what other options do we have? Well, that mostly depends on what you're getting out of OpenSolaris now, and what you want out of your OS going forward. For many people, I suspect that Solaris 11 Express might be a suitable replacement, especially for those running mixed OpenSolaris / Solaris environments. Migrating up to full Solaris 11 will also cover most of your technology needs, so depending on pricing I suspect people may find that a cheaper alternative to migrating to a new platform. Of course, if you want to stick with a free software solution, this won't really be an option.

FreeBSD seems to be the most obvious alternative platform. If you're currently taking advantage of dtrace, zfs, and zones, FreeBSD gives you options to cover all three. It won't be the same; the dtrace and zfs implementations are pretty close aiui, but for zones you'll probably have to use either Jails or OpenVS, neither of which am I a fan of. I think you'd also find a larger overlap in system utilities (tar, find, grep, etc..) between FreeBSD and Solaris, so for people (and scripts) making the transition, this might be an easier move. The big question here is probably hardware support; if you can't get FreeBSD running on your hardware, that's likely to be a show stopper, unless you can work out a new hardware purchase in the transition :-)

So, if you don't want to go closed Solaris, and FreeBSD isn't an option, that probably leaves you on Linux. People sometimes think I don't like Linux; I'm actually very comfortable on it. My first "unix" was Linux, and we run some extremely demanding systems on Linux and it has performed well in those cases. However if you're trying to do deep introspection, systemtap is a poor man's dtrace. And if you are relying on zfs, you'll have a hard time finding a suitable replacement amongst the current Linux options. Personally I am most comfortable on ext3, but I tend to give up on file system snapshots, which is a painful submission if you have to make it. XFS is probably the next most common option, and generally I've no bones about using it if you want to avoid ext3. Of the three "advanced" replacements; ext4, btrfs, and zfs on linux; I think ext4 is probably your best bet, but only because zfs is too new for any serious database systems, and if you are moving off OpenSolaris to get away from Oracle, "butter" seems like an odd choice.

And so...



I think it's wise to keep things in perspective. There are some cases where you want to be a technology leader (we've been running Postgres 9 for months on some systems), but generally speaking when it comes to picking the operating system and filesystem for your database, it's best to tread lightly. Now is a fine time to start evaluating your options; at least figure out what features are critical to your enterprise that you'll need to replace (and don't just think about database, you might be relying on crossbow for something, or who knows what else). We'll certainly be watching the current options available, and I suspect diversifying a little, over the next 6 months, as we wait for the picture to clear up where we can. We're not in a hurry (after all, we do have the source code of what we're running now), and I don't see much reason for others to be either.
PlanetMySQL Voting: Vote UP / Vote DOWN

Databases, The End User Experience

Июль 29th, 2010
Does it matter if the end user knows what the database is?

Recently I got a wonderful view of a database from the end user perspective.

While I was traveling I had found a restaurant where I had decided to let friends who live locally know where I was at. Part way through my food I got a message from a local friend that said "Don't eat there, their food always makes people sick!"

"Always" is a word that I would think would be a little too strong when applied to a restaurant, right?

Nope, the next day I got to feel the full truth of the word.

A couple of days later I am telling some friends about this and a local asked me "Where was this, I want to avoid them." I didn't get asked this question once, I got it asked a dozen times.

I don't know where the place is. Why is that? Because the system I was using lost the entire day worth of my data. I don't know how often they loose data, but from asking a few other folks it appeared to be that it is more frequent then not.

It came up in casual conversation the other day that the site had moved off Postgres to another system recently. Which suddenly made everything make sense, because the particular solution they moved too is not very durable.

We talk about databases being "transactional" or not. We talk about them being "durable". What matters in the end, to me as an end user, is that when I put my data in a system, I want a confirmation that the system stored it. I don't want to retype my data, and I don't want to collect it again. If I was the operator for the site? I certainly wouldn't want to be losing my users data.

In the MySQL world? MyISAM is the most abandoned storage engine in the stack. People will pick it initially because it is fast, but the first time they discover data corruption or have to deal with multiple hours of recovery time they quickly move away from it.

As an operator I wouldn't want to be having to explain to my users or my boss, that we had to wait 12 hours until the database recovered itself (or that it had corrupted itself). "Transactional" systems know how to handle recovery. People will wave their arms about and talk about disk controllers, disk failures, etc... That is hand waving. A properly configured system is redundant and sure it can be hit by lighting, but the real issue is most likely going to be that a plug gets pulled or a program crashes.

I look at, and even work with, some of the "no-sql" solutions. Some of them I recommend, and other's of them I don't. I look at scale out needs, usage patterns, and a wide variety of other details.

As end user though?

I would like to know that my data was stored, and that I will reliably be able to retrieve it when I want. I don't like outages. Of the services online that I pay for or that I have integrated into my life? I can't imagine wanting to deal with a system which was unreliable. A free service which does not work most of the time, is not free. It will consume my time whenever it is not available.

There is an end user experience for the database, site operators ought to remember this.
PlanetMySQL Voting: Vote UP / Vote DOWN