Bratty Redhead

the sarcasm is free!

The Alien Technology Tuning Challenge

A friend challenged me tonight to write a brilliant blog post on tuning a technology about which I know nothing. Actually I don’t think you can really do that. I don’t think you can write a brilliant blog post unless you’ve participated in some kind of failure/stress activity with a product. 

Until you’ve seen real life interact with your infrastructure, it’s all just theories and beer.  But after a long day debugging Chef code I wrote in a way I wish I’d never written (no really, I don’t want my name anywhere on some of this stuff!), I thought I’d take a little downtime to read about a product I’m utterly clueless about.  Because that’s fun.  YES IT IS.

How clueless? After he challenged me, I opted to get in a bike ride while the weather was nice (rumor predicts snow this weekend). While pondering things on my bike, I finally stopped for a sec, took out my phone and googled ‘Redis Rescue Throughput.’  And when I found nothing at all useful, I sent him a text “Did you actually use the words ‘rescue throughput’ earlier?”

Can you guess what he said to me? He said, “Go write a brilliant blog post on Optimizing Redis for Resque Throughput.”  Now you know how much I know about Resque.  That would be a big, embarrassing ZERO.  As for Redis, I know it’s an in-memory data store.  That’s it.

So now it might not surprise anyone who knows me that this evening did not turn into a big brainstorming session for Redis tuning params.  Once I tracked down the Resque github page I was all

 

And it’s written in Ruby. This is kind of dorky of me, but I still get a little thrill whenever I read source code and know what’s going. I looked at some of the examples and thought, OH HEY, I see what you did there! Not that I’m any kind of a genius. But it’s fun to realize I can read it.

And so then I spent the evening reading about Resque, looking at the source code and reading blog posts about it.  I adore queuing software. I love the idea that we can pop little bits of data into a store and have it consumed asynchronously, without having other processes blocked while waiting for something to complete. It always makes me happy to have asynchronicity in place.

This of course is mostly from years of supporting projects in the early years where devs didn’t understand or know about asynchronous communication. Lots of our problems back in the day were related to synchronous calls blocking until the app crashed.  Good times.

When I encounter open source queuing applications, I get a warm fuzzy.  I grew up professionally in a world that only acknowledged one queuing tool: IBM MQ. IBM built an enormous industry around high availability messaging and I had no idea there were other, easy-to-use tools available in the wild, until I got involved with open source.  When I first came across RabbitMQ I was enchanted just because it was the first free, easy to use queuing tool I met; when I come from a world where so many are led to believe that you should pay millions of dollars for a decent, reliable tool.

Then I remembered! o craps! I’m supposed to be thinking about in memory database optimization for this queuing stuff, NOT reading about the queuing!  Unfortunately, it’s now late and I have to get up early tomorrow, so I guess I lose the alien tuning challenge. But I can leave you with common sense and thoughts based on what I see in the redis.conf.

Tuning your in memory data store for performance throughput:

Don’t be stupid.

  • Read the config options.  Am I the only one who loves reading config files? Probably.
  • Also read the Redis page on virtual memory
  • Dedicate instances to your queuing activities.  Don’t cause data with disparate requirements to co-exist. Competing data sets could also cause developer hair pulling fights over whose app is breaking things.

Disk I/O and resource contention

  • Avoid frequent disk writes, esp if you have multiple instances, because you risk I/O contention
  • Avoid excessive logging for the same reason
  • Avoid virtual machines
  • Get a fucking ssd? Maybe not if you avoid needing to write to disk much, since we’re more concerned with ALL THE RAMS.
  • Know how much memory you will need and MONITOR usage and trending BEFORE you use it all up.
  • Your data lives in a Memory-based container.  Understand the Max Memory policy

Connections

  • Manage your max clients - just because it defaults to 10,000 doesn’t make it right.
  • Ensure you have enough file descriptors for all your connections plus whatever else the DB needs to keep functioning.  If you limit your max connections, you can probably leave the FD unlimited. Or you can limit your FDs as it looks like Redis is smart enough to use those to set connections.
  • Either way - be aware of the relationship between number of potential clients connecting, max file descriptors and max client connections.  Or you will be sad.
  • Connection timeouts - this is a tricky topic. If your data store is separated from your queue manager by a firewall, you probably can’t leave it on infinite. Firewalls will timeout your connection and not tell you about it. This will either confuse the queue manager and cause it to error or it will possibly be smart enough to open a new connection.

     If the latter, you will eventually run out of file descriptors or allowed connections on the data store side unless the data store is also smart.

Redis (did I mention I know nothing about Redis? It’s an in-memory data store, right?) I read the redis.conf and skimmed the Virtual Memory page.

  • Disable active rehashing
  • Understand your VM options
  • Understand your typical message sizes and size your paging accordingly

That’s all I got. You should verify anything you read here against your own requirements and get a second opinion.  Every situation is unique. All of these relate to production environments and could be specific to a low latency goal. Memory conservation and data criticality may be conflicting priorities or require compromise.