My Favorite JVM Arg EVER

This is partially a post for posterity because I can never remember the format of this when I want it and I have to search all over the Internets for it.  And for some reason, it's really hard to find with keywords.  Possibly my google-fu sucks, but I actually think it's a java developer conspiracy.

The first time I found this jvm arg, it was by accident.  I was reading a page of comprehensive JVM args.  No it wasn't for fun, although I was enjoying it.  I was on a gig to help a client get rid of 20 second garbage collections that were crashing their site.  So I was reading through the page just to see what might be useful and I found this:

-XX:MaxJavaStackTraceDepth=<num>

It was love at first sight.  See, I know developers lurve their stack traces and exceptions.  Truly.  But I've met some truly horrendous apps.  You know the kind.  They throw 500 line stack traces every time someone types in a bad password. Or you have an overrun team that, once an app is in production, they don't have time to tune the app and so you are stuck with 2000 lines of stack trace for every exception, even though it only takes < 10 to figure out where to look for trouble.

After I found this and used it, I had bookmarked the page of wonderful jvm options, but it was dead the next time I went there. So I ended up searching the internet for it later and found that my googling led me all over.  I found a stackoverflow discussion where a harried admin was asking about it and a herd of VERY ANGRY developers jumped all over him saying things like

"How dare you truncate our wonderful stack traces!", and

"An exception is called that because it's exceptional. There's a reason for it and you shouldn't truncate it!" 

Oh they were like an angry mob.

But I have to tell you, this arg is an admin's best friend if you have apps in your care that can't be bothered with proper tuning on the dev side.  Often this was because a client had a contract dev team come in, deliver an app and then leave.  There was no one to fix it.

There's no shame in cleaning up your log files so they are readable.  It's tough to actually dig into a problem if you have thousands of lines of "benign" errors filling your world.  This option allows you to limit the damage done by excessive logging while still seeing that an exception is being generated. Of course, with all the wonderful logging packages and tools with filters out there today, this sort of thing is less important if you are filtering, but there are lots of admins out there with nothing but the CLI, syslog and vi even today. This post is dedicated to them.

 

The Alien Technology Tuning Challenge

A friend challenged me tonight to write a brilliant blog post on tuning a technology about which I know nothing. Actually I don’t think you can really do that. I don’t think you can write a brilliant blog post unless you’ve participated in some kind of failure/stress activity with a product. 

Until you’ve seen real life interact with your infrastructure, it’s all just theories and beer.  But after a long day debugging Chef code I wrote in a way I wish I’d never written (no really, I don’t want my name anywhere on some of this stuff!), I thought I’d take a little downtime to read about a product I’m utterly clueless about.  Because that's fun.  YES IT IS.

How clueless? After he challenged me, I opted to get in a bike ride while the weather was nice (rumor predicts snow this weekend). While pondering things on my bike, I finally stopped for a sec, took out my phone and googled 'Redis Rescue Throughput.'  And when I found nothing at all useful, I sent him a text “Did you actually use the words 'rescue throughput' earlier?”

Can you guess what he said to me? He said, “Go write a brilliant blog post on Optimizing Redis for Resque Throughput.”  Now you know how much I know about Resque.  That would be a big, embarrassing ZERO.  As for Redis, I know it's an in-memory data store.  That's it.

So now it might not surprise anyone who knows me that this evening did not turn into a big brainstorming session for Redis tuning params.  Once I tracked down the Resque github page I was all

 

19454070

And it’s written in Ruby. This is kind of dorky of me, but I still get a little thrill whenever I read source code and know what’s going. I looked at some of the examples and thought, OH HEY, I see what you did there! Not that I’m any kind of a genius. But it’s fun to realize I can read it.

And so then I spent the evening reading about Resque, looking at the source code and reading blog posts about it.  I adore queuing software. I love the idea that we can pop little bits of data into a store and have it consumed asynchronously, without having other processes blocked while waiting for something to complete. It always makes me happy to have asynchronicity in place.

This of course is mostly from years of supporting projects in the early years where devs didn’t understand or know about asynchronous communication. Lots of our problems back in the day were related to synchronous calls blocking until the app crashed.  Good times.

When I encounter open source queuing applications, I get a warm fuzzy.  I grew up professionally in a world that only acknowledged one queuing tool: IBM MQ. IBM built an enormous industry around high availability messaging and I had no idea there were other, easy-to-use tools available in the wild, until I got involved with open source.  When I first came across RabbitMQ I was enchanted just because it was the first free, easy to use queuing tool I met; when I come from a world where so many are led to believe that you should pay millions of dollars for a decent, reliable tool.

Then I remembered! o craps! I’m supposed to be thinking about in memory database optimization for this queuing stuff, NOT reading about the queuing!  Unfortunately, it’s now late and I have to get up early tomorrow, so I guess I lose the alien tuning challenge. But I can leave you with common sense and thoughts based on what I see in the redis.conf.

Tuning your in memory data store for performance throughput:

Don’t be stupid.

  • Read the config options.  Am I the only one who loves reading config files? Probably.
  • Also read the Redis page on virtual memory
  • Dedicate instances to your queuing activities.  Don’t cause data with disparate requirements to co-exist. Competing data sets could also cause developer hair pulling fights over whose app is breaking things.

Disk I/O and resource contention

  • Avoid frequent disk writes, esp if you have multiple instances, because you risk I/O contention
  • Avoid excessive logging for the same reason
  • Avoid virtual machines
  • Get a fucking ssd? Maybe not if you avoid needing to write to disk much, since we're more concerned with ALL THE RAMS.
  • Know how much memory you will need and MONITOR usage and trending BEFORE you use it all up.
  • Your data lives in a Memory-based container.  Understand the Max Memory policy

Connections

  • Manage your max clients - just because it defaults to 10,000 doesn’t make it right.
  • Ensure you have enough file descriptors for all your connections plus whatever else the DB needs to keep functioning.  If you limit your max connections, you can probably leave the FD unlimited. Or you can limit your FDs as it looks like Redis is smart enough to use those to set connections.
  • Either way - be aware of the relationship between number of potential clients connecting, max file descriptors and max client connections.  Or you will be sad.
  • Connection timeouts - this is a tricky topic. If your data store is separated from your queue manager by a firewall, you probably can’t leave it on infinite. Firewalls will timeout your connection and not tell you about it. This will either confuse the queue manager and cause it to error or it will possibly be smart enough to open a new connection.

     If the latter, you will eventually run out of file descriptors or allowed connections on the data store side unless the data store is also smart.

Redis (did I mention I know nothing about Redis? It's an in-memory data store, right?) I read the redis.conf and skimmed the Virtual Memory page.

  • Disable active rehashing
  • Understand your VM options
  • Understand your typical message sizes and size your paging accordingly

That's all I got. You should verify anything you read here against your own requirements and get a second opinion.  Every situation is unique. All of these relate to production environments and could be specific to a low latency goal. Memory conservation and data criticality may be conflicting priorities or require compromise. 

Yak Hacking - A Story

I have a couple of half-written blog posts started this weekend.  Real wordy things about scrummifying your infrastructure and my experience with it.  I'll probably get to that in a few days but I needed to vent about my weekend of yak shaving first.

Moral of my story: The story you're about to read is not the worst yak shave ever, and the problem is not a hard problem.  It's all in a day's work for any halfway decent sysadmin.  What we're seeing here is a small problem exacerbated by a combination of technical debt and inadequate tooling.

Technical debt is a choice and can happen to anyone.  Here the client allowed their configuration management to run away from them.  They haven't been maintaining their Puppet nodes and so don't have a good list of what servers they are managing.  They also let some config files slip through unmanaged.  I generally don't point fingers about it as there's usually a sane tradeoff involved, but the first issue makes fixing the second one harder.

However, inadequate tooling does frustrate me.  They've gone to the trouble of automating with Puppet and managing application configs with subversion and scripting, but do not seem to have considered holistic server management. The only way to perform administrative tasks is by hand on each server or with Puppet. This seems a rather gaping hole in long term planning.  100 servers is long past what I consider manageable by hand.  But you read the story and decide for yourself.

Situation
An org uses Puppet. The org has files unmanaged by Puppet that need to be gathered, analyzed and brought under Puppet control. I expected this to take a couple of hours.

What made it challenging:
Actual list of servers, uncertain. Generated from Puppet but unverified.

The files are secured from being read by anyone except root. 600 as it were.

There are no other management tools - no Rundeck, no Func, no Ansible, no Salt, no Knife, not even a home grown, lovingly maintained perl management script, no passwordless ssh.

The client apparently expects admins to log into boxes one by one for administrative activities.

What I have:
My user ID. 
Sudo on any server I can log into.
Passwordless SSH works even if it's considered insecure.  (really?)
SSHPass
SSHSudo
Ruby 1.8.5 installed from RPM

With the thought of using sshsudo, I wrote a ruby script that runs on the client node, checks for file existence, copies the files to a temp folder, makes them available for scp and even tries to scp them back to the admin server.  The script should be runnable via sshsudo/sshpass.  BUT...

The servers also do not allow sudo without a tty. SSH -t doesn't work.  SSH -t -t kind of works but hangs without the occasional keyboard intervention. Trying to scp from the script invokes another request for password which can't be handled discreetly on the client.

SSHpass works but can't execute anything with sudo on the other end because of the secure tty issue.  So this doesn't do me any good because all the files are root-readable only.

SSHSudo works but suffers from similar issues.

The servers are RH5 running puppet installed with RPMs so no rubygem installs exist, even if I were rude enough to install things on servers belonging to someone else. I had been looking at the use of net-ssh/scp for some of my scripting, but it wasn't really useful.

Did any of that make sense?

None of this addresses the pre-work I did either.  I was given a list of 200 or so servers (generated from Puppet I believe). with the caveat that "some may have been retired."  So I wrote a tcp script to check listening sockets on port 22 and a few others if desired.  I then sorted my servers into ones that responded, ones that timed out and ones that issued a 'name or service not known' message and consulted with the client's full time sysadmin. It turns out that the ones in DNS but not responding were retired and the ones with connection timeouts needed to be reached from another server. omg.

What I finally did:
Several unscripted actions on the command line because I was doing them as troubleshooting/discovery steps while figuring out wtf to do to get the files I needed.

Assembled a servers.good list based on the tcp testing.

Updated the sshsudo script to -t -t wherever it ssh'd (oi!)

make a local dir for the files, separated by host name:
for i in `cat servers.good`
  do echo $i
  mkdir -p /tmp/sascha/files/$i
done

Put the file manipulation script down on all the servers and run it; originally it was going to scp them back to the admin server too, but that wasn't working out for me
./sshsudo -r -v -u sascha servers.good getfiles.rb

Get the files I snagged out of their root-read-only existence
for i in `cat servers.good`
  do echo $i
 sshpass -f ~/mypassfile scp -q sascha@$i:/tmp/files/* /tmp/sascha/files/$i
done

Delete the files from remote tmp
for i in `cat servers.good`
  do echo $i
  sshpass -f ~/mypassfile ssh -q sascha@$i rm -rf /tmp/files
done

I also spent some time making a temporary keyset for passwordless ssh but that turned out to be no real diff from using sshpass. But messing around with moving it to the servers highlighted another issue - I could only log into about 10 of 100.  Lovely.

OMG.  I need an orchestration tool or something, STAT.  I should probably go script this, but I will probably never do anything similar for this client again unless they ask me to implement an orchestration tool for them. 

All of this work, just to get the files to me so I can work work with them. How do people live like this???

****Lesson Learned: ask more questions when accepting work, even when it's ad hoc, tiny project work for someone I know.  I assumed they would have management tooling.  After all, they were smart enough to use configuration management.

Weekend Chef and Puppet Projects

I have a few clients that I keep in touch with and take on small projects from time to time.  This has been a strange week.  I’m technically full time on a project at the moment and I haven’t gotten any calls for new work in a couple of months.  It’s been quiet.  Then all of the sudden Wednesday I had 3 different people contact me about work. 


One call was actually for some awesome Chef work at a place I’ve been before that I’d love to go back to.  It really pained me to tell them that I thought they had in-house resources of which they weren’t aware and that these resources would be more than sufficient to Chef their project without my help. Sadface Sascha.


Another call was from someone I do work for from time to time.  Their projects source my “Puppet for Chef” series, if 2 blog posts can be called a series.  In my head I have a 3rd one on templates but haven’t gotten around to it yet. This 3wk run may source some if I’m lucky. I agreed to do some ad hoc sysadmin work for these guys because they’re in a swivet, getting ready for a PCI audit in 3 weeks.

The job: There are about 150 Centos 5 servers. Caveat: some of the servers on the list may be retired already.  Examine all of the configuration files for ConfigServer Firewall, bring them under Puppet control. Individual node config sets are ok (ugh).  All servers are running a puppet client.  I don’t know if it’s the same client.  The master was at 2.7.9 last I checked, I think.

I have a mini Chef project too for the weekend, but I think I’ll save that for another blog post.  I actually wrote post of this post on Saturday morning and am just finishing it up. I haven’t actually started my Puppet stuff except to wonder if they have some command line tools and think I’d better get to writing some tcp testers and comparison scripts if there isn’t such a thing.


I actually spent a chunk of time working on my Chef thing which was an exercise in frustration.  It’s for a Windows Ohai plugin and the actual ruby work was a piece of cake. Testing has made me crazy though.  Windows servers don’t fire up as fast as Linux for one and they are a huge pain in the ass to interact with.  On my todo list is an SSH server for Windows cookbook, unless my ultimate wish of hoping that all Windows servers DIAF happens prior to that.  The ssh cookbook is half done, like so many things.

I also am not sure I understand the testing strategy.  I didn't write any new code so I shouldn't need to author any tests (relief).  But it turns out I can't seen any windows testing happening in the run, or at least, I think that's what's not happening.  It ran happily on my Mac and then I thought maybe I should run the test on the Windows box too to see if the output was different.  It was different.  But I think that's because there's no Windows plugin tests in there?  Wondering if I missed them?  I was planning to email the chef-dev list tomorrow to see if I can get some love.  Unfortunately the time I have to work on pet projects is the weekends and the IRC channels were like graveyards this weekend.

But I digress.  I also ran manual tests - you know, the kind where you do stuff by hand and watch the output? My Ohai changes tested perfectly sanely on my Windows 7 64bit workstation.  They tested insanely on a 2008r2 VMware VM I had available to me.  And when I went into ec2 to fire up a neutral 2008r2, I discovered Ohai hangs on gathering ec2 metadata when run from the command line.  I have no idea why.  I hand-edited the ec2 plugin and the mixin/ec2_metadata to try and eke out some debug info but I got no joy.  While the problem fascinated me, that was as far down that yak shave as I wanted to go today.  So I put away the Chef and cleaned up the yard for 2 hours.

Now I guess it's time to play with Puppet!

Annoying Recruiter Call #45

 

I enjoy independent consulting.  For that matter, I enjoyed consulting for a consulting company.  I’ve been doing it in one form or another since 2006.  I love the variety and the challenge that comes from never knowing wtf will be thrown at me on any given day or project.  Some days it’s terrifying and some days I want to slap people around, but I won’t deny that I like it.

Currently my LinkedIn profile, a business card and a terse web site are all I use for self-promotion. I never lack for work. I get calls every week from larger businesses looking to recruit for FTEs.  This is rarely something I’m interested in.  I dislike the inevitable silo and I’m not able to engage in the cognitive dissonance required to believe in all the meaningless BS that is part of corporate baggage.  So I politely tell all these recruiters that, while I appreciate the contact, I’m not looking for full time work at this time.

Once in a while I get contacted by recruiters looking to hire for smaller startups.  I always listen to the pitch for these because, if the right small company comes along, I'd probably go work for them. 

Sometimes they’re way off the mark with my skillset or with people who want a culture match - like the recruiter looking to hire someone to work at a social network sports startup.  When I told her that, unless it was bicycle racing, I wasn’t interested, I could hear the awkward just floating through the phone.  Obviously these guys trying to use ESPN APIs and social networking to bring people together would have nothing to say to me.

I had a call in December from a recruiter from a brand new startup local to the area, using a tech stack of which I approved.  They tried to set me up but it never worked out.  The place was all chaos and key people kept going on vacation and one thing after another.  Eventually I forgot about it.  Until I got a call the other day from the same recruiting company.  They left me a voice mail, telling me they had a question and would I call.  So I did.  I called them today.

I got on the phone with the recruiter, a new one, and he literally stumbled all over himself for over a minute, trying to spit out a complete sentence.  He finally did and I nearly hung up on him.  The conversation went something like this:

Him: “ah, um...so...you know that company we were trying to...remember you worked with my boss last year and that company....they um, were looking for...remember we tried to set you up with X company?”
Me: Yes
Him: Would you happen to....they’re looking for...it’s really hard to find people who are into startup culture....um....ah...they’re looking for....um...developers....and so um...I was wondering....if, um
Me: I don’t know any developers looking for work
Him: Oh, um...right, well, if you happen to think of anyone...could I um...send you a followup email...just in um case?
Me: sure <click>

Then I sent an email to his manager, the very nice, polite and professional person I’d been talking to in December:

“Hi Harry, I would appreciate it if you could let your colleagues know not to call me for the sole purpose of asking if I know anyone looking for work.  It would probably also help their delivery if they not preface their requests with "remember the job last year we were trying to get for you but couldn't?" and follow up with "do you know anyone else who might like to work there?"

Thanks,
Sascha

I am not a fangirl

Apple-1
In the beginning, there was no Apple...
I have an uneven history of geekiness.  Most of my geek comes from my dad.  When I was little, sometimes my parents would let me sit up late on the weekends and play Dungeons and Dragons with them and their friends.  My dad was a bona fide war gamer.  He even played them by mail, with maps hanging up on the wall, using sticky putty to hang counters on them.  We didn’t have a lot of money, but I remember pong and the first computer I played on was a TRS80.  I played Zork, a little.  My dad took me to wargame conventions sometimes. There were no apple products in the house.

Still no Apple...
Left to my own devices, I was mostly just a bookworm and music junkie.  I knew computers because they were always around at home, but I never programmed for fun.  I made money in college tutoring computer subjects and took a few “computer classes,” because that was the only way to get an email address at my tiny college.  But I didn’t take them seriously.  I was a Medieval Lit major and only interested in computers for email, browsing and word processing. I used a unix terminal long before I mastered DOS or even saw an apple up close.

Who the hell is Apple?
As an undergrad I took a job doing tech support because it beat the hell out of working in the library, both fun-wise and money-wise.  Seriously, ever mended a book binding or filed the catalog cards that come with new books?  Helping the clueless figure things out won for me every time and tech support usually means Microsoft systems.  So still no apple.

Isn’t Apple going to die a horrible corporate death?
What I’m trying to say is, I never even saw a Commodore or an Apple PC until I was an adult.  I never had any reason for interest in Apple products.  For years to me they were just a company about to go broke.  All my experience was Microsoft-based.  Then the internet boom happened and I left the Lit studies behind. 

Oh look, an iP....
I never wanted a Mac. I eventually bought a 4th gen iPod and really liked it. I'd never really clicked with any mobile music devices.  But I loved the iPod and used it extensively for running.

I was skeptical about smartphones and I didn’t buy one at all until the iPhone 3GS. At first I didn't care for it as I was used to my Blackberry that I had for work.  The transition from keyboard to touchscreen was rough.  I still don't care for chat clients that much although I'm a proficient typist now. It really grew on me and I eventually became quite attached to it.

When the 3GS caught a dunking last year, I picked up the iPhone 4 and only gave that up because ATT didn’t work at all in my urban Minneapolis neighborhood.  Since I needed a new phone and service, I took the opportunity to test drive an Android phone and, while it has a few good things, I’ll gladly give it up for an iPhone5.  iOS 5 brought the pulldown notifications which I looooved on the android.  And while the phone still doesn't have a blinky light, I'm so attached to my social software these days that I almost don't need one any more.  That may not be a good thing....

Something I've never been able to get used to on my android are the keyboards.  They do a few cool things I wish for on iOS(holding down the key for the shift character), but in general I mistype on them constantly and it never seems to figure out the trends.  I always feel like iOS is reading my mind when I type.

Me likely teh Shuffle....
Eventually, I discovered the Shuffle, when it first came in the small square.  I've gone through 3 of them as I firmly believe they are the best music player for running ever invented.  Apple got this so right.  It’s tiny, clips on to my sports bra strap and has an easy-to-identify pause button and fast forward button.  It shuffles my playlist too. This is all I want in a running player.  Anything else complicates tactile navigation.  I skipped the funny rectangle shuffle - it made things complicated.

It's a little irritating that it costs so much for the memory it come with, but it really approaches perfection for me.  So I pay for it.

Still, who needed a Mac?  Not I.
Despite all of this, I never wanted a Mac.  Among other things, I had some experience with linux touchpad drivers and some older Mac touchpad drivers and they made me want to headdesk myself bloody.  I had years of Windows experience. I knew how to fix it and I understood its quirks.  I liked it.  There was no Mac software for certain things I love (hello Quicken) and they aren't really made for games.  And their single mouse buttons gave me the creeps.  Little did I know...

I always thought of Macs the way I do Volkswagen: Nice, but overpriced for what you’re getting.  I’ve always liked a high screen resolution on my laptops and Macs could never give me that.  My 11” Sony had a 1366x800 screen resolution which I loved.  My 13” had 1600x1024 (or something like that).  It was awesome and lightweight to boot.  I could even snag a unix-like command line with Cygwin. It did everything I needed it to do.

Until I started programming....
2 years ago, when I started in on Chef and Ruby, needing Rubygems and all the other disparate parts that come with managing laptop development environments, it all fell apart.  I’ve heard rumors that it’s possible to get ruby environments running with Cygwin.  As a matter of fact, I know you can because a couple of the Thoughtworks consultants forced to use client-supplied Windows laptops made it work.  But it was painful.  And I wanted no part of it. 

I ad-hoc’d my way through for a while by using a Linux VM.  That worked ok, but it wasn’t really flexible enough for me.  Also, I ran into funky problems like some gems/tools worked on some Linux distros but not others.  But one thing almost everything in the ENTIRE WORLD works on is Mac OSX.  BECAUSE ALL THE DEVELOPERS ARE USING IT! And so all the Ruby and Python tools in the world work on OSX as well as common Linux distros (read: Ubuntu)

I eventually discovered that I really wanted a Mac because I was so tired of devoting countless cycles to shoe-horning (yak shaving) things into a workable state on my windows laptop when I knew it would all just work on a Mac.  So I waited for the new Macbook Air to surface last July and grabbed one.  I’ve had it for 8 months and am pretty used to it now.  It doesn’t have the screen real estate that my last 13” had, but I’ve come to really love the touchpad and gestures.  I also love hot key empowerment and being able to type at a comfortable command line.  I love that, when I don’t know wtf the OS is doing, I can drop to a command line and figure it out.  I love that OSX comes with awesome built in tools like iPhoto and the VPN client.

I still have days when I’m all key smashy omgwtf-fu mac, but they are rare.

And then there was iPad....
I’ve never felt the need for a tablet.  I’m not on the road that much these days.  My laptop is fairly mobile.  I have tiny hands. No seriously, I shop for winter gloves in the boys section because all the women’s size small gloves are TOO BIG.  WTF are my tiny hands going to do with a tablet? 

It turns out, strain to reach the letters in the middle of the keyboard.

My recent experiences with different phones made me think I could ready for a tablet.  I’d canceled my iPhone contract and was going to sell it.  But I found myself continuing to read on the iPhone and use it for things at home via wifi.  I read a lot at night in bed in the dark. The new Samsung was too bright even with the brightness all the way down and the battery needed to be saved for when I actually needed to use the phone. I eventually got a light filter app whose sole duty is to dim the backlight even more.

5 months later, I’m still using the iPhone.  I figured I might find some use for a tablet and maybe sell the iPhone.  Plus, I love the iPhone keyboards and the way iOS seems to read my mind when it’s autocorrecting (most of the time).  I’ve never connected well with the Android keyboards, even with auxiliary apps like SwiftKey.

I now have a new iPad.  It’s really big.  I can’t really reach the letters in the center of the keyboard with my thumbs although it’s nice to type on when flat.  It’s also really slippery and hard to use in bed.  My case will come tomorrow.  This will hopefully keep it from sliding out of my hands and give me some texture to hang on to.

I like that Tweetdeck will sync across all my devices.  I like that I can now read PDFs without pain on a mobile device.  I don't like that everyone seems to think that Retina means everything should stay the same size and look nicer.  I want smaller stuff.  SHRINK ALL THE THINGS APPLE!

The iPad is not yet earning its keep.  I'm on the extended work from home plan so it's doing light duty at the moment.  I am reserving judgement; although it's possible that, if you buy the rumored 7 inch iPad later this year, I might trade you.

And the paraphernalia...
I picked up a stand for my Macbook and a bluetooth keyboard.  And then the Airport Express.  I researched wireless print servers and couldn't find a decent one for much less than the Airport.  And the Airport will stream my iTunes too.

And the iTunes...
Yeah I forgot to mention iTunes.  I've been using it for years, even though it sucked up massive resources on my Windows boxes.  It's the easiest, most intuitive music manager I've ever met.  And it manages my i-devices.  I asked Android owners before I bought my phone, "how do you manage your phone?" They just looked at me like I was crazy.  I still can’t figure out if I’ve been conditioned by Apple, but I really like having the iTunes interface to manage my phone, etc.

and then there was the iTunes Store...
I loathe the iTunes store.  There’s no way to filter searches.  There’s no way to make it show you what you want.  You get only what it wants you to see. 

So I don’t know, am I a fangirl? 
I've always thought Apple owners came across rather smug. But then I kind of fell down the slippery slope of Apple product ownership.  I don't feel smug.

I still won’t buy iTunes content except for a few things here and there.  I won’t buy the Apple TV.  I have a Roku that streams everything I need and for less money. 

I don’t care to be locked into one content provider, which is why I refused to buy a Kindle.  Also, I love me my backlight.  Ironically I like a smartphone/tablet for reading because it allows me to have apps for all content providers.  In general I tend to use Amazon, but I refuse to be required to use them through the mishap of buying one of their devices.  If Apple ever started restricting other content provider apps, I would have to take a serious look at my hardware collection.

I still have a Windows desktop.  My Quicken is there.  My (tiny) gaming collection is there.  Calibre is there.  Other large footprint software is there.  I haven’t rid myself of my 13” Sony yet although I only open it when I need to look for something.  It’s my only mobile HDMI-out device at the moment.  Apple devices don’t play Flash and that’s what Amazon is streaming in at the moment.

I’m a big fan of their hardware.  I love all my Apple stuff.  But I’m still comfortable with Windows and have nothing against it as a user (although Windows Server Administration makes me want to cut off my fingers).  I wouldn’t urge someone else to get an Apple device.  I might or might not gush about it, depending on what it was.  I kind of wish Apple had designed my treadmill. But then, I probably couldn’t have afforded it.

To an outsider, I probably look like a fangirl. But if another company made something as awesome to use as an Apple product, I’m sure I’d buy it.  My foray into that arena though was my Samsung Galaxy S2 which has just not been as lovable as I’d hoped.

I’ve been all “I don’t need Apple stuff” my whole life. But I kinda like it, it’s reliable and it’s all so very comfortable to use.  So while I may not gush, I might be a bit of a closet fangirl.  It’s possible....

Puppet for Chef Users - Part 2 - When is a node a node?

Disclaimer: I have been working with Chef for about 18 months and Puppet for about 5 minutes.  I don't claim to be an expert on Puppet nor warranty this information in any way.  It could all be egregiously wrong.  This is simply a blog exploring what I'm learning.

When is a node a node?  And when is it a thought?  Or a regex? When is it a description of a certain state of being?

When I first started working in someone else's existing puppet domain, I was confused.  Everything was described in puppet/manifests/site.pp.  There was a lot of description.  Everything was called a node, regardless of what kind of object it was describing. Lists of manifests/classes were nodes, named servers were nodes, there were some regexs also designated nodes.  I was confused.

Check it out.  In Chef there are nodes.  A node is always the endpoint in question.  It's a concrete item.  It just....is.  A VM is a node.  An AWS instance is a node.  If you want to describe a node, you add recipes or roles to it.  A recipe is a single thing.  A role is a collection of recipes, attributes and information that comprises a functional entity.

In Puppet, a node is a much more nebulous concept.  A node is still always the client endpoint.  But, the node can also be a described end state.

Let's use the ubiquitous webserver example.  In my webserver, I will probably need a set of base OS modules/cookbooks and some webserver stuff.  I might have a requirement set that looks like this:

[ dns, ntp, ssh, syslog, puppet-agent|chef-client, apache, openssl, memcached ]

There are some differences in Puppet and Chef management structure.  In Chef, a node is a client, an endpoint. It's pretty much always a server, whether hardware, VM or cloud instance.

In Chef I would write cookbooks for each of the items in my set.  Then I would probably create two roles: dotcom_base and webserver

roles/com_base.rb
--------------------
name "dotcom_base"
description "The base role for ALL systems"
run_list(
    "recipe[chef-client]",
    "recipe[syslog]",  
    "recipe[ssh]",
    "recipe[ntp]",
    "recipe[dns]"
)

roles/webserver.rb
---------------------
name "webserver"
description "Role for Webservers"
run_list(
    "recipe[apache]",
    "recipe[openssl]",  
    "recipe[memcached]"
)

At this point I would then add both roles to any webserver node in my farm.  Or the roles would be added at bootstrap or your choice.  This information goes into node attributes stored in the attribute database.  There is no flat file representation of actual named nodes.

Puppet is all flat file management, requiring node management to be described in flat files.  Puppet has also chosen a flexible concept of node.

A node can be a collection of similar items:
node "dotcom_base" {
  include dns
  include ntp
  include syslog
  include puppet_agent
}

A node can be a type:
node "webserver"  inherits "dotcom_base" {
  include apache
  include openssl
  include memcached
 }

Given dotcom_web1.domain.com, dotcom_web2.domain.com, dotcom_web3.domain.com:

A node can just be a single endpoint:
node dotcom_web1.domain.com inherits "webserver"{ }

A node can be a collection of nodes matched with regex.
node /^dotcom_web\d+\.domain\.com$/ inherits "webserver" { }

A thought on node construction: To be more transparent, these could also say

inherits "dotcom_base" { webserver }

While I have not seen this notation, it's what I would do if I were designing from the bottom up.  This may or may not be correct style-wise, but I'm a fan of transparency in my config statements.

Puppet node information can either be described in the top level puppet/manifests/site.pp or site.pp can include a second file traditionally called nodes.pp.  Site.pp can also include puppet conditionals, environment descriptions and params that might be specific to an environment, similar to Chef's override attributes.

So, to recap, Chef and Puppet both have nodes.  Puppet describes many different objects which are all designated nodes.  Chef has chosen to break out object descriptions into roles and runlists for nodes.

Chef progression: cookbook/recipe=> role => node
Puppet progression: module/manifest => typed node description => specific/regex matched node description

Puppet for Chef Users

I picked up a Puppet gig recently.  The client already has an OS level Puppet install working.  My job is to use my understanding of automation tooling to configure an app stack.  I was pretty cocky.  I thought, well, I really know Chef after over a year working with it and extending it at another client, how hard can Puppet be? 

It's not hard really.  But it's not cooperative either.  They look really similar, Puppet and Chef, but I'm finding it hard to get Puppet to do what I want.  First, I need to figure out what I want to do.  Then I automatically consider how I would do it in chef.  Then I have to figure out how to do it in Puppet.  This is harder than it sounds as a direct translation of something is not always the most useful, graceful or appropriate method.

I keep wishing I could find an article or manual on the Interwebs called something like "Puppet for Chef Users," where it would translate all the things done in Chef to what they do in Puppet.  I haven't found it.  And I don't have all the answers.  As you can see here, I actually have very few answers.  None really.  But I'm eking out answers as I go.

Example: I want to set a node attribute. 

Well, hm.  Ok.  A node is still a node(mostly) in Puppet.  Puppet has Facter instead of Ohai.  Right, I knew that.  So, then I have to address the philosophical question, is that the correct way to do what I want?  Do I add things like myapp_home, java_install_dir to Facter?  Yah, I think I do.  I want stuff outside the myapp module to know where the myapp log dir is, maybe (it could happen).

Let's say I do want to add custom facts.  That's just like node attributes aren't they?  I'm still not positive, but I've got to make a decision and start coding something.  I can always revise if I'm wrong.

Alright, where's the attributes file?  Whoops!  No attribute file.  I downloaded a sample Apache module from puppet labs in order to have an example to work with.  I needed the example to be complex enough to have custom attributes but not so complicated I couldn't read it.  I figured an Apache server was a good compromise. 

I found that Puppet modules can define new facts in the lib directory of the module.  Strangely, at first I thought it was defined in a directory called 'plugins' as I first noticed the new facts defined there.  But as I read the documentation on defining new facts, I realized that was the old directory. The new way dictates that all custom code goes in modules/myapp/lib. 

The directory storing new facts has small rb files with code similar to ohai snippets.  This seemed like more coding than I needed, but I'm still not sure how to just set an attribute without writing a code snippet.  Here's the code example I found:

require 'facter'
Facter.add(:gem_passenger_version) do
    setcode do %x{rpmquery --qf='%{VERSION}' rubygem-passenger} end
end

here's my experimental code:

require 'facter'
myapp_facts={"myapp_home" => "/opt/myapp", "myapp_server_dir" => "/opt/myapp/server","myapp_log_dir" => "/var/log/myapp"}

myapp_facts.each do |k,v|
  Facter.add(k) do
    setcode do { k=v } end
  end
end

This is what came out of testing:

[root@pixel rubylib]#  RUBYLIB=. facter|grep myapp
myapp_home => /opt/myapp
myapp_log_dir => /var/log/myapp
myapp_server_dir => /opt/myapp/server

So yay!  I figured it out. I added my puppet equivalent of node attributes.  Kind of.  As far as I can tell, Facter is not hierarchical a la Ohai, so my node attributes are flattened instead of in a hierarchy.

To recap, in Chef/In Puppet:

  • Ohai/Facter
  • node attributes/facts
  • ohai is hierarchical json: facter is flat key => value pairs
  • chef node attributes are defined in myapp/attributes/attribute.rb/Puppet facts are defined in myapp/lib/facts 

Coming soon, my adventures with Hiera and data storage, also Puppet ruby DSL vs Puppet Puppet DSL.

Things I learned at Surge

Generalists are awesome.  Value them or you're doomed.  They make the best diagnosticians, especially when systems are large, complex, distributed and workgroups reside in silos.

Cloud is awesome.  Except when it's not.  Understand your vulnerabilities. Plan for failure. Don't abuse your ops guys. Don't abuse your devs.  Be interested in what others are doing even if it doesn't (seem to) directly apply to your job.

Buy a fucking SSD.  Hack your kernel.  

DNS is sexy.  

Just because it's up doesn't mean it's working.

Why use IBM when we have Rabbit?

Often the best solution is a hybrid.  Don't reinvent the internets unless you're feeling clever.

Sales guys employ dragons and fire to prosecute leads.  Shun assholes.  OS projects should never skip over the governance role.  You can always tell who's doing devops right because they are genuinely happy.

Etsy used to be the anti-pattern for awesome until they experienced a culture shift causing them to become our role model for even more awesome.  LOLcats and LOLgoats make presentations even better.

The best part about Surge is knowing that I was surrounded by so many smart people.  I love the hallway track and listening to others talk about their systems, the challenges and solutions.  I love hearing about project goals that would never have occurred to me.  I love all the ideas I get from all the smart people talking around me all the time.

The only unfortunate part of the week was that there were several opposing talks where I would have liked to see both.  I hope that there are some videos posted eventually.