Simplicity, Troubleshooting, #JoelKallmanday
TL;DR: You don't have to be a genius to solve problems. And there is no shame in not-knowing or not understanding all the geeky details. Main Skill in troubleshooting is to Think Clearly and Systematically. But the tech-knowledge does come in Useful.
This post is for #JOELKALLMANDAY - Our yearly commemoration of Joel and to keep alive the Community he helped build, an initiative of Tim at oracle-base (link)
Warning: One German Quote incoming...
Image : the limitation shows the master... I have used that quote a lot.
Background: I am not That skilled...
Today's message is about Simplicity. Be careful with things you cannot Understand (yet), or things cannot you Explain clearly (yet) to colleagues.
I am often amazed when I sit in presentations from people like Jonathan Lewis, Christoph Lutz, Kamil Stawiarski and Frits Hoogland. They dissect CBO-paths, they run bpf-trace, they dissect logwriter-activity, or they copy credentials out from the memory of a running VM (well known trick, not widely acknowledged...). And they make it all look dead-simple (e.g. How come You didnt see this Evident Flaw in the Optimizer...?).
But I rarely walked away with a thorough Understanding of what what they just explained. After listening to one of those Gurus, I generally felt quite little. Some 20yrs ago, I tried reading trace-files, and I found out that there is No Way I could ever do that sort of thing myself.
I've done some C-Programming in my early days (+/- 1990..), and I know how Powerful (and how complicated) that can be. I've listened to hackers and notably to security-geeks, and I know how Risky, how Devious some trick are.
I have Great Respect for these folks. And I sort-of know when to call them for help. But if I have to call for help, I have already "over complicated" my systems...
And it took me some time to finally stop blaming myself for not being that Skilled.
I am lazy, and not That Clever...
Back in about 2002, my boss wanted me to choose a "Specialty". In our team of Oracle-oriented engineers, we already had wannabee-gurus for things like "PL/SQL-Thought-Leader", and "High-Availability-Architect" And the general idea was you needed about 5 years of intense work in an area to become "Senior". I had already done a lot of reading and experimenting, notably with databases. But I didnt feel like digging into some "speciality" for 5 long years... (we are now writing 20yrs later...)
At time, I had Great Fun doing several jobs in combination: DBA, system-rollouts, training local "expert users", converting legacy-data, and notably Troubleshooting database-related problems. I had no typical "speciality" (yet).
I did find that, even back then, most of the problems were solved by a combination of Asking questions, Verifying answers, and if possibly Simplifying systems. My technical knowledge was important, and useful, but the solution often was not "more Tech", but rather "less complexity".
And when I needed deeper expertise on CBO, on RAC, or on IO-subsystems, I would either call on ppl with more knowledge, or try to find a way around to avoid the problem altogether.
We did call Frits Regularly, he was a Very Valued Colleague. I dont recall much about the other wannabee-gurus of that time.
Another quote from that time: "If we have to call Frits, we have done something very wrong". (I called him regularly).
After some thinking, I told my boss that my "Expertise" would be Simple.
With the KISS principle as leading moniker.
I found the "ability to explain" probably the Most Valuable Item of them all. In the end, you must not only solve the problem, but also Explain it to others to prevent re-occurence, and often also to prevent punishing-of-the-innocent.
And that is when I took the tagline "The Simple Oracle DBA". It became the title of my company blog at the time. I later also claimed the blogspot-name (link?)
Another Lightbulb moment (with a smile)...
One morning I sat in on a presenation by Frits at an 08:00 DOAG-session (they schedule Gurus early to lure ppl to the conference). I picked up a few tidbits I could understand and which might come in useful. Besides, Frits is a friend and a colleague from way back, and he is generally interesting to listen to. Of course, somewhere towards the end I had lost the thread of his reasoning (he lost me when he started discussing dtrace-details around the IO-channel based on the overflow of the some buffer.... ).
And while I was contemplating how small my brain was in comparison to this Genius, the young fellow next to me was still Eagerly Penning Notes.
He seemed to be still with the speaker, he took it all in like gospel...
Until I glanced at his paper-notepad...
His notes made no sense, to me.
From what I read, he was mis-interpreting the current story, based on some of the earlier remarks. He had Missed the Point(s) completely, and was blissfully unaware of it...
But that young man left the room Glowing in his new-found Insights, and he was probably going to apply d-trace or some dis-assembly tool to his database tomorrow in the hope of discovering valuable things in there...
I wondered how he was going to Explain his new insights to Colleagues.
Note: I full well realise that I might myself be completely wrong in interpreting his notes, c.f. Dunning-Kruger (link)... But I dont think that was the case.
Some Lessons I learned (and it took me 20+ years)
In my presentations, I often use the quote by Goethe: In der Beschänkung zegt sich erst der Meister.
And my learnings, with that quote are:
1. Dont over-complicate your systems. (duh... but you'd be surprised how many ppl copy scripts from the internet. And geeks love complexity ...). Remember: You have to explain it to your colleagues and your successors.
2. You dont need to know "Everything", but you need to know enough to ask questions, judge answers, and solve problems.
3. Build out your knowledge from What you Know, with what you Have, in steps that you can Understand and Explain (link to famous quote...).
Professor Dijkstra (google him!) saw this clearly when he produced some of his famous Quotes:
It is not difficult to explain...
Simple systems are cheaper to build (RAC was a complex beast at the time...)
Simple systems are easier to explain and to administrate.
Simple systems are easier to upgrade, to repair and to troubleshoot.
Dijkstra followed it with a nice quote on Elegance as well...
I never forgot that simple advice...
And even after absorbing a lot of geeky knowlegde, which I was able to use when trying to understand (too-)complicated systems... my aim was still to be an expert in Simplicity.
Recommended Reading.
I'll mention two books that has some impact on my reasoning as well ...
Thinking Clearly about Performance, by Cary Millsap. About how to approach (IT-)problems.
The Mythical Man Month, by F. P. Brooks. A classic about IT-systems. It may look old, but still very much Valuable.
(tempted to say: Phoenix Projects come an go, but the Mythical Man Month will always be so.)
Summary KISS: Keep It Simple (not Stupid)
You generally dont need the Advanced High-Tech tricks... (in my opinion). But it is good to know they exist.
Myself, after so many years in the Oracle ecosystem, I am still not skilled in the really deep-tech stuff, such as 1053-CBO-insights or using dtrace on an executable.
But I can listen to those who "Know More", and I can call some of them when I need to.
And for now, I'll remain an expert in Simplicity.