Tuesday, October 10, 2023

A Community needs to grow New Speakers, new members #JoelKallmanDay

TL;DR: We should encourage more new speakers (and bloggers and writers...) to come mix with "the elders" and share findings and opinions. That is how "the Community" works, and how some of our knowledge is disseminated.


Background: Just some Events.

Last week, the NL-OUG hosted the EMEA-Tour of "speakers", and next week a number of those speakers will visit JavaCro and HrOUG as well.  I noticed there were, finally, some younger faces next to the usual-suspects in the known group of grey-grumpy DBAs. Some of the newbies had just recently completed a course and were sent off to their first customer just at the start of this months. 

Great to have those folks visiting an EMEA Tour.  Now we need these ppl to Talk! 

But how to get these ppl to talk and present? How to develop a pool of interesting, entertaining, useful, educational, communicative speakers ?


Pick a Topic... Help some colleagues.

(I've repeated this often already, some of you will recognize the list, but it is not you I need to  talk to...).

There is no excuse of "no topic". Anything you have learned is Relevant, and is probably interesting to others. Hence, to pick a topic for a talk, one of my tricks are these 3 steps:

1. Look back 3-9 months, and find your biggest obstacle or frustration (creating a k8s pod from helm-charts, convincing the architect to use #SmartDB, upgrading to python3, etc...)

2. Write down what you would tell your colleagues to "fix this" (explain pods/helm/operators - again, explain smartDB, import the correct libraries and drivers, etc...)

3. Create 20 slides for a 45min ppt, possibly compose a simple demo. The closing part of the presentation should be something that the audience can remember, something they feel they can take home and use themselves.

And Voila, you have "a presentation". Now just go see how ppl react to it, and what discussion ensues when you present it.

Of course you are perfectly free to follow your own system. You can also talk about, for example,... a) the latest marketing hype, either support or debunk it, b) whatever your boss is on about this week, c) dissect+rehash a chapter of RTFM (e.g. optimizer-hints...), d) discuss your dogs, puppies cats, kittens, with illustrations, e)tcetera... 

(Insert mandatory cat-picture? bcse a lot of the elder community members are "cat servants" rather than "dog owners"? nah, better not)


Development programs... Dont Impose: let ppl develop their own style.

I'm very much in favour of just letting ppl develop their own "brand". When encouraging a new speaker to stand up, firstly, let them decide for themselves on topic and style. But for those who prefer some guidance, there is also the MASH program (link). Do Contact those mentors if you feel it can help you! 

I remeber how back-when my employer sent everyone off to "the development course", and we were instructed on "how to present". I was lucky: The instructor at the time notices a lot of my faillures (jacket not closed, hand in pocket, too-informal, too-many-jokes)... And the "up and coming management" was quick to point out all my failures. Thx guys. 

The instructor reacted positive though, he summarised: 1) you all grinned at the jokes.. and 2) he got the main message Squarely Across, and you wont forget it 3) this is the "Genuine Persona", and not some plastic-barbie-ken-in-suit. That instructor showed me I could just stand in front of any audience, relaxed, and more or less "just be me". That was my style from that point on. I think it worked (partly also bcse I always bothered to learn some of the local language and culture as well. In the Balkan: refer to no-smoking, miracles of pyramids and rivers, mention Fata, Mujo and the Topalovic family, and you are good in most coutries).


How to Start: Just Fix them up to Do It.

Here is a task for both employers and community-folks. Give your ppl the time and space to go present. But a lot also depends on whether a person sees this opportunity and uses it. Some have that ambition, others maybe dont, and that doesnt make them any less a meaningful and valued member of the team.

(hope most of you are old enough to know those movies...)


Start Small.

The UKOUG used to have SIGs and Meetups, relatively small events where speakers could "cut their teeth". I still think that is a good system.

Ideally, a community has small scale events, meetups, gatherings, where ppl can talk freely to peers. In the 1990s, my then-employer had company-meetings where ppl were encouraged to mix and also had to speak up. Next to that, several groups also held "meetups. The "oracle nerds" held "pancake-meals" where they discussed nerdy-stuff. 

And one of the tricks at those small scale meetings was to Interrogate every participant: That way you found out what they were doing, how they could possibly help colleagues, and if they might need some help themselves. Call it "knowledge management" if you like.

By provoking every person present to "introduce", they started their first "speaking assignment" right at the dinner table. And by giving them some air-time, both the speaker and the listeners could find out how this person could benefit "the community" at that point. Often, this lead to telephone-connections and/or help-sessions between colleagues. And that is also community.


Summary: Please encourage your "juniors"...

Only by keeping "the mix" and the socializing alive can a profession and a community remain in existence. That can be in-person (my pref), but also partly online in Forums and lists and groups.




Tuesday, August 15, 2023

Some things about sharding.

TL;DR: a podcast on Sharding set me out to Think. I summarised some of the info I learned, and I offer some comments, notably for distributed systems.


First I should Thank Nicolay and Michael for their podcast (links below). I learned a few things, and I re-studies some of the material on the topic.

As they point out, naming (taxonomy) is important. It is helpful to state good definitions. Not sure if I agree with all of the wiki-page as it seems written mainly from a noSQL point of view (normalisation is not the same as vertical partitioning, in my book??).  But the definitions on that page (Aug 2023) are is a good start. 

https://en.wikipedia.org/wiki/Shard_(database_architecture)

Note: Using a wiki-page on a somewhat vague and fluid concept is risky, I know...


Definitions related to Sharded systems

What I now (re)understand (and thought to know since mid-90s):

Sharding always involves Multiple Nodes.

Nuance: Often Sharding implies that one Shard corresponds exactly to one Node (or container or machine), but this is (IMHO) not a given: a Node can hold multiple shards, and a shard can be held on multiple Nodes.

Horizontal Sharding: records are kept in shards as "whole records" (e.g. empno, employee-name, emp-email, date_of_birth are kept together on 1 shard, on 1 node). The record is assigned to a shard depending on some property, often based on key or hash-of-key, but can also be date-range-based, or list-based on, say ISO-Country code.

Vertical Sharding: columns or groups of columns are assigned to shards. In this case, the contents of a single records can be divided over multiple shards or nodes. Entering the domain of Columnar Databases.

A Node: Generally a combination of compute-capacity and storage-capacity, can be raw-iron, VM, or container, or other incarnations of IT-capabilities. I sometimes say: Node=workhorse, with given capacity (cpu, memory, storage)

A Shard: I'll see that as a collection of stored records or data, grouped by some property (hash, range, list...). Sometimes also called Segment or Tablet. The relation between Shard and Node doesn't have to be 1:1, but can be any of n:m (in my opinion...).

A Table: a set of records adhering to "some" definitions or constraints. In noSQL it is "anything goes" (greybeard-DB view), and in an RDBMS the definitions tend to be fairly strikt, with well-defined tables, columns and relations between them. A Table can or cannot be sharded, e.g. can be in 1 or more Shards, and thus on 1 or more Nodes if there are multiple. 

When confronted with term "Sharded", I have generally observed some hash-based, even distribution over 2, 4, 16, 27 or 128 shards, and mostly over multiple machines or nodes. I recall one notable exception: a system with "alphabetical shards" with ... 27 shards. The buckets for Jones/Johnson and Sing/Smith got rather larger than the others. Long story, very Cool Idea, Vendor made a Fortune, but system never was great success.


Note the overlap with Partitioning:

Partitioning has its own, more elaborate wiki-page.

https://en.wikipedia.org/wiki/Partition_(database)

And there you find (Aug 2023), very similar definitions of "horizontal partitioning" and "vertical partitioning", and Possibly spread over 1 or more Nodes. But no mention of Shards.

When speaking of Partitioning in a database, it meant generally: creating smaller tables and using select-union (possibly a view) to treat them as 1 table. In most modern RDBMS this is now nicely hidden "under the SQL layer". It would classify as Horizontal Partitioning, or even Horizontal Sharding if designated nodes were involved. 


Downsides of sharding an my view on how to "mitigate":

The mentioned negatives of Sharding (wiki + other sources) are notably: Complexity, SPoF and Re-Balancing. Allow me comment on them.


SQL-complexity:

I would reply: Hidden Under the SQL. The underlying SQL-engine and the storage mechanism Should be completely responsible for storing and finding the data from Inserts, updates, Selects and Deletes. Finding the correct shard and returning the set should happen outside of Dev-View. (but the DBA may want some insight, some monitoring and Controls...). Admittedly, there will often be impacts when data from far-out components needs to be joined, but I would expect Zero impact on the SQL-code when writing queries.


Single Points of Failure (a shard/node/storageblock and the metadata):

Same mitigations apply as for conventional RDBMS or other data-stores. Ideally, data is verified, checksummed, logged, stored, possibly replicated, and maybe even guarded by a quorum-mechanism. I would expect some monitoring notably by the team responsible for infrastructure (the provider) but also some by the Dev/Ops team. Users will definitely do "monitoring work", and will complain if things go missing. 


Re-sharding or re-balancing:

Ideally this also happens under the covers, and should not be a concern of the Dev/Ops team. But "manual controls" are always appreciated. A good implementation will allow for (semi-)automatic and near-invisible re-sharding If Needed. (I appreciate this can be a Big Ask, and the timing+impact of re-sharding will generally be something an Ops would want to manage).


In my (arrogant?) opinion, these downsides (and there are others) are by now "solvable problems". And the vendors of RDBMS and other storage-solutions will be happy to Suggest and Provide. The role of the customer is to Be Informed, and make Sensible Selections from the many options available. 


Instrumentation: Challenging with Distributed systems

One more word on "Observability". You (we) Want to be able to see, to Monitor, to Diagnose the Sharding system. And Ideally as an old greybeard-DBA, I want that data to be Query-Able via views or tables or functions, and preferably "Instant" and not with minutes++ of delay via logfile-parsers and log-mergers (which will probably fail under pressure!).

Postgres has the catalog-views, with for example pg_views, pg_tables and pg_database, mostly based on pg_class. SQL-server and Oracle have similar views to peek-inside. 

So, for a Sharded system, I would expect views to expose the relevant entities: Tables, Shards, Nodes, and whatever relevant objects the specific system has.  I would also expect to be able to join those views to compose information ad-hoc to investigate a problem. 

Join data to provide Information, that is the Nature of a Database.

And in the near future I would expect those views on internal objects to expose counters and timing-info as well.

This system uses Shards?  So... can I do something like:


select 
  n.nodename              node_name
, sum   ( s.elapsed_sec ) scnds_ela
, count (*)               nr_shards
from sh_shard s
join sh_node n on ( n.id = s.node_id )
group by n.nodename
order by 2 ;

If not yet, why not?

I understand some of the answer is in the "Distributed" nature.  If a Database is running, distributed, over 42+ nodes (machines, containers?), then fetching that data would possibly be more then 42 RPC calls. 

I must assume some of the required data, basic info on nodes, shards, tables is already "known" and available as metadata: The processing-software just below the SQL engine knows where to go to find the Shards, it knows which node they are supposed to be on... 

And like stated above: if I have to extract it from "merged logfiles" or data scraped off some admin-interface at port 13001 of every node, that is too much of a roundabout-way, and too slow. Web-interfaces are good, Necessary. But some SQL-Access to the information is in the long run, much more valuable.

It took Oracle 15+ years (from 1990 to 2005) to create their AWR and expose the wait-times of most components. I dont quite know how long it took Postgres to expose the catalog (and still not much of timing-info in there!). I'm expecting some of the new, distributed systems to Learn from the Past... 

To finalize, I'll repeat my motto: 

In Databases, Everything is SQL, and SQL is Everything.



Links:

Anyway, the Podcast that started this over-thinking is here (link to Nicolay and Michael). Bookmark and Enjoy! 



PS - Some Further Comments on some of the contents:

If too much pushback + flames come out, I might just remove this. Internet-expert-gurus and high-value sales ppl should be allowed their moment of fame... 

2PC and XA transaction: Just Dont. Too many problems.

Need for Sharding: You Are Not Netflix (or Amazon). Most systems that thought they needed Sharding had an over-inflated architect, or a pushy (hardware) vendor, or could solve their problem by waiting a few yrs: Hardware and Engineering have done some surprising Catchups.

Bi-Directional replication: I'm Very Skeptical, especially when promised "over continents". I _Know_ it is Possible. But I've Never seen that work properly, or not for long anyway. Maybe, just Maybe, some systems that have a mechanism "from design built in" may get this practical + working. 



Monday, January 09, 2023

The wonder of a Database and SQL.

They say the wonders of the world are African Pyramids and Indian Rivers... But no Miracle was equal to when I discovered SQL...


Suddenly I didnt have to mess with type-defs and linked lists anymore...

Suddenly I didnt have to worry about working-memory space anymore...


SQL allowed me to 

SELECT exactly what I needed 

FROM tables 

WHERE I could JOIN, link or combine the data, 

AND I could filter on conditions, and also immediately

ORDER it BY whatever I fancied.


I could even pre-format it in the select-list. My poor FORTRAN program at the time became a PRO*FORTRAN program and it lost some 50% of its 10.000 lines...(and pre/compiled/linked a lot faster).

We used to open flat-file data-sets with WMO (weather) data....

Suddenly that database did more than half the work for me.


That was a Good Time. It felt a bit as "on credit of the RDBMS", but it was Heaven and it lasted forever. 30 years later, I still use SQL. 

SQL!  





 


Monday, January 02, 2023

Operating and Diagnosing an RDBMS.

 This is one of those "Back to Basic" posts, but one with a good number of years and projects of Experience written into it.


I’ve been playing a little with, Postgres (again) and Yugabyte (link), notably the explain-plan of queries. Of course, I am looking with the biased eyes of an oracle-dba, and that comes with some prior knowledge and possibly a strong personal preference from my education in the big-red environment.


Several findings come to mind, here are the 4 items I think you need to be able to properly operate or diagnose an RDBMS:


  1. Know The Concepts, and have a mental picture of your RDBMS.
  2. SQL-access to ALL relevant information. 
  3. Measure Effort (and time)
  4. Establish “Normality”, your reference, your benchmark.

Anyone working with Databases, RDBMSes, especially as troubleshooter (“Consultant” if you wear a tie), should be aware of these items, and have them Mastered to some degree.



Let me elaborate on those..



re-1, The Concept. You need a mental model, a “concept description” of the underlying layers to be able to understand your RDBMS. For something as simple as :


select ename from emp where id = :emp_id ;


In this case,  you should realise that this query needs to be parsed and executed (different stages) and the execution-phase will traverse various “objects”. The RDBMS needs to use its knowledge of the emp-table, and decide to either scan the table or use an index if available. This Also means the RDBMS will have to read data, from memory or disk. And if a query is too slow, or not doing what you want, you need the background information to diagnose and fix your problem.

Hence you need some conceptual knowledge of your RDBMS.



re-2: SQL-access to all relevant is strongly recommended.  Am I biased in this? It is not just me: re-read the 12-rules of Codd! 

The best examples are the catalog-tables that expose how your RDBMS actually stores its objects. But you can dig much deeper using things like pg_stat_statements or by doing explain-plans of the query.


With the knowledge of the “concepts” (see above), you should be able to search for, and query, the relevant tables and views from the RDBMS to help you diagnose and fix your problem.


As the ex-oracle-DBA, I know I can use :

v$sql : for basic info about the SQL.

v$sql_plan : which steps were included in the explain plan 

v$sql_bind_capture : I can even find recent values of :emp_id.


With so much information available, I can even re-create some of the executions of the statement above (future link to demo...)


I want to stress the point of using the SQL interface here, over, for example, using dtrace or the examining+grepping of log- or tracefiles. In many circumstances, you will not have access to the backend where those files are. So apart from the “Codd-Correctness" of exposing all data via SQL, there is the practical need for an RDBMS to expose its elements and its workings as tables/views/functions.

(Note: this is something that may pose a problem for a multi-node system, a clustered system, or a serverless RDBMS : the work is done in many places, and the SQL-front-end may not have easy access to all the components, Yuga, Cockroach, pay attention here).



re-3 Measure Effort (and time). Don't make the mistake of just keeping a stopwatch next to your system.  Any complex IT system is notorious for inconsistent response times. It is relevant to measure the (elapsed) time, but that time is only really helpful if you also know “what happened” in that time. For example, a select query may be slow on the first retrieval, but faster on the next due to caching. Or an individual full-table-scan query may be fast when testing just because your test-data is cached and the system is idle, whereas under higher load, that response will slow down because the in-efficiency of the full-scan quickly deteriorates your system performance. Your hardware- or cloud-vendor will smile at you, though, as you will need more of his resources.

(this topic is worth a separate blog, sometime...)

(future: link to example...)



re-4 Establish what is "Normal”, of what can you expect from a certain system. This takes some time to get to know your environment, some time for logical-reasoning, and some experience to recognise if your observations are reasonable or not. What was “fast storage” in 1995, is now considered abysmal, and what was considered slow-network in 2005 is now considered Fast due to the system being half a continent away in some cloud-region.

I tend to look for typical elapsed times (my “normal” is : better than 1-10 milliseconds per SQL), and used-effort per sql (cost or block-reads per SQL) and some overall system metrics: percentage idle of the CPU should typically be 30% or better. But for some systems, and idle of near-zero can be the norm and that can be fine as well, as long as the users are happy and no degradation occurs.


The best examples I know for “normality” come from the Oracle-world: I can easily recognise a quiet looking OEM-graph, or a “regular looking” AWR report for databases that I am familiar with. 


And an anecdote about "normality": in my view, 5ms is a perfectly “normal” response-time for an employee lookup (actually a bit slow in 2022, but hey: blame network- and storage-latency for some of it…). Now, If a system has a legitimate need to do lookups of 1000 employee-records per second, then a 5ms lookup will need at least 50+ CPUs to do so. There is no magic bullet to fix this on a 4-cpu system. However, if the 1000-lookups are out of touch with the actual 2 ppl working on it, something else is wrong.



Why all this...


Well, I have some reflexions on this “basic stuff”, and some wishful thinking.


With single-server RDBMS, like conventional ms-sql, mySQL, Oracle or Postgres, we learned how they were built, most experienced DBAs know the concepts of their systems. And we can diagnose them and reason out where errors or bottlenecks are located. In the case of Oracle (which I am most familiar with), we can do that using purely SQL (e.g. I can diagnose a system by using only SQL-queries and the scritps that me and others have developed for the purpose of diagnosing).


Also, for those “known” systems, we have enough experience to say if something is “normal” or not (e.g. 2ms disk-response is probably normal, slower raises questions, faster shows good SSDs or efficient use us Cache).


For the newer systems, and notably the “serverless” systems we are in the process of learning this.

And if I am allowed some pedantic advice to to the builders of those systems: Help us to find the 4 items mentioned. 



Here is what makers/vendors of an RDBMS need to think about:


Help us at item1, Concepts, by explaining your “concepts” clearly and in some detail. Documentation. 


Help us at item2, SQL-access, by making All Information accessible via SQL (yes, I know this is complicated if systems span multiple nodes)


Help us at item3, Measure Effort, by opening up the “effort underneath” via SQL (views). Good example is the Oracle “wait interface” where 99% of “time spent” by the system can be attributed to activity using views on v$system_event and v$session_event.

 

Help us at item4, benchmarking Normality, by allowing easy reporting and possibly adding some statistics-info to the RDBMS.



(Disclaimer: disillusioned-old-oracle-DBA here…)


It took Oracle some 10-15 years (oracle version 6 and 7, around 1994) to stabilise their concept, document it in a “concepts guide” and to seriously open up and publish the v$ views, although the original bstat/estat scripts go back further. I’m expecting the new breed of RDBMSes, especially the ones that are Open Source and claim to be Planet-Scalable, to do that a little speedier.


TBC...


 

This is the footer...and this should be small text for disclaimers and the like. and some small stuff

Locations of visitors to this page And this text is placed next to the map. we could possibly hide some stuff here too for the benefit of the search-engines and if it is at color ffffff cam we put all sort of rubbish that we do not want readers to see. travel itinirary reservation ticket agent flight plane boarding attendant train connection rail ticket wait time booking flight boardingtime taxi ramp luggage suitcase trolley wheely laptop bagpack corpaorate wifi connection oracle. it will also be interesting to see what happens when this wrap around. or even if we put in spherus and worwood as additional word.