Leadership Power Tools: SQL and Statistics

(matt.blwt.io)

44 points | by PaulHoule 4 hours ago ago

15 comments

  • conductr 2 hours ago

    As a somewhat technical leader/manager, I’m pretty comfortable with SQL but that also means I know i could pretty easily goof up these queries. I don’t know much about the quality or exceptions that may be present in the underlying data either. I simply wouldn’t trust my own results for fear I was overlooking something. So, I’d rather ask the BI person to make this for me. They should be more intimately familiar with any footguns.

    For that reason, I see the technical part of this post at odds with the initial premise that a leadership role should need to learn how to query their data. If the resulting information, stats, etc are being used to answer business questions and make business decisions, it would be best that the person that specializes in this produces said information/queries.

    If there’s some tool GUI interface and the datasets are clean or well documented, then maybe the self service nature is on the table again but anything moderately complex likely will still be run through the BI team. In a sense, it’s just basic QC, it’s not that I’m completely helpless. I might even do a first pass and kick it to BI for them to review, but seldom do I find myself in a real world situation where I’m confident enough about my knowledge of the underlying data so that gives me a huge pause most of the time.

    • simonw 2 hours ago

      I'm a little skeptical of dedicated BI teams as the sole oracles of correct queries. In my experience it's sadly common for an engineering team to make a schema or product design change that isn't instantly obvious to the BI team, resulting in incorrect queries.

      I think the best way to address this is to aim for a culture of transparency: any time anyone in a company presents a report or dashboard or similar, the query that was used to create it needs to be easily accessible. That way there's a much higher chance of mistakes being spotted. The BI equivalent of "given enough eyeballs, all bugs are shallow".

      Another cultural trick that's important is that engineering teams need to consider schema changes as part of their documented and supported API. If a change might affect BI reports the people who write and use those reports need to be told about it.

    • mble_ 2 hours ago

      Author here.

      One of the main things here is that you should know your data well enough to articulate the right request from BI. In my experience, BI often end up as pure order takers - if you ask the wrong question, you get a lovingly formatted but wrong answer.

      The other thing is that this assumes you have a BI team at hand - smaller teams/orgs often don't! Perhaps I should make this a little more explicit.

      My central thesis, also not made explicit, is that leaders should be appropriately curious _and_ leverage the tools they have to be able to do things like "hey, this looks weird, what's up?" and share the data and their methodology - that way it can be corrected/investigated etc.

      • conductr 2 hours ago

        Thanks for chiming in, great post, I like the premise - I just think we must have completely different working experiences. I'm typically in a larger org that has multiple systems feeding data into a data lake or something similar that has been normalized but also can still usually has some quirks. Articulating the right request to BI is certainly a skill, but my approach/experience is that I try to paint the picture of the end goal and let them fill in the gaps as needed. Sometimes that's literally drawing out a graph or chart that I want to exist.

        Even when no BI team is dedicated, there's usually someone that's wearing that hat. Someone setup those schemas and data pipelines, etc or is responsible for maintaining them. That person is probably the one that knows "make sure you exclude the NULL items" or something similar.

        I do like being in touch with changing data trends from a leadership perspective. It's either real and could be a valuable insight or it's a bug that needs to be addressed before any ill advised decisions are made from the 'info'. I find this can often be setup proactively and put into a dashboard. In that way, identifying it and raising concern can be 'my job' but when investigating it, it could be a team effort.

        • mble_ 2 hours ago

          > I just think we must have completely different working experiences.

          Likely! I've generally worked in smaller orgs (including as part of a much larger org, as with my current employer) and there is less access to dedicated resources.

          > Even when no BI team is dedicated, there's usually someone that's wearing that hat.

          100%. Unfortunately, this has commonly be me from my personal experience.

          > In that way, identifying it and raising concern can be 'my job' but when investigating it, it could be a team effort.

          Totally agreed.

          For some additional context, I've spent my working career on data systems so I likely feel a much stronger affinity to this type of self-serve analysis than your average bear.

      • PaulHoule 2 hours ago

        From my POV I have a choice of a database or a tool like pandas. Anybody who is interested in this sort of work has a choice of doing it with databases or with a specialized data analysis tool. What's your take on that?

  • mfdupuis 4 hours ago

    Love DuckDB. Definitely a great place to start.

    > A common pattern I’ve seen over the years have been folks in engineering leadership positions that are not super comfortable with extracting and interpreting data from stores

    I think this extends beyond just engineering, and I wish more data teams made the raw data (or at least some clean, subset) more readily available for folks across organizations to explore. I've been part of orgs where I had access to read-only replicas, and I quickly got comfortable querying and analyzing data on my own, and I've been part of other orgs where everything had to go through the data team, and I had to be spoon-fed all the data and insights.

    • ryanwaldorf 3 hours ago

      Totally agree. In my last job I was able to create my own ETL jobs as a PM to get data for my own analyses and figured out a fairly minor configuration change could save us $10M per year. It was from one of many random ETL jobs I created myself out of curiosity that, if I had been forced to rely on other people, I may not ever have created.

      • wjnc 3 hours ago

        If you’d just had a business controller, you’d have x*$10M saved and have more time for your PM-role.

        Yes, calling BS on leadership running their own SQL. Bring strategy and tactics, find good people, create clear roles and expectations and sure don’t get lost in running naive scripts you’ve written because you can do all roles better than the people actually occupying those roles.

        • mble_ 2 hours ago

          Agreed, if you have the budget for it. There are often times where living off the land is necessary.

  • philipodonnell 3 hours ago

    I think a lot of times this is not about skill sets but more that data engineers don’t build datasets with UX in mind. The examples in this piece are not what show up when a leader browses a real database with hundreds of tables stuffed with abbreviations and numeric/short codes. If you want your leaders to use your data, you have to design your data to be used by leaders, not teach them SQL and statistics.

    • PaulHoule 2 hours ago

      You have to design your data tables in an operational system to support operations, in particular, to not get corrupted, which tends to lean towards

      https://en.wikipedia.org/wiki/Database_normalization

      but means you have to write queries with joins to get answers and many people find that difficult. Tooling to provide a better view for analytics is an interesting question.

      As for statistics I think anybody making decisions should know a little about them. Myself I am a fan of "nonparametric" methods because they only make me learn one probability distribution so if I was stuck on a desert island (with pencil and paper) I could compute my own tables for methods like

      https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test

  • tired_and_awake 3 hours ago

    At my current place of work our entire Product org is tech illiterate and it's such a loss. In engineering we built a data science team to do their job for them and serve up pretty plots and constantly evolving dashboards instead of that team learning evening a modicum of tech.

    Thanks for the article OP, I agree with the sentiment!

    • jddj 3 hours ago

      Do they use the plots and dashboards to inform their decisions?

  • simonw 3 hours ago

    I didn't take a statistics course at university and I've been regretting it ever since. I get by, but I feel like this is one area where more formal education would have paid off many times over during my career.