You can’t get through much datascrolling these days without hearing about the rise to dominance of so-called “low- and no-code” data tools. “It’s bringing data to the people.” the typical post reads. “You no longer need to learn how to code to perform sophisticated insert analytics project here.”
I absolutely see the draw to low- and no-code data solutions: after how many tedious rounds of import pandas as pd
and object of type 'closure' is not subsettable
errors could you not? That said, I don’t see this trend as revolutionary or as permanent as it’s often claimed to be. Here’s why.
It’s an expected part of the adoption cycle
Take as an example of low- and no-code data products the ability to build Azure machine learning models right from Excel. It’s wild stuff, and there’s no question that it’s democratizing machine learning like never before. What’s not asked, though is what came before this innovation, and what comes next.
Innovation often arrives in waves, with one trend building on and ultimately supplanting the next. Analytics software is no exception. Each wave has been built on code, rolled out by low- and no-code graphical user interfaces (GUIs), and then supplanted again by code.
This is a greatly simplified adoption wave: doing it justice would require a dissertation (and I’ve studied innovation at the doctoral level, so I’m not kidding). The upshot is that low- and no-code has come and gone in analytics tools; let’s look at some examples.
Case study: SPSS
SPSS (Statistical Program for the Social Sciences) began in the late 1960s and by the next decade, was joined by S and SAS as a new wave of tools for exploratory data analysis and statistical programming.
Back then, computer scripts generally needed first to be compiled into a machine-readable file, and then run; this made it difficult to manipulate, visualize and analyze data on the fly. These tools were novel in that they allowed for bits and portions of a script to be executed and printed immediately, which greatly enabled iteration. Analysts could now focus on the data analysis rather than compiling the code.
At some point (And I’m not able to find the exact launch, so if someone does know please get in touch!) SPSS went a step further and added menu-driven options for writing programs. All menu choices generated syntax which could be saved, but the idea was that analysts could further focus less on the code and more on the data, hence democratizing analysis. Technically-savvy statisticians no longer had the monopoly on working with data.
One fruit of this “no- and low-code” implementation is the above menu screenshot. There’s no one-size-fits-all answer to working with data, so trying to accommodate all needs into a single menu can result in, let’s say, a bloated interface. I used SPSS in grad school, and while it was great to be able to point-and-click my way to data analysis, hence focusing on the conceptual bits of the work, I quickly found it easier just to write the code than to navigate the menus. So, the SPSS syntax generation is a blessing… but it’s not the be-all, end-all.
SPSS’s menu was just one product of the computing revolution driven by the GUI. As point-and-click options, GUIs offered relatively low- or no-code features to data users. Another result of this revolution was the spreadsheet, which in the opinion of many was the first “killer app” of the personal computer. Business users now had computing ability at their fingertips, without necessarily needing to code.
Some assembly always required
Let’s stick with spreadsheets because they’re facing the same GUI dilemma as SPSS in the age of what I am calling “personalized cloud computing:” computer applications which rely on cloud capabilities for storing and working with data.
Excel’s Power Query is a show-stopping innovation allowing users to build extract, transform, load (ETL) pipelines right from a spreadsheet. (Similar tools for low/no-code data prep include Alteryx, Tableau Prep, etc.). While based on the M programming language, it includes menu-driven syntax generation, much like SPSS. Not a cloud application per se, Power Query is part of Microsoft’s larger “Power Platform” which is largely marketed as a cloud (and no/low code) solution.
Its menus can be used for most of a user’s needs… but not all. And indeed, a rite of passage for Power Query users is that they begin writing their own code:
Combining files from a folder is one of my all-time favorite #PowerQuery functions. Check out Stephanie’s trick to clean up those helper queries and pull content in exactly how you want it! #PowerBI
“Don’t be afraid to tweak your M code just a tiny bit to solve some problems.” https://t.co/uID0Cj3yiY
— Shannon Lindsay (@shan_gsd) October 28, 2020
Recently, I was trying to add an index number by sub-group in Power Query; this took quite a bit of doing between Power Query menus and M custom coding. By the end, I asked myself, Was this really any easier than just strictly coding? After all, Power Query doesn’t offer a dedicated integrated development environment with package management, like R or Python. It’s an odd soup of GUI and code, somewhat like SPSS.
Working with data is messy in more ways than any of us can count. And it’s this ambiguity that makes building a rigid set of menu options so difficult. I’ve yet to see a GUI that easily accomodates everything I want to do with my data in a simple user experience. Can it be done? Possibly given future technologies. But learning to code has untapped far more possibilities for me than any GUI ever has. So why wait for some idealized UX?
Code and GUIs, ying and yang
It’s at least worth pointing out that many claim the rise to R and Python is in part because they are purely-programmed applications. By offering great development environments and large open source code bases, it became possible to do nearly everything with these two languages… if you would learn to code. There’s little debate that if, perhaps not as the layer the user interacts with, code should be the primary artifact for how analytics is done.
So, why the change in heart to low- and no-code analytics solutions? Like I said earlier, it can get frustating to write the same calls and receive the same trivial errors time and again. So I get that these could be seen as roadblocks to greater data democracy. GUIs have had their time and place in the waves of analytics innovation, often when some application has hit a certain level of maturity. Code also plays a part to build the applications out so they can reach that maturity.
I don’t know what the next wave will be, but I’m certain that this current wave of low- and no-code solutions won’t last it. It may be that fewer coders are needed to get an innovation to the low/no-code part of the adoption, and that the product can genuinely do everything required of its users.
Until that time, I recommend data professionals learn a bit about coding. Maybe not every data solution requires it; that’s fine. But given where we’ve come from in the data world, I’m not inclined to say that the future is all low and no code.
Charles N. Steele
I think you are correct. If I may add, somewhat cynically but (most likely) realistically, if AI does become sufficiently advanced that programming isn’t necessary, then neither will people be necessary. The mentality seems to be to remove decision making power from people. Increasingly I cannot even type without “autocorrect” taking over and changing everything. My input is decreasingly important, because a few programmers set up a system that increasingly overrides me.
For now I still have ultimate power to override that… for now.
George Mount
That’s a good observation. It also says something about employee autonomy. These low-no code platforms can be quite expensive and it’s a shame that employers would rather invest in these tools when they could up-skill their workforce instead. The claim is that it lowers the barrier to entry for everyone to do analytics, but as the examples above show that often doesn’t hold up under stress-testing and does just take away individual decision-making and autonomy.
Charles N. Steele
It has become increasingly easy to run econometric regressions. Originally an expert statistician (I mean a real expert, not someone with mere credentials) would set everything up and then rooms of people would calculate. Everything had to be done carefully because it was so expensive. When I was in my masters the mainframe was in its dying days. I would make sure my code was right and the regression made sense, because I had to run it on the mainframe, and then had to go to the computing center to pick up the paper output, usually at about 3:00 AM.
Today even a gender studies major could learn in five minutes to run regressions. GIGO.
(OK, I exaggerate. I know you can’t teach a gender studies major anything.)