Jupyter Notebooks as E2E Tests

(rakhim.exotext.com)

62 points | by freetonik 2 days ago ago

21 comments

  • tpoacher an hour ago

    I really don't get (jupyter) notebooks. It always feels like an over-engineered solution to a simple problem. The only advantage they have over 'proper' code generating a 'proper' report is being able to modify code directly on the report (i.e. as opposed to having to change the code in a separate file instead). Which sounds good and all, but this is rarely what I see notebooks actually used for. For the most part, people use them as an IDE, and a notebook is a terrible choice of IDE compared to even something as simple as editing in nano. And the disadvantages of using one generally completely outweigh the benefit of live-editing in the first place, and encourage all sorts of bad programming habits / hacky code structure.

    Whenever I've required a report which intermingles code, text, and outputs, a simple bash-script processing my (literate programming) codebase has always done a far more impressive job of generating a useful report. And, without having to sacrifice proper packaging / code-organisation semantics just for the sake of a report.

    I find it a big shame that the iodide / pyodide notebooks didn't take off in the same way, at least. Those seemed tightly integrated to html in a way that was very elegant.

    (they're not completely gone though, it was nice to see this example on the front page earlier: https://news.ycombinator.com/item?id=42425489 )

  • miohtama 5 hours ago

    Notebooks are a wonderful tool, especially now when Visual Studio Code has superb editor for them. Before this, using web based Jupyter was a bit pain. I use PyCharm for .py files, which is still a bit broken for notebooks, but always go to VSC for notebooks.

    If someone needs here is an some sample code to run notebooks programically, and tune the output and formatting:

    https://github.com/tradingstrategy-ai/trade-executor/blob/ma...

    • qsort 3 hours ago

      What's broken about them in pycharm? I've been using pycharm as my primary IDE for a while now and I haven't ever experienced major problems, this includes working on projects where other people were using VSC.

  • CJefferson 8 hours ago

    I've used this in a few systems, using (in my case) nbconvert.

    As I write more code, I increasingly find the most important thing about tests early on is that they are easy to write and maintain. The help that, I find one of the best 'quick test suites' is "run program, save output, run 'git diff' to see if anything changes".

    This has several advantages. If you have lots of small programs it's trivial to parallelise. It's easy to see what outputs have changed. It's very easy to write weird one-off special tests that need to do something a bit unusual.

    Yes, eventually you will probably want some nicer test framework, but even then I often keep this framework around, as there will still often be a few tests that don't fit nicely in whatever fancy testing library I'm trying to use (for example, checking a program's front end produces correct error messages when given invalid input).

  • abdullahkhalids 9 hours ago

    You can just use nbclient/nbconvert as a library instead of on cmd. Execute the notebook, and the cell outputs are directly available as python objects [1]. Error handling is also inbuilt.

    This should make integration with pytest etc, much simpler.

    [1] https://nbconvert.readthedocs.io/en/latest/execute_api.html#...

  • taeric 2 hours ago

    I find the gigantic swing of a lot of software developers from the pure separation of data and styling into notebooks rather amusing.

    In particular, there is very little that a "notebook" style environment can get you that you couldn't have gotten as output from any previous testing regime. Styling test results used to be a thing people spent a fair amount of time on so that test reports were pretty and informative. Reports on aggregate results could show trends in how they have been executing in the past.

    Now, I grant that this article is subtly different. In particular, the notebooks are an artifact that they are testing anyway. So, having more reliance on that technology may itself be a good end goal. I still have a hard time shaking that notebooks are being treated as a hammer looking for nails by so many teams.

  • batmansmk 3 hours ago

    Maintaining e2e tests is a pain. Maintaining a notebook is a pain. It seems it was a given somebody would make this match made in heaven!

    • vvladymyrov 2 hours ago

      I can suggest improvement - combining long running e2e tests with notebooks is even better match

  • pplonski86 2 hours ago

    Mixing code, markdown and execution results in one output file gives Jupyter notebooks superpower. You can really have anything in the ipynb file, stored in JSON. I wish there was two types of ipynb files, one for file with just code and markdown (for example ipynbc), and one for keeping code+markdown+results.

    BTW, sometime ago I wrote an article about surprising things that you can build with Jupyter notebook https://mljar.com/blog/how-to-use-jupyter-notebook/ You will find in the list: blog, book, website, web app, dashboards, REST API, even full packages :)

    • armanboyaci 2 hours ago

      > I wish there was two types of ipynb files, one for file with just code and markdown (for example ipynbc), and one for keeping code+markdown+results.

      I believe you can achieve that if you use jupytext library, right?

  • wodenokoto 10 hours ago

    It looks like the final solution will have to run notebooks twice - once to check for errors and once more to render to documentation.

    The concept of running code examples inside documentation as a part of tests is well known, and extending it to end-to-end tests / user guides is a good idea.

    Next step might be to add hidden code cells with asserts, to check that the code not only runs, but creates the expected output.

    • freetonik 10 hours ago

      The documentation is rendered without running the notebook in my case, because we store notebooks with all the outputs, so the doc builder (Sphinx) just converts them to HTML as is.

  • udioron 8 hours ago

    Very cool! One can write a pytest plugin that executes notebooks from a folder with custom pytest fixtures support.

  • zurfer 8 hours ago

    for anybody who wants to schedule or automatically run jupyter notebooks I recommend also looking into papermill: https://papermill.readthedocs.io/en/latest/

    • rmholt 3 hours ago

      Agreed! I use it and it's a breath of fresh air, I have to use Jupyter because of colleagues but I really prefer python scripts, and this let's me kind of run a Jupyter notebook as if it were a script, even with cli flags

  • freetonik 10 hours ago

    I’m confused. I’ve submitted this link almost two days ago, but now it says “3 hours ago”, as if the timestamp was modified.

    • abdullahkhalids 10 hours ago

      This is intended HN behavior. Someone else submitted it again today and due to the closeness of times (within a few days), a new item is not created but the old item is boosted.

      • freetonik 9 hours ago

        Ah, thanks for clarifying, did not know this. I thought when someone submits an existing link it either has no effect (and the submitter is redirected to the old post), or a new post is created if enough time had passed since the original submission.

        • bobnamob 9 hours ago

          There’s also a chance it was placed on the second chance queue and reposted “automatically”