Presentation: Communicating code and data

Using the notebook format and app frameworks to communicate code and data

Thomas H. Simm

Content

Notebooks
- What are they?
- Examples
- Pros and cons
Apps
- Voila
- Streamlit
Websites and HTML
- Converting notebooks to HTML and websites
Presentations
- Using notebooks for presentations
Tabular Data
- Comments on Excel
- Thoughts on code alternatives

Overview Communicating code

Communicating when code is a large element of the presentation
- Microsoft Word/ppt- type methods aren’t set-up well to include code
- Programming files (e.g. .py) aren’t set-up well to share or understand what is going on in the code
- Videoing code with outputs is an option, but doesn’t translate to other formats (i.e. we may also need to do a written format of this)
- Excel is a good way to present tabular data but if we are only viewing is slow and inefficient
Better ways?
- Programming notebooks (e.g..ipynb) offer a good and easy to share code alongside written content
  - And can be converted to other formats easily such as html
- Apps (e.g. streamlit) can be good
  - particularly when interactivity is required

Notebooks General

Jupyter Notebooks

From TalkPython: Awesome Jupyter Libraries and Extensions

Jupyter is an amazing environment for exploring data and generating executable reports with Python. But there are many external tools, extensions, and libraries to make it so much better and make you more productive.

A notebook consists of two parts
- markdown part where we can:
  - write text, add images, links, html, LaTeX etc
- code part which runs and displays output of code

Some links:

Example of a notebook

An example notebook

Markdown in a notebook 1

Some useful commands:

# Notebooks General and ## Markdown in a notebook 1
![](ghtop_images/pest.png) looks like this

Markdown in a notebook 2

And the same with a mp4 file ![](ghtop_images/revealjs.mp4)
Or a youtube video ""

Markdown in a notebook 3

> If we want text like this

If we want text like this

Or if we want code use `a = b + c`

or:

```

a = b

a = a + c

```


a = b

a = a + c

Markdown in a notebook 4

HTML works too

<img src="ghtop_images/pest.png"></img>

Code in a notebook

Example interactive format using altair:

Simpler code output

x = np.arange(0,np.pi,.01)
y = np.sin(x)
plt.plot(x,y)

Other code stuff

Can use shell commands e.g. !pip install pandas
Can use magic commands e.g. %%time to time a cell

%%time
y=0
for x in range(0,100):
    y+=x
print(f"y is {y}")

y is 4950
CPU times: total: 0 ns
Wall time: 0 ns

But Not everyone loves notebooks :(

Notebooks have their validish detractors I don’t like notebooks.- Joel Grus Youtube

Notebooks Opinion

Although notebooks have their validish detractors I don’t like notebooks.- Joel Grus Youtube I think if you approach them in the right way they are a super powerful tool.

The negatives seem to be:

encourage bad practice in code (a genuine problem)
issues around order of what cell is run (easily got around with good practice)
issues around lack of auto complete (I don’t see the issue, use in visual studio autocomplete is there)
no grammar/spelling correction
issues with using git and version control
- there are ways around this though
anything else?

Notebook Benefits

Notebooks are intuitive
- You have the code then the result of the code
- Can add text or images
- And it’s linear
Can get things up and working quickly
Aid with communicating code
Encourages Writing
- and writing things down aids thinking in the now and understanding what you did and why in the future
- FastAI guide for better blogs
Easy to convert code to a pipeline
Treat it as a notebook
- in the same way an artist would have a sketchbook to help make final piece
With many companies moving towards Python/R from Excel and a varied level of skills.
- “Aid with communicating code” is particularly important

Example Useage

Example: Documenting Code

Here is my website for my research project on pesticides in UK food
This is not the same as documentation for a package but there are parallels

This does a few things:

Documents the analysis steps I have taken including the code and outputs
- Useful for data transparency, useability of the code if needs modifiying/adapting, and why I did XYZ
Provides a way to present the data
- There is a streamlit app, but sometimes I like to be able to see the code

Example: Tool to aid learning

A big area I have been using Jupyter Notebooks for is to aid learning

If you want to understand something it helps to write it down
Having the code next to it is a big advantage
And if stored on github you can access it anywhere

Tensoflow cheat sheet

Example: Debugging Code

Since starting at ONS I have been working with understanding an existing project and latterly adding code to it
The project consists of multiple python files across several folders
- My Python was good but lots of the functions and their useage weren’t immediately obvious to me
break-points in VS Studio is really good to step through the code and work out what happens in the code.
- I had not used before with Python (but had lots with MATLAB), and it’s really useful
But it can be limited what you can do
- difficult to probe code if want to write more than 1 line of code
- the experience/knowledge exists as you go through it but no documentation to refer to later, e.g. function X does this when I give it Y etc

Example: Debugging Code 2

By copying and pasting code into Jupyter cells I could see and document how they worked (e.g. changing inputs)
- This (copying and pasting) would get around code changes too (which would be an issue if modules were just imported)
- because this was all done in Jupyter notebook I can have a ipynb code file and a html file showing how the code works
- I could even save a pickle file of the variables at a particularly point to understand how the code would work from this point

Apps

Apps Overview

There are many packages that can be used to convert python code to an app

Using Notebooks directly

Apps without notebooks

PySimpleGUI
- Simple and useful but not the best for displaying data
Dash
- Looks really good, but also super complicated
Streamlit
- Easy and looks good

Voila

Voila is relatively simple to use
run with something like voila .\Excel_Voila.ipynb
converts notebook to an app
can use things like ipywidgets for interactivity
the reason I’m interested in it’s use is streamlit doesn’t seem to give flexibility to modify table output I’d like

Streamlit Overview

Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. In just a few minutes you can build and deploy powerful data apps. So let’s get started!

Principally used to create apps, but some of the functionality works well for code/data presentations

Streamlit Functionality: overview

Streamlit allows various functionality:

textbox
images/videos
charts/tables
menus/buttons
etc

Streamlit Functionality: streamlit_layout

But unlike some apps (am thinking MATLAB GUIs) you can’t create the look and functionality separately. So if you want something in a certain position it can be tricky. HTML can be used with st.markdown to give more control but it isn’t recommended to use by streamlit.

Instead, to create the layout as you would like they have the following features:

Streamlit Functionality: columns and sidebar

The most useable are the first two: columns and sidebar

Columns allows us to split the app vertically. The code is fairly simple:

Either colL, colM, colR = st.columns(3) for 3 equal columns or to split columns with different sizes:

colL, _, colR = st.columns((10, 5, 20))
with colL:
    st.write('On the left')
with colR:
    st.write('On the right twice as big as left')

st.sidebar just adds a sidebar to the app that can be hidden or shown.

Anything in the sidebar is just prefixed by st.sidebar so:

st.sidebar.write('I am in the sidebar')
st.write('I am in the main app')
st.sidebar.write('I am back in the sidebar')

Streamlit Functionality: html

It is possible to add various additional personalisations using html. - BUT it does come with security risks and so is [not recommended]](https://github.com/streamlit/streamlit/issues/152)

Allows much more control over the layout of the app that can be useful for a presentation: - Can add a background image - Can add background color to a textbox - Control over positioning of widgets - lots more

HTML is implementated using st.markdown with unsafe_allow_html=True inside the former

Streamlit Functionality: html examples

add background to a text box

text = "Code Examples"
        st.markdown(f'<center><p style=font-family:"Calibri";background-color:#FFFFFF;color:#000000;font-size:42px;border-radius:10%><b>{text}</b></p></center>', unsafe_allow_html=True)

Or to add a background image

import streamlit as st
import base64

@st.cache(allow_output_mutation=True)
def get_base64_of_bin_file(bin_file):
    with open(bin_file, 'rb') as f:
        data = f.read()
    return base64.b64encode(data).decode()

def set_png_as_page_bg(png_file):
    bin_str = get_base64_of_bin_file(png_file) 
    page_bg_img = '''
    <style>
    .stApp {
    background-image: url("data:image/png;base64,%s");
    background-size: contain;
    background-repeat: no-repeat;
    background-attachment: scroll; # doesn't work
    }
    </style>
    ''' % bin_str
    st.markdown(page_bg_img, unsafe_allow_html=True)
    return

Streamlit Functionality: echo

Sometimes you want your Streamlit app to contain both your usual Streamlit graphic elements and the code that generated those elements. That’s where st.echo() comes in

Easier to display this by an example:

In the example above the right of the image is given below (st.columns is used, where the input for the function is found from the left column).

st.echo is used with the with statement.
everything within the with is printed to the screen and executed

with st.echo():
    # Everything inside this block will be both printed to the screen
    # and executed.

    def do_pd_replace(text, text_search, text_sub):
        col_name = "Start string"
        df = pd.DataFrame(data=[text], columns=[col_name])

        df["Final String"] = df[col_name].replace(
            text_search, text_sub, regex=True
        )

        st.dataframe(df)
        st.write(f"text_search = '{text_search}' and text_sub = '{text_sub}'")
        st.write(f"Input string = '{text}'")
        st.write(f"Output string = '{df['Final String'].values[0]}'")

    do_pd_replace(text, text_search, text_sub)

Streamlit Functionality: pages

By simply creating a folder called pages and putting other streamlit .py files in the folder they can then be accessed in the sidebar.

A main file needs to be outside the pages folder
The .py files in pages behave as if they were outside the folder (i.e. when loading files/functions)

Example Streamlit Presentation

Notebooks to HTML

What I have used to convert notebooks to html

fastpages
- I have used fastpages, but this is now deprecated and they are recommending the use of quarto
quarto
- So far I have found quarto really good and flexible (N.B. R works too)
- Easy to convert a notebook to multiple formats, including html, powerpoint, pdf, word doc
- BUT Quarto is not possible if installing from non pip sources is an issue (as far as I can tell currently)
nbconvert is another option I tried
- but it doesn’t seem to have the functionality of fastpages or quarto
Jupyter Books seems to be the best option within companies with installation issues
- Maybe not as good as quarto but it works!

Others

I know some people use Sphinx,
- is recommended by QA
- From what I can tell sphinx on it’s own is not as easy to use as notebooks
- But there is a jupyter extension nbsphinx
- Jupyter Books uses Sphinx heavily under the hood
nbdev
- I think is connected to quarto
Voila
- Voilà turns Jupyter notebooks into standalone web applications.
- Looks good, bit like streamlit
- but seems to interfere with other libraries
- mercury seems similar

Creating html (& other formats)

Quarto

Installation is via a package i.e. .msi for Windows or .pkg for Mac. Which can cause issues.

Works with both ipynb and qmd files, which are both a mixture of markdown and executable code.

The only thing that needs to be done with the notebook is add a YAML block at the start of the notebook, like the following (raq not markdown was used):

---
title: "Communicating code: Website"
subtitle: "Using the notebook format for a website"
author: "Thomas H. Simm"
format:
  html:
    toc: true
title-slide-attributes:
  data-background-size: contain
  data-background-opacity: "0.5"
jupyter: python3
---

We can create different files from this .ipynb Jupyter notebook using the following code:

quarto render testPres.ipynb --to pptx
quarto render testPres.ipynb --to pdf
quarto render testPres.ipynb --to html
quarto render testPres.ipynb --to revealjs

Quarto

Further, formatting for projects (i.e. for website) can be done within the configuration file _quarto.yml

project:
  type: website
  output-dir: _site

website:
  title: "ThomasHSimm"
  favicon: /posts/Picture3.png
  body-header: <img src="/posts/header2.png" height=200>

  navbar:
    right:
      - about.qmd
      - icon: github
        href: https://github.com/ThomasHSimm
      - icon: mortarboard-fill
        href: https://scholar.google.com/citations?hl=en&user=HdPDn1sAAAAJ
format:
  html:
    theme: 
      light: flatly
      dark: darkly
    css: styles.css

Jupyter Books

We can create different files from this .ipynb Jupyter notebook using the following code:

jupyter-book build .\PesticideDocs\
jupyter-book build <path-to-book>
jupyter-book build <path-to-book> --builder pdfhtml
jupyter-book build <path-to-book> --builder singlehtml

The only difference in notebook is that it needs to have One header in a markdown cell for the table of contents, e.g.

# Title of page

Configuration file

A seperate files _config.yml is used to define how the html (or other) files will look

# Book settings
# Learn more at https://jupyterbook.org/customize/config.html

title: Defra Pesticide Testing, Data Analysis
author: Thomas Simm
logo: ONS-logo.png
exclude_patterns: [_build, Thumbs.db, .DS_Store, "**.ipynb_checkpoints"]


# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
  execute_notebooks: force

# Define the name of the latex output file for PDF builds
latex:
  latex_documents:
    targetname: book.tex

# Add a bibtex file so that we can create citations
bibtex_bibfiles:
  - references.bib

# Information about where the book exists on the web
repository:
  url: https://github.com/ThomasHSimm/Pesticide  # Online location of your book
  path_to_book: docs  # Optional path to your book, relative to the repository root
  branch: master  # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
# HTML-specific settings
html:
  favicon                   : "_images/favicon.jpg"  # A path to a favicon image
  use_edit_page_button      : false  # Whether to add an "edit this page" button to pages. If `true`, repository information in repository: must be filled in
  use_repository_button     : false  # Whether to add a link to your repository button
  use_issues_button         : false  # Whether to add an "open an issue" button
  use_multitoc_numbering    : true   # Continuous numbering across parts/chapters
  extra_navbar              : Powered by <a href="https://jupyterbook.org">Jupyter Book</a>
                              <br>Home website <a href="https://thomashsimm.com/">thomashsimm.com</a> # Will be displayed underneath the left navbar.
  extra_footer              : ""  # Will be displayed underneath the footer.
  google_analytics_id       : ""  # A GA id that can be used to track book views.
  home_page_in_navbar       : true  # Whether to include your home page in the left Navigation Bar
  baseurl                   : ""  # The base URL where your book will be hosted. Used for creating image previews and social links. e.g.: https://mypage.com/mybook/
  comments:
    hypothesis              : false
    utterances              : false
  announcement              : "" # A banner announcement at the top of the site.

Table of content

And in addition to the config file a table of contents file is required _toc.yml:

# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html

format: jb-book
root: intro
chapters:
- file: Pesticide_Plots
- file: References
- file: UK_areas
- file: using_jupyter_books

Creating a webpage from this

Takes about 30 mins including installing the chosen converter. (But can be done much quicker)

create a Github repo for your website
choose the converter (e.g. Jupyter Books)
- And follow their instructions
go to settings -> Pages within the repo
- few options to do
Optional: add your own website url to it

Link how to do this here

In Quarto a command from your PC in the repo, publishes the website:

quarto publish quarto-pub

Or equivalently with Jupyter Books:

ghp-import -n -p -f _build/html

Creating directly from the repo

If we instead want to convert notebook files directly from a repo to create a website then this can be done with Netlify.

This is useful if using Gitlab (i.e. not Github) or don’t want all the extra html files cluttering the repo.

Steps:

https://jupyterbook.org/en/stable/publish/netlify.html

Sign up and connect Github/Gitlab
Add a requirements.txt file and also toc.yml to directory
On netlify -> Add new site -> import from an existing repo
Insert something like below
- N.B. the command:
- pip install -r requirements.txt && jupyter-book build .
- and folder location

On netlify

Example:

Presentations

Quarto Presentations

Quarto supports a variety of formats for creating presentations, including:

revealjs — reveal.js (HTML)
pptx — PowerPoint (MS Office)
beamer — Beamer (LaTeX/PDF)

I’ll consider the first two

Quarto PowerPoint overview

The steps to make a PowerPoint presentation from a notebook:

Create the inbuilt template.pptx file
Adjust it to match your own template
At the top of the notebook insert format for pptx including the template file
Choose how you will define a new page
You will probably need to manually check the slides and adjust as required
- especially for interactive content and code

Creating the template

(Office info correct for Office 365 Feb 2023, Version 2301 Build 16.0.16026.20002)

If your workplace has a custom template or you have one you always use, you can incorporate this into quarto.

However, quarto is quite specific on the form this template takes, and requires the following elements

Title Slide
Title and Content
Section Header
Two Content
Comparison
Content with Caption
Blank

Creating a ppt template

By selecting Layout from the Home tab in powerpoint the different layouts can be seen

They can then be modified by going to View tab - Slide Master.

Creating a ppt template 2

If using your own template you will need to match the names of the slides given above. These can be found by hovering over the slides on the left or right clicking on one and selecting “Rename Layout”

Creating a ppt template 3

Alternatively, create a custom template using quarto and then modify this. The following command creates the template:

quarto pandoc -o template.pptx --print-default-data-file reference.pptx

Then go to View tab - Slide Master and modify each slide layout.

Note if you are trying to match a template, some tips: - go to Design -> Slide Size and match this to your template - when View tab - Slide Master is selected go to first tab (see above it will be left indented) on one you are copying from and select all on this then paste to the new template - these will be background images and other things that want to be passed to all slides - Check other slides for images and font-styles etc to match to the new template

Load the template

To load the template the first cell in the notebook needs to be modified as follows to reference the template.pptx file.

format:
  pptx:
    reference-doc: template.pptx
    slide-level: 2

In addition, we can also specify here the rule by which a new slide is defined. If slide-level: 2 is used a new slide is defined by “##’ and a new section header by ‘#’. So if we used ‘###’ this would be a heading within the slide.

If slide-level: 1 is used a new slide is defined by “#’ and ‘##’ this would be a heading within the slide (this is normally the default).

Check the slides

I have found creation of slides to powerpoint more prone to strange results than if .doc/.pdf/.html are used.

So check the slides, see if interactive content or code has been included (probably not) and if the slide content goes outside the slide.

In the example above - There is overlap of text on a slide - Strange ouput of a slide - Code output not displayed

Quarto HTML overview

With quarto two different html formats can be produced by using either html or revealjs.

Whereas, html produces the standard html format revealjs produces an interactive presentation format. https://quarto.org/docs/presentations/revealjs/

revealjs does pretty much the same as a powerpoint file but is more robust - interactive content is included - less issues with getting format to fit within the slide

But - Can’t use the ppt template - And maybe(?) there are issues with sharing this format? - Interactive elements not as well implemeneted as within pure html

Adding style to revealjs

A simple way to add template like details to a revealjs file is to add a style.css sheet.

In the example below, the style sheet adds logo.png to the bottom right of each sheet

The file style.css looks like this:

.reveal .slide-logo {
  display: block;
  position: fixed;
  top: unset !important;
  left: unset !important;
  bottom: 50px;
  right: 12px;
  height: 100px !important;
  width: 100x !important;
  max-width: unset !important;
  max-height: unset !important;
}

Adding style to revealjs

And the revealjs part at the top of the jupyter notebook looks like this

revealjs:
    slide-number: true
    height: 1080
    width: 1920
    logo: logo.png
    css: style.css

So this would then look like the following, with the logo (logo.png) in the bottom right, and size and positioning given by the css file

What the revealjs file looks like

Tabular Data

How efficient is Excel when we are just viewing the data?
What other with code options to Excel are there?

Excel files

ChatGPT: What are the problems of Excel and xlsx files?

Limited scalability: Excel has a limit on the number of rows and columns it can handle effectively.
Inaccurate data: The manual nature of data entry in Excel can lead to errors and inaccuracies.
Fragility: Complex Excel spreadsheets can break easily if any formula or cell reference is altered.
Lack of security: Excel files are vulnerable to unauthorized access and malicious attacks.
Slow performance: Large and complex Excel files can become slow to open and use, leading to decreased productivity.
Compatibility issues: XLSX files may not be compatible with older versions of Excel, or with other software applications.
Limited collaboration: Sharing Excel files and making changes to them can be difficult, especially when multiple people are involved.

For me it is the Slow performance alongside: (a) we are doing the data manipulation outside Excel anyway and (b) having to have another application open

What aspect of tables I am considering

Loading data

`.ods` with Excel

About 10 s to open 3 ods files with Excel

`.ods` converted to html in firefox browser

About 6 s to open 3 converted ods files in a browser

`.ods` converted to html firefox browser no new tabs

Almost instant when open converted ods files in same browser

Convert xlsx to html?

Opening xlsx files in Excel is slow
Converting to html if we don’t want to edit could be an option
If we are moving to Python/R aren’t non-Excel options worth considering??

Converting xlsx files to html

Seems the most obvious course
But it doesn’t seem that easy with code
- but easy within Excel

pandas

Since (in Python) we are mainly working with pandas. Let’s consider how pandas outputs can be modified.
pandas options

Some code functionality

# precision of all columns
pd.set_option("display.precision", 2)
# Or map as a string
df2['amount_pc_str'] = df2['amount_pc'].map(lambda x: '%.3f' % x)
# some other options
pd.set_option('max_colwidth', 20)
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', 0)

pandas basic

	sample_id	date_of_sampling	description	country_of_origin	retail_outlet	address	brand_name	packer_/_manufacturer_/_importer	product	address_postcode	packer_postcode	address_area	packer_area	chem_name	amount_detected	mrl	amount_pc
0	1958/2016	2016-08-08	Bramley Apples	UK	Asda	Creechbarrow Road, Taunton TA1 2AN	Asda	Asda Stores Ltd Leeds, UK LS11 5AD	Apple	TA1 2AN	LS11 5AD	Somerset	West Yorkshire	boscalid	0.03	2.0	0.015
1	1958/2016	2016-08-08	Bramley Apples	UK	Asda	Creechbarrow Road, Taunton TA1 2AN	Asda	Asda Stores Ltd Leeds, UK LS11 5AD	Apple	TA1 2AN	LS11 5AD	Somerset	West Yorkshire	pyraclostrobin	0.01	0.5	0.020
2	0230/2016	2016-08-08	Bramley Apples	UK	Co-op	Northgate, Louth LN11 0LT	Co-op	Co-operative Group Ltd Manchester M60 0AG	Apple	LN11 0LT	M60 0AG	Lincolnshire	Greater Manchester	boscalid	0.05	2.0	0.025
3	0230/2016	2016-08-08	Bramley Apples	UK	Co-op	Northgate, Louth LN11 0LT	Co-op	Co-operative Group Ltd Manchester M60 0AG	Apple	LN11 0LT	M60 0AG	Lincolnshire	Greater Manchester	flonicamid (sum)	0.02	0.2	0.100
4	0230/2016	2016-08-08	Bramley Apples	UK	Co-op	Northgate, Louth LN11 0LT	Co-op	Co-operative Group Ltd Manchester M60 0AG	Apple	LN11 0LT	M60 0AG	Lincolnshire	Greater Manchester	pyraclostrobin	0.03	0.5	0.060
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
35155	2858/2020 Organic	2020-10-20	Organic Sweet Potatoes	Spain	Tesco	300 Beverley Way, New Malden KT3 4PJ	Tesco	Tesco Stores Ltd Welwyn Garden City AL7 1GA	Sweet_Potatoes_Q4_(BNA)	KT3 4PJ	AL7 1GA	Greater London	Hertfordshire	0	0.00	0.0	0.000
35156	0562/2020 Organic	2020-10-05	Organic Duchy Sweet Potatoes	Egypt	Waitrose	Mill Lane, Swindon SN1 7BX	Waitrose	Waitrose Ltd Doncastle Road, Bracknell, Berksh...	Sweet_Potatoes_Q4_(BNA)	SN1 7BX	RG12 8YA	Wiltshire	Berkshire	0	0.00	0.0	0.000
35157	0563/2020	2020-10-05	Sweet Potatoes	USA	Waitrose	Mill Lane, Swindon SN1 7BX	Waitrose	Waitrose Ltd Doncastle Road, Bracknell, Berksh...	Sweet_Potatoes_Q4_(BNA)	SN1 7BX	RG12 8YA	Wiltshire	Berkshire	0	0.00	0.0	0.000
35158	2601/2020	2020-10-14	Sweet Potatoes	USA	Waitrose	Ossington Way, Newark NG24 1FF	Waitrose	Waitrose Ltd Doncastle Road, Bracknell, Berksh...	Sweet_Potatoes_Q4_(BNA)	NG24 1FF	RG12 8YA	Nottinghamshire	Berkshire	0	0.00	0.0	0.000
35159	2601/2020	2020-10-14	Sweet Potatoes	USA	Waitrose	Ossington Way, Newark NG24 1FF	Waitrose	Waitrose Ltd Doncastle Road, Bracknell, Berksh...	Sweet_Potatoes_Q4_(BNA)	NG24 1FF	RG12 8YA	Nottinghamshire	Berkshire	0	0.00	0.0	0.000

35160 rows × 17 columns

pandas overview

Using pandas we can control various outputs
But these still need a format to display within
And display functionality is not easy

Or convert to a html file

df2.iloc[:500].to_html('df2_500.html')

But using a style sheet as shown in stack overflow by Parfait

ipydatagrid

https://github.com/bloomberg/ipydatagrid

itables code

from itables import init_notebook_mode

import itables
init_notebook_mode(all_interactive=True)

itables.show(df2)

	sample_id	date_of_sampling	description	country_of_origin	retail_outlet	address	brand_name	packer_/_manufacturer_/_importer	product	address_postcode	packer_postcode	address_area	packer_area	chem_name	amount_detected	mrl	amount_pc
Loading... (need help?)

Dash

https://dash.plotly.com/datatable

Downloaded 800,000 times per month, Dash is the original low-code framework for rapidly building data apps in Python, R, Julia, and F# (experimental).

https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e

OSError: Address 'http://127.0.0.1:8050' already in use.
    Try passing a different port to run_server.

Streamlit

A faster way to build and share data apps

Dash can be run within a notebook but is principally an app.
Streamlit is a similar app.
But much easier to code.

import pandas as pd
import streamlit as st
all_dfs = pd.read_csv("./data/combined_df.csv")
st.dataframe(all_dfs.head())
# Or
st.tables(all_dfs)

And more

DataTables

DataTables is a plug-in for the jQuery Javascript library. It is a highly flexible tool, built upon the foundations of progressive enhancement, that adds all of these advanced features to any HTML table.

Jupyter widgets

If you are looking for Jupyter widgets, have a look at (taken from https://mwouts.github.io/itables/references.html) - QGrid by Quantopian - IPyaggrid by Louis Raison and Olivier Borderies - IPySheet by QuantStack.

More details on itables

From my brief review I found itables the best package

It works
It gives lots of control of table output to be consistent with good-practice
- Column width
- Number formatting
- Column alignment
Alongside
- Search
- Column ordering
- How many rows are shown
- Scrolling options
- etc

itables code1

def _indi_columnDefs(cols, format,col_width):
    if 'num_format' in format:
        if format['num_format'] == "#,##0":
            format_str = "',', '.', 0, ''"
        elif format['num_format'] ==  '0.0':
            format_str = "',', '.', 1, ''"
        else: 
            format_str = "',', '.', 3, ''"
    else:
        format_str = "',', '.', 3, ''"
    columnDefs= {
            "targets": cols,
            "className":f"dt-{format['align']}",
            "render": JavascriptCode(f"$.fn.dataTable.render.number({format_str})"),
            "width": f"{col_width}px",
        }
    return columnDefs

itables code2

style = {
    "general": {
        "table_output" :{
            "pct_cols" : {
                "regex_defined": ["%" , "[Pp]ercentage"],
                "format": {"num_format": "0.0", "align": "right"}
            },
            "total_cols" : {
                    "regex_defined": ["[Tt]otal" ],
                    "format": { "num_format": "0.0", "align": "center"}
            },
            "code_cols" : {
                "regex_defined": ["[Cc]ode"],
                "format": { "align": "left"}
            },
            "_else" : {
                "regex_defined": [],
                "format": {"num_format": "#,##0", "align": "right"}
            }
        
        },
        "col_width": 130.0
    }
}

itables code3


def _create_style_col(style: dict, df: pd.DataFrame):
    style_col = {}
    colsAll=[]
    for keys in style['general']['table_output'].keys():
        if keys !='_else':
            # print(">>",keys)
            cols=[]
            for i,column in enumerate(df.columns):
                # print(column)
                if [True for reg in style['general']['table_output'][keys]['regex_defined'] if re.search(reg,column)]:
                    # print(i,column)
                    cols.append(i)
            # print(cols)
            colsAll = colsAll + cols
            style_col[keys]=cols

    style_col['_else'] = [ i for i,column in enumerate(df.columns) if i not in colsAll]

    return style_col

itables code4

def create_columnDefs(style: dict, df: pd.DataFrame):
    
    style_col = _create_style_col(style, df)
    columnDefs = []
    for keys in style['general']['table_output'].keys():
        columnDefs.append( _indi_columnDefs(style_col[keys],
                                            style['general']['table_output'][keys]['format'],
                                            style['general']['col_width']) )
    return columnDefs

itables code5


columnDefs= create_columnDefs(style, data)
show(
    data,
    # style="table-layout:auto;width:80%;float:left",
    classes="display",
    
    # specify how many rows
    lengthMenu = [25,100,-1],
    # or to scroll through data
    scrollX=True,
    # scrollY="800px", 
    scrollCollapse=True, 
    # paging=False,

    style=f"width:{style['general']['col_width']*10}px",
    autoWidth=False,

    # add footer
    # footer=True,
    
    columnDefs=columnDefs,
    tags =f'<caption style="caption-side: Bottom">File: {files[0]} and Tab: {ansa.value}</caption>'
)

Presentation: Communicating code and data

Content

Overview Communicating code

Notebooks General

Jupyter Notebooks

Example of a notebook

Markdown in a notebook 1

Markdown in a notebook 2

Markdown in a notebook 3

Markdown in a notebook 4

Code in a notebook

Simpler code output

Other code stuff

But Not everyone loves notebooks :(

Notebooks Opinion

Notebook Benefits

Example Useage

Example: Documenting Code

Example: Tool to aid learning

Example: Debugging Code

Example: Debugging Code 2

Apps

Apps Overview

Voila

Streamlit Overview

Streamlit Functionality: overview

Streamlit Functionality: streamlit_layout

Streamlit Functionality: columns and sidebar

Streamlit Functionality: html

Streamlit Functionality: html examples

Streamlit Functionality: echo

Streamlit Functionality: pages

Example Streamlit Presentation

Notebooks to HTML

What I have used to convert notebooks to html

Others

Creating html (& other formats)

Quarto

Quarto

Jupyter Books

Configuration file

Table of content

Creating a webpage from this

Creating directly from the repo

Steps:

On netlify

Presentations

Quarto Presentations

Quarto PowerPoint overview

Creating the template

Creating a ppt template

Creating a ppt template 2

Creating a ppt template 3

Load the template

Check the slides

Quarto HTML overview

Adding style to revealjs

Adding style to revealjs

What the revealjs file looks like

Tabular Data

Excel files

What aspect of tables I am considering

Loading data

.ods with Excel

.ods converted to html in firefox browser

.ods converted to html firefox browser no new tabs

Convert xlsx to html?

pandas

pandas basic

pandas overview

ipydatagrid

itables code

Dash

Streamlit

And more

DataTables

Jupyter widgets

More details on itables

itables code1

itables code2

`.ods` with Excel

`.ods` converted to html in firefox browser

`.ods` converted to html firefox browser no new tabs