import pandas as pd
import os
from pathlib import Path
import sys
= Path( os.getcwd() )
module_path = module_path.parent.parent.parent.__str__() + '\\Pesticide'
module_path
= module_path
cwd
= os.path.join(cwd,'data')
folder_path
0, module_path)
sys.path.insert(
= pd.read_csv(os.path.join(folder_path,'combined_df.csv') ,index_col=0 )
df2 # change data type of columns
'date_of_sampling'] = pd.to_datetime(df2['date_of_sampling']) df2[
Communicating code: Tables
Communicating tabular data
Tabular Data
- How efficient is Excel when we are just viewing the data?
- What other with code options to Excel are there?
Excel files
ChatGPT: What are the problems of Excel and xlsx files?
Limited scalability: Excel has a limit on the number of rows and columns it can handle effectively.
Inaccurate data: The manual nature of data entry in Excel can lead to errors and inaccuracies.
Fragility: Complex Excel spreadsheets can break easily if any formula or cell reference is altered.
Lack of security: Excel files are vulnerable to unauthorized access and malicious attacks.
Slow performance: Large and complex Excel files can become slow to open and use, leading to decreased productivity.
Compatibility issues: XLSX files may not be compatible with older versions of Excel, or with other software applications.
Limited collaboration: Sharing Excel files and making changes to them can be difficult, especially when multiple people are involved.
For me it is the Slow performance alongside: (a) we are doing the data manipulation outside Excel anyway and (b) having to have another application open
ods
with Excel
About 10 s to open 3 ods
files with Excel
.ods
converted to html in firefox browser
About 6 s to open 3 converted ods
files in a browser
.ods
converted to html firefox browser no new tabs
Almost instant when open converted ods
files in same browser
What aspect of tables I am considering
Convert xlsx to html?
- Opening xlsx files in Excel is slow
- Converting to html if we don’t want to edit could be an option
- If we are moving to Python/R aren’t non-Excel options worth considering??
Converting xlsx files to html
- Seems the most obvious course
- But it doesn’t seem that easy with code
- but easy within Excel
What does chatgpt say to convert excel file?
Convert an excel file to html with python
What does chatgpt say without pandas?
convert excel file to html in python without pandas include the excel formatting such as column width
pandas
- Since (in Python) we are mainly working with pandas. Let’s consider how pandas outputs can be modified.
- pandas options
Some code functionality
# precision of all columns
pd.set_option("display.precision", 2)
# Or map as a string
df2['amount_pc_str'] = df2['amount_pc'].map(lambda x: '%.3f' % x)
# some other options
pd.set_option('max_colwidth', 20)
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', 0)
pandas basic
df2
sample_id | date_of_sampling | description | country_of_origin | retail_outlet | address | brand_name | packer_/_manufacturer_/_importer | product | address_postcode | packer_postcode | address_area | packer_area | chem_name | amount_detected | mrl | amount_pc | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1958/2016 | 2016-08-08 | Bramley Apples | UK | Asda | Creechbarrow Road, Taunton TA1 2AN | Asda | Asda Stores Ltd Leeds, UK LS11 5AD | Apple | TA1 2AN | LS11 5AD | Somerset | West Yorkshire | boscalid | 0.03 | 2.0 | 0.015 |
1 | 1958/2016 | 2016-08-08 | Bramley Apples | UK | Asda | Creechbarrow Road, Taunton TA1 2AN | Asda | Asda Stores Ltd Leeds, UK LS11 5AD | Apple | TA1 2AN | LS11 5AD | Somerset | West Yorkshire | pyraclostrobin | 0.01 | 0.5 | 0.020 |
2 | 0230/2016 | 2016-08-08 | Bramley Apples | UK | Co-op | Northgate, Louth LN11 0LT | Co-op | Co-operative Group Ltd Manchester M60 0AG | Apple | LN11 0LT | M60 0AG | Lincolnshire | Greater Manchester | boscalid | 0.05 | 2.0 | 0.025 |
3 | 0230/2016 | 2016-08-08 | Bramley Apples | UK | Co-op | Northgate, Louth LN11 0LT | Co-op | Co-operative Group Ltd Manchester M60 0AG | Apple | LN11 0LT | M60 0AG | Lincolnshire | Greater Manchester | flonicamid (sum) | 0.02 | 0.2 | 0.100 |
4 | 0230/2016 | 2016-08-08 | Bramley Apples | UK | Co-op | Northgate, Louth LN11 0LT | Co-op | Co-operative Group Ltd Manchester M60 0AG | Apple | LN11 0LT | M60 0AG | Lincolnshire | Greater Manchester | pyraclostrobin | 0.03 | 0.5 | 0.060 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
35155 | 2858/2020 Organic | 2020-10-20 | Organic Sweet Potatoes | Spain | Tesco | 300 Beverley Way, New Malden KT3 4PJ | Tesco | Tesco Stores Ltd Welwyn Garden City AL7 1GA | Sweet_Potatoes_Q4_(BNA) | KT3 4PJ | AL7 1GA | Greater London | Hertfordshire | 0 | 0.00 | 0.0 | 0.000 |
35156 | 0562/2020 Organic | 2020-10-05 | Organic Duchy Sweet Potatoes | Egypt | Waitrose | Mill Lane, Swindon SN1 7BX | Waitrose | Waitrose Ltd Doncastle Road, Bracknell, Berksh... | Sweet_Potatoes_Q4_(BNA) | SN1 7BX | RG12 8YA | Wiltshire | Berkshire | 0 | 0.00 | 0.0 | 0.000 |
35157 | 0563/2020 | 2020-10-05 | Sweet Potatoes | USA | Waitrose | Mill Lane, Swindon SN1 7BX | Waitrose | Waitrose Ltd Doncastle Road, Bracknell, Berksh... | Sweet_Potatoes_Q4_(BNA) | SN1 7BX | RG12 8YA | Wiltshire | Berkshire | 0 | 0.00 | 0.0 | 0.000 |
35158 | 2601/2020 | 2020-10-14 | Sweet Potatoes | USA | Waitrose | Ossington Way, Newark NG24 1FF | Waitrose | Waitrose Ltd Doncastle Road, Bracknell, Berksh... | Sweet_Potatoes_Q4_(BNA) | NG24 1FF | RG12 8YA | Nottinghamshire | Berkshire | 0 | 0.00 | 0.0 | 0.000 |
35159 | 2601/2020 | 2020-10-14 | Sweet Potatoes | USA | Waitrose | Ossington Way, Newark NG24 1FF | Waitrose | Waitrose Ltd Doncastle Road, Bracknell, Berksh... | Sweet_Potatoes_Q4_(BNA) | NG24 1FF | RG12 8YA | Nottinghamshire | Berkshire | 0 | 0.00 | 0.0 | 0.000 |
35160 rows × 17 columns
pandas overview
- Using pandas we can control various outputs
- But these still need a format to display within
- And display functionality is not easy
Or convert to a html file
df2.iloc[:500].to_html('df2_500.html')
But using a style sheet as shown in stack overflow by Parfait
= df2.iloc[:500].copy()
df_out
'colheader_justify', 'center') # FOR TABLE <th>
pd.set_option(
= '''
html_string <html>
<head><title>HTML Pandas Dataframe with CSS</title></head>
<link rel="stylesheet" type="text/css" href="df_style.css"/>
<body>
{table}
</body>
</html>.
'''
# OUTPUT AN HTML FILE
with open('df2_500.html', 'w') as f:
format(table=df_out.to_html(classes='mystyle'))) f.write(html_string.
https://www.python-excel.org/
ipydatagrid
https://github.com/bloomberg/ipydatagrid
from ipydatagrid import DataGrid, TextRenderer, VegaExpr
import ipydatagrid
= DataGrid(df2, selection_mode="cell", editable=True,
datagrid =32, base_column_size=150)
base_row_size
= DataGrid(df2, base_row_size=30, base_column_size=150)
datagrid datagrid
itables code
from itables import init_notebook_mode
import itables
init_notebook_mode(all_interactive=True)
itables.show(df2)
from itables import init_notebook_mode
import itables
=True)
init_notebook_mode(all_interactive
itables.show(df2)
sample_id | date_of_sampling | description | country_of_origin | retail_outlet | address | brand_name | packer_/_manufacturer_/_importer | product | address_postcode | packer_postcode | address_area | packer_area | chem_name | amount_detected | mrl | amount_pc | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Loading... (need help?) |
Dash
https://dash.plotly.com/datatable
Downloaded 800,000 times per month, Dash is the original low-code framework for rapidly building data apps in Python, R, Julia, and F# (experimental).
https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e
import plotly.express as px
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output# Load Data
= px.data.tips()# Build App
df = JupyterDash(__name__)
app = html.Div([
app.layout "JupyterDash Demo"),
html.H1(id='graph'),
dcc.Graph(
html.Label(["colorscale",
dcc.Dropdown(id='colorscale-dropdown', clearable=False,
='plasma', options=[
value'label': c, 'value': c}
{for c in px.colors.named_colorscales()
])
]),# Define callback to update graph
])@app.callback(
'graph', 'figure'),
Output("colorscale-dropdown", "value")]
[Input(
)def update_figure(colorscale):
return px.scatter(
="total_bill", y="tip", color="size",
df, x=colorscale,
color_continuous_scale="webgl", title="Tips"
render_mode# Run app and display result inline in the notebook
)='inline') app.run_server(mode
C:\Users\44781\AppData\Local\Temp\ipykernel_15260\3294666565.py:3: UserWarning:
The dash_core_components package is deprecated. Please replace
`import dash_core_components as dcc` with `from dash import dcc`
import dash_core_components as dcc
C:\Users\44781\AppData\Local\Temp\ipykernel_15260\3294666565.py:4: UserWarning:
The dash_html_components package is deprecated. Please replace
`import dash_html_components as html` with `from dash import html`
import dash_html_components as html
Streamlit
A faster way to build and share data apps
- Dash can be run within a notebook but is principally an app.
- Streamlit is a similar app.
- But much easier to code.
import pandas as pd
import streamlit as st
all_dfs = pd.read_csv("./data/combined_df.csv")
st.dataframe(all_dfs.head())
And more
DataTables
DataTables is a plug-in for the jQuery Javascript library. It is a highly flexible tool, built upon the foundations of progressive enhancement, that adds all of these advanced features to any HTML table.
Jupyter widgets
If you are looking for Jupyter widgets, have a look at (taken from https://mwouts.github.io/itables/references.html) - QGrid by Quantopian - IPyaggrid by Louis Raison and Olivier Borderies - IPySheet by QuantStack.