There are different ways to fell a tree. You can hack away at it until it eventually comes down, or you can take time to sharpen your axe and drop it with a few clean strikes.
My current client work involves filling complex Excel spreadsheets with text and getting approval on them. It’s error-prone and my approval partners find it messy too. So I prepare everything in Obsidian first, using the Excel-to-Markdown plugin and the Advanced Table plugin to make the work simpler.
Today the question arose: how do I get the Markdown table back into Excel?
With help from my virtual intern ChatGPT, I developed a Python script that automatically converts Markdown documents in a specified folder into Excel files while preserving line formatting. This lets me write my text in Markdown and edit it more easily after translation. I hope the effort pays off.
The workflow looks like this:
So I don’t forget (and maybe it helps someone out there on the web), here’s the Python script. I had to run many iterations until it worked. Ultimately “Beautiful Soup” does the heavy lifting.
import os
import pandas as pd
import numpy as np
import openpyxl
from bs4 import BeautifulSoup
def markdown_to_excel_with_linebreaks(md_filepath, excel_filepath):
# Create a DataFrame from the Markdown file
df = pd.read_csv(md_filepath, sep='|', quotechar='"', encoding='utf-8', skipinitialspace=True)
# Keep only the column with text data
df = df.iloc[:,1:2]
# Convert <br> and <div> tags to line breaks and convert <div><br></div> to a line break
df = df.applymap(lambda x: BeautifulSoup(x.replace('<div><br></div>', '\n'), 'lxml').get_text(separator="\n") if pd.notnull(x) else np.nan)
# Save DataFrame to Excel
with pd.ExcelWriter(excel_filepath, engine='openpyxl', mode='w') as writer:
df.to_excel(writer, index=False, header=False)
# Set text wrap for each cell
wb = openpyxl.load_workbook(excel_filepath)
ws = wb.active
for row in ws.iter_rows():
for cell in row:
cell.alignment = openpyxl.styles.Alignment(wrap_text=True)
wb.save(excel_filepath)
def process_directory(directory):
for filename in os.listdir(directory):
if filename.endswith('.md'):
md_filepath = os.path.join(directory, filename)
excel_filepath = os.path.join(directory, filename.replace('.md', '.xlsx'))
markdown_to_excel_with_linebreaks(md_filepath, excel_filepath)
print(f"Converting {md_filepath} to {excel_filepath}...")
process_directory('/Users/macbookpro/md_xls')
To make this run, you need to install a few things:
pip install pandas
pip install openpyxl
pip install beautifulsoup4
I called the script md_xls.py and it lives in the User/macbookpro directory. For the markdown files to convert, I set up a directory at this level (md_xls). I drop the markdown files in there. Then I run the command:
python3 md_xls.py
The script now looks in the md_xls directory for Markdown files and converts them to XLS. Done.
This was quite a specific solution, but maybe it helps you.
From the archives of reinergaertner.de, running since 1997. Translated with AI help and my questionable bilingual proofreading. If you spot a Germanismus — that’s a feature, not a bug.
