Slicing up text sausages

I’ve been dictating texts again lately. My current favourite is OpenAI’s Whisper model. When I’m out and about dictating longer pieces, I get this massive text sausage as a transcript. 5000 characters or more in a single line.

ChatGPT works brilliantly for chopping up these text sausages. I just type: “Please divide the following text into sensible paragraphs.” No idea why I’m so polite, but I’m always nice to ChatGPT. AI bots are only human after all. Well, they’re not.

Since ChatGPT only processes texts up to 4096 characters, I need to break particularly long text sausages into smaller packets. Tedious work. But there’s a CLI command that handles this beautifully:

fold -w 4000 -s longsausage.txt | sed G > nicelysliced.txt

This terminal command splits your text into 4000 character lines, and to make the packets more visible, it adds a blank line between them (with the “sed” command). Works a treat.


This started life in German on reinergaertner.de, my blog since 1997. The English version was AI-assisted. My German-trained eyes may have missed a few things along the way. She’ll be right.