How to export a Penzu journal
By Anatoly Mironov
I have used Penzu as my main journal app for many years. Recently when Apple launched its Journal app I have been looking at it and other competitors. Then I realized that I am not able to get my own data from Penzu. There is no reasonable Export function.
So I found my own way to get my journal data. I could name this blog post to something like “How to export the unexportable”, or “How to intercept XHR requests in Puppeteer”, but my case is about Penzu, so I’ll stick to this particular title.
As a matter fact, Penzu can export your journal if you have a Pro subscription, and I have it. But when I export it, it just creates a PDF file. It is a poor solution. Better than nothing of course, but I cannot import it to a new journal app. As an IT guy, I want my data, not a pdf.
Here is how one can export a penzu journal. I’ll start with a solution, then I can explain some important decisions in the solution.
Please see this as a starter, an example how it can work. It did work for me, but you might need to make adjustments to your specific situation and possible future changes in Penzu application.
Solution
Create a nodejs solution, initialize an npm project and install axios
and puppeteer
packages.
Start Chrome with debugging enabled.
Log in to your Penzu account, note the journalId and the id of the most recent journal entry.
Download my penzu-export.js
file and run it. You’ll get all your posts.
All the nifty and drifty details are in my gist file: mirontoli/penzu-export.js.
Decisions
Export of data was not available so I created this one.
Debugging Chrome was the only way to run puppeteer against a logged-in session of Chrome.
Intercepting the requests was the only way to call Penzu APIs, direct calls are blocked.
Without delays I bumped into throttling where I got HTTP 429 error.
It was a lot of trial and error to get it to work with timings and when to call what.
Some entries had the same timestamps, ending with 00:00 UTC, I needed to keep track of what posts were already saved, without that it could just stuck in an indefinite loop between two journal entries.
Having data was of course the main reason I did this export function, another reason was the fact I wanted to learn more about puppeteer and its great features to intercept api calls. This is even more fun than the classic web scraping.