A return to hand-written notes by learning to read & write


Digital note-taking is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in a vectorized form. However, a substantial gap remains between digital note-taking and traditional pen-and-paper note-taking, a practice still favored by a majority of people.

Bridging this gap by converting a note taker’s physical writing into a digital form is a process called derendering. The result is a sequence of strokes, or trajectories of a writing instrument like a pen or finger, recorded as points and stored digitally. This is also known as an “online” representation of writing, or “digital ink”.

The conversion to digital ink offers users who still prefer traditional handwritten notes access to their notes in a digital form. Instead of simply using optical character recognition (OCR), which would allow the writing to be transcribed to a text document, by capturing the handwritten documents as a collection of strokes, it’s possible to reproduce them in a form that can be edited freely by hand in a way that is more natural. It allows the user to create documents with a realistic look that captures their handwriting style, rather than simply a collection of text. This representation allows the user to later inspect, modify or complete their handwritten notes, which gives their notes enhanced durability, seamless organization and integration with other digital content (images, text, links) or digital assistance.

For these reasons, this field has gained significant interest in both academia and industry, with software solutions that digitize handwriting and hardware solutions that leverage smart pens or special paper for capture. The need for additional hardware and accompanying software stack is, however, an obstacle for wider adoption, as it creates both onboarding friction and carries additional expense for the user.

With this in mind, in “InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write”, we propose an approach to derendering that can take a picture of a handwritten note and extract the strokes that generated the writing without the need for specialized equipment. We also remove the reliance on typical geometric constructs, where gradients, contours, and shapes in an image are utilized to extract writing strokes. Instead, we train the model to build an understanding of “reading”, so it can recognize written words, and “writing”, so it can output strokes that resemble handwriting. This results in a more robust model that performs well across diverse scenarios and appearances, including challenging lighting conditions, occlusions, etc. You can access the model and the inference code on our GitHub repo.

Leave a Reply

Your email address will not be published. Required fields are marked *