Using Google Drive to Scan Documents
I’ve recently found the power in the scan option of the Google Drive app for Android. The scans actually turn out pretty decently. Just when I thought this was great, I discovered that you can open up a scanned PDF from GD using Google Docs, and the whole thing is OCR’d into text. If we can pipe this all together into a single workflow, that would be awesome.
The goal is to be able to scan a document using the GD app, and have it appear in Podio along with a text version. Automagically.
The Setup
The first thing we’ll need for this is a Scans app in Podio. The fields the app needs would be:
- Title - used for the name of the PDF
- Transcription - multi-line text field for the pretty OCR version of the document
- Plain Text - multi-line text field for a plain text version of the document
- File ID - single-line always-hidden text field to store the Google Drive file ID
With my setup, I also have a folder in GD called “Scans”. I don’t want to have to select a folder when scanning from the app, and just want GF to move my PDF to the Scans folder automatically once it’s been processed. The ID of this Scans folder is required for the flow, so just make a note of it for now.
To trigger the process you’ll also need Zapier or IFTTT or other automation platform that can trigger on Google Drive files. I chose IFTTT for this exercise, but the principle will be similar if you’re using another automation platform.
The Webhook
We’re going to need a webhook in GF to start the process. Create a new webhook and make a note of its URL:
Now go to IFTTT (or Zapier or whatever) and create a new applet to trigger when a new GD file is added and to do a POST to the GF webhook address:
Now open up the GD app on your mobile device and scan something. Anything. Doesn’t matter what you scan. The point is just to get the trigger to fire and for GF to receive some payload.
Once the trigger has fired, go back to your webhook flow in GF and click on the “(refresh)” link in the trigger. You should see the result of the last event and the variables that were passed:
The rest of the webhook flow is pretty straight forward:
- Make sure the filename starts with “Scanned_”
- Create a new item in the Scans app passing the file ID
Note that the code to parse out the file ID from the url is preg_match_gf("/id=([a-zA-Z0-9-_]+)/", [(WebHook) url], 1)
.
Important: make sure that the hook event is turned on for the create action so that the subsequent flow is triggered.
The Main App Flow
In the Scans app, we now need a flow to do the heavy lifting. This flow should trigger on create (and will be triggered by the previous webhook flow):
We’ll want this flow to:
- Copy the file from GF to Podio
- Get the OCR version of the file in HTML
- Move the file to the Scans folder in GD
For extra credit we also convert the HTML version into plain text for easier parsing later on if required.
The flow looks like this:
To achieve the desired result, the flow uses the following ProcFu scripts:
- google_file_to_podio.pf - to copy the PDF file from GD to the Podio item
- google_drive_pdf_to_html.pf - to get the HTML version of the PDF (note that PF will create a temporary Google Docs file and will delete it again afterwards)
- google_drive_move.pf - to move the PDF into the Scans folder in GD
The Result
That’s all there’s to it. Now it’s time to try this out.
Find a document to scan, and scan it using the GD app.
I decided to scan the Podio API fact sheet (again):
Which turned up in Podio less than a minute later:
The transcription wasn’t all too bad:
And the plain text version of that was incredibly accurate:
So there you have it. Easy Scanning + OCR using Google Drive :-)