Bookscanner

From Noisebridge
(Difference between revisions)
Jump to: navigation, search
(Added link to OpenCV)
(Software)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The DIY Bookscanner is a project of the Noisebridge [[Digital Archivists]] group.
+
The DIY Book Scanner is a project of the Noisebridge [[Digital Archivists]] group. It's based on the open-source [http://diybookscanner.myshopify.com/products/diy-book-scanner-kit DIY Book Scanner Kit] designed by Daniel Reetz.
  
The Bookscanner was discussed in our first meeting: [[Digital Archivists 2013-04-21]]
+
Here's the scanner first being assembled:
  
 +
[[Image:diybookscanner.jpg|400px]]
  
How do we deal with the digital output from the scanner? Someone mentioned [http://opencv.org/ OpenCV] for Open Source Computer Vision.
+
The completed scanner ready to scan:
 +
 
 +
[[Image:diybookscanner1.jpg|300px]]
 +
 
 +
 
 +
=== Lighting ===
 +
 
 +
The original LED light was replaced with a larger LED array. To avoid glare, the array is mounted perpendicularly so the long side is parallel with the spine of the book. All of the interior surfaces of the book scanner are painted black, to avoid casting reflections on the glass. The glass was taken from two flatbed scanners and cut down to size by hand.
 +
 
 +
=== Cameras ===
 +
 
 +
We use two Canon cameras connected with USB to a computer and remote-controlled with a [https://github.com/danyq/diybookscanner Python script] that uses the gphoto2 library. You can find the full list of cameras supported by gphoto2 [http://www.gphoto.org/proj/libgphoto2/support.php here]. The pictures are transferred to the computer as soon as they are taken, rather than stored on an SD card.
 +
 
 +
=== Trigger ===
 +
 
 +
The cameras are triggered by a button mounted next to the handle on the scanner. The button is connected to a circuit board from a USB keyboard, so it behaves like pressing the "enter" key on the computer.
 +
 
 +
=== Software ===
 +
 
 +
The scanning is handled by a [https://github.com/danyq/diybookscanner Python script] that uses gphoto2 to connect with the cameras and displays images of the scanned pages in an HTML view. Three USB ports are required for the monitoring computer: One each for the two cameras, and one for the triggering mechanism. Post-processing is handled with [http://scantailor.sourceforge.net/ ScanTailor], an [https://gist.github.com/WD-42/7595388 OCR script using Tessaract], and [https://github.com/thisisparker/bookscanning various other scripts].

Latest revision as of 21:34, 19 December 2013

The DIY Book Scanner is a project of the Noisebridge Digital Archivists group. It's based on the open-source DIY Book Scanner Kit designed by Daniel Reetz.

Here's the scanner first being assembled:

Diybookscanner.jpg

The completed scanner ready to scan:

Diybookscanner1.jpg


Contents

[edit] Lighting

The original LED light was replaced with a larger LED array. To avoid glare, the array is mounted perpendicularly so the long side is parallel with the spine of the book. All of the interior surfaces of the book scanner are painted black, to avoid casting reflections on the glass. The glass was taken from two flatbed scanners and cut down to size by hand.

[edit] Cameras

We use two Canon cameras connected with USB to a computer and remote-controlled with a Python script that uses the gphoto2 library. You can find the full list of cameras supported by gphoto2 here. The pictures are transferred to the computer as soon as they are taken, rather than stored on an SD card.

[edit] Trigger

The cameras are triggered by a button mounted next to the handle on the scanner. The button is connected to a circuit board from a USB keyboard, so it behaves like pressing the "enter" key on the computer.

[edit] Software

The scanning is handled by a Python script that uses gphoto2 to connect with the cameras and displays images of the scanned pages in an HTML view. Three USB ports are required for the monitoring computer: One each for the two cameras, and one for the triggering mechanism. Post-processing is handled with ScanTailor, an OCR script using Tessaract, and various other scripts.

Personal tools