Girish Mahajan (Editor)

Captricity

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Operating system
  
All

Type
  
Optical character recognition (OCR); ICR; Handwriting Recognition, Redaction

Captricity is a data capture software program (and the company that sells it) that uses a combination of machine-learning and human verification to perform OCR data capture from hand-filled forms.

Contents

Background

Captricity was incubated in the Code for America incubator program and is used by government agencies, health clinics and global health practitioners, and researchers such as NYU's Center for Technology and Economic Development.

Captricity was founded in 2011 by Kuang Chen and former Harvey Danger musician Jeff J. Lin. The idea for Captricity came from Chen’s PhD dissertation at UC Berkeley. His research focused on data-centric approaches to increase the efficiency of low-resource organizations, so they could better serve disadvantaged clients.

Company

Captricity is currently headquartered in downtown Oakland, CA, and according to its LinkedIn profile, it has 51-200 employees.

Technology

Captricity capitalizes on the process of crowd sourcing, parceling out OCR verification tasks to human operators. Captricity claims that their technology achieves 99.9% accuracy. Captricity’s machine learning elements combine OCR, ICR and OMR.

Captricity captures handwritten information from forms. This data then populates searchable spreadsheets (like a .csv Excel file). Captricity does not support unstructured data.

Privacy

To maintain the privacy of the information in the forms, each form is “shredded” into distinct fields and each field is verified by one or more different people. Captricity claims that since no one person can see more than one field from a document, privacy is maintained. Captricity uses Amazon's Mechanical Turk System to perform this human verification step. For example, a worker may see a stream of 4-digit numbers, not knowing that it is the last portion of a collection of US social security numbers.

Data redaction

Captricity performs redaction in addition to OCR. Redaction is a service in which any field or collection of fields can be “blacked out” in the document template. Any information contained in those fields will not be read by the system. For example, if a courthouse wants to release their records to the public, but wants to keep the arresting officer’s name private, the field containing this information can be redacted.

Captricity and Non-profits

Non-profit and academic researchers often conduct survey research in order to conduct Monitoring and Evaluation of their programs or projects. The Center for Effective Global Action (CEGA), which is affiliated with UC Berkeley, announced a partnership with Captricity in August 2012. Captricity donates digitization services to non-profits via its Data for Communities program, and offers discounts to non-profit organizations such as CEGA members.

References

Captricity Wikipedia