My prior collaborator, Lee Stearns, presented his dissertation talk today.

HandSight: A Touch-Based Wearable System to Increase Information Accessibility for People with Visual Impairments

Please refer to http://www.leestearns.com for more details. An incomplete outline is summarized below:


Related Work

  • OrCam, Access Lens, Access Lens, OmniTouch, VizLens (UIST 2016), ForeSee, Google Glass, eSight, NuEyes, IrisVisio

Reading / Exploring Text

  • Advantages of Touch-Based Reading
    • Does not require framing an overhead camera
    • Allows direct access to spatial information
    • Provides better control over pace and rereading
  • New Challenges
    • How to precisely trace a line of text?
    • How to support physical navigation?

HandSight HoloLens version 

  • Augment rather than replace
  • Camera resolution too low
  • Turning head to look at desired content was uncomforable
  • Voice commands are cumbersome and imprecise
  • Fixed 2D
    • Screen billboards
  • Fixed 3D
    • Vertical and horizontal mode
  • Finger tracking design
    • Follow where the finger is pointing at in 3D
  • Three participants

Findings

  • Finger-worn camera
    • [+] flexible, allows hands-free use
    • [-] Requires moving finger to read
  • HoloLens for low-vision
    • Low contrast due to transparency
    • Narrow view, center of vision

Handheld Camera / Mobile Phone

  • 6 low vision participants
  • participants were more successful and positive about their experience
  • [+] Better camera
  • [+] More useble interactions
  • [-] No longer hands-free

Conclusion / Strengths and WEaknesses of 3D AR

  • [+] Enables new interactions not possible with other approaches
  • [+] Good for multitasking

Design space exploration: AR magnification & enhancement

Implementation and evaluation

On-body Input using finger-worn sensors

  • Preprocessing
  • Coarse-Grained Classification
    • Textures
    • Feature Extraction
    • Localization (SVM)
    • Accelerometer
    • Gyroscope
    • Magnetometer
  • Fine-Grained Classification
  • Geometric Verification

Within-person classification Experiment

  • Coarse-Grained 99.1%, 88.0%, 96.4

Interface Designs

  • Real-time processing (~60 fps)
  • five applications, clock, health & acitivities, clock, daily summary, notifications, summary, location-specific gestures on body
  • 12 Visually Impaired participants
    • 5 participants location-independent gestures
    • 6 participants love the location-specific gestures on the palm
    • 1 participant love location-specific gestures on the body
  • Mitigate camera framing issues
  • Demonstrate feasibility, with high accuracy and approaches

 

Color Recognition

  • Limitations: cannot recognize patters, only color
  • Do not allow users to quickly inspect multiple locations
  • Accuracy affected by ambient lighting and distance.
  • Both colors and visual patterns
  • Deep convolusional activation features (DeCAF)
  • Dense SIFT features combined in an Improved Fisher Vector (IFV)
  • Highly controlled dataset – risks overfitting, limits robustness
  • Solid, striped, checkered, dotted, zigzag, textured, rotation (30\degree increments), scales (1-4)
  • An End-to-End Deep Learning Approach
  • Fine-tuning the classifier with ~half of the HandSight images, (N=36 per classs) increases the accuracy to 96.5%
  • Identify multiple colors in a single image
  • User-configurable level of detail:
  • Two datasets of fabric pattern images
  • 529 images from HandSight
  • 77,052 external

Future Work

  • Alternative or supplementary camera locations
  • Camera on the user’s finger or wrist
  • Camera on the User’s Upper Body
  • Wider field of view more contextual information
  • Easier to localize and track hand/finger position
  • Spatial exploration of documents and other surfaces
  • Maps, charts and graphs are hard to explain (A very interesting deep learning topic)
  • Translating to other languages